我是靠谱客的博主 疯狂花生,最近开发中收集的这篇文章主要介绍Tensorflow 2 调试方法1. 调试 Tensor 值2. 调试设备位置3. 调试图结构4. 单步调试5. 调试高级 API (tf.keras)6. 数值问题 (NaN / Infinity)7. Tensorflow Debugger (tfdbg),觉得挺不错的,现在分享给大家,希望可以做个参考。

概述

文章目录

  • 1. 调试 Tensor 值
  • 2. 调试设备位置
  • 3. 调试图结构
    • a. tf.function 图
    • b. 运行图 (runtime graphs)
  • 4. 单步调试
  • 5. 调试高级 API (tf.keras)
  • 6. 数值问题 (NaN / Infinity)
  • 7. Tensorflow Debugger (tfdbg)

1. 调试 Tensor 值

打印Tensor的值

import tensorflow as tf
import numpy as np
def log1p(x):
    y = 1.0 * x
    print(y)
    return tf.math.log(y)

y = log1p(tf.constant([1., 2., 3.]))
y = log1p(tf.constant([2., 3., 4.]) * np.pi)

运行结果

tf.Tensor([1. 2. 3.], shape=(3,), dtype=float32)
tf.Tensor([ 6.2831855  9.424778  12.566371 ], shape=(3,), dtype=float32)

解释

  • 函数log1p没有@tf.function被修饰,所以是立即执行的。

  • print函数能够输出张量的值

    • 类似numpy.ndarray
    • 可能设置设备到主机的复制

打印Tensor的聚合值

def log1p(x):
    y = 1.0 * x
    print(tf.reduce_mean(y), tf.reduce_max(y), tf.reduce_min(y))
    return tf.math.log(y)

y = log1p(tf.constant([1., 2., 3.]))
y = log1p(tf.constant([2., 3., 4.]) * np.pi)

运行结果

tf.Tensor(2.0, shape=(), dtype=float32) tf.Tensor(3.0, shape=(), dtype=float32) tf.Tensor(1.0, shape=(), dtype=float32)
tf.Tensor(9.424778, shape=(), dtype=float32) tf.Tensor(12.566371, shape=(), dtype=float32) tf.Tensor(6.2831855, shape=(), dtype=float32)
  • 可以用内建的TF函数打印变换后的tensor值

修改打印值的格式

np.set_printoptions(precision=3)

def log1p(x):
    y = 1.0 * x
    print(y)
    return tf.math.log(y)

y = log1p(tf.constant([1., 2., 3.]))
y = log1p(tf.constant([2., 3., 4.]) * np.pi)

输出

tf.Tensor([1. 2. 3.], shape=(3,), dtype=float32)
tf.Tensor([ 6.283  9.425 12.566], shape=(3,), dtype=float32)
  • EagerTensor.__str()____repr()__与numpy字符串格式挂钩
  • 因此可以使用numpy.set_printoptions()控制打印格式

输出图内的张量

@tf.function
def collatz(n):
    counter = tf.constant(0)
    while n > 1:
        print(n)
        if n % 2 == 0:
            n //= 2
        else:
            n = n * 3 + 1
        counter += 1
    return counter

print(collatz(tf.constant(42)))

运行结果

Tensor("placeholder:0", shape=(), dtype=int32)
tf.Tensor(8, shape=(), dtype=int32)
  • Placeholder是TF while循环中graphlet的一部分

print(n)换成tf.print(n),结果变成

42
21
64
32
16
8
4
2
tf.Tensor(8, shape=(), dtype=int32)
  • tf.print()打印出了真正的张量n执行中的值

不等长张量 RaggedTensor

ragged = tf.RaggedTensor.from_row_splits(
    values=[3.0, 1.0, 4.0, 1.0, 5.0, 9.0, 2.0, 6.0],
    row_splits=[0, 4, 4, 7, 8, 8]
)

@tf.function
def ragged_times_length_plus_one(x):
    row_lenghts = tf.reduce_sum(x.row_lengths())
    y = x * tf.cast(row_lenghts, tf.float32)
    tf.print(y)
    return y + 1.0

ragged_times_length_plus_one(ragged)

输出

tf.RaggedTensor(values=Tensor("Mul_1:0", shape=(8,), dtype=float32), row_splits=Tensor("x_1:0", shape=(6,), dtype=int64))
  • 不等长张量不能正常打印

稀疏张量

sparse = tf.sparse.SparseTensor(
    indices=[[0, 0], [1, 2]],
    values=[1.1, 2.2],
    dense_shape=[3, 4]
)

@tf.function
def sparse_times_non_zero_count(x):
    count = tf.cast(tf.math.count_nonzero(x.values), tf.float32)
    y = x * count
    tf.print(y)
    return y

sparse_times_non_zero_count(sparse)

输出

'SparseTensor(indices=[[0 0]
 [1 2]], values=[2.2 4.4], shape=[3 4])'
  • 稀疏张量可以打印

以编程方式访问图内的张量值

random_normal = tf.random_normal_initializer()
w = tf.Variable(random_normal([2, 3]))
b = tf.Variable(random_normal([3]))

@tf.function
def my_dense_layer(x):
    y = tf.matmul(x, w)
    y_with_bias = y + b
    return tf.nn.relu(y_with_bias), y, y_with_bias

x = random_normal([4, 2])
print(my_dense_layer(x))

运行结果

(<tf.Tensor: id=460, shape=(4, 3), dtype=float32, numpy=
array([[0.   , 0.026, 0.   ],
       [0.   , 0.024, 0.   ],
       [0.   , 0.029, 0.   ],
       [0.   , 0.022, 0.   ]], dtype=float32)>, <tf.Tensor: id=461, shape=(4, 3), dtype=float32, numpy=
array([[-0.   ,  0.001,  0.001],
       [ 0.003, -0.001, -0.006],
       [-0.001,  0.003,  0.006],
       [ 0.002, -0.004, -0.008]], dtype=float32)>, <tf.Tensor: id=462, shape=(4, 3), dtype=float32, numpy=
array([[-0.092,  0.026, -0.011],
       [-0.088,  0.024, -0.019],
       [-0.093,  0.029, -0.007],
       [-0.09 ,  0.022, -0.021]], dtype=float32)>)
  • 对于控制流之外的中间张量值,可以将他们添加到返回值中,获得运行时的值

以编程访问图内的张量值 - while循环

@tf.function
def collatz(n):
    counter = tf.constant(0)
    n_history = tf.TensorArray(n.dtype, size=0, dynamic_size=True)
    while n > 1:
        if n % 2 == 0:
            n //= 2
        else:
            n = n * 3 + 1
        n_history = n_history.write(counter, n)
        counter += 1
    return counter, n_history.stack()

print(collatz(tf.constant(42)))

运行结果

(<tf.Tensor: id=556, shape=(), dtype=int32, numpy=8>, <tf.Tensor: id=557, shape=(8,), dtype=int32, numpy=array([21, 64, 32, 16,  8,  4,  2,  1])>)
  • 可以用tf.TensorArray实现

2. 调试设备位置

op(算子)在设备上的位置

import tensorflow as tf
import numpy as np
# 必须在程序开始时执行
tf.debugging.set_log_device_placement(True)

def log1p(x):
    y = 1.0 + x
    tf.print(y)
    return tf.math.log(y)

log1p(tf.constant([1.0, 2.0, 3.0]) * np.pi)

运行结果

Executing op Mul in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op AddV2 in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op StringFormat in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op PrintV2 in device /job:localhost/replica:0/task:0/device:CPU:0
[4.14159298 7.28318548 10.424778]
Executing op Log in device /job:localhost/replica:0/task:0/device:CPU:0
  • 每当单个算子放到设备上时,输出算子的位置
  • 不输出相同设备上的算子的重复eager执行

tf.funtion在设备上的位置

import tensorflow as tf
import numpy as np
# 必须在程序开始时执行
tf.debugging.set_log_device_placement(True)

@tf.function
def log1p(x):
    y = 1.0 + x
    tf.print(y)
    return tf.math.log(y)

log1p(tf.constant([1.0, 2.0, 3.0]) * np.pi)

Jupyter Notebook 的运行结果

Executing op __inference_log1p_19 in device /job:localhost/replica:0/task:0/device:CPU:0
[4.14159298 7.28318548 10.424778]

命令行运行的结果

x: (_Arg): /job:localhost/replica:0/task:0/device:CPU:0
add: (AddV2): /job:localhost/replica:0/task:0/device:CPU:0
StringFormat: (StringFormat): /job:localhost/replica:0/task:0/device:CPU:0
PrintV2: (PrintV2): /job:localhost/replica:0/task:0/device:CPU:0
Log: (Log): /job:localhost/replica:0/task:0/device:CPU:0
Identity: (Identity): /job:localhost/replica:0/task:0/device:CPU:0
identity_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:CPU:0
add/x: (Const): /job:localhost/replica:0/task:0/device:CPU:0
[4.14159298 7.28318548 10.424778]
  • set_log_device_placement()在Jupyter中不会显示图内算子的位置
  • 因为Jupyter只显示stdout的结果,变量输出输出在了info log中
  • set_log_device_placement()只打印以下项目的设备位置
    • Eager 算子执行
    • 图构建
  • 对于后者,不保证所有的算子在运行时执行。Grapper优化可能在实际运行前将其裁剪
  • set_log_device_placement()无法在TPU上良好地工作

3. 调试图结构

a. tf.function 图

获得tf函数的图

random_normal = tf.random_normal_initializer()
w = tf.Variable(random_normal([2, 3]))
b = tf.Variable(random_normal([3]))

@tf.function
def my_dense_layer(x):
    y = tf.matmul(x, w)
    y_with_bias = y + b
    return tf.nn.relu(y_with_bias), y, y_with_bias

x = random_normal([4, 2])
print(my_dense_layer(x))

graph = my_dense_layer.get_concrete_function(x).graph
graph.as_graph_def()

运行结果

node {
  name: "x"
  op: "Placeholder"
  attr {
    key: "_user_specified_name"
    value {
      s: "x"
    }
  }
  attr {
    key: "dtype"
    value {
      type: DT_FLOAT
    }
  }
  attr {
    key: "shape"
    value {
      shape {
        dim {
          size: 4
        }
        dim {
          size: 2
        }
      }
    }
  }
}
node {
  name: "MatMul/ReadVariableOp/resource"
  op: "Placeholder"
  device: "/job:localhost/replica:0/task:0/device:CPU:0"
  attr {
    key: "dtype"
    value {
      type: DT_RESOURCE
    }
  }
...
  • 在第一个调用或穿过tf.function时使用get_concrete_function
  • concrete函数是基于特定的输入参数,将Python函数编译成图的结果

TensorBoard 图可视化工具

  • 信息流的垂直方向:自底向上
  • 按名字范围分组:是
  • 能够在GraphDef中处理FunctionDefLibrary(例如 V2控制流):是(使用breakout工具箱)

获得和绘制函数图: Colab (仅Google3)

$ blaze run -c opt --config=python3 --config=cuda 
learning/brain/python/client/colab:colab_notebook_with_tfgraph_py3
random_normal = tf.random_normal_initializer()
w = tf.Variable(random_normal([2, 3]))
b = tf.Variable(random_normal([3]))

@tf.function
def my_dense_layer(x):
    y = tf.matmul(x, w)
    y_with_bias = y + b
    return tf.nn.relu(y_with_bias), y, y_with_bias

x = random_normal([4, 2])
print(my_dense_layer(x))

from google3.learning.brain.python.client import colab

graph = my_dense_layer.get_concrete_function(x).graph
colab.tfgraph.display(graph)

获得和绘制函数图: TF2中的控制流

@tf.function
def collatz(n):
    counter = tf.constant(0)
    while n > 1:
        if n % 2 == 0:
            n //= 2
        else:
            n = n * 3 + 1
        counter += 1
    return counter

print(collatz(tf.constant(42)))
collatz_graph = collatz.get_concrete_function(tf.constant(42)).graph
colab.tfgraph.display(collatz_graph)
  • 控制流V2被转换成了graphlet
  • TensorBoard图可视化用break out boxes展示graphlet
  • Netron也不能处理这样的nested graph structure

分布式策略

gpus = tf.config.list_physical_devices("GPU")
if len(gpus) == 1:
    tf.config.experimental.set_virtual_device_configuration(
        gpus[0],  # Which physical device to use
        [tf.config.LogicalDeviceConfiguration(512) for _ in range(4)] # Resultant logical devices
    )
tf.config.list_logical_devices()

dist_strat = tf.distribute.MirroredStrategy()

with dist_strat.scope():
    w = tf.Variable(tf.ones([4, 10]))

def f():
    with tf.GradientTape() as tape:
        loss = tf.math.square(w)
    grads = tape.gradient(loss, w)
    return grads

dist_f = lambda: dist_strat.experimental_run_v2(f)
dist_f = tf.function(dist_f, autograph=True)
g = dist_f.get_concrete_function().graph
g.as_graph_def()

运行结果

...
node {
  name: "Square"
  op: "Square"
  input: "Square/ReadVariableOp"
  device: "/job:localhost/replica:0/task:0/device:GPU:0"
  attr {
    key: "T"
    value {
      type: DT_FLOAT
    }
  }
}
...
  • 镜像策略和一些其它策略执行图内复制
  • 此复制影响了具体函数的图

tf.print()是如何工作的

问题:tf.print()操作的结果并没有被使用,它是如何执行的?

答案:此算子被添加在了返回结果的控制依赖中

tf.print是否在没有返回值的函数中仍然工作?

v1 = tf.Variable(40.0)

@tf.function
def increment_variable():
    tf.print(v1)
    tf.compat.v1.assign_add(v1, 1.0)
    
increment_variable()

运行结果

40

b. 运行图 (runtime graphs)

tf.print(): 可能影响运行运行图的优化

@tf.function
def harmonic_mean(x):
    x_reciprocals = tf.math.reciprocal(x)
    reciprocal_sum = tf.math.reduce_sum(x_reciprocals)
    tf.math.reduce_min(x_reciprocals) ==> tf.print(tf.math.reduce_min(x_reciprocals))
    n = tf.cast(tf.size(x), tf.float32)
    return n / reciprocal_sum

harmonic_mean(tf.constant([10.0, 20.0, 30.0]))
  • 添加tf.print()导致本来不会执行的min算子需要执行

Dump Grapper 输出: 实际执行的图

$ TF_DUMP_GRAPH_PREFIX="/tmp/tf_graph_dump" 
  bazel run my/build/target -- --vmodule=meta_optimizer=4
  • Grapper是TF内置的默认的图优化器
  • 感兴趣的通常是最后一个文件:Grapper最后的输出
  • tfdbg2的目标是使这个工作流更简单(相对于函数图和Grapper-out图)

4. 单步调试

tf.config.experimental_run_functions_eagerly()

  • 覆盖图的编译,运行所有的算子eagerly,包括backprop。
  • 然后就可以在IDE中断点调试了

此API在tf.data.Dataset.map()中不工作

  • 因为Dataset.map()总是在图执行之前编译
  • 不论是否使用了@tf.function
  • 意思
    • 在map函数中单步调试是不可能的
    • 必须使用tf.print()而不是print()输出张量的值
    • 变通:使用tfdbg2

5. 调试高级 API (tf.keras)

访问tf.keras

model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(5, input_shape=[4], activation='relu'))
model.add(tf.keras.layers.Dropout(rate=0.5))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))

debug_model = tf.keras.Model(
    inputs=model.inputs, 
    outputs=[model.layers[0].output, model.layers[1].output] + model.outputs)

xs = tf.random_normal_initializer()([8, 4])
print(debug_model(xs, training=True))

运行结果

[<tf.Tensor: id=103, shape=(8, 5), dtype=float32, numpy=
array([[0.03208053, 0.        , 0.        , 0.09101269, 0.0405516 ],
       [0.06668283, 0.        , 0.05414589, 0.        , 0.06441024],
       [0.        , 0.02470349, 0.0345275 , 0.        , 0.        ],
       [0.02822505, 0.        , 0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.03051471, 0.        ],
       [0.01117405, 0.        , 0.0744615 , 0.07232606, 0.09003952],
       [0.        , 0.03395397, 0.04608804, 0.        , 0.        ],
       [0.        , 0.02972447, 0.00674627, 0.        , 0.        ]],
      dtype=float32)>, <tf.Tensor: id=116, shape=(8, 5), dtype=float32, numpy=
array([[0.06416105, 0.        , 0.        , 0.        , 0.08110321],
       [0.13336566, 0.        , 0.        , 0.        , 0.12882048],
       [0.        , 0.04940698, 0.069055  , 0.        , 0.        ],
       [0.0564501 , 0.        , 0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.06102942, 0.        ],
       [0.0223481 , 0.        , 0.14892301, 0.14465213, 0.18007904],
       [0.        , 0.        , 0.09217609, 0.        , 0.        ],
       [0.        , 0.05944894, 0.        , 0.        , 0.        ]],
      dtype=float32)>, <tf.Tensor: id=121, shape=(8, 1), dtype=float32, numpy=
array([[0.51327056],
       [0.52288353],
       [0.49928164],
       [0.5032595 ],
       [0.5143335 ],
       [0.54077065],
       [0.49030966],
       [0.50787127]], dtype=float32)>]
  • 要访问模型的内部层,可以构建一个新模型输出那些层
  • 如果向看层内部的梯度呢?
    • tfdbg可以帮你

使用TensorBoard回调调试Keras模型

from tensorflow.keras import backend as K

model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(5, input_shape=[4], activation='relu'))
model.add(tf.keras.layers.Dropout(rate=0.5))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam')

xs = tf.random_normal_initializer()([8, 4])
ys = tf.zeros([8])
model.fit(xs, ys, epochs=2, callbacks=[tf.keras.callbacks.TensorBoard("tb_logdir")])
  • tf.keras.callbacks.TensorBoard回调将训练图的日志输出到logdir,包括损失、权重等信息
  • 边上被标记了张量的形状,标记到了什么程度:只有模型构建时已知的形状

6. 数值问题 (NaN / Infinity)

常见的导致数值问题的情况

  • 缺乏对值的裁剪
    • 除0,对0取对数
  • 算子的问题
  • 梯度爆炸
  • 训练样本很差

使用tfdbg2调试数值问题

tf.debugging.enable_check_numerics()

@tf.function
def bad_func(n):
    total = tf.constant(0.0)
    x = tf.constant(10.0)
    i = tf.constant(0, dtype=tf.int32)
    while tf.math.less(i, n):
        total += tf.math.log(x)
        x -= 1.0
        i += 1
    return total

# 尝试小于10的值,观察错误
n = tf.constant(12, dtype=tf.int32)
print(bad_func(n))

输出结果

InvalidArgumentError:  

!!! Detected Infinity or NaN in output 0 of graph op "Log" (# of outputs: 1) !!!
  dtype: <dtype: 'float32'>
  shape: ()

  Input tensor: Tensor("Placeholder:0", shape=(), dtype=float32)
  Graph name: "while_body_13"

  Stack trace of op's creation ("->": inferred user code):
    + ... (Omitted 21 frames)
    + ...3.6/site-packages/IPython/core/interactiveshell.py (L2848) run_cell
 -> |   raw_cell, store_history, silent, shell_futures)
    + ...3.6/site-packages/IPython/core/interactiveshell.py (L2874) _run_cell
 -> |   return runner(coro)
    + ...hon3.6/site-packages/IPython/core/async_helpers.py (L68) _pseudo_sync_runner
 -> |   coro.send(None)
    + ...3.6/site-packages/IPython/core/interactiveshell.py (L3051) run_cell_async
 -> |   interactivity=interactivity, compiler=compiler, result=result)
    + ...3.6/site-packages/IPython/core/interactiveshell.py (L3242) run_ast_nodes
 -> |   if (await self.run_code(code, result,  async_=asy)):
    + ...3.6/site-packages/IPython/core/interactiveshell.py (L3319) run_code
 -> |   exec(code_obj, self.user_global_ns, self.user_ns)
    + <ipython-input-3-acc5c4cbe210> (L16) <module>
 -> |   print(bad_func(n))
    + ...kages/tensorflow_core/python/eager/def_function.py (L568) __call__
    |   result = self._call(*args, **kwds)
    + ...kages/tensorflow_core/python/eager/def_function.py (L615) _call
    |   self._initialize(args, kwds, add_initializers_to=initializers)
    + ...kages/tensorflow_core/python/eager/def_function.py (L497) _initialize
    |   *args, **kwds))
    + ...-packages/tensorflow_core/python/eager/function.py (L2389) _get_concrete_function_internal_garbage_collected
    |   graph_function, _, _ = self._maybe_define_function(args, kwargs)
    + ...-packages/tensorflow_core/python/eager/function.py (L2703) _maybe_define_function
    |   graph_function = self._create_graph_function(args, kwargs)
    + ...-packages/tensorflow_core/python/eager/function.py (L2593) _create_graph_function
    |   capture_by_value=self._capture_by_value),
    + ...ges/tensorflow_core/python/framework/func_graph.py (L978) func_graph_from_py_func
    |   func_outputs = python_func(*func_args, **func_kwargs)
    + ...kages/tensorflow_core/python/eager/def_function.py (L439) wrapped_fn
    |   return weak_wrapped_fn().__wrapped__(*args, **kwds)
    + ...ges/tensorflow_core/python/framework/func_graph.py (L964) wrapper
    |   user_requested=True,
    + <ipython-input-3-acc5c4cbe210> (L8) bad_func
 -> |   while tf.math.less(i, n):
    + ...ow_core/python/autograph/operators/control_flow.py (L746) while_stmt
    |   basic_symbol_names, composite_symbol_names, opts)
    + ...ow_core/python/autograph/operators/control_flow.py (L794) _tf_while_stmt
    |   aug_init_vars, **opts)
    + ...ges/tensorflow_core/python/ops/control_flow_ops.py (L2675) while_loop
    |   back_prop=back_prop)
    + ...te-packages/tensorflow_core/python/ops/while_v2.py (L194) while_loop
    |   add_control_dependencies=add_control_dependencies)
    + ...ges/tensorflow_core/python/framework/func_graph.py (L978) func_graph_from_py_func
    |   func_outputs = python_func(*func_args, **func_kwargs)
    + ...te-packages/tensorflow_core/python/ops/while_v2.py (L172) wrapped_body
    |   outputs = body(*_pack_sequence_as(orig_loop_vars, args))
    + ...ow_core/python/autograph/operators/control_flow.py (L781) aug_body
    |   loop_vars = body(*aug_loop_vars[loop_vars_slice])
    + <ipython-input-3-acc5c4cbe210> (L9) bad_func
 -> |   total += tf.math.log(x)
    + ...ackages/tensorflow_core/python/ops/gen_math_ops.py (L5248) log
    |   "Log", x=x, name=name)
    + ...tensorflow_core/python/framework/op_def_library.py (L742) _apply_op_helper
    |   attrs=attr_protos, op_def=op_def)
    + ...ges/tensorflow_core/python/framework/func_graph.py (L595) _create_op_internal
    |   compute_device)
    + ...e-packages/tensorflow_core/python/framework/ops.py (L3322) _create_op_internal
    |   op_def=op_def)
    + ...e-packages/tensorflow_core/python/framework/ops.py (L1756) __init__
    |   self._traceback = tf_stack.extract_stack()

 : Tensor had Inf values
	 [[{{node while/body/_1/Log/CheckNumerics}}]] [Op:__inference_bad_func_58]

Function call stack:
bad_func
  • enable_check_numerics()是TF1中add_check_numerics_ops()的继承者
  • 检查eagerly执行的算子和图内的算子
    • 工作与向前和向后传播
    • 工作于API层
    • 工作于TF1,
    • 工作在CPU, GPU和TPU
  • 相对负载
    • CPU上是1.29x的时长,GPU上是1.76x的时长,负载不高
    • 注:1.0x为无负载
    • 基于模型:tensorflow_models.official.transformers.v2 task type=training; batch size=64
    • TPU benchmarks之后会添加的,与TensorTracer协作

7. Tensorflow Debugger (tfdbg)

TensorFlow Debugger (tfdbg) V1

  • tfdbg v2的前身,启动与2017年早些时候
  • 提供tf.Session()运行时的可视化界面
    • 插入tf.Session()包中
    • Keras, Estimator, slim也可用的方便的API
  • 支持分布式训练
  • 用户界面:交互式、可点击的CLI
    • 中间张量值和他们的总结统计信息
      • 条件断点,比如has_inf_or_nan
    • 运行图结构(在Grapper和Partition之后)
    • 算子属性,包括原始堆栈
    • 源码查看

为什么需要tfdbg v2?

  • TF新执行范式
    • 没有tf.Session()
    • Eager执行+tf函数
  • print()和tf.print()能否满足可调试性?
    • 一些情况下有帮助,但不是完整的答案
    • 通用性很重要:硬件种类
    • 低性能负载很重要
      • 调试的分级侵入性
    • 前端UX很重要
用户的TF程序
tf.debugging.experimental.enable_dump_debug_info(logdir)
Debugger V2 仪表盘(正在施工????)

重要文档:

tf.debugging.experimental.enable_dump_debug_info

TensorFlow Debugger (TFDBG)

Debugger Dashboard 使用说明

最后

以上就是疯狂花生为你收集整理的Tensorflow 2 调试方法1. 调试 Tensor 值2. 调试设备位置3. 调试图结构4. 单步调试5. 调试高级 API (tf.keras)6. 数值问题 (NaN / Infinity)7. Tensorflow Debugger (tfdbg)的全部内容,希望文章能够帮你解决Tensorflow 2 调试方法1. 调试 Tensor 值2. 调试设备位置3. 调试图结构4. 单步调试5. 调试高级 API (tf.keras)6. 数值问题 (NaN / Infinity)7. Tensorflow Debugger (tfdbg)所遇到的程序开发问题。

如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。

本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
点赞(45)

评论列表共有 0 条评论

立即
投稿
返回
顶部