文章目录
- 1. 调试 Tensor 值
- 2. 调试设备位置
- 3. 调试图结构
- a. tf.function 图
- b. 运行图 (runtime graphs)
- 4. 单步调试
- 5. 调试高级 API (tf.keras)
- 6. 数值问题 (NaN / Infinity)
- 7. Tensorflow Debugger (tfdbg)
1. 调试 Tensor 值
打印Tensor的值
1
2
3
4
5
6
7
8
9
10import tensorflow as tf import numpy as np def log1p(x): y = 1.0 * x print(y) return tf.math.log(y) y = log1p(tf.constant([1., 2., 3.])) y = log1p(tf.constant([2., 3., 4.]) * np.pi)
运行结果
1
2
3tf.Tensor([1. 2. 3.], shape=(3,), dtype=float32) tf.Tensor([ 6.2831855 9.424778 12.566371 ], shape=(3,), dtype=float32)
解释
-
函数
log1p
没有@tf.function
被修饰,所以是立即执行的。 -
print
函数能够输出张量的值- 类似
numpy.ndarray
- 可能设置设备到主机的复制
- 类似
打印Tensor的聚合值
1
2
3
4
5
6
7
8def log1p(x): y = 1.0 * x print(tf.reduce_mean(y), tf.reduce_max(y), tf.reduce_min(y)) return tf.math.log(y) y = log1p(tf.constant([1., 2., 3.])) y = log1p(tf.constant([2., 3., 4.]) * np.pi)
运行结果
1
2
3tf.Tensor(2.0, shape=(), dtype=float32) tf.Tensor(3.0, shape=(), dtype=float32) tf.Tensor(1.0, shape=(), dtype=float32) tf.Tensor(9.424778, shape=(), dtype=float32) tf.Tensor(12.566371, shape=(), dtype=float32) tf.Tensor(6.2831855, shape=(), dtype=float32)
- 可以用内建的TF函数打印变换后的tensor值
修改打印值的格式
1
2
3
4
5
6
7
8
9
10np.set_printoptions(precision=3) def log1p(x): y = 1.0 * x print(y) return tf.math.log(y) y = log1p(tf.constant([1., 2., 3.])) y = log1p(tf.constant([2., 3., 4.]) * np.pi)
输出
1
2
3tf.Tensor([1. 2. 3.], shape=(3,), dtype=float32) tf.Tensor([ 6.283 9.425 12.566], shape=(3,), dtype=float32)
EagerTensor.__str()__
和__repr()__
与numpy字符串格式挂钩- 因此可以使用
numpy.set_printoptions()
控制打印格式
输出图内的张量
1
2
3
4
5
6
7
8
9
10
11
12
13
14@tf.function def collatz(n): counter = tf.constant(0) while n > 1: print(n) if n % 2 == 0: n //= 2 else: n = n * 3 + 1 counter += 1 return counter print(collatz(tf.constant(42)))
运行结果
1
2
3Tensor("placeholder:0", shape=(), dtype=int32) tf.Tensor(8, shape=(), dtype=int32)
- Placeholder是TF while循环中graphlet的一部分
将print(n)
换成tf.print(n)
,结果变成
1
2
3
4
5
6
7
8
9
1042 21 64 32 16 8 4 2 tf.Tensor(8, shape=(), dtype=int32)
tf.print()
打印出了真正的张量n执行中的值
不等长张量 RaggedTensor
1
2
3
4
5
6
7
8
9
10
11
12
13
14ragged = tf.RaggedTensor.from_row_splits( values=[3.0, 1.0, 4.0, 1.0, 5.0, 9.0, 2.0, 6.0], row_splits=[0, 4, 4, 7, 8, 8] ) @tf.function def ragged_times_length_plus_one(x): row_lenghts = tf.reduce_sum(x.row_lengths()) y = x * tf.cast(row_lenghts, tf.float32) tf.print(y) return y + 1.0 ragged_times_length_plus_one(ragged)
输出
1
2tf.RaggedTensor(values=Tensor("Mul_1:0", shape=(8,), dtype=float32), row_splits=Tensor("x_1:0", shape=(6,), dtype=int64))
- 不等长张量不能正常打印
稀疏张量
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15sparse = tf.sparse.SparseTensor( indices=[[0, 0], [1, 2]], values=[1.1, 2.2], dense_shape=[3, 4] ) @tf.function def sparse_times_non_zero_count(x): count = tf.cast(tf.math.count_nonzero(x.values), tf.float32) y = x * count tf.print(y) return y sparse_times_non_zero_count(sparse)
输出
1
2
3'SparseTensor(indices=[[0 0] [1 2]], values=[2.2 4.4], shape=[3 4])'
- 稀疏张量可以打印
以编程方式访问图内的张量值
1
2
3
4
5
6
7
8
9
10
11
12
13random_normal = tf.random_normal_initializer() w = tf.Variable(random_normal([2, 3])) b = tf.Variable(random_normal([3])) @tf.function def my_dense_layer(x): y = tf.matmul(x, w) y_with_bias = y + b return tf.nn.relu(y_with_bias), y, y_with_bias x = random_normal([4, 2]) print(my_dense_layer(x))
运行结果
1
2
3
4
5
6
7
8
9
10
11
12
13
14(<tf.Tensor: id=460, shape=(4, 3), dtype=float32, numpy= array([[0. , 0.026, 0. ], [0. , 0.024, 0. ], [0. , 0.029, 0. ], [0. , 0.022, 0. ]], dtype=float32)>, <tf.Tensor: id=461, shape=(4, 3), dtype=float32, numpy= array([[-0. , 0.001, 0.001], [ 0.003, -0.001, -0.006], [-0.001, 0.003, 0.006], [ 0.002, -0.004, -0.008]], dtype=float32)>, <tf.Tensor: id=462, shape=(4, 3), dtype=float32, numpy= array([[-0.092, 0.026, -0.011], [-0.088, 0.024, -0.019], [-0.093, 0.029, -0.007], [-0.09 , 0.022, -0.021]], dtype=float32)>)
- 对于控制流之外的中间张量值,可以将他们添加到返回值中,获得运行时的值
以编程访问图内的张量值 - while循环
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15@tf.function def collatz(n): counter = tf.constant(0) n_history = tf.TensorArray(n.dtype, size=0, dynamic_size=True) while n > 1: if n % 2 == 0: n //= 2 else: n = n * 3 + 1 n_history = n_history.write(counter, n) counter += 1 return counter, n_history.stack() print(collatz(tf.constant(42)))
运行结果
1
2(<tf.Tensor: id=556, shape=(), dtype=int32, numpy=8>, <tf.Tensor: id=557, shape=(8,), dtype=int32, numpy=array([21, 64, 32, 16, 8, 4, 2, 1])>)
- 可以用
tf.TensorArray
实现
2. 调试设备位置
op(算子)在设备上的位置
1
2
3
4
5
6
7
8
9
10
11
12import tensorflow as tf import numpy as np # 必须在程序开始时执行 tf.debugging.set_log_device_placement(True) def log1p(x): y = 1.0 + x tf.print(y) return tf.math.log(y) log1p(tf.constant([1.0, 2.0, 3.0]) * np.pi)
运行结果
1
2
3
4
5
6
7Executing op Mul in device /job:localhost/replica:0/task:0/device:CPU:0 Executing op AddV2 in device /job:localhost/replica:0/task:0/device:CPU:0 Executing op StringFormat in device /job:localhost/replica:0/task:0/device:CPU:0 Executing op PrintV2 in device /job:localhost/replica:0/task:0/device:CPU:0 [4.14159298 7.28318548 10.424778] Executing op Log in device /job:localhost/replica:0/task:0/device:CPU:0
- 每当单个算子放到设备上时,输出算子的位置
- 不输出相同设备上的算子的重复eager执行
tf.funtion
在设备上的位置
1
2
3
4
5
6
7
8
9
10
11
12
13import tensorflow as tf import numpy as np # 必须在程序开始时执行 tf.debugging.set_log_device_placement(True) @tf.function def log1p(x): y = 1.0 + x tf.print(y) return tf.math.log(y) log1p(tf.constant([1.0, 2.0, 3.0]) * np.pi)
Jupyter Notebook 的运行结果
1
2
3Executing op __inference_log1p_19 in device /job:localhost/replica:0/task:0/device:CPU:0 [4.14159298 7.28318548 10.424778]
命令行运行的结果
1
2
3
4
5
6
7
8
9
10x: (_Arg): /job:localhost/replica:0/task:0/device:CPU:0 add: (AddV2): /job:localhost/replica:0/task:0/device:CPU:0 StringFormat: (StringFormat): /job:localhost/replica:0/task:0/device:CPU:0 PrintV2: (PrintV2): /job:localhost/replica:0/task:0/device:CPU:0 Log: (Log): /job:localhost/replica:0/task:0/device:CPU:0 Identity: (Identity): /job:localhost/replica:0/task:0/device:CPU:0 identity_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:CPU:0 add/x: (Const): /job:localhost/replica:0/task:0/device:CPU:0 [4.14159298 7.28318548 10.424778]
set_log_device_placement()
在Jupyter中不会显示图内算子的位置- 因为Jupyter只显示stdout的结果,变量输出输出在了info log中
set_log_device_placement()
只打印以下项目的设备位置- Eager 算子执行
- 图构建
- 对于后者,不保证所有的算子在运行时执行。Grapper优化可能在实际运行前将其裁剪
set_log_device_placement()
无法在TPU上良好地工作
3. 调试图结构
a. tf.function 图
获得tf函数的图
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16random_normal = tf.random_normal_initializer() w = tf.Variable(random_normal([2, 3])) b = tf.Variable(random_normal([3])) @tf.function def my_dense_layer(x): y = tf.matmul(x, w) y_with_bias = y + b return tf.nn.relu(y_with_bias), y, y_with_bias x = random_normal([4, 2]) print(my_dense_layer(x)) graph = my_dense_layer.get_concrete_function(x).graph graph.as_graph_def()
运行结果
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41node { name: "x" op: "Placeholder" attr { key: "_user_specified_name" value { s: "x" } } attr { key: "dtype" value { type: DT_FLOAT } } attr { key: "shape" value { shape { dim { size: 4 } dim { size: 2 } } } } } node { name: "MatMul/ReadVariableOp/resource" op: "Placeholder" device: "/job:localhost/replica:0/task:0/device:CPU:0" attr { key: "dtype" value { type: DT_RESOURCE } } ...
- 在第一个调用或穿过tf.function时使用
get_concrete_function
- concrete函数是基于特定的输入参数,将Python函数编译成图的结果
TensorBoard 图可视化工具
- 信息流的垂直方向:自底向上
- 按名字范围分组:是
- 能够在GraphDef中处理FunctionDefLibrary(例如 V2控制流):是(使用breakout工具箱)
获得和绘制函数图: Colab (仅Google3)
1
2
3$ blaze run -c opt --config=python3 --config=cuda learning/brain/python/client/colab:colab_notebook_with_tfgraph_py3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18random_normal = tf.random_normal_initializer() w = tf.Variable(random_normal([2, 3])) b = tf.Variable(random_normal([3])) @tf.function def my_dense_layer(x): y = tf.matmul(x, w) y_with_bias = y + b return tf.nn.relu(y_with_bias), y, y_with_bias x = random_normal([4, 2]) print(my_dense_layer(x)) from google3.learning.brain.python.client import colab graph = my_dense_layer.get_concrete_function(x).graph colab.tfgraph.display(graph)
获得和绘制函数图: TF2中的控制流
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15@tf.function def collatz(n): counter = tf.constant(0) while n > 1: if n % 2 == 0: n //= 2 else: n = n * 3 + 1 counter += 1 return counter print(collatz(tf.constant(42))) collatz_graph = collatz.get_concrete_function(tf.constant(42)).graph colab.tfgraph.display(collatz_graph)
- 控制流V2被转换成了graphlet
- TensorBoard图可视化用break out boxes展示graphlet
- Netron也不能处理这样的nested graph structure
分布式策略
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24gpus = tf.config.list_physical_devices("GPU") if len(gpus) == 1: tf.config.experimental.set_virtual_device_configuration( gpus[0], # Which physical device to use [tf.config.LogicalDeviceConfiguration(512) for _ in range(4)] # Resultant logical devices ) tf.config.list_logical_devices() dist_strat = tf.distribute.MirroredStrategy() with dist_strat.scope(): w = tf.Variable(tf.ones([4, 10])) def f(): with tf.GradientTape() as tape: loss = tf.math.square(w) grads = tape.gradient(loss, w) return grads dist_f = lambda: dist_strat.experimental_run_v2(f) dist_f = tf.function(dist_f, autograph=True) g = dist_f.get_concrete_function().graph g.as_graph_def()
运行结果
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15... node { name: "Square" op: "Square" input: "Square/ReadVariableOp" device: "/job:localhost/replica:0/task:0/device:GPU:0" attr { key: "T" value { type: DT_FLOAT } } } ...
- 镜像策略和一些其它策略执行图内复制
- 此复制影响了具体函数的图
tf.print()是如何工作的
问题:tf.print()操作的结果并没有被使用,它是如何执行的?
答案:此算子被添加在了返回结果的控制依赖中
tf.print是否在没有返回值的函数中仍然工作?
1
2
3
4
5
6
7
8
9v1 = tf.Variable(40.0) @tf.function def increment_variable(): tf.print(v1) tf.compat.v1.assign_add(v1, 1.0) increment_variable()
运行结果
1
240
b. 运行图 (runtime graphs)
tf.print(): 可能影响运行运行图的优化
1
2
3
4
5
6
7
8
9
10@tf.function def harmonic_mean(x): x_reciprocals = tf.math.reciprocal(x) reciprocal_sum = tf.math.reduce_sum(x_reciprocals) tf.math.reduce_min(x_reciprocals) ==> tf.print(tf.math.reduce_min(x_reciprocals)) n = tf.cast(tf.size(x), tf.float32) return n / reciprocal_sum harmonic_mean(tf.constant([10.0, 20.0, 30.0]))
- 添加
tf.print()
导致本来不会执行的min算子需要执行
Dump Grapper 输出: 实际执行的图
1
2
3$ TF_DUMP_GRAPH_PREFIX="/tmp/tf_graph_dump" bazel run my/build/target -- --vmodule=meta_optimizer=4
- Grapper是TF内置的默认的图优化器
- 感兴趣的通常是最后一个文件:Grapper最后的输出
- tfdbg2的目标是使这个工作流更简单(相对于函数图和Grapper-out图)
4. 单步调试
tf.config.experimental_run_functions_eagerly()
- 覆盖图的编译,运行所有的算子eagerly,包括backprop。
- 然后就可以在IDE中断点调试了
此API在tf.data.Dataset.map()中不工作
- 因为
Dataset.map()
总是在图执行之前编译 - 不论是否使用了
@tf.function
- 意思
- 在map函数中单步调试是不可能的
- 必须使用
tf.print()
而不是print()
输出张量的值 - 变通:使用tfdbg2
5. 调试高级 API (tf.keras)
访问tf.keras
层
1
2
3
4
5
6
7
8
9
10
11
12model = tf.keras.Sequential() model.add(tf.keras.layers.Dense(5, input_shape=[4], activation='relu')) model.add(tf.keras.layers.Dropout(rate=0.5)) model.add(tf.keras.layers.Dense(1, activation='sigmoid')) debug_model = tf.keras.Model( inputs=model.inputs, outputs=[model.layers[0].output, model.layers[1].output] + model.outputs) xs = tf.random_normal_initializer()([8, 4]) print(debug_model(xs, training=True))
运行结果
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28[<tf.Tensor: id=103, shape=(8, 5), dtype=float32, numpy= array([[0.03208053, 0. , 0. , 0.09101269, 0.0405516 ], [0.06668283, 0. , 0.05414589, 0. , 0.06441024], [0. , 0.02470349, 0.0345275 , 0. , 0. ], [0.02822505, 0. , 0. , 0. , 0. ], [0. , 0. , 0. , 0.03051471, 0. ], [0.01117405, 0. , 0.0744615 , 0.07232606, 0.09003952], [0. , 0.03395397, 0.04608804, 0. , 0. ], [0. , 0.02972447, 0.00674627, 0. , 0. ]], dtype=float32)>, <tf.Tensor: id=116, shape=(8, 5), dtype=float32, numpy= array([[0.06416105, 0. , 0. , 0. , 0.08110321], [0.13336566, 0. , 0. , 0. , 0.12882048], [0. , 0.04940698, 0.069055 , 0. , 0. ], [0.0564501 , 0. , 0. , 0. , 0. ], [0. , 0. , 0. , 0.06102942, 0. ], [0.0223481 , 0. , 0.14892301, 0.14465213, 0.18007904], [0. , 0. , 0.09217609, 0. , 0. ], [0. , 0.05944894, 0. , 0. , 0. ]], dtype=float32)>, <tf.Tensor: id=121, shape=(8, 1), dtype=float32, numpy= array([[0.51327056], [0.52288353], [0.49928164], [0.5032595 ], [0.5143335 ], [0.54077065], [0.49030966], [0.50787127]], dtype=float32)>]
- 要访问模型的内部层,可以构建一个新模型输出那些层
- 如果向看层内部的梯度呢?
- tfdbg可以帮你
使用TensorBoard回调调试Keras模型
1
2
3
4
5
6
7
8
9
10
11
12from tensorflow.keras import backend as K model = tf.keras.Sequential() model.add(tf.keras.layers.Dense(5, input_shape=[4], activation='relu')) model.add(tf.keras.layers.Dropout(rate=0.5)) model.add(tf.keras.layers.Dense(1, activation='sigmoid')) model.compile(loss='binary_crossentropy', optimizer='adam') xs = tf.random_normal_initializer()([8, 4]) ys = tf.zeros([8]) model.fit(xs, ys, epochs=2, callbacks=[tf.keras.callbacks.TensorBoard("tb_logdir")])
tf.keras.callbacks.TensorBoard
回调将训练图的日志输出到logdir,包括损失、权重等信息- 边上被标记了张量的形状,标记到了什么程度:只有模型构建时已知的形状
6. 数值问题 (NaN / Infinity)
常见的导致数值问题的情况
- 缺乏对值的裁剪
- 除0,对0取对数
- 算子的问题
- 梯度爆炸
- 训练样本很差
使用tfdbg2调试数值问题
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17tf.debugging.enable_check_numerics() @tf.function def bad_func(n): total = tf.constant(0.0) x = tf.constant(10.0) i = tf.constant(0, dtype=tf.int32) while tf.math.less(i, n): total += tf.math.log(x) x -= 1.0 i += 1 return total # 尝试小于10的值,观察错误 n = tf.constant(12, dtype=tf.int32) print(bad_func(n))
输出结果
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78InvalidArgumentError: !!! Detected Infinity or NaN in output 0 of graph op "Log" (# of outputs: 1) !!! dtype: <dtype: 'float32'> shape: () Input tensor: Tensor("Placeholder:0", shape=(), dtype=float32) Graph name: "while_body_13" Stack trace of op's creation ("->": inferred user code): + ... (Omitted 21 frames) + ...3.6/site-packages/IPython/core/interactiveshell.py (L2848) run_cell -> | raw_cell, store_history, silent, shell_futures) + ...3.6/site-packages/IPython/core/interactiveshell.py (L2874) _run_cell -> | return runner(coro) + ...hon3.6/site-packages/IPython/core/async_helpers.py (L68) _pseudo_sync_runner -> | coro.send(None) + ...3.6/site-packages/IPython/core/interactiveshell.py (L3051) run_cell_async -> | interactivity=interactivity, compiler=compiler, result=result) + ...3.6/site-packages/IPython/core/interactiveshell.py (L3242) run_ast_nodes -> | if (await self.run_code(code, result, async_=asy)): + ...3.6/site-packages/IPython/core/interactiveshell.py (L3319) run_code -> | exec(code_obj, self.user_global_ns, self.user_ns) + <ipython-input-3-acc5c4cbe210> (L16) <module> -> | print(bad_func(n)) + ...kages/tensorflow_core/python/eager/def_function.py (L568) __call__ | result = self._call(*args, **kwds) + ...kages/tensorflow_core/python/eager/def_function.py (L615) _call | self._initialize(args, kwds, add_initializers_to=initializers) + ...kages/tensorflow_core/python/eager/def_function.py (L497) _initialize | *args, **kwds)) + ...-packages/tensorflow_core/python/eager/function.py (L2389) _get_concrete_function_internal_garbage_collected | graph_function, _, _ = self._maybe_define_function(args, kwargs) + ...-packages/tensorflow_core/python/eager/function.py (L2703) _maybe_define_function | graph_function = self._create_graph_function(args, kwargs) + ...-packages/tensorflow_core/python/eager/function.py (L2593) _create_graph_function | capture_by_value=self._capture_by_value), + ...ges/tensorflow_core/python/framework/func_graph.py (L978) func_graph_from_py_func | func_outputs = python_func(*func_args, **func_kwargs) + ...kages/tensorflow_core/python/eager/def_function.py (L439) wrapped_fn | return weak_wrapped_fn().__wrapped__(*args, **kwds) + ...ges/tensorflow_core/python/framework/func_graph.py (L964) wrapper | user_requested=True, + <ipython-input-3-acc5c4cbe210> (L8) bad_func -> | while tf.math.less(i, n): + ...ow_core/python/autograph/operators/control_flow.py (L746) while_stmt | basic_symbol_names, composite_symbol_names, opts) + ...ow_core/python/autograph/operators/control_flow.py (L794) _tf_while_stmt | aug_init_vars, **opts) + ...ges/tensorflow_core/python/ops/control_flow_ops.py (L2675) while_loop | back_prop=back_prop) + ...te-packages/tensorflow_core/python/ops/while_v2.py (L194) while_loop | add_control_dependencies=add_control_dependencies) + ...ges/tensorflow_core/python/framework/func_graph.py (L978) func_graph_from_py_func | func_outputs = python_func(*func_args, **func_kwargs) + ...te-packages/tensorflow_core/python/ops/while_v2.py (L172) wrapped_body | outputs = body(*_pack_sequence_as(orig_loop_vars, args)) + ...ow_core/python/autograph/operators/control_flow.py (L781) aug_body | loop_vars = body(*aug_loop_vars[loop_vars_slice]) + <ipython-input-3-acc5c4cbe210> (L9) bad_func -> | total += tf.math.log(x) + ...ackages/tensorflow_core/python/ops/gen_math_ops.py (L5248) log | "Log", x=x, name=name) + ...tensorflow_core/python/framework/op_def_library.py (L742) _apply_op_helper | attrs=attr_protos, op_def=op_def) + ...ges/tensorflow_core/python/framework/func_graph.py (L595) _create_op_internal | compute_device) + ...e-packages/tensorflow_core/python/framework/ops.py (L3322) _create_op_internal | op_def=op_def) + ...e-packages/tensorflow_core/python/framework/ops.py (L1756) __init__ | self._traceback = tf_stack.extract_stack() : Tensor had Inf values [[{{node while/body/_1/Log/CheckNumerics}}]] [Op:__inference_bad_func_58] Function call stack: bad_func
enable_check_numerics()
是TF1中add_check_numerics_ops()
的继承者- 检查eagerly执行的算子和图内的算子
- 工作与向前和向后传播
- 工作于API层
- 工作于TF1,
- 工作在CPU, GPU和TPU
- 相对负载
- CPU上是1.29x的时长,GPU上是1.76x的时长,负载不高
- 注:1.0x为无负载
- 基于模型:tensorflow_models.official.transformers.v2 task type=training; batch size=64
- TPU benchmarks之后会添加的,与TensorTracer协作
7. Tensorflow Debugger (tfdbg)
TensorFlow Debugger (tfdbg) V1
- tfdbg v2的前身,启动与2017年早些时候
- 提供
tf.Session()
运行时的可视化界面- 插入
tf.Session()
包中 - Keras, Estimator, slim也可用的方便的API
- 插入
- 支持分布式训练
- 用户界面:交互式、可点击的CLI
- 中间张量值和他们的总结统计信息
- 条件断点,比如has_inf_or_nan
- 运行图结构(在Grapper和Partition之后)
- 算子属性,包括原始堆栈
- 源码查看
- 中间张量值和他们的总结统计信息
为什么需要tfdbg v2?
- TF新执行范式
- 没有
tf.Session()
- Eager执行+tf函数
- 没有
- print()和tf.print()能否满足可调试性?
- 一些情况下有帮助,但不是完整的答案
- 通用性很重要:硬件种类
- 低性能负载很重要
- 调试的分级侵入性
- 前端UX很重要
重要文档:
tf.debugging.experimental.enable_dump_debug_info
TensorFlow Debugger (TFDBG)
Debugger Dashboard 使用说明
最后
以上就是疯狂花生最近收集整理的关于Tensorflow 2 调试方法1. 调试 Tensor 值2. 调试设备位置3. 调试图结构4. 单步调试5. 调试高级 API (tf.keras)6. 数值问题 (NaN / Infinity)7. Tensorflow Debugger (tfdbg)的全部内容,更多相关Tensorflow内容请搜索靠谱客的其他文章。
发表评论 取消回复