TensorFlow笔记——常用运算函数解析

81 阅读 0 评论 54 点赞

我是靠谱客的博主老迟到鞋子，最近开发中收集的这篇文章主要介绍TensorFlow笔记——常用运算函数解析，觉得挺不错的，现在分享给大家，希望可以做个参考。

概述

1. 矩阵乘法

tf.matmul(a, b, transpose_a=False, transpose_b=False, adjoint_a=False, adjoint_b=False, a_is_sparse=False, b_is_sparse=False, name=None)

函数实现了数学上的矩阵乘法，最简单的二维例子：

a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])

prod = tf.matmul(a, b)

Out:
[[19 22]
 [43 50]]

如果a, b都是3阶及以上的张量，则倒数第三之前的维度都当做是矩阵的数量。如a.shape=[4, 2, 3]，b.shape=[4, 3, 2]，则可以看做是4个2×3维的矩阵分别和4个3×2维的矩阵相乘，得到4个2×2维的矩阵。

另外需要理解的是transpose和adjoint参数，前者表示矩阵相乘之前进行转置操作，后者表示矩阵相乘之前进行共轭转置操作，两者不能同时设为True，否则报错。

共轭转置：
首先复习一下什么是共轭复数：实部相同，虚部符号相反的复数，如2+3i和2-3i，共轭转置就是先将矩阵里每个元素进行共轭，然后再得到矩阵的转置。

$\ 5 & 6-7i end{bmatrix} rightarrow begin{bmatrix} 2-3i & 5 \ 4 & 6+7i end{bmatrix}$

对于实数矩阵，共轭转置跟转置其实是一样的。

2. 掩码

tf.sequence_mask(lengths, maxlen=None, dtype=tf.bool, name=None)

这个函数在表示句子长度时非常有用。


# ['OK'],
# ['I','love','you'],
# ['Damn','it']]

tf.sequence_mask([1, 3, 2], 5)

Out:
[[True, False, False, False, False],
 [True, True, True, False, False],
 [True, True, False, False, False]]

[1, 3, 2]表示有3个句子，第一个句子有1个单词，第二个有3，第三个有2。5表示句子最大长度为5。maxlen可不设，默认是lengths中的最大值。

tf.sequence_mask([[1, 3],[2,0]])

Out:
[[[True, False, False],
  [True, True, True]],
 [[True, True, False],
  [False, False, False]]]

3. 切片

tf.slice(input_, begin, size, name=None)
这个简单，占坑先。

tf.strided_slice(input_, begin, end, trides=None, begin_mask=0, end_mask=0, ellipsis_mask=0, new_axis_mask=0, shrink_axis_mask=0, var=None, name=None)

主要看3个参数即可：input_, begin, end

假设输入为：

input = [[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]],
         [[2, 3, 4],
          [5, 6, 7],
          [8, 9, 1]],
         [[3, 4, 5],
          [6, 7, 8],
          [9, 1, 2]]]

以tf.strided_slice(input, [0, 0, 0], [3, 2, 2])为例，表示的是，取第一维度[0, 3)索引、第二维度[0, 2)索引、第三维度[0,2)索引的数据，得到：

[[[1 2]
  [4 5]]
 [[2 3]
  [5 6]]
 [[3 4]
  [6 7]]]

三个维度的区别：
3个维度的区别
strides参数表示步长，各个维度默认步长是1。以tf.strided_slice(input, [0, 0, 0], [3, 2, 2], [2, 1, 1])为例，表示第一维度的步长是2，因此取到的是第0,2索引的数据，结果为：

[[[1 2]
  [4 5]]
 [[3 4]
  [6 7]]]

softmax
tf.nn.softmax函数通常用于分类任务中计算概率。
例如一个batch_size=5，3分类的数据，经过全连接网络后输出为logits，然后经过softmax变换后：

logits = tf.constant([[1, 3, 6], [2, 3, 5], [5, 2, 3], [3, 4, 3], [4, 2, 4]], dtype=tf.float32)
probabilities = tf.nn.softmax(logits, axis=-1)

在这里插入图片描述

log_softmax
在softmax运算之后，加了一个log运算，因为输入都是 [0,1]，所以输出都是小数。

在这里插入图片描述
log_softmax通常用来计算多分类的交叉熵损失：

labels = tf.constant([2, 2, 0, 1, 0], dtype=tf.int32)
logits = tf.constant([[1, 3, 6], [2, 3, 5], [5, 2, 3], [3, 4, 3], [4, 2, 4]], dtype=tf.float32)

one_hot_labels = tf.one_hot(labels, depth=3, dtype=tf.float32)
log_probs = tf.nn.log_softmax(logits, axis=-1)
batch_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
loss = tf.reduce_mean(batch_loss)

另一种写法：

labels = tf.constant([2, 2, 0, 1, 0], dtype=tf.int32)
logits = tf.constant([[1, 3, 6], [2, 3, 5], [5, 2, 3], [3, 4, 3], [4, 2, 4]], dtype=tf.float32)

one_hot_labels = tf.one_hot(labels, depth=3, dtype=tf.float32)
batch_loss = tf.nn.softmax_cross_entropy_with_logits_v2(labels=one_hot_labels, logits=logits)
loss = tf.reduce_mean(batch_loss)