TensorFlow Slim 的常用操作

74 阅读 0 评论 49 点赞

我是靠谱客的博主迷路耳机，最近开发中收集的这篇文章主要介绍TensorFlow Slim 的常用操作，觉得挺不错的，现在分享给大家，希望可以做个参考。

概述

文章目录

@[toc]
一、简介
二、常用的 Layer
三、作用域机制（arg_scope）
四、两个操作符（repeat 和 stack）
五、TensorFlow-Slim 中 Loss 的维护与使用
六、Fine-Tuning Existing Models
七、参考资料

一、简介

使用 Slim 开发 TensorFlow 程序，增加了程序的易读性和可维护性，简化了 hyper parameter 的调优，使得开发的模型变得通用，封装了计算机视觉里面的一些常用模型（比如VGG、Inception、ResNet），并且容易扩展复杂的模型，可以使用已经存在的模型的 checkpoints 来开始训练算法。。Slim API 常用组件如下：
- arg_scope： scope 内的操作，使用相同的参数设定
- layers： 定义了High level 的神经网络层的定义
- nets： 定义了一些常用的网络模型，如 VGG、Inception、ResNet 等
  - import tensorflow.contrib.slim.nets as nets
  - vgg = nets.vgg
- regularizes： 规范了权重正则化的使用 API
  - regularization_loss = tf.add_n(slim.losses.get_regularization_losses())
  - total_loss = slim.losses.get_total_loss(add_regularization_losses=False)
导入方法： import tensorflow.contrib.slim as slim

二、常用的 Layer

conv2d 的参数设置：
- 输入数据（NHWC），输出 channel 数，卷积核大小，卷积步长（默认为 1）补零方式（默认为 SAME）
- 激活函数（默认为 relu）、命名空间
- 权重和偏置的初始化（默认为 xavier 和 0）、正则化参数
- BN 以及其参数（可选）

# Adds an 2-D convolution followed by an optional batch_norm layer.
# Performs atrous convolution with input stride/dilation rate equal to `rate`
def conv2d(inputs,
           num_outputs,
           kernel_size,
           stride=1,
           padding='SAME',
           activation_fn=nn.relu,
           scope=None
           
           weights_initializer=initializers.xavier_initializer(),
           weights_regularizer=None,
           biases_initializer=init_ops.zeros_initializer(),
           biases_regularizer=None,

           normalizer_fn=None,
           normalizer_params=None,
           trainable=True,
           
           data_format=None,  # 默认是 'NHWC'
           rate=1,
           reuse=None,  
           variables_collections=None,
           outputs_collections=None)

separable_conv2d 的参数设置：
- 输入数据（NHWC），卷积核大小，卷积步长（默认为 1）补零方式（默认为 SAME）
- 输出 channel 数：若不设置为 None，则添加一个 pointwise convolution，卷积核大小和步长均为 1
- 激活函数（默认为 relu）、命名空间
- 权重和偏置的初始化（默认为 xavier 和 0）、正则化参数
- BN 以及其参数的设置（可选）

# Adds an 2-D separable convolution followed by an optional batch_norm layer.
def separable_conv2d(
    inputs,
    kernel_size,
    depth_multiplier=1,
    stride=1,
    padding='SAME',

    num_outputs,
    
    activation_fn=nn.relu,
    scope=None  
     
    normalizer_fn=None,
    normalizer_params=None,
    trainable=True,
    
    weights_initializer=initializers.xavier_initializer(),
    pointwise_initializer=None,
    weights_regularizer=None,
    biases_initializer=init_ops.zeros_initializer(),
    biases_regularizer=None,

    data_format=DATA_FORMAT_NHWC,
    rate=1,
    reuse=None,
    variables_collections=None,
    outputs_collections=None)

fully_connected 的参数设置：
- 输入数据（NC），输出神经元数
- 激活函数（默认为 relu）、命名空间
- 权重和偏置的初始化（默认为 xavier 和 0）、正则化参数
- BN 以及其参数（可选）

def fully_connected(inputs,
                    num_outputs,

					# Explicitly set it to None to skip it and maintain a linear activation.                    
                    activation_fn=nn.relu,  
                    scope=None,
                    

                    weights_initializer=initializers.xavier_initializer(),
                    weights_regularizer=None,
                    biases_initializer=init_ops.zeros_initializer(),
                    biases_regularizer=None,
                    
                    normalizer_fn=None,
                    normalizer_params=None,
                    trainable=True,
                    
                    reuse=None,
                    variables_collections=None,
                    outputs_collections=None)

max_pool2d 的参数设置：
- 输入数据（NHWC），卷积核大小，卷积步长（默认为 2）补零方式（默认为 VALID）
- 命名空间

def fully_connected(inputs,
                    num_outputs,

					# Explicitly set it to None to skip it and maintain a linear activation.                    
                    activation_fn=nn.relu,  
                    scope=None,
                    

                    weights_initializer=initializers.xavier_initializer(),
                    weights_regularizer=None,
                    biases_initializer=init_ops.zeros_initializer(),
                    biases_regularizer=None,
                    
                    normalizer_fn=None,
                    normalizer_params=None,
                    trainable=True,
                    
                    reuse=None,
                    variables_collections=None,
                    outputs_collections=None)

batch_norm 的参数设置：
- 输入数据（NHWC）：the normalization is over all but the last dimension
- 滑动平均参数(decay)：decay for the moving average
- center(平移参数): 默认为 True， add offset of beta to normalized tensor
- scale(缩放参数): 默认为 False，gamma is not used. When the next layer is linear (also e.g. nn.relu), this can be disabled since the scaling can be done by the next layer
- activation_fn: default set to None to skip it and maintain a linear activation.
- is_training: Whether or not the layer is in training mode

def batch_norm(inputs,
               decay=0.999,
               center=True,
               scale=False,
               epsilon=0.001,
               activation_fn=None,
               is_training=True, 
                            
               param_initializers=None,
               param_regularizers=None,
               updates_collections=ops.GraphKeys.UPDATE_OPS,
               reuse=None,
               variables_collections=None,
               outputs_collections=None,
               trainable=True,
               batch_weights=None,
               fused=None,
               data_format=DATA_FORMAT_NHWC,
               zero_debias_moving_mean=False,
               scope=None,
               renorm=False,
               renorm_clipping=None,
               renorm_decay=0.99,
               adjustment=None)

dropout 参数设置
- inputs: The tensor to pass to the nn.dropout op.
- keep_prob: A scalar Tensor with the same type as x. The probability that each element is kept.
- is_training: A bool Tensor indicating whether or not the model is in training mode. If so, dropout is applied and values scaled. Otherwise, inputs is returned.

#   With probability keep_prob, outputs the input element scaled up by 1 / keep_prob, otherwise outputs 0.
# The scaling is so that the expected sum is unchanged.
def dropout(inputs,
            keep_prob=0.5,
            noise_shape=None,
            is_training=True,
            outputs_collections=None,
            scope=None,
            seed=None)

三、作用域机制（arg_scope）

这个新的作用域允许使用者明确一个或者多个操作和一些参数传递给 arg_scope 内部的每一个操作
在 arg_scope 中规定的参数值，它们可以被局部覆盖

# 在第一个 arg_scope 中，卷积层和全连接层被应用于相同的权重初始化和权重正则化；
# 在第二个arg_scope中，额外的参数仅仅对卷积层 conv2d 起作用
with slim.arg_scope([slim.conv2d, slim.fully_connected],
                      activation_fn=tf.nn.relu,
                      weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
                      weights_regularizer=slim.l2_regularizer(0.0005)):
  with slim.arg_scope([slim.conv2d], stride=1, padding='SAME'):
    net = slim.conv2d(inputs, 64, [11, 11], 4, padding='VALID', scope='conv1')
    net = slim.conv2d(net, 256, [5, 5],
                      weights_initializer=tf.truncated_normal_initializer(stddev=0.03),
                      scope='conv2')
    net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc')

四、两个操作符（repeat 和 stack）

slim.repeat：允许用户用相同的参数重复调用同一种操作
slim.stack：允许用户用不同的参数重复调用同一种操作

# 示例 1
net = ...
net = slim.conv2d(net, 256, [3, 3], scope='conv3_1')
net = slim.conv2d(net, 256, [3, 3], scope='conv3_2')
net = slim.conv2d(net, 256, [3, 3], scope='conv3_3')
net = slim.max_pool2d(net, [2, 2], scope='pool2')

# slim.repeat 允许用户用相同的参数重复调用同一种操作
# 它会自动给每一个卷积层的 scopes 命名为'conv3/conv3_1', 'conv3/conv3_2' 和 'conv3/conv3_3'
net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
net = slim.max_pool2d(net, [2, 2], scope='pool2')



# 示例 2
x = slim.fully_connected(x, 32, scope='fc/fc_1')
x = slim.fully_connected(x, 64, scope='fc/fc_2')
x = slim.fully_connected(x, 128, scope='fc/fc_3')

# slim.stack 允许用户用不同的参数重复调用同一种操作（调用了slim.fully_connected 三次，简化多层的全连接层）
# 它会自动给每一个卷积层的 scopes 命名为'fc/fc_1', 'fc/fc_2' 和 'fc/fc_3'
slim.stack(x, slim.fully_connected, [32, 64, 128], scope='fc')


# 示例 3:
x = slim.conv2d(x, 32, [3, 3], scope='core/core_1')
x = slim.conv2d(x, 32, [1, 1], scope='core/core_2')
x = slim.conv2d(x, 64, [3, 3], scope='core/core_3')
x = slim.conv2d(x, 64, [1, 1], scope='core/core_4')

# slim.stack 允许用户用不同的参数重复调用同一种操作（调用了slim.conv2d 三次，简化多层的卷积层）
slim.stack(x, slim.conv2d, [(32, [3, 3]), (32, [1, 1]), (64, [3, 3]), (64, [1, 1])], scope='core')

五、TensorFlow-Slim 中 Loss 的维护与使用

当你通过 TF-Slim 创建一个 loss 时，TF-Slim 将 loss 添加到一个特殊的 TensorFlow collection of loss functions，这使得你既可以手动得管理全部的 loss，也可以让 TF-Slim 来替你管理它们
如果你想让 TF-Slim 为你管理 losses ，但是你有一个自己实现的 loss 该怎么办？
- loss_ops.py 也有一个函数add_loss可以将你自己实现的 loss 加到 TF-Slims collection 中

# Load the images and labels.
images, scene_labels, depth_labels, pose_labels = ...

# Create the model.
scene_predictions, depth_predictions, pose_predictions = CreateMultiTaskModel(images)

# Define the loss functions and get the total loss.
classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)
sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)
pose_loss = MyCustomLossFunction(pose_predictions, pose_labels)
slim.losses.add_loss(pose_loss) # Letting TF-Slim know about the additional loss.

# The following two ways to compute the total loss are equivalent:
regularization_loss = tf.add_n(slim.losses.get_regularization_losses())
total_loss1 = classification_loss + sum_of_squares_loss + pose_loss + regularization_loss

# (Regularization Loss is included in the total loss by default).
total_loss2 = slim.losses.get_total_loss(add_regularization_losses=True)

六、Fine-Tuning Existing Models

# 1、Restoring Variables(all/partial) from a Checkpoint
-------------------------------------------------------

# Create some variables.
v1 = tf.Variable(..., name="v1")
v2 = tf.Variable(..., name="v2")
...
# Add ops to restore all the variables.
restorer = tf.train.Saver()

# Add ops to restore some variables.
restorer = tf.train.Saver([v1, v2])

# Later, launch the model, use the saver to restore variables from disk, and
# do some work with the model.
with tf.Session() as sess:
  # Restore variables from disk.
  restorer.restore(sess, "/tmp/model.ckpt")
  print("Model restored.")
  # Do some work with the model
  ...


# 2、Partially Restoring Models
-------------------------------------------------------
# Create some variables.
v1 = slim.variable(name="v1", ...)
v2 = slim.variable(name="nested/v2", ...)
...

# Get list of variables to restore (which contains only 'v2')，注意 exclude 和 include 的用法
# These are all equivalent methods:
variables_to_restore = slim.get_variables_by_name("v2")
# or
variables_to_restore = slim.get_variables_by_suffix("2")
# or
variables_to_restore = slim.get_variables(scope="nested")
# or
variables_to_restore = slim.get_variables_to_restore(include=["nested"])
# or
variables_to_restore = slim.get_variables_to_restore(exclude=["v1"])

# Create the saver which will be used to restore the variables.
restorer = tf.train.Saver(variables_to_restore)

with tf.Session() as sess:
  # Restore variables from disk.
  restorer.restore(sess, "/tmp/model.ckpt")
  print("Model restored.")
  # Do some work with the model
  ...


# 3、Restoring models with different variable names
-------------------------------------------------------
# restore a model from a checkpoint whose variables have different names to those in the current graph

# Assuming than 'conv1/weights' should be restored from 'vgg16/conv1/weights'
def name_in_checkpoint(var):
  return 'vgg16/' + var.op.name

# Assuming than 'conv1/weights' and 'conv1/bias' should be restored from 'conv1/params1' and 'conv1/params2'
def name_in_checkpoint(var):
  if "weights" in var.op.name:
    return var.op.name.replace("weights", "params1")
  if "bias" in var.op.name:
    return var.op.name.replace("bias", "params2")

variables_to_restore = slim.get_model_variables()
variables_to_restore = {name_in_checkpoint(var):var for var in variables_to_restore}
restorer = tf.train.Saver(variables_to_restore)

with tf.Session() as sess:
  # Restore variables from disk.
  restorer.restore(sess, "/tmp/model.ckpt")


# 4、 Initialize our new model using the values of the pre-trained model excluding the final layer
--------------------------------------------------------------------------------------------
# Load the Pascal VOC data
image, label = MyPascalVocDataLoader(...)
images, labels = tf.train.batch([image, label], batch_size=32)

# Create the model
predictions = vgg.vgg_16(images)

train_op = slim.learning.create_train_op(...)

# Specify where the Model, trained on ImageNet, was saved.
model_path = '/path/to/pre_trained_on_imagenet.checkpoint'

# Specify where the new model will live:
log_dir = '/path/to/my_pascal_model_dir/'

# Restore only the convolutional layers，去除全连接层
variables_to_restore = slim.get_variables_to_restore(exclude=['fc6', 'fc7', 'fc8'])
init_fn = assign_from_checkpoint_fn(model_path, variables_to_restore)

# Start training.
slim.learning.train(train_op, log_dir, init_fn=init_fn)

七、参考资料

1、https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/slim
2、https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/layers/python/layers/layers.py
3、https://github.com/tensorflow/models/tree/master/research/slim/nets
4、TensorFlow-Slim 开源的训练好的模型列表
5、TensorFlow-Slim Tutorial 翻译