Keras 实现细节——dropout在训练阶段与测试阶段的使用分析0. 写作目的1. Dropout 的实现方式2. 实验验证Dropout的实现（也可以通过源码查看）3. 实验结果4. 实验结论[Reference]

96 阅读 0 评论 64 点赞

我是靠谱客的博主负责路灯，最近开发中收集的这篇文章主要介绍Keras 实现细节——dropout在训练阶段与测试阶段的使用分析0. 写作目的1. Dropout 的实现方式2. 实验验证Dropout的实现（也可以通过源码查看）3. 实验结果4. 实验结论[Reference]，觉得挺不错的，现在分享给大家，希望可以做个参考。

概述

0. 写作目的

通过实验分析keras中Dropout在训练阶段和测试阶段的使用情况。

结论： Keras使用的 Inverted Dropout，因此测试时不需要修改 Dropout中的参数(rate)。

1. Dropout 的实现方式

Dropout的实现方式有两种。

Dropout:(使用较少， AlexNet使用的是这种Dropout)

训练阶段：

keepProb: 保留该神经元的概率。

d3 = np.random.rand( a3.shape[0], a3.shape[1] ) < keepProb

$a3=a3 * d3$

测试阶段：计算的结果需要乘以keepProb： $a3 = a3 * Keepprob$

Inverted Dropout:（目前常用的方法）：

训练阶段：

d3 = np.random.rand( a3.shape[0], a3.shape[1] ) < keepProb

$a3=a3 * d3$

a3 = a3 / keepProb

测试阶段： $a3=a3$

2. 实验验证Dropout的实现（也可以通过源码查看）

实验思路：通过训练带有Dropout的网络，然后加载训练的模型，并修改其中的Dropout的参数。观察在相同的数据集上的预测结果是否相同，为避免实验的随机性，对于测试实验运行10次观察结果。

实验过程：首先运行代码一，然后运行代码二，然后对比代码一与代码二的结果。

实验结果猜测：如果代码一的结果是代码二结果的 1/2，说明Keras中Dropout是采用AlexNet中的Dropout，如果代码一二的结果近似相等，说明Keras中Dropout使用的是Inverted Dropout。

代码一：

# _*_ coding:utf-8 _*_


import keras
from keras.layers import Dense, Dropout, Input
from keras.optimizers import SGD
import numpy as np
from keras.models import Model, load_model
import tensorflow as tf

## y = 2 * x1 + x2
def generateData():
    X = np.array([[3, 2], [2, 4], [1, 6]])
    y = np.array([[8], [8], [8]])
    return X, y



def Net(rate=0):
    tf.reset_default_graph()
    input_x = Input( shape=(2, ) )
    x = Dense(units=100, activation='linear')(input_x)
    x = Dropout(rate=rate)(x)
    x = Dense(units=100, activation='linear')(x)
    x = Dense(units=1, activation='linear')(x)
    model = Model(inputs=input_x, outputs=x)
    model.summary()
    return model


def main():
    model_with = Net(rate=0.5)
    model_with.compile(optimizer=SGD(0.001), loss='mse')

    X, y = generateData()
    model_with.fit(X, y, nb_epoch=1000, verbose=0)
    model_with.save('model.h5')

    for ii in range(10):
        y_with = model_with.predict( X )
        print( 'model with dropout:{}'.format(y_with) )



if __name__ == "__main__":
    main()

代码二：

#!/usr/bin/env python
# _*_ coding:utf-8 _*_

import keras
from keras.layers import Dense, Dropout, Input
from keras.optimizers import SGD
import numpy as np
from keras.models import Model, load_model
import tensorflow as tf

## y = 2 * x1 + x2
def generateData():
    X = np.array([[3, 2], [2, 4], [1, 6]])
    y = np.array([[8], [8], [8]])
    return X, y



def Net(rate=0):
    tf.reset_default_graph()
    input_x = Input( shape=(2, ) )
    x = Dense(units=100, activation='linear')(input_x)
    x = Dropout(rate=rate)(x)
    x = Dense(units=100, activation='linear')(x)
    x = Dense(units=1, activation='linear')(x)
    model = Model(inputs=input_x, outputs=x)
    model.summary()
    return model

def main():
    X, y = generateData()
    model_without = Net(rate=0)
    model_without.load_weights('model.h5', by_name=True)
    # model_without = load_model( 'model.h5' )
    for ii in range(10):
        y_without = model_without.predict(X)
        print('model without dropout: {}'.format(y_without))

if __name__ == "__main__":
    main()

3. 实验结果

代码一结果：

model with dropout:[[8.249627]
 [8.171895]
 [8.094164]]
model with dropout:[[8.249627]
 [8.171895]
 [8.094164]]
model with dropout:[[8.249627]
 [8.171895]
 [8.094164]]
model with dropout:[[8.249627]
 [8.171895]
 [8.094164]]
model with dropout:[[8.249627]
 [8.171895]
 [8.094164]]
model with dropout:[[8.249627]
 [8.171895]
 [8.094164]]
model with dropout:[[8.249627]
 [8.171895]
 [8.094164]]
model with dropout:[[8.249627]
 [8.171895]
 [8.094164]]
model with dropout:[[8.249627]
 [8.171895]
 [8.094164]]
model with dropout:[[8.249627]
 [8.171895]
 [8.094164]]

代码二结果：

model without dropout: [[8.249627]
 [8.171895]
 [8.094164]]
model without dropout: [[8.249627]
 [8.171895]
 [8.094164]]
model without dropout: [[8.249627]
 [8.171895]
 [8.094164]]
model without dropout: [[8.249627]
 [8.171895]
 [8.094164]]
model without dropout: [[8.249627]
 [8.171895]
 [8.094164]]
model without dropout: [[8.249627]
 [8.171895]
 [8.094164]]
model without dropout: [[8.249627]
 [8.171895]
 [8.094164]]
model without dropout: [[8.249627]
 [8.171895]
 [8.094164]]
model without dropout: [[8.249627]
 [8.171895]
 [8.094164]]
model without dropout: [[8.249627]
 [8.171895]
 [8.094164]]