TensorFlow入门-MNIST & CNN

69 阅读 0 评论 46 点赞

我是靠谱客的博主合适石头，最近开发中收集的这篇文章主要介绍TensorFlow入门-MNIST & CNN，觉得挺不错的，现在分享给大家，希望可以做个参考。

概述

参考TensorFLow官方教程Deep MNIST for Experts实现用CNN识别手写数字，数据集MNIST。

# load MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("Mnist_data/", one_hot=True)

# start tensorflow interactiveSession
import tensorflow as tf
sess = tf.InteractiveSession()

# weight initialization
def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(initial)

def bias_variable(shape):
    initial = tf.constant(0.1, shape = shape)
    return tf.Variable(initial)

# convolution
def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
# pooling
def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

# Create the model
# placeholder
x = tf.placeholder("float", [None, 784])
y_ = tf.placeholder("float", [None, 10])
# variables
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))

y = tf.nn.softmax(tf.matmul(x,W) + b)

# first convolutinal layer
w_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])

x_image = tf.reshape(x, [-1, 28, 28, 1])

h_conv1 = tf.nn.relu(conv2d(x_image, w_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)

# second convolutional layer
w_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])

h_conv2 = tf.nn.relu(conv2d(h_pool1, w_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

# densely connected layer
w_fc1 = weight_variable([7*7*64, 1024])
b_fc1 = bias_variable([1024])

h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, w_fc1) + b_fc1)

# dropout
keep_prob = tf.placeholder("float")
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

# readout layer
w_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])

y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, w_fc2) + b_fc2)

# train and evaluate the model
cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv))
train_step = tf.train.AdagradOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
sess.run(tf.initialize_all_variables())
for i in range(20000):
    batch = mnist.train.next_batch(50)
    if i%100 == 0:
        train_accuracy = accuracy.eval(feed_dict={x:batch[0], y_:batch[1], keep_prob:1.0})
        print "step %d, train accuracy %g" %(i, train_accuracy)
    train_step.run(feed_dict={x:batch[0], y_:batch[1], keep_prob:0.5})

print "test accuracy %g" % accuracy.eval(feed_dict={x:mnist.test.images, y_:mnist.test.labels, keep_prob:1.0})

训练用时大约12min，结果如下

…
step 19300, train accuracy 0.94
step 19400, train accuracy 0.92
step 19500, train accuracy 0.86
step 19600, train accuracy 0.98
step 19700, train accuracy 0.94
step 19800, train accuracy 0.96
step 19900, train accuracy 0.94

然后，报错了。

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[10000,28,28,32]

意思是用10000张图像测试时，内存泄露(memory leak)。不知道原因，那就把测试图形改成2000张试试。

print "test accuracy %g" % accuracy.eval(feed_dict={x:mnist.test.images[:2000], y_:mnist.test.labels[:2000], keep_prob:1.0})

输出：test accuracy 0.9155。
并不能达到官网所说99.2%的精度。

参考http://blog.csdn.net/yhl_leo/article/details/50624471，改变优化方法，将Adagrad改为Gradient Descent,重新测试一下。

step 19200, train accuracy 1
step 19300, train accuracy 1
step 19400, train accuracy 1
step 19500, train accuracy 1
step 19600, train accuracy 1
step 19700, train accuracy 1
step 19800, train accuracy 1
step 19900, train accuracy 1
test accuracy 0.9895

精度98.95%，测试样本2000个。

补充：

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[10000,28,28,32]

是因为在测试时GPU显存不足导致的，可以将test set分成几个batch分别测试，最后求平均精度。如下：

accuracy_sum = tf.reduce_sum(tf.cast(correct_prediction, tf.float32))
good = 0
total = 0
for i in xrange(10):
    testSet = mnist.test.next_batch(50)
    good += accuracy_sum.eval(feed_dict={ x: testSet[0], y_: testSet[1], keep_prob: 1.0})
    total += testSet[0].shape[0]
print("test accuracy %g"%(good/total))

参考：
1. http://stackoverflow.com/questions/39076388/tensorflow-deep-mnist-resource-exhausted-oom-when-allocating-tensor-with-shape
2. https://github.com/tensorflow/tensorflow/pull/157

补充2
MNIST的训练图像一共有5,5000张，但是训练时使用了20,000个周期（epochs），每个周期的batch size是50，一共就有20000*50=100万张图片参与训练，平均每张图片训练了18次，那么这样的训练方式会提高精度吗，会不会产生过拟合呢？

把周期数由20,000改为2,000。测试5000张，精度为97.8%。

把周期数由20,000改为10,000。测试5000张，精度为98.76%。

把周期数由20,000改为25,000。测试5000张，精度为98.78%。

这三个实验结果的测试精度都比周期为20,000的精度低。