用Tensorflow搭建神经网络tf.data的作用tf.data获取数据的方式

89 阅读 0 评论 59 点赞

我是靠谱客的博主着急长颈鹿，最近开发中收集的这篇文章主要介绍用Tensorflow搭建神经网络tf.data的作用tf.data获取数据的方式，觉得挺不错的，现在分享给大家，希望可以做个参考。

概述

一、神经网络的基本框架

1.层数

（1）input layer 接受信息的神经层，负责传递信息

（2）output layer 是信息在神经元中传递和中转，分析和权衡，输出结果，

（3）hidden layer 负责信息的加工处理

2.神经网络实质：梯度下降 Gradient Descent

Optimation ： Newton's method ,Least Squares method, Gradient Descent

3.cost function 预测值和真实值的差距，回归问题通常是平方差，分类问题通常用交叉熵

cost = (predict - reality)^2

找到梯度最小的点

4.黑盒（隐藏层）不黑

输入层的信息为特征，第一层加工叫代表特征（feature representation），多层加工形成不同的代表特征，一次次特征转换。如果代表特征太多，人类无法理解，但计算机能理解，并学习到它们的规律

5.迁移学习

切掉一个神经网络的输出层，套上另一个神经网络

二、常用的方法

6. Tensorflow 首先要定义神经网络的结构, 然后再把数据放入结构当中去运算和 training.

import tensorflow as tf

import numpy as np

# create data

x_data = np.random.rand(100).astype(np.float32)

y_data = x_data * 0.1 + 0.3

# create tensorflow structure start

Weights = tf.Variable(tf.random_uniform([1], -1.0, 1.0))

biases = tf.Variable(tf.zeros([1]))

y = Weights*x_data + biases

loss = tf.reduce_mean(tf.square(y-y_data))

optimizer = tf.train.GradientDescentOptimizer(0.5)

train = optimizer.minimize(loss)

init = tf.global_variables_initializer()

# create tensorflow structure end

sess = tf.Session()

sess.run(init) # very important

for step in range(201):

sess.run(train)

if step % 20 == 0:

print(step,sess.run(Weights),sess.run(biases))

6.1 会话 Session

凡是变量，要用session.run()激活才有意义

matrix1 = tf.constant([[3,3]])

matrix2 = tf.constant([[2],[2]])

product = tf.matmul(matrix1, matrix2) # matrix multiply np.dot(m1,m2)

'''

会话Session

'''

# # method1

# sess = tf.Session()

# result = sess.run(product)

# print(result)

# sess.close()

# method2

# with tf.Session() as sess:

# result2 = sess.run(product)

# print(result2)

1.记得激活init，定义session并运行run（）

2.激励函数

因为有些函数无法线性化。卷积神经网络函数常用relu，循环神经网络常用relu or tanh

6.2 变量Variable

import tensorflow as tf

matrix1 = tf.constant([[3,3]])

matrix2 = tf.constant([[2],[2]])

product = tf.matmul(matrix1, matrix2) # matrix multiply np.dot(m1,m2)

'''

variable

'''

state = tf.Variable(0,name='counter')

#print(state.name)

one = tf.constant(1)

new_variable = tf.add(state,one)

update = tf.assign(state,new_variable)

init = tf.global_variables_initializer()

with tf.Session() as sess:

sess.run(init)

for i in range(3):

sess.run(update)

print(sess.run(state))

6.3 placeholder

placeholders，顾名思义，就是占位的意思，举个例子：我们定义了一个关于x，y的函数 f(x,y)=2x+y，但是我们并不知道x，y的值，那么x，y就是等待确定的值输入的placeholders。

我们如下定义一个placeholders：

tf.placeholder(dtype, shape=None, name=None)

一个简单的实例如下：

1 a = tf.placeholder(tf.float32, shape=[3])

2 b = tf.constant([5, 5, 5], tf.float32)

3 c =a + b

4 with tf.Session() as sess:

5 print(sess.run(c, feed_dict={a: [1, 2, 3]}))

输出：

[ 6. 7. 8.]

6.4 激励函数：

目的：解决不能用线性方程解决的问题

作用：某一部分的神经元先激活起来 ,把激活的信息传递到后面的神经元

激励函数必须是可微分的，在误差反向传递的时候只些可微分的激励函数才能把误差传递过去

常用的有：relu，sigmoid，tanh，sotfmax

激励函数的选择：

（1）当神经网络的隐藏层只有一两层的时候，随意一种激励函数都可以

（2）在卷积神经网络中使用relu

（3）在循环神经网络中使用relu or tanh

7.例子：建造神经网络

import tensorflow as tf

import numpy as np

from tensorflow.contrib.model_pruning.python.learning import train_step

def add_layer(inputs, in_size, out_size, activation_function=None):

Weights = tf.Variable(tf.random_normal([in_size, out_size]))

biases = tf.Variable(tf.zeros([1, out_size])+0.1)

Wx_plus_b = tf.matmul(inputs, Weights) + biases

if activation_function is None:

outputs = Wx_plus_b

else:

outputs = activation_function(Wx_plus_b)

return outputs

x_data = np.linspace(-1, 1, 300)[:, np.newaxis]

noise = np.random.normal(0, 0.05, x_data.shape)

y_data = np.square(x_data)-0.05 + noise

xs = tf.placeholder(tf.float32, [None, 1]) #only one value is outputed

ys = tf.placeholder(tf.float32, [None, 1])

l1 = add_layer(xs, 1, 10, activation_function=tf.nn.relu)

prediction = add_layer(l1, 10, 1, activation_function=None)

loss = tf.reduce_mean(tf.reduce_sum(tf.square(ys - prediction),

reduction_indices=[1])) # reduction_indices= [1]表示按行相加，[0]表示按列相加。# [0,1]表示全部元素相加

train_step = tf.train.GradientDescentOptimizer(0.1).minimize(loss)

init = tf.global_variables_initializer()

sess = tf.Session()

sess.run(init)

for i in range(1000):

sess.run(train_step, feed_dict={xs:x_data, ys:y_data})

if i % 50 == 0:

print(sess.run(loss, feed_dict={xs:x_data, ys:y_data}))

np.linspace

numpy.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None)

在指定的间隔内返回均匀间隔的数字。返回num均匀分布的样本，在[start, stop]。这个区间的端点可以任意的被排除在外。

start : scalar(标量)

The starting value of the sequence(序列的起始点).

stop : scalar

序列的结束点，除非endpoint被设置为False，在这种情况下, the sequence consists of all but the last of num + 1 evenly spaced samples(该序列包括所有除了最后的num+1上均匀分布的样本(感觉这样翻译有点坑)), 以致于stop被排除.当endpoint is False的时候注意步长的大小(下面有例子).

num : int, optional(可选)

生成的样本数，默认是50。必须是非负。

endpoint : bool, optional

如果是真，则一定包括stop，如果为False，一定不会有stop

retstep : bool, optional

If True, return (samples, step), where step is the spacing between samples.(看例子)

dtype : dtype, optional

The type of the output array. If dtype is not given, infer the data type from the other input arguments(推断这个输入用例从其他的输入中).

New in version 1.9.0.

samples : ndarray

There are num equally spaced samples in the closed interval [start, stop] or the half-open interval [start, stop) (depending on whether endpoint is True or False).

step : float(只有当retstep设置为真的时候才会存在)

Only returned if retstep is True

Size of spacing between samples.

np.newaxis

'''# np.newaxis equals None，is a nickname of None

# np.newaxis add a newaxis for numpy.ndarray

>> x = np.arange(3)

>> x

array([0, 1, 2])

>> x.shape

(3,)

>> x[:, np.newaxis]

array([[0],

[1],

[2]])

>> x[:, None]

array([[0],

[1],

[2]])

>> x[:, np.newaxis].shape

(3, 1)

'''

tf.placeholder

tf.placeholder(dtype, shape=None, name=None)

此函数可以理解为形参，用于定义过程，在执行的时候再赋具体的值

参数：

dtype：数据类型。常用的是tf.float32,tf.float64等数值类型

shape：数据形状。默认是None，就是一维值，也可以是多维，比如[2,3], [None, 3]表示列是3，行不定

name：名称。

numpy.random.normal

对于numpy.random.normal函数，有三个参数（loc, scale, size），分别l代表生成的高斯分布的随机数的均值、方差以及输出的size.我想让loc和scale分别为(1, 2)的数组，而输出的是一个(2, 2)的数组。也是可行的。

8.加速神经网络训练的方法:对learning rate 的改变

sgd，

momentum

adagrad

rmsprop

adam

9.dropput解决overfitting

10.tensorboard可视化

11.rnn

11.1 tf.argmax(vector, 1)

返回的是vector中的最大值的索引号，如果vector是一个向量，那就返回一个值，如果是一个矩阵，那就返回一个向量，这个向量的每一个维度都是相对应矩阵行的最大值元素的索引号。

11.2 tf.reshape(tensor, shape, name=None)

函数的作用是将tensor变换为参数shape的形式。

其中shape为一个列表形式，特殊的一点是列表中可以存在-1。-1代表的含义是不用我们自己指定这一维的大小，函数会自动计算，但列表中只能存在一个-1。（当然如果存在多个-1，就是一个存在多解的方程了）

好了我想说的重点还有一个就是根据shape如何变换矩阵。其实简单的想就是，

reshape（t, shape） => reshape(t, [-1]) => reshape(t, shape)

首先将矩阵t变为一维矩阵，然后再对矩阵的形式更改就可以了。

官方例子

11.3 tf.nn.rnn_cell.BasicLSTMCell(n_hidden, forget_bias=1.0, state_is_tuple=True)

n_hidden表示神经元的个数，forget_bias就是LSTM们的忘记系数，如果等于1，就是不会忘记任何信息。如果等于0，就都忘记。state_is_tuple默认就是True，官方建议用True，就是表示返回的状态用一个元祖表示。这个里面存在一个状态初始化函数，就是zero_state（batch_size，dtype）两个参数。batch_size就是输入样本批次的数目，dtype就是数据类型。

11.4 clas