（2）tensorflow 再深入一点：多元一次方程求解

100 阅读 0 评论 66 点赞

我是靠谱客的博主乐观嚓茶，这篇文章主要介绍（2）tensorflow 再深入一点：多元一次方程求解，现在分享给大家，希望可以做个参考。

基于上一篇博客二元一次方程求解，上面的例子如果能完成，结合官网的资料和其他博主的资料，我相信你已经算入了个门了，后面能不能通过修改上面的例子进行解决更加复杂的问题呢？再看看下一个问题，如果有一个值，它受到 N 个参数的影响，但是每个参数的权重我们并不清楚，我们希望能用刚刚学到的 TensorFlow 来解决这个问题。首先建立一个模型，表示 N 组数据，具体点，先实现 5 个变量的求解，生成 10 个数据集，我们可以很容易联想到使用大小为 [10,5]的矩阵表示 t_x，使用大小为 [5,1]的矩阵表示参数权重 t_w，使用大小为 [10,1]的矩阵表示结果 t_y，即 t_y = t_x * t_w。当然，为了更加通用，变量的数量和数据集的数量可以使用常量来表示，矩阵的向量乘法在 numpy 库中使用 dot 函数实现：

复制代码

1
2
3
4
5
6
7
8
9
10
11
12
13
test_count = 10         #数据集数量
param_count = 5         #变量数
t_x = np.floor(1000 * np.random.random([test_count,param_count]),dtype=np.float32)

#要求的值
t_w = np.floor(1000 * np.random.random([param_count,1]),dtype=np.float32)

#根据公式 t_y = t_x * t_w 算出值 t_y
t_y = t_x.dot(t_w)

print t_x
print t_w
print t_y

与上面的例子一样，我们以 TensorFlow 占位符形式定义输入训练集 x 和 y，矩阵大小可以使用 shape 参数来定义：

复制代码

1
2
3
#x 是输入量，对应 t_x，用于训练输入，在训练过程中，由外部提供，因此是 placeholder 类型
x = tf.placeholder(tf.float32,shape=[test_count,param_count])
y = tf.placeholder(tf.float32,shape=[test_count,1])

以 TensorFlow 变量形式定义结果 w：

复制代码

1
2
#w 是要求的各个参数的权重，是目标输出，对应 t_w
w = tf.Variable(np.zeros(param_count,dtype=np.float32).reshape((param_count,1)), tf.float32)

定义 TensorFlow 计算结果 y、损失函数 loss 和训练方法：

复制代码

1
2
3
4
5
curr_y = tf.matmul(x, w)                         #实际输出数据
loss = tf.reduce_sum(tf.square(t_y - curr_y))    #损失函数，实际输出数据和训练输出数据的方差之和
optimizer = tf.train.GradientDescentOptimizer(0.0000001)
train = optimizer.minimize(loss)                 #训练的结果是使得损失函数最小

针对训练次数的问题，我们可以优化一下之前的方式，设定当 loss 函数值低于一定值或者不再变化的时候停止，因为 loss 函数需要在 Session 中使用，它需要使用 TensorFlow 的常量表示：

复制代码

1
2
LOSS_MIN_VALUE = tf.constant(1e-5)               #达到此精度的时候结束训练

模型已经建立完毕，开始训练，我们使用变量 run_count 来记录训练的次数，以 last_loss 记录上一次训练的损失函数的值，初始值为 0。

复制代码

1
2
3
4
sess = tf.Session()
sess.run(tf.global_variables_initializer())
run_count = 0
last_loss = 0

训练主循环，将当前的 loss 函数值保存在 curr_loss 中，与上一次相比，如果相同，则退出训练，另外如果 loss 函数低于设定的精度 LOSS_MIN_VALUE，也会退出训练：

复制代码

1
2
3
4
5
6
7
8
9
10
11
12
13
while True:
        run_count  = 1
        sess.run(train, {x:t_x, y:t_y})

        curr_loss,is_ok = sess.run([loss,loss < LOSS_MIN_VALUE],{x:t_x, y:t_y})
        print "运行%d 次,loss=%s" % (run_count,curr_loss)

        if last_loss == curr_loss:
                break

        last_loss = curr_loss
        if is_ok:
                break

最后打印结果，由于我们知道 t_w 的值是整数，因此将得到的结果四舍五入的值 fix_w 也打印出来，再看看 fix_w 与 t_w 的差距 fix_w_loss 是多少：

复制代码

1
2
3
4
curr_W, curr_loss = sess.run([w, loss], {x:t_x,y:t_y})
print("t_w: %snw: %snfix_w: %snloss: %snfix_w_loss:%s" % (t_w, curr_W, np.round(curr_W), curr_loss, np.sum(np.square(t_w - np.round(curr_W)))))

exit(0)

完整代码如下：

复制代码

#!/usr/bin/python
#coding=utf-8
import tensorflow as tf
import numpy as np

tf.logging.set_verbosity(tf.logging.ERROR)              #日志级别设置成 ERROR，避免干扰
np.set_printoptions(threshold='nan')                    #打印内容不限制长度

test_count = 10         #数据集数量
param_count = 5         #变量数
t_x = np.floor(1000 * np.random.random([test_count,param_count]),dtype=np.float32)

#要求的值
t_w = np.floor(1000 * np.random.random([param_count,1]),dtype=np.float32)

#根据公式 t_y = t_x * t_w 算出值 t_y
t_y = t_x.dot(t_w)

print t_x
print t_w
print t_y

#x 是输入量，对应 t_x，用于训练输入，在训练过程中，由外部提供，因此是 placeholder 类型
x = tf.placeholder(tf.float32,shape=[test_count,param_count])
y = tf.placeholder(tf.float32,shape=[test_count,1])

#w 是要求的各个参数的权重，是目标输出，对应 t_w
w = tf.Variable(np.zeros(param_count,dtype=np.float32).reshape((param_count,1)), tf.float32)

curr_y = tf.matmul(x, w)                         #实际输出数据
loss = tf.reduce_sum(tf.square(t_y - curr_y))    #损失函数，实际输出数据和训练输出数据的方差之和
optimizer = tf.train.GradientDescentOptimizer(0.00000001)
train = optimizer.minimize(loss)                 #训练的结果是使得损失函数最小

LOSS_MIN_VALUE = tf.constant(1e-5)               #达到此精度的时候结束训练

sess = tf.Session()
sess.run(tf.global_variables_initializer())
run_count = 0
last_loss = 0
while True:
        run_count  = 1
        sess.run(train, {x:t_x, y:t_y})

curr_loss,is_ok = sess.run([loss,loss < LOSS_MIN_VALUE],{x:t_x, y:t_y})
        print "运行%d 次,loss=%s" % (run_count,curr_loss)

if last_loss == curr_loss:
                break

last_loss = curr_loss
        if is_ok:
                break

curr_W, curr_loss = sess.run([w, loss], {x:t_x,y:t_y})
print("t_w: %snw: %snfix_w: %snloss: %snfix_w_loss:%s" % (t_w, curr_W, np.round(curr_W), curr_loss, np.sum(np.square(t_w - np.round(curr_W)))))

exit(0)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
#!/usr/bin/python
#coding=utf-8
import tensorflow as tf
import numpy as np

tf.logging.set_verbosity(tf.logging.ERROR)              #日志级别设置成 ERROR，避免干扰
np.set_printoptions(threshold='nan')                    #打印内容不限制长度

test_count = 10         #数据集数量
param_count = 5         #变量数
t_x = np.floor(1000 * np.random.random([test_count,param_count]),dtype=np.float32)

#要求的值
t_w = np.floor(1000 * np.random.random([param_count,1]),dtype=np.float32)

#根据公式 t_y = t_x * t_w 算出值 t_y
t_y = t_x.dot(t_w)

print t_x
print t_w
print t_y

#x 是输入量，对应 t_x，用于训练输入，在训练过程中，由外部提供，因此是 placeholder 类型
x = tf.placeholder(tf.float32,shape=[test_count,param_count])
y = tf.placeholder(tf.float32,shape=[test_count,1])

#w 是要求的各个参数的权重，是目标输出，对应 t_w
w = tf.Variable(np.zeros(param_count,dtype=np.float32).reshape((param_count,1)), tf.float32)    

curr_y = tf.matmul(x, w)                         #实际输出数据
loss = tf.reduce_sum(tf.square(t_y - curr_y))    #损失函数，实际输出数据和训练输出数据的方差之和
optimizer = tf.train.GradientDescentOptimizer(0.00000001)
train = optimizer.minimize(loss)                 #训练的结果是使得损失函数最小

LOSS_MIN_VALUE = tf.constant(1e-5)               #达到此精度的时候结束训练

sess = tf.Session()
sess.run(tf.global_variables_initializer())
run_count = 0
last_loss = 0
while True:
        run_count  = 1
        sess.run(train, {x:t_x, y:t_y})

        curr_loss,is_ok = sess.run([loss,loss < LOSS_MIN_VALUE],{x:t_x, y:t_y})
        print "运行%d 次,loss=%s" % (run_count,curr_loss)

        if last_loss == curr_loss:
                break

        last_loss = curr_loss
        if is_ok:
                break

curr_W, curr_loss = sess.run([w, loss], {x:t_x,y:t_y})
print("t_w: %snw: %snfix_w: %snloss: %snfix_w_loss:%s" % (t_w, curr_W, np.round(curr_W), curr_loss, np.sum(np.square(t_w - np.round(curr_W)))))

exit(0)

运行一下，仍然把头尾的部分记录下来，中间部分太多就省略掉：

复制代码

$ python ./test1.py 
[[ 842.  453.  586.  919.   91.]
 [ 867.  600.  156.  993.  558.]
 [ 795.  809.  146.  793.  118.]
 [ 202.  184.  125.  132.  450.]
 [ 214.   36.  436.  118.  290.]
 [ 207.  916.  757.  647.  670.]
 [ 679.  176.  872.  522.  927.]
 [ 552.  602.  981.  563.  937.]
 [  31.  519.  718.  226.  178.]
 [ 571.  464.  289.  141.  769.]]
[[  42.]
 [ 465.]
 [ 890.]
 [  84.]
 [ 488.]]
[[  889153.]
 [  809970.]
 [  663711.]
 [  435982.]
 [  565200.]
 [ 1489672.]
 [ 1382662.]
 [ 1680752.]
 [  987505.]
 [  884068.]]
运行 1 次,loss=3.30516e 13
运行 2 次,loss=1.02875e 14
运行 3 次,loss=3.22531e 14
运行 4 次,loss=1.01237e 15
运行 5 次,loss=3.17825e 15
运行 6 次,loss=9.97822e 15
运行 7 次,loss=3.13272e 16
运行 8 次,loss=9.83534e 16
运行 9 次,loss=3.08786e 17
运行 10 次,loss=9.69452e 17
运行 11 次,loss=3.04365e 18
运行 12 次,loss=9.55571e 18
运行 13 次,loss=3.00007e 19
运行 14 次,loss=9.41889e 19
运行 15 次,loss=2.95712e 20
...
运行 2821 次,loss=6839.32
运行 2822 次,loss=6780.68
运行 2823 次,loss=6767.86
运行 2824 次,loss=6735.09
运行 2825 次,loss=6709.06
运行 2826 次,loss=6662.66
运行 2827 次,loss=6637.81
运行 2828 次,loss=6637.81
t_w: [[ 117.]
 [ 642.]
 [ 662.]
 [ 318.]
 [ 771.]]
w: [[ 117.0872879 ]
 [ 641.80706787]
 [ 662.05078125]
 [ 318.10388184]
 [ 771.01501465]]
fix_w: [[ 117.]
 [ 642.]
 [ 662.]
 [ 318.]
 [ 771.]]
loss: 6637.81
fix_loss:0.0

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
$ python ./test1.py 
[[ 842.  453.  586.  919.   91.]
 [ 867.  600.  156.  993.  558.]
 [ 795.  809.  146.  793.  118.]
 [ 202.  184.  125.  132.  450.]
 [ 214.   36.  436.  118.  290.]
 [ 207.  916.  757.  647.  670.]
 [ 679.  176.  872.  522.  927.]
 [ 552.  602.  981.  563.  937.]
 [  31.  519.  718.  226.  178.]
 [ 571.  464.  289.  141.  769.]]
[[  42.]
 [ 465.]
 [ 890.]
 [  84.]
 [ 488.]]
[[  889153.]
 [  809970.]
 [  663711.]
 [  435982.]
 [  565200.]
 [ 1489672.]
 [ 1382662.]
 [ 1680752.]
 [  987505.]
 [  884068.]]
运行 1 次,loss=3.30516e 13
运行 2 次,loss=1.02875e 14
运行 3 次,loss=3.22531e 14
运行 4 次,loss=1.01237e 15
运行 5 次,loss=3.17825e 15
运行 6 次,loss=9.97822e 15
运行 7 次,loss=3.13272e 16
运行 8 次,loss=9.83534e 16
运行 9 次,loss=3.08786e 17
运行 10 次,loss=9.69452e 17
运行 11 次,loss=3.04365e 18
运行 12 次,loss=9.55571e 18
运行 13 次,loss=3.00007e 19
运行 14 次,loss=9.41889e 19
运行 15 次,loss=2.95712e 20
...
运行 2821 次,loss=6839.32
运行 2822 次,loss=6780.68
运行 2823 次,loss=6767.86
运行 2824 次,loss=6735.09
运行 2825 次,loss=6709.06
运行 2826 次,loss=6662.66
运行 2827 次,loss=6637.81
运行 2828 次,loss=6637.81
t_w: [[ 117.]
 [ 642.]
 [ 662.]
 [ 318.]
 [ 771.]]
w: [[ 117.0872879 ]
 [ 641.80706787]
 [ 662.05078125]
 [ 318.10388184]
 [ 771.01501465]]
fix_w: [[ 117.]
 [ 642.]
 [ 662.]
 [ 318.]
 [ 771.]]
loss: 6637.81
fix_loss:0.0

可见，这次在执行了 2828 次之后，loss 函数从 3.30516e 13 降低到 6637.81 后不再变动，看起来有点大，但是实际上我们的 y 值也是非常大的，最后求得的结果与实际值有大约不到千分之一的差距，要缩小这个差距，可以通过减少梯度下降学习速率，同时增加训练次数来解决，而 fix_w 的值已经等于 t_w 的值了。目前这个代码也可以修改一下训练集的数量以及变量的数量，然后通过调梯度下降学习速率参数来进行训练