机器学习——gradient descent

78 阅读 0 评论 52 点赞

我是靠谱客的博主寂寞小白菜，这篇文章主要介绍机器学习——gradient descent，现在分享给大家，希望可以做个参考。

机器学习——梯度下降法线性回归算法

~~本文章包含诸多错误，已作废，请勿参考~~

主要目的:

通过线性回归算法，根据已知的数据集进行训练得出一条较吻合的曲线。

实验步骤：

1：根据数据集画出对应的图

myplot

2：求出它的近似曲线：

根据图片可以看出这是一个线性函数：我们可以先假设它为
$h(x) = θ_0 + θ_1x$

而现在我们需要做的就是得出这两个参数，即θ0 和 θ1，那么如何去做呢？我们可以随意的假设两个参数的值，并且画出它的曲线，看看是否与图像吻合，由于我们有图像作为标准，即，我们知道答案，所以这是一种有监督学习。现在当我们输入某一对参数时，对训练集进行训练，那么我们可以得出一个损失函数:
$(h_theta(x^{(i)}) - y^{(i)})^2$
现在我们初始化 $theta_1$ 和 $theta_0$

theta = [[0, 0]]

(这里有点小细节，请参考numpy.md)

可以预见到cost = $sumfrac{y^2}{2times M}$ ，代码输出如下：

def computeCost(x, y, theta):
    mul = numpy.multiply(x, theta)
    mod = np.linalg.norm(mul)
    sub = numpy.subtract(y, mod)
    mod = np.linalg.norm(sub)
    cost = mod ** 2 / 2 * m
    print(cost)

接下来给予 $θ$ -1 ,2

theta = [[-1, 2]]

可以明确的是cost越小那么曲线就吻合的更好，此时我们就需要使用梯度下降法来进行选择更吻合数据的 $θ$

Gradient Descend Algorithm

定义:
$theta_j := theta_j - alphafrac{partial}{partialtheta_j}J(theta_0,theta_1)$
其中， $α$ 是学习率（learning rate），可以人为进行调控 := 这个符号代表的是赋值,，而且我们可以求出对应的 $θ$ 偏导公式
$theta_0 ： frac{1}{M} sum(h_theta(x^{(i)}) - y^{(i)})$

$theta_1 ： frac{1}{M} sum(h_theta(x^{(i)}) - y^{(i)})times x^{(i)}$

实现代码如下：

def updateTheta(rate, cost, profit, population, theta):
    # you must update theta 0 and theta 0 simultaneously
    theta0 = theta[0][0]
    theta1 = theta[1][0]
    length = len(cost)
    temp = np.sum((np.subtract(cost, profit))) / length
    temp1 = np.sum((np.dot(population, np.subtract(cost, profit)))) / length
    theta0 -= rate * temp
    theta1 -= rate * temp1
    theta = np.array([[theta0, theta1]]).T
    return theta

接下来我们进行循环更新：

while res > 0 | iteration < iterations:
        theta = updateTheta(alpha, cost, profit, population1, theta)
        cost = np.dot(population, theta)
        res = computeCost(population, profit, theta)
        iteration += 1
    print(theta)

输出结果如下：

注意这是学习率为0.01的情况，当学习率为0.05时，则会有以下的结果

这是跳跃步数太大的结果，会在临界点左右反复横跳

3.验证

经过上述步骤得到一个比较理想的值，接下来我们进行验证看看是否符合样本数据

def test(theta):
    x = np.arange(5.0, 22.5, 0.5)
    y = [theta[0][0] + theta[1][0] * i for i in x]
    plt.figure()
    plt.plot(x, y, color='r', linestyle='-.')
    plt.show()