概述
Linear Regression with one variable
- Introduction
- Code
- Plotting the Data
- Gradient Descent
Introduction
In this part of this exercise, you will implement linear regression with one variable to predict profits for a food truck. Suppose you are the CEO of a restaurant franchise and are considering different cities for opening a new outlet. The chain already has trucks in various cities and you have data for profits and populations from the cities.You would like to use this data to help you select which city to expand to next. The file ex1data1.txt contains the dataset for our linear regression problem. The first column is the population of a city and the second column is the profit of a food truck in that city. A negative value for profit indicates a loss.
Code
Plotting the Data
Before starting on any task, it is often useful to understand the data by visualizing it. For this dataset, you can use a scatter plot to visualize the data, since it has only two properties to plot (profit and population). (Many other problems that you will encounter in real life are multi-dimensional and can’t be plotted on a 2-d plot.)
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
path = 'ex1data1.txt'
data = pd.read_csv(path,header=None,names=['Population','Profit'])
data.head()
# print(data.head())
Population Profit
0 6.1101 17.5920
1 5.5277 9.1302
2 8.5186 13.6620
3 7.0032 11.8540
4 5.8598 6.8233
data.describe()
# print(data.describe())
Population Profit
count 97.000000 97.000000
mean 8.159800 5.839135
std 3.869884 5.510262
min 5.026900 -2.680700
25% 5.707700 1.986900
50% 6.589400 4.562300
75% 8.578100 7.046700
max 22.203000 24.147000
data.plot(kind='scatter',x='Population',y='Profit',figsize=(12,8))
# plt.show()
Figure 1: Scatter plot of training data
Gradient Descent
In this part, you will fit the linear regression parameters θ to our dataset using gradient descent.
- Upadata Equations
The obiective of linear regression is to minimize the cost function
J( θ 0 , θ 1 theta_0, theta_1 θ0,θ1) = 1 2 m dfrac {1}{2m} 2m1 ∑ i = 1 m displaystyle sum_{i=1}^m i=1∑m ( y ^ i − y i ) 2 left ( hat{y}_{i}- y_{i} right)^2 (y^i−yi)2 = 1 2 m dfrac {1}{2m} 2m1 ∑ i = 1 m displaystyle sum _{i=1}^m i=1∑m ( h θ ( x ( i ) ) − y ( i ) ) 2 left (h_theta (x^{(i)}) - y^{(i)} right)^2 (hθ(x(i))−y(i))2
Where the hypothesis h θ ( x ) h_theta (x) hθ(x) is given by the linear model
h θ ( x ) = θ T x = θ 0 + θ 1 h_theta(x)=theta^Tx=theta_0+theta_1 hθ(x)=θTx=θ0+θ1
Recall that the parameters of your model are the θ j theta_j θj values. These are the values you will adjust to minimize cost J ( θ ) J(θ) J(θ). One way to do this is to use the batch gradient descent algorithm. In batch gradient descent, each iteration performs the update
θ j : = θ j − α 1 m ∑ i = 1 m ( h θ ( x ( i ) − y ( i ) ) x j ( i ) theta_j := theta_j - alphadfrac{1}{m}displaystylesum_{i=1}^m(h_theta(x^{(i)}-y^{(i)})x^{(i)}_j θj:=θj−αm1i=1∑m(hθ(x(i)−y(i))xj(i) (simultaneously updata θ j theta_j θj for all j j j).
With each step of gradient descent, your parameters θ j θ_j θj come closer to the optimal values that will achieve the lowest cost J ( θ ) J(θ) J(θ).
Note:We store each example as a row in the the X matrix in Python. To take into account the intercept term ( θ 0 θ_0 θ0), we add an additional first column to X and set it to all ones. This allows us to treat θ 0 θ_0 θ0 as simply another ‘feature’. - Implementation
In the following lines, we add another dimension[维度] to our data to accommodate the θ 0 θ_0 θ0 intercept term. We also initialize the initial parameters to 0 and the learning rate alpha to 0.01. - Computing the cost
J
(
θ
)
J(theta)
J(θ)
As you perform gradient descent to learn minimize the cost function J ( θ ) J(θ) J(θ), it is helpful to monitor the convergence by computing the cost. In this section, you will implement a function to calculate J ( θ ) J(θ) J(θ) so you can check the convergence of your gradient descent implementation.
def computeCost(X,y,theta):
# $$Jleft( theta right)=frac{1}{2m}sumlimits_{i=1}^{m}{{{left( {{h}_{theta }}left( {{x}^{(i)}} right)-{{y}^{(i)}} right)}^{2}}}$$
inner = np.power(((X*theta.T)-y),2)
return np.sum(inner)/(2*len(X))
# 在训练集中添加一列,以便我们可以使用向量化的解决方案来计算代价和梯度
data.insert(0,'Ones',1)
# Set X(training data) and y (target variable)
cols = data.shape[1]
X = data.iloc[:,0:cols-1]
y = data.iloc[:,cols-1:cols]
X.head()
y.head()
# print(X.head())
# print(y.head())
# 代价函数是numpy矩阵,所以将转换X和y
# 初始化X,y,theta
X = np.matrix(X.values)
y = np.matrix(y.values)
theta = np.matrix(np.array([0,0]))
'''
查看theta,X,y的矩阵维度,以及代价函数(theta初始值为0)
print(theta)
print(X.shape,theta.shape,y.shape)
print(computeCost(X,y,theta))
'''
X.head()
Ones Population
0 1 6.1101
1 1 5.5277
2 1 8.5186
3 1 7.0032
4 1 5.8598
y.head()
Profit
0 17.5920
1 9.1302
2 13.6620
3 11.8540
4 6.8233
- batch gradient decent
θ j : = θ j − α ∂ ∂ θ j J ( θ theta_j := theta_j - alpha frac{partial}{partial theta_j} J(theta θj:=θj−α∂θj∂J(θ)
def gradientDescent(X,y,theta,alpha,iters):
temp = np.zeros(theta.shape)
parameters = int(theta.ravel().shape[1])
cost = np.zeros(iters)
for i in range(iters):
error = (X*theta.T)-y
for j in range(parameters):
term = np.multiply(error,X[:,j])
temp[0,j] = theta[0,j]-((alpha/len(X)))*np.sum(term)
theta = temp
cost[i] = computeCost(X,y,theta)
return theta,cost
- Debugging
# 初始化学习速率alpha和要执行的迭代次数
alpha = 0.01
iters = 1000
g,cost = gradientDescent(X,y,theta,alpha,iters)
# print(g)
computeCost(X,y,g)
# print(computeCost(X,y,g))
# 2.3 绘制线性模型以及数据
x = np.linspace(data.Population.min(),data.Population.max(),100)
f = g[0,0]+(g[0,1]*x)
fig,ax = plt.subplots(figsize=(12,8))
ax.plot(x,f,'r',label='Prediction')
ax.scatter(data.Population,data.Profit,label='Traning Data')
ax.legend(loc=2)
ax.set_xlabel('Population')
ax.set_ylabel('Profit')
ax.set_title('Predicted Profit vs.Population Size')
plt.show()
fig,ax =plt.subplots(figsize=(12,8))
ax.plot(np.arange(iters),cost,'r')
ax.set_xlabel('IIterations')
ax.set_ylabel('Cost')
ax.set_title('Error vs,Training Epoch')
plt.show()
最后
以上就是清脆白昼为你收集整理的Programming Exercise 1: Linear Regression with one variableIntroductionCode的全部内容,希望文章能够帮你解决Programming Exercise 1: Linear Regression with one variableIntroductionCode所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
发表评论 取消回复