概述
In this Blog, I would summarise the theory and implementation of Linear Regression.
The based materials what I used are CS229 Lecture Note Part 1 and Cousera's Machine Learning Course Lecture 2 to 3.
If you haven't read such materials, I suggest you could read it first.
Linear Regression is the basic Problem we would concern about when we study Supervised Learning.
Since there are lots of formulas and figures, i have no time to make screen-shotcut for all of them. May i am too lazy.
This is my first blog written in english. Why? Because it is nearly impossible for us to study deep learning in chinese. I believe it is much more efficient to use english to describe all the relative things.
OK. This blog may be in chaos, but it is in my thinking style.
OK!
Linear Regression
Simple Problem is Linear Regression with one variable
we assume a hypothesis h(x) = y ,x is a matrix of data set. we want to find a optimistic thetas to fit the h(x) therefore we could use this for predication.
we have already had some training data set. we would use this data to solve the problem.
Step 1: Cost Function LMS (least mean squares)
Step 2: Gradient descent to find the thetas.
repeat until convergence
The key problem is to solve partial derivatives!!
It is easy in Linear Regression, but far more complicated in Neural Network. Back Propagation is an approach.
I don't want to copy the formula here.
Stochastic gradient descent (use all training set per iteration)
Batch gradient descent (use some or one training set per iteration, much faster ,and still easy to converge to a local optima)
Since it is a formulas solving problem. if we can set all partial derivatives to 0 and solve it to get the suitable result.
therefore the normal equations is another approach.
why might the least-squares cost function J be a reasonable choice?
there is probabilistic interpretation.
underfitting and overfitting depend on the parameter we choose!
locally weighted linear regression :
basic idea: give different training set different weight.
sometimes we would face such a problem that the newest data is influenced by the latest dataset. especially in time series. More close,More important!
how to set the weights? remain consideration!
OK.
Next, Linear Regression with Multiple variable!
Approach: transform h(x) into one variable problem
Above figure is from Andrew Ng's lecture 4 ppt.
Feature Scaling Mean Normalization
Learning Rate : if alpha is too small:slow convergence. if alpha is too large: cost function may on decrease on every iteration;may not converge.
OK.
Next topic is the implementation of Linear Regression in Matlab.
we use the Assignment 1 Of ML course to explain.
1 how to compute cost function?
function J = computeCost(X, y, theta)
%COMPUTECOST Compute cost for linear regression
% J = COMPUTECOST(X, y, theta) computes the cost of using theta as the
% parameter for linear regression to fit the data points in X and y
% Initialize some useful values
m = length(y); % number of training examples
% You need to return the following variables correctly
%J = 0;
% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta
% You should set J to the cost.
predictions = X*theta;
sqrErrors = (predictions - y).^2;
J = 1/(2*m)*sum(sqrErrors);
% =========================================================================
end
2 how to compute gradient descent
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
%GRADIENTDESCENT Performs gradient descent to learn theta
% theta = GRADIENTDESENT(X, y, theta, alpha, num_iters) updates theta by
% taking num_iters gradient steps with learning rate alpha
% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
% ====================== YOUR CODE HERE ======================
% Instructions: Perform a single gradient step on the parameter vector
% theta.
%
% Hint: While debugging, it can be useful to print out the values
% of the cost function (computeCost) and gradient here.
%
theta = theta - alpha/m*X'*(X*theta - y);
% ============================================================
% Save the cost J in every iteration
J_history(iter) = computeCost(X, y, theta);
end
end
3 how to solve normal equation
function [theta] = normalEqn(X, y)
%NORMALEQN Computes the closed-form solution to linear regression
% NORMALEQN(X,y) computes the closed-form solution to linear
% regression using the normal equations.
%theta = zeros(size(X, 2), 1);
% ====================== YOUR CODE HERE ======================
% Instructions: Complete the code to compute the closed form solution
% to linear regression and put the result in theta.
%
% ---------------------- Sample Solution ----------------------
theta = pinv(X'*X)*X'*y;
% -------------------------------------------------------------
% ============================================================
end
That's it!
Remember X is a matrix!
最后
以上就是安静海燕为你收集整理的Deep Learning 2: Linear Regression Note的全部内容,希望文章能够帮你解决Deep Learning 2: Linear Regression Note所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
发表评论 取消回复