前言
本文是多元線性回歸的練習,這里練習的是最簡單的二元線性回歸,參考斯坦福大學的教學網http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=DeepLearning&doc=exercises/ex2/ex2.html。本題給出的是50個數據樣本點,其中x為這50個小朋友到的年齡,年齡為2歲到8歲,年齡可有小數形式呈現。Y為這50個小朋友對應的身高,當然也是小數形式表示的。現在的問題是要根據這50個訓練樣本,估計出3.5歲和7歲時小孩子的身高。通過畫出訓練樣本點的分布憑直覺可以發現這是一個典型的線性回歸問題。
matlab函數介紹:
legend:
比如legend('Training data', 'Linear regression'),它表示的是標出圖像中各曲線標志所代表的意義,這里圖像的第一條曲線(其實是離散的點)表示的是訓練樣本數據,第二條曲線(其實是一條直線)表示的是回歸曲線。
hold on, hold off:
hold on指在前一幅圖的情況下打開畫紙,允許在上面繼續畫曲線。hold off指關閉前一副畫的畫紙。
linspace:
比如linspace(-3, 3, 100)指的是給出-3到3之間的100個數,均勻的選取,即線性的選取。
logspace:
比如logspace(-2, 2, 15),指的是在10^(-2)到10^(2)之間選取15個數,這些數按照指數大小來選取,即指數部分是均勻選取的,但是由於都取了10為底的指數,所以最終是服從指數分布選取的。
實驗結果:
訓練樣本散點和回歸曲線預測圖:
損失函數與參數之間的曲面圖:
損失函數的等高線圖:
程序代碼及注釋:
采用normal equations方法求解:
%%方法一 x = load('ex2x.dat'); y = load('ex2y.dat'); plot(x,y,'*') xlabel('height') ylabel('age') x = [ones(size(x),1),x]; w=inv(x'*x)*x'*y hold on %plot(x,0.0639*x+0.7502)
plot(x(:,2),0.0639*x(:,2)+0.7502)%更正后的代碼
采用gradient descend過程求解:
% Exercise 2 Linear Regression % Data is roughly based on 2000 CDC growth figures % for boys % % x refers to a boy's age % y is a boy's height in meters % clear all; close all; clc x = load('ex2x.dat'); y = load('ex2y.dat'); m = length(y); % number of training examples % Plot the training data figure; % open a new figure window plot(x, y, 'o'); ylabel('Height in meters') xlabel('Age in years') % Gradient descent x = [ones(m, 1) x]; % Add a column of ones to x theta = zeros(size(x(1,:)))'; % initialize fitting parameters MAX_ITR = 1500; alpha = 0.07; for num_iterations = 1:MAX_ITR % This is a vectorized version of the % gradient descent update formula % It's also fine to use the summation formula from the videos % Here is the gradient grad = (1/m).* x' * ((x * theta) - y); % Here is the actual update theta = theta - alpha .* grad; % Sequential update: The wrong way to do gradient descent % grad1 = (1/m).* x(:,1)' * ((x * theta) - y); % theta(1) = theta(1) + alpha*grad1; % grad2 = (1/m).* x(:,2)' * ((x * theta) - y); % theta(2) = theta(2) + alpha*grad2; end % print theta to screen theta % Plot the linear fit hold on; % keep previous plot visible plot(x(:,2), x*theta, '-') legend('Training data', 'Linear regression')%標出圖像中各曲線標志所代表的意義 hold off % don't overlay any more plots on this figure,指關掉前面的那幅圖 % Closed form solution for reference % You will learn about this method in future videos exact_theta = (x' * x)\x' * y % Predict values for age 3.5 and 7 predict1 = [1, 3.5] *theta predict2 = [1, 7] * theta % Calculate J matrix % Grid over which we will calculate J theta0_vals = linspace(-3, 3, 100); theta1_vals = linspace(-1, 1, 100); % initialize J_vals to a matrix of 0's J_vals = zeros(length(theta0_vals), length(theta1_vals)); for i = 1:length(theta0_vals) for j = 1:length(theta1_vals) t = [theta0_vals(i); theta1_vals(j)]; J_vals(i,j) = (0.5/m) .* (x * t - y)' * (x * t - y); end end % Because of the way meshgrids work in the surf command, we need to % transpose J_vals before calling surf, or else the axes will be flipped J_vals = J_vals'; % Surface plot figure; surf(theta0_vals, theta1_vals, J_vals) xlabel('\theta_0'); ylabel('\theta_1'); % Contour plot figure; % Plot J_vals as 15 contours spaced logarithmically between 0.01 and 100 contour(theta0_vals, theta1_vals, J_vals, logspace(-2, 2, 15))%畫出等高線 xlabel('\theta_0'); ylabel('\theta_1');%類似於轉義字符,但是最多只能是到參數0~9
參考資料: