《機器學習》學習筆記（一）：線性回歸、邏輯回歸

本文轉載自查看原文 2015-03-04 19:54 7547 matlab/ 機器學習/ 計算機視覺/ 邏輯回歸/ 線性回歸

本筆記主要記錄學習《機器學習》的總結體會。如有理解不到位的地方，歡迎大家指出，我會努力改正。

在學習《機器學習》時，我主要是通過Andrew Ng教授在mooc上提供的《Machine Learning》課程，不得不說Andrew Ng老師在講授這門課程時，真的很用心，特別是編程練習，這門課真的很nice，在此謝謝Andrew Ng老師的付出。同時也謝過告知這個平台的小伙伴。本文在寫的過程中，多有借鑒Andrew Ng教授在mooc提供的資料，再次感謝。

轉載請注明出處：http://blog.csdn.net/u010278305

什么是機器學習？我認為機器學習就是，給定一定的信息（如一間房子的面子，一幅圖片每個點的像素值等等），通過對這些信息進行“學習”，得出一個“學習模型“，這個模型可以在有該類型的信息輸入時，輸出我們感興趣的結果。好比我們如果要進行手寫數字的識別，已經給定了一些已知信息（一些圖片和這些圖片上的手寫數字是多少），我們可以按以下步驟進行學習：

1、將這些圖片每個點的像素值與每個圖片的手寫數字值輸入”學習系統“。

2、通過”學習過程“，我們得到一個”學習模型“，這個模型可以在有新的手寫數字的圖片輸入時，給出這張圖片對應手寫數字的合理估計。

什么是線性回歸？我的理解就是，用一個線性函數對提供的已知數據進行擬合，最終得到一個線性函數，使這個函數滿足我們的要求（如具有最小平方差,隨后我們將定義一個代價函數，使這個目標量化），之后我們可以利用這個函數，對給定的輸入進行預測（例如，給定房屋面積，我們預測這個房屋的價格）。如下圖所示：

假設我們最終要的得到的假設函數具有如下形式：

其中，x是我們的輸入，theta是我們要求得的參數。

代價函數如下：

我們的目標是使得此代價函數具有最小值。

為此，我們還需要求得代價函數關於參量theta的導數，即梯度，具有如下形式：

有了這些信息之后，我們就可以用梯度下降算法來求得theta參數。過程如下：

其實，為了求得theta參數，有更多更好的算法可以選擇，我們可以通過調用matlab的fminunc函數實現,而我們只需求出代價與梯度，供該函數調用即可。

根據以上公式，我們給出代價函數的具體實現：

function J = computeCostMulti(X, y, theta)
%COMPUTECOSTMULTI Compute cost for linear regression with multiple variables
%   J = COMPUTECOSTMULTI(X, y, theta) computes the cost of using theta as the
%   parameter for linear regression to fit the data points in X and y

% Initialize some useful values
m = length(y); % number of training examples

% You need to return the following variables correctly 
J = 0;

% Instructions: Compute the cost of a particular choice of theta
%               You should set J to the cost.
hThetaX=X*theta;
J=1/(2*m)*sum((hThetaX-y).^2);

end

什么是邏輯回歸？相比於線性回歸，邏輯回歸只會輸出一些離散的特定值（例如判定一封郵件是否為垃圾郵件，輸出只有0和1），而且對假設函數進行了處理，使得輸出只在0和1之間。

假設函數如下：

代價函數如下：

梯度函數如下，觀察可知，形式與線性回歸時一樣：

有了這些信息，我們就可以通過fminunc求出最優的theta參數，我們只需給出代價與梯度的計算方式，代碼如下：

function [J, grad] = costFunction(theta, X, y)
%COSTFUNCTION Compute cost and gradient for logistic regression
%   J = COSTFUNCTION(theta, X, y) computes the cost of using theta as the
%   parameter for logistic regression and the gradient of the cost
%   w.r.t. to the parameters.

% Initialize some useful values
m = length(y); % number of training examples

% You need to return the following variables correctly 
J = 0;
grad = zeros(size(theta));

% Instructions: Compute the cost of a particular choice of theta.
%               You should set J to the cost.
%               Compute the partial derivatives and set grad to the partial
%               derivatives of the cost w.r.t. each parameter in theta
%
% Note: grad should have the same dimensions as theta
%
hThetaX=sigmoid(X * theta);
J=1/m*sum(-y.*log(hThetaX)-(1-y).*log(1-hThetaX));
grad=(1/m*(hThetaX-y)'*X)';

end

其中，sigmod函數如下：

function g = sigmoid(z)
%SIGMOID Compute sigmoid functoon
%   J = SIGMOID(z) computes the sigmoid of z.

% You need to return the following variables correctly 
g = zeros(size(z));

% Instructions: Compute the sigmoid of each value of z (z can be a matrix,
%               vector or scalar).
e=exp(1);
g=1./(1+e.^-z);

end

有時，會出現”過擬合“的情況，即求得的參數能夠很好的擬合訓練集中的數據，但在進行預測時，明顯與趨勢不符，好比下圖所示：

此時，我們需要進行正則化處理，對參數進行懲罰，使得除theta(1)之外的theta值均保持較小值。

進行正則化之后的代價函數如下：

進行正則化之后的梯度如下：

下面給出正則化之后的代價與梯度值得代碼：

function [J, grad] = costFunctionReg(theta, X, y, lambda)
%COSTFUNCTIONREG Compute cost and gradient for logistic regression with regularization
%   J = COSTFUNCTIONREG(theta, X, y, lambda) computes the cost of using
%   theta as the parameter for regularized logistic regression and the
%   gradient of the cost w.r.t. to the parameters. 

% Initialize some useful values
m = length(y); % number of training examples

% You need to return the following variables correctly 
J = 0;
grad = zeros(size(theta));

% Instructions: Compute the cost of a particular choice of theta.
%               You should set J to the cost.
%               Compute the partial derivatives and set grad to the partial
%               derivatives of the cost w.r.t. each parameter in theta
hThetaX=sigmoid(X * theta);
theta(1)=0;
J=1/m*sum(-y.*log(hThetaX)-(1-y).*log(1-hThetaX))+lambda/(2*m)*sum(theta.^2);
grad=(1/m*(hThetaX-y)'*X)' + lambda/m*theta;

end

對於線性回歸，正則化的過程基本類似。

至於如何選擇正則化時的常數lambda，我們可以將數據分為訓練集、交叉驗證集和測試集三部分，在不同lambda下，先用訓練集求出參數theta，之后求出訓練集與交叉驗證集的代價，通過分析得出適合的lambda。如下圖所示：

轉載請注明出處：http://blog.csdn.net/u010278305

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 機器學習基礎---邏輯回歸（假設函數與線性回歸不同）機器學習總結（六）線性回歸與邏輯回歸機器學習之線性回歸機器學習筆記（三）決策樹、線性回歸機器學習 | 算法筆記- 線性回歸（Linear Regression） Stanford機器學習筆記-1.線性回歸菜鳥筆記Python3——機器學習(二) 邏輯回歸算法機器學習——從線性回歸到邏輯回歸【附詳細推導和代碼】機器學習--線性回歸與梯度算法從線性回歸走進機器學習