stanford coursera 機器學習編程作業 exercise 3（邏輯回歸實現多分類問題）

本文轉載自查看原文 2016-11-21 18:37 12553 編程作業/ machine learning

本作業使用邏輯回歸(logistic regression)和神經網絡(neural networks)識別手寫的阿拉伯數字(0-9)

關於邏輯回歸的一個編程練習，可參考：Stanford coursera Andrew Ng 機器學習課程編程作業（Exercise 2）及總結

下面使用邏輯回歸實現多分類問題：識別手寫的阿拉伯數字(0-9)，使用神經網絡實現：識別手寫的阿拉伯數字(0-9)，請參考：神經網絡實現

數據加載到Matlab中的格式如下：

一共有5000個訓練樣本，每個訓練樣本是400維的列向量（20X20像素的 grayscale image），用矩陣 X 保存。樣本的結果(label of training set)保存在向量 y 中，y 是一個5000行1列的列向量。

比如 y = (1,2,3,4,5,6,7,8,9,10......)^T，注意，由於Matlab下標是從1開始的，故用 10 表示數字 0

①樣本數據的可視化

隨機選擇100個樣本數據，使用Matlab可視化的結果如下：

②使用邏輯回歸來實現多分類問題(one-vs-all)

所謂多分類問題，是指分類的結果為三類以上。比如，預測明天的天氣結果為三類：晴(用y==1表示)、陰(用y==2表示)、雨(用y==3表示)

分類的思想，其實與邏輯回歸分類(默認是指二分類，binary classification)很相似，對“晴天”進行分類時，將另外兩類(陰天和下雨)視為一類：(非晴天)，這樣，就把一個多分類問題轉化成了二分類問題。示意圖如下：（圖中的圓圈表示：不屬於某一類的所有其他類）

對於N分類問題(N>=3)，就需要N個假設函數(預測模型)，也即需要N組模型參數θ（θ一般是一個向量）

然后，對於每個樣本實例，依次使用每個模型預測輸出，選取輸出值最大的那組模型所對應的預測結果作為最終結果。

因為模型的輸出值，在sigmoid函數作用下，其實是一個概率值。，注意：h_θ⁽¹⁾(x)，h_θ⁽²⁾(x)，h_θ⁽³⁾(x)三組模型參數θ 一般是不同的。比如：

h_θ⁽¹⁾(x)，輸出預測為晴天(y==1)的概率

h_θ⁽²⁾(x)，輸出預測為陰天(y==2)的概率

h_θ⁽³⁾(x)，輸出預測為雨天(y==3)的概率

③Matlab代碼實現

對於上面的識別阿拉伯數字的問題，一共需要訓練出10個邏輯回歸模型，每個邏輯回歸模型對應着識別其中一個數字。

我們一共有5000個樣本，樣本的預測結果值就是：y=(1,2,3,4,5,6,7,8,9,10)，其中 10 代表數字0

我們使用Matlab fmincg庫函數來求解使得代價函數取最小值的模型參數θ

function [all_theta] = oneVsAll(X, y, num_labels, lambda)
%ONEVSALL trains multiple logistic regression classifiers and returns all
%the classifiers in a matrix all_theta, where the i-th row of all_theta 
%corresponds to the classifier for label i
%   [all_theta] = ONEVSALL(X, y, num_labels, lambda) trains num_labels
%   logisitc regression classifiers and returns each of these classifiers
%   in a matrix all_theta, where the i-th row of all_theta corresponds 
%   to the classifier for label i

% Some useful variables
m = size(X, 1);% num of samples
n = size(X, 2);% num of features

% You need to return the following variables correctly 
all_theta = zeros(num_labels, n + 1);

% Add ones to the X data matrix
X = [ones(m, 1) X];

% ====================== YOUR CODE HERE ======================
% Instructions: You should complete the following code to train num_labels
%               logistic regression classifiers with regularization
%               parameter lambda. 
%
% Hint: theta(:) will return a column vector.
%
% Hint: You can use y == c to obtain a vector of 1's and 0's that tell use 
%       whether the ground truth is true/false for this class.
%
% Note: For this assignment, we recommend using fmincg to optimize the cost
%       function. It is okay to use a for-loop (for c = 1:num_labels) to
%       loop over the different classes.
%
%       fmincg works similarly to fminunc, but is more efficient when we
%       are dealing with large number of parameters.
%
% Example Code for fmincg:
%
%     % Set Initial theta
%     initial_theta = zeros(n + 1, 1);
%     
%     % Set options for fminunc
%     options = optimset('GradObj', 'on', 'MaxIter', 50);
% 
%     % Run fmincg to obtain the optimal theta
%     % This function will return theta and the cost 
%     [theta] = ...
%         fmincg (@(t)(lrCostFunction(t, X, (y == c), lambda)), ...
%                 initial_theta, options);
%
initial_theta = zeros(n + 1, 1);

options = optimset('GradObj','on','MaxIter',50);

for c = 1:num_labels %num_labels 為邏輯回歸訓練器的個數，num of logistic regression classifiers all_theta(c, :) = fmincg(@(t)(lrCostFunction(t, X, (y == c),lambda)), initial_theta,options ); end
% =========================================================================
end

lrCostFunction，完全可參考：http://www.cnblogs.com/hapjin/p/6078530.html 里面的正則化的邏輯回歸模型實現costFunctionReg.m文件

下面來解釋一下 for循環：

num_labels 為分類器個數，共10個，每個分類器(模型)用來識別10個數字中的某一個。

我們一共有5000個樣本，每個樣本有400中特征變量，因此：模型參數θ 向量有401個元素。

initial_theta = zeros(n + 1, 1); % 模型參數θ的初始值(n == 400)

all_theta是一個10*401的矩陣，每一行存儲着一個分類器(模型)的模型參數θ 向量，執行上面for循環，就調用fmincg庫函數求出了所有模型的參數θ 向量了。

求出了每個模型的參數向量θ，就可以用訓練好的模型來識別數字了。對於一個給定的數字輸入(400個 feature variables) input instance，每個模型的假設函數h_θ⁽ⁱ⁾(x) 輸出一個值(i = 1,2,...10)。取這10個值中最大值那個值，作為最終的識別結果。比如g(h_θ⁽⁸⁾(x))==0.96 比其它所有的 g(h_θ⁽ⁱ⁾(x)) (i = 1,2,...10,但 i 不等於8) 都大，則識別的結果為數字 8

function p = predictOneVsAll(all_theta, X)
%PREDICT Predict the label for a trained one-vs-all classifier. The labels 
%are in the range 1..K, where K = size(all_theta, 1). 
%  p = PREDICTONEVSALL(all_theta, X) will return a vector of predictions
%  for each example in the matrix X. Note that X contains the examples in
%  rows. all_theta is a matrix where the i-th row is a trained logistic
%  regression theta vector for the i-th class. You should set p to a vector
%  of values from 1..K (e.g., p = [1; 3; 1; 2] predicts classes 1, 3, 1, 2
%  for 4 examples) 

m = size(X, 1);
num_labels = size(all_theta, 1);

% You need to return the following variables correctly 
p = zeros(size(X, 1), 1);

% Add ones to the X data matrix
X = [ones(m, 1) X];

% ====================== YOUR CODE HERE ======================
% Instructions: Complete the following code to make predictions using
%               your learned logistic regression parameters (one-vs-all).
%               You should set p to a vector of   (from 1 to
%               num_labels).
%
% Hint: This code can be done all vectorized using the max function.
%       In particular, the max function can also return the index of the 
%       max element, for more information see 'help max'. If your examples 
%       are in rows, then, you can use max(A, [], 2) to obtain the max 
%       for each row.
%       

[~,p] = max( X * all_theta',[],2); % 求矩陣(X*all_theta')每行的最大值，p 記錄矩陣每行的最大值的索引
% =========================================================================
end

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 【原】Coursera—Andrew Ng機器學習—編程作業 Programming Exercise 3—多分類邏輯回歸和神經網絡 stanford coursera 機器學習編程作業 exercise 5（正則化線性回歸及偏差和方差）【原】Coursera—Andrew Ng機器學習—編程作業 Programming Exercise 2——邏輯回歸 Stanford coursera Andrew Ng 機器學習課程編程作業（Exercise 2）及總結 Stanford coursera Andrew Ng 機器學習課程編程作業（Exercise 1） stanford coursera 機器學習編程作業 exercise 6（支持向量機-support vector machines）【原】Coursera—Andrew Ng機器學習—編程作業 Programming Exercise 1 線性回歸 stanford coursera 機器學習編程作業 exercise4--使用BP算法訓練神經網絡以識別阿拉伯數字(0-9) stanford coursera 機器學習編程作業 exercise 3（使用神經網絡識別手寫的阿拉伯數字(0-9)） Coursera-AndrewNg(吳恩達)機器學習筆記——第四周編程作業（多分類與神經網絡）