卷積神經網絡CNN——MATLAB deep learning工具箱學習筆記


這是MATLAB深度學習工具箱中CNN代碼的學習筆記。

工具箱可以從github上下載:https://github.com/rasmusbergpalm/DeepLearnToolbox

建議參考CNN代碼分析筆記:https://blog.csdn.net/u013007900/article/details/51428186

講解誤差反向傳播的筆記:https://blog.csdn.net/viatorsun/article/details/82696475

MATLAB卷積運算筆記:https://blog.csdn.net/baoxiao7872/article/details/80435214

推薦閱讀:《深度學習》

 

在CNN的示例中,使用自帶的數據(手寫數字的圖片)進行CNN的訓練和測試。

全部代碼名稱如下圖所示。

  

其中test_example_CNN為測試示例,mnist_uint8為數據,該部分代碼及注釋如下:

function test_example_CNN
load mnist_uint8;   %手寫數字樣本,每個樣本特征為28*28的向量

train_x = double(reshape(train_x',28,28,60000))/255;   %訓練數據,重塑數組為28*28,60000份,並歸一化
test_x = double(reshape(test_x',28,28,10000))/255;    %測試數據,10000份
train_y = double(train_y');
test_y = double(test_y');

%% ex1 Train a 6c-2s-12c-2s Convolutional neural network 
%will run 1 epoch in about 200 second and get around 11% error. 
%With 100 epochs you'll get around 1.2% error

rand('state',0)    %每次產生的隨機數都相同

cnn.layers = {
    struct('type', 'i') %input layer  輸入層
    struct('type', 'c', 'outputmaps', 6, 'kernelsize', 5) %convolution layer 卷積層
    % outputmaps:卷積輸出特征圖像個數 6
    % kernelsize:卷積核尺寸 5
    struct('type', 's', 'scale', 2) %sub sampling layer  降采樣層,功能類似於pooling
    % 降采樣尺寸 2
    struct('type', 'c', 'outputmaps', 12, 'kernelsize', 5) %convolution layer  卷積層
    % outputmaps:卷積輸出特征圖像個數 12
    % kernelsize:卷積核尺寸 5
    struct('type', 's', 'scale', 2) %subsampling layer  降采樣層
    %降采樣尺寸 2
};
% 此處定義神經網絡一共有5層:輸入層-卷積層-降采樣層-卷積層-降采樣層

opts.alpha = 1;   %學習效率
opts.batchsize = 50;   %批訓練樣本數量
opts.numepochs = 1;  %迭代次數

cnn = cnnsetup(cnn, train_x, train_y);   %CNN初始化
cnn = cnntrain(cnn, train_x, train_y, opts); %訓練CNN

[er, bad] = cnntest(cnn, test_x, test_y);  %測試CNN

%plot mean squared error
figure; plot(cnn.rL);    %畫出MSE,均方誤差
assert(er<0.12, 'Too big error');

 

根據代碼運行順序,接下來運行cnnsetup,對CNN中的參數進行初始化,主要設置卷積核初始值和偏置初始值。代碼及注釋如下:

%% 初始化CNN參數
% 卷積核,偏置,尾部單層感知機

function net = cnnsetup(net, x, y)
    assert(~isOctave() || compare_versions(OCTAVE_VERSION, '3.8.0', '>='), ['Octave 3.8.0 or greater is required for CNNs as there is a bug in convolution in previous versions. See http://savannah.gnu.org/bugs/?39314. Your version is ' myOctaveVersion]);
    inputmaps = 1;   % 每次輸入map個數
    mapsize = size(squeeze(x(:, :, 1)));   
    % 將3維數據壓縮成2維數據並計算矩陣尺寸
    % mapsize = [28, 28]
    for l = 1 : numel(net.layers)   % 對各層神經網絡的參數進行初始化設置
        if strcmp(net.layers{l}.type, 's')    %subsampling layer 若為降采樣層
            mapsize = mapsize / net.layers{l}.scale;
            % 若l=3,mapsize = [24, 24]/2 = [12, 12]
            % 若l=5,mapsize = [8, 8]/2 = [4, 4]
            assert(all(floor(mapsize)==mapsize), ['Layer ' num2str(l) ' size must be integer. Actual: ' num2str(mapsize)]);
            for j = 1 : inputmaps
                net.layers{l}.b{j} = 0;  %一個降采樣層的所有輸入map,偏置b初始化為0
            end
        end
        if strcmp(net.layers{l}.type, 'c')   %卷積層
            mapsize = mapsize - net.layers{l}.kernelsize + 1;  
            % 卷積后map的尺寸(默認步長stride為1)
            % 計算方式:(原圖尺寸 - 卷積核尺寸)/步長 + 1
            % 若l=2,mapsize = [28, 28] - 5 + 1 = [24, 24]
            % 若l=4,mapsize = [12, 12] - 5 + 1 = [8, 8]
            fan_out = net.layers{l}.outputmaps * net.layers{l}.kernelsize ^ 2;   
            % 此次卷積核神經元總數
            % 若l=2,fan_out = 6 * 5^2 = 150
            % 若l=4,fan_out = 12 * 5^2 = 300
            for j = 1 : net.layers{l}.outputmaps  %  output map
                fan_in = inputmaps * net.layers{l}.kernelsize ^ 2;   
                % 每個輸出map對應卷積核神經元總數
                % 若l=2,j=1, fan_in = 1 * 5^2 = 25;
                % 若l=4,j=1,fan_in= 6 * 5^2 = 150
                for i = 1 : inputmaps  %  input map
                    net.layers{l}.k{i}{j} = (rand(net.layers{l}.kernelsize) - 0.5) * 2 * sqrt(6 / (fan_in + fan_out));  
                    % 對每個卷積核值進行初始化
                    % k{i}{j} = (隨機5*5矩陣 - 0.5)* 2 * sqrt(6/(fan_in + fan_out))
                    % 若l=2,該層共有1*6=6個卷積核
                    % 若l=4,該層共有6*12=72個卷積核,上層神經網絡生成的每個特征圖像對應6個卷積核
                end
                net.layers{l}.b{j} = 0;   
                % 偏置為0
                % 每個輸出map只有一個bias,並非每個filter一個bias
            end
            inputmaps = net.layers{l}.outputmaps;   %更新下一層的輸入特征圖像個數
            % 若l=2,inputmaps = 6
            % 若l=4,inputmaps = 12
        end
    end
    % 'onum' is the number of labels, that's why it is calculated using size(y, 1). If you have 20 labels so the output of the network will be 20 neurons.
    % 'fvnum' is the number of output neurons at the last layer, the layer just before the output layer.
    % 'ffb' is the biases of the output neurons.
    % 'ffW' is the weights between the last layer and the output neurons. Note that the last layer is fully connected to the output layer, that's why the size of the weights is (onum * fvnum)
    fvnum = prod(mapsize) * inputmaps;   
    % 最后一層(輸出層前一層)神經元數量
    % fvnum = 4 * 4 * 12 = 196
    onum = size(y, 1);    %標簽總個數
    % onum = 10
    net.ffb = zeros(onum, 1);     %輸出神經元偏置
    net.ffW = (rand(onum, fvnum) - 0.5) * 2 * sqrt(6 / (onum + fvnum));   %最后一層與輸出神經元的連接權重
    % ffW = (隨機10*196矩陣 - 0.5) * 2 * sqrt(6 / (10 + 196))
end

接下來運行cnntrain,利用分批數據訓練神經網絡,代碼及注釋如下:

%% 訓練CNN
function net = cnntrain(net, x, y, opts)
    m = size(x, 3);    
    % 訓練樣本總個數
    % m = 60000
    numbatches = m / opts.batchsize;    
    % 能夠分成的批次總數
    % numbatches = 60000/50
    if rem(numbatches, 1) ~= 0
        error('numbatches not integer');
    end
    net.rL = [];
    for i = 1 : opts.numepochs    %迭代次數
        disp(['epoch ' num2str(i) '/' num2str(opts.numepochs)]);
        tic;    %計時
        kk = randperm(m);     %生成1~m隨機序號向量
        for l = 1 : numbatches
            batch_x = x(:, :, kk((l - 1) * opts.batchsize + 1 : l * opts.batchsize));    %隨機抽取一批樣本
            batch_y = y(:,    kk((l - 1) * opts.batchsize + 1 : l * opts.batchsize));

            net = cnnff(net, batch_x);     %前向過程
            net = cnnbp(net, batch_y);    %計算反向誤差,計算梯度
            net = cnnapplygrads(net, opts);  %卷積核權重更新
            if isempty(net.rL)
                net.rL(1) = net.L;
            end
            net.rL(end + 1) = 0.99 * net.rL(end) + 0.01 * net.L;
            % net.L為損失函數MSE
            % net.rL為損失函數的平滑序列
        end
        toc;
    end  
end

在cnntrain中,根據運行順序,首先運行cnnff,進行前向過程,將數據輸入神經網絡,獲得相應的輸出。代碼及注釋如下:

%% 前向過程
function net = cnnff(net, x)
    n = numel(net.layers);   % 神經網絡層數 n=5
    net.layers{1}.a{1} = x;   
    % 第一層神經網絡(輸入層)
    % a是輸入map,為一個[28, 28, 50]的矩陣(具體情況具體定)
    inputmaps = 1;

    for l = 2 : n   %  for each layer
        if strcmp(net.layers{l}.type, 'c')   %卷積層
            %  !!below can probably be handled by insane matrix operations
            for j = 1 : net.layers{l}.outputmaps   %  for each output map
                %  create temp output map
                z = zeros(size(net.layers{l - 1}.a{1}) - [net.layers{l}.kernelsize - 1 net.layers{l}.kernelsize - 1 0]);
                % z用於存儲輸出特征圖像值
                % 若l=2,size(net.layers{l - 1}.a{1}) = [28, 28, 50],
                % z = zeros([28, 28, 50] - [5 - 1, 5 - 1, 0]) = zeros([24, 24, 50])
                for i = 1 : inputmaps   %  for each input map
                    %  convolve with corresponding kernel and add to temp output map
                    z = z + convn(net.layers{l - 1}.a{i}, net.layers{l}.k{i}{j}, 'valid');
                    % 將輸入與卷積核進行卷積運算,輸出未被填充0的部分
                end
                %  add bias, pass through nonlinearity
                net.layers{l}.a{j} = sigm(z + net.layers{l}.b{j});  
                % 加偏置,采用sigmoid函數進行非線性化
                % 獲得激活函數結果,作為該層輸出
            end
            %  set number of input maps to this layers number of outputmaps
            inputmaps = net.layers{l}.outputmaps;
            % 下一層的輸入為該層輸出特征圖像個數
        elseif strcmp(net.layers{l}.type, 's')   %降采樣層
            %  downsample
            for j = 1 : inputmaps
                z = convn(net.layers{l - 1}.a{j}, ones(net.layers{l}.scale) / (net.layers{l}.scale ^ 2), 'valid');   %  !! replace with variable
                % 上一層輸出與2*2且值全為1/4的矩陣進行卷積運算,返回未被填充的部分
                net.layers{l}.a{j} = z(1 : net.layers{l}.scale : end, 1 : net.layers{l}.scale : end, :);
            end
        end
    end
    % 尾部單層感知機
    %  concatenate all end layer feature maps into vector
    net.fv = [];
    for j = 1 : numel(net.layers{n}.a)
        sa = size(net.layers{n}.a{j});    %最后一層
        % sa = [4, 4, 50]
        net.fv = [net.fv; reshape(net.layers{n}.a{j}, sa(1) * sa(2), sa(3))];
        % 重塑為[12, 16, 50];
    end
    % feedforward into output perceptrons
    net.o = sigm(net.ffW * net.fv + repmat(net.ffb, 1, size(net.fv, 2)));   %輸出
    % sigmoid函數非線性化
    % sigmoid([10, 196] * [196, 50] + 50份偏置)
    % 輸出乘以權重
end

在cnntrain中,繼續運行cnnbp,計算反向誤差和梯度。這一部分比較難理解,建議先看一下反向誤差傳播的原理。代碼及注釋如下:

function net = cnnbp(net, y)
    n = numel(net.layers);
    %   error
    net.e = net.o - y;    %神經網絡前向過程的輸出與期望輸出的誤差
    %  loss function
    net.L = 1/2* sum(net.e(:) .^ 2) / size(net.e, 2);   
    % 損失函數:均方誤差(1/2方便計算微分)
    %%  backprop deltas   誤差反向傳播
    net.od = net.e .* (net.o .* (1 - net.o));   %  output delta   
    % 輸出層誤差向上一層(單層感知機)傳遞
    % error * output * (1-output) 為損失函數相對於參數的偏微分,沒考慮學習速度
    net.fvd = (net.ffW' * net.od);              
    % feature vector delta 特征向量誤差傳遞到單層感知機
    if strcmp(net.layers{n}.type, 'c')         %  only conv layers has sigm function 前一層為卷積層時
        net.fvd = net.fvd .* (net.fv .* (1 - net.fv));   
        % sigmoid求導,誤差再求導一次,因為卷積結果進行了非線性化
    end
    %  reshape feature vector deltas into output map style
    sa = size(net.layers{n}.a{1});   %最后一層輸出map尺寸4*4,共12個,50張
    fvnum = sa(1) * sa(2);   %4*4
    for j = 1 : numel(net.layers{n}.a)   %j=1:12
        net.layers{n}.d{j} = reshape(net.fvd(((j - 1) * fvnum + 1) : j * fvnum, :), sa(1), sa(2), sa(3));
        % 4*4*50 誤差矩陣
    end

    for l = (n - 1) : -1 : 1   %從后向前
        if strcmp(net.layers{l}.type, 'c')  %卷積層,誤差從降采樣層獲得
            for j = 1 : numel(net.layers{l}.a)
                net.layers{l}.d{j} = net.layers{l}.a{j} .* (1 - net.layers{l}.a{j}) .* (expand(net.layers{l + 1}.d{j}, [net.layers{l + 1}.scale net.layers{l + 1}.scale 1]) / net.layers{l + 1}.scale ^ 2);    
                % expand:多項式展開相乘,將后一層的誤差矩陣展開還原(降采樣的逆過程)
                % 仍然為error*output*(1-output)形式
            end
        elseif strcmp(net.layers{l}.type, 's')  %降采樣層,誤差從卷積層獲得,進行反卷積過程
            for i = 1 : numel(net.layers{l}.a)
                z = zeros(size(net.layers{l}.a{1}));
                for j = 1 : numel(net.layers{l + 1}.a)
                     z = z + convn(net.layers{l + 1}.d{j}, rot180(net.layers{l + 1}.k{i}{j}), 'full');    %卷積核旋轉180度,反卷積
                end
                net.layers{l}.d{i} = z;
            end
        end
    end

    %%  calc gradients   計算梯度
    for l = 2 : n   %從前向后
        if strcmp(net.layers{l}.type, 'c')        %卷積層
            for j = 1 : numel(net.layers{l}.a)
                for i = 1 : numel(net.layers{l - 1}.a)  %前一層
                    net.layers{l}.dk{i}{j} = convn(flipall(net.layers{l - 1}.a{i}), net.layers{l}.d{j}, 'valid') / size(net.layers{l}.d{j}, 3);   
                    % 卷積核修改量=輸入圖像*輸出圖像誤差矩陣
                end
                net.layers{l}.db{j} = sum(net.layers{l}.d{j}(:)) / size(net.layers{l}.d{j}, 3);  %偏置
            end
        end
    end
    % 計算單層感知機梯度(修改量)
    net.dffW = net.od * (net.fv)' / size(net.od, 2);    % 權重修改量
    net.dffb = mean(net.od, 2); % 偏置修改量

    function X = rot180(X)
        X = flipdim(flipdim(X, 1), 2);
    end
end

在cnntrain中,繼續運行cnnapplygrads,根據計算的修改量,更新卷積核的權重。這部分代碼我的工具箱里沒有,因此我從網上的代碼中復制了一份。代碼及注釋如下:

function net = cnnapplygrads(net, opts) %使用梯度
    %特征抽取層(卷機降采樣)的權重更新
    for l = 2 : numel(net.layers) %從第二層開始
        if strcmp(net.layers{l}.type, 'c')%對於每個卷積層
            for j = 1 : numel(net.layers{l}.a)%枚舉該層的每個輸出
                %枚舉所有卷積核net.layers{l}.k{ii}{j}
                for ii = 1 : numel(net.layers{l - 1}.a)%枚舉上層的每個輸出
                    net.layers{l}.k{ii}{j} = net.layers{l}.k{ii}{j} - opts.alpha * net.layers{l}.dk{ii}{j};
                    % 修正卷積核值
                end
                % 修正偏置bias
                net.layers{l}.b{j} = net.layers{l}.b{j} - opts.alpha * net.layers{l}.db{j};
            end
        end
    end
    %單層感知機的權重更新
    net.ffW = net.ffW - opts.alpha * net.dffW;
    net.ffb = net.ffb - opts.alpha * net.dffb;
end

自此,CNN的訓練已經完成。接下來利用cnntest測試訓練的深度學習神經網絡分類准確程度如何。

function [er, bad] = cnntest(net, x, y)
    %  feedforward
    net = cnnff(net, x);   % 前向傳播
    [~, h] = max(net.o);  %輸出結果最大值
    [~, a] = max(y);   
    bad = find(h ~= a);  % 預測錯誤的樣本數量

    er = numel(bad) / size(y, 2);  % 計算錯誤概率
end

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM