限制波爾茲曼機(Restricted Boltzmann Machines)

本文轉載自查看原文 2014-02-28 13:51 2683 Deep Learning

能量模型的概念從統計力學中得來，它描述着整個系統的某種狀態，系統越有序，系統能量波動越小，趨近於平衡狀態，系統越無序，能量波動越大。例如：一個孤立的物體，其內部各處的溫度不盡相同，那么熱就從溫度較高的地方流向溫度較低的地方，最后達到各處溫度都相同的狀態，也就是熱平衡的狀態。在統計力學中，系統處於某個狀態的相對概率為，即玻爾茲曼因子，其中T表示溫度，是玻爾茲曼常數，是狀態的能量。玻爾茲曼因子本身並不是一個概率，因為它還沒有歸一化。為了把玻爾茲曼因子歸一化，使其成為一個概率，我們把它除以系統所有可能的狀態的玻爾茲曼因子之和Z，稱為配分函數(partition function)。這便給出了玻爾茲曼分布。

玻爾茲曼機（Boltzmann Machine，BM）是一種特殊形式的對數線性的馬爾科夫隨機場（Markov Random Field，MRF），即能量函數是自由變量的線性函數。通過引入隱含單元，我們可以提升模型的表達能力，表示非常復雜的概率分布。限制性玻爾茲曼機（RBM）進一步加一些約束，在RBM中不存在可見單元與可見單元的鏈接，也不存在隱含單元與隱含單元的鏈接，如下圖所示

能量函數在限制玻爾茲曼機中定義為，b,c,W為模型的參數，b,c分別為可見層和隱含層的偏置，W為可見層與隱含層的鏈接權重

有了上述三個公式我們可以使用最大似然估計來求解模型的參數：設。把概率p(x)改寫為。

由於可見單元V和不可見單元h條件獨立，利用這一性質，我們可以得到：

logistic回歸估計v與h取一的概率：

有了以上條件，我們可以推導出參數變化的梯度值：

使用基於馬爾可夫鏈的gibbs抽樣，對於一個d維的隨機向量x=(x1,x2,…xd)，假設我們無法求得x的聯合概率分布p(x)，但我們知道給定x的其他分量是其第i個分量xi的條件分布，即p(xi|xi-),xi-=(x1,x2,…xi-1,xi+1…xd)。那么，我們可以從x的一個任意狀態（如(x1(0),x2(0),…,xd(0))）開始，利用條件分布p(xi|xi-)，迭代地對這狀態的每個分量進行抽樣，隨着抽樣次數n的增加，隨機變量(x1(n),x2(n),…,xd(n))的概率分布將以n的幾何級數的速度收斂到x的聯合概率分布p(v)。

基於RBM模型的對稱結構，以及其中節點的條件獨立行，我們可以使用Gibbs抽樣方法得到服從RBM定義的分布的隨機樣本。在RBM中進行k步Gibbs抽樣的具體算法為：用一個訓練樣本（或者可視節點的一個隨機初始狀態）初始化可視節點的狀態v0，交替進行下面的抽樣：

理論上，參數的每次更新需要讓上面的鏈條圖形遍歷一次，這樣帶來的性能損耗毫無疑問是不能承受的。

Hinton教授提出一種改進方法叫做對比分歧（Contrastive Divergence）,即CD-K。他指出CD沒有必要等待鏈收斂，樣本可以通過k步的gibbs抽樣完成，僅需要較少的抽樣步數（實驗中使用一步）就可以得到足夠好的效果。

下面給出RBM用到的CD-K算法偽代碼。

關於deeplearning的c++實現放到了github上，由於時間關系只是實現了大致框架，細節方面有待改善，也歡迎大家的參與：https://github.com/loujiayu/deeplearning

　　下面附上Geoff Hinton提供的關於RBM的matlab代碼

% Version 1.000 
%
% Code provided by Geoff Hinton and Ruslan Salakhutdinov 
%
% Permission is granted for anyone to copy, use, modify, or distribute this
% program and accompanying programs and documents for any purpose, provided
% this copyright notice is retained and prominently displayed, along with
% a note saying that the original programs are available from our
% web page.
% The programs and documents are distributed without any warranty, express or
% implied.  As the programs were written for research purposes only, they have
% not been tested to the degree that would be advisable in any important
% application.  All use of these programs is entirely at the user's own risk.

% This program trains Restricted Boltzmann Machine in which
% visible, binary, stochastic pixels are connected to
% hidden, binary, stochastic feature detectors using symmetrically
% weighted connections. Learning is done with 1-step Contrastive Divergence.   
% The program assumes that the following variables are set externally:
% maxepoch  -- maximum number of epochs
% numhid    -- number of hidden units 
% batchdata -- the data that is divided into batches (numcases numdims numbatches)
% restart   -- set to 1 if learning starts from beginning 

epsilonw      = 0.1;   % Learning rate for weights 
epsilonvb     = 0.1;   % Learning rate for biases of visible units 
epsilonhb     = 0.1;   % Learning rate for biases of hidden units 
weightcost  = 0.0002;   
initialmomentum  = 0.5;
finalmomentum    = 0.9;

[numcases numdims numbatches]=size(batchdata);

if restart ==1,
  restart=0;
  epoch=1;

% Initializing symmetric weights and biases. 
  vishid     = 0.1*randn(numdims, numhid);
  hidbiases  = zeros(1,numhid);
  visbiases  = zeros(1,numdims);

  poshidprobs = zeros(numcases,numhid);
  neghidprobs = zeros(numcases,numhid);
  posprods    = zeros(numdims,numhid);
  negprods    = zeros(numdims,numhid);
  vishidinc  = zeros(numdims,numhid);
  hidbiasinc = zeros(1,numhid);
  visbiasinc = zeros(1,numdims);
  batchposhidprobs=zeros(numcases,numhid,numbatches);
end

for epoch = epoch:maxepoch,
 fprintf(1,'epoch %d\r',epoch); 
 errsum=0;
 for batch = 1:numbatches,
 fprintf(1,'epoch %d batch %d\r',epoch,batch); 

%%%%%%%%% START POSITIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  data = batchdata(:,:,batch);
  poshidprobs = 1./(1 + exp(-data*vishid - repmat(hidbiases,numcases,1)));    
  batchposhidprobs(:,:,batch)=poshidprobs;
  posprods    = data' * poshidprobs;
  poshidact   = sum(poshidprobs);
  posvisact = sum(data);

%%%%%%%%% END OF POSITIVE PHASE  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  poshidstates = poshidprobs > rand(numcases,numhid);

%%%%%%%%% START NEGATIVE PHASE  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  negdata = 1./(1 + exp(-poshidstates*vishid' - repmat(visbiases,numcases,1)));
  neghidprobs = 1./(1 + exp(-negdata*vishid - repmat(hidbiases,numcases,1)));    
  negprods  = negdata'*neghidprobs;
  neghidact = sum(neghidprobs);
  negvisact = sum(negdata); 

%%%%%%%%% END OF NEGATIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  err= sum(sum( (data-negdata).^2 ));
  errsum = err + errsum;

   if epoch>5,
     momentum=finalmomentum;
   else
     momentum=initialmomentum;
   end;

%%%%%%%%% UPDATE WEIGHTS AND BIASES %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
    vishidinc = momentum*vishidinc + ...
                epsilonw*( (posprods-negprods)/numcases - weightcost*vishid);
    visbiasinc = momentum*visbiasinc + (epsilonvb/numcases)*(posvisact-negvisact);
    hidbiasinc = momentum*hidbiasinc + (epsilonhb/numcases)*(poshidact-neghidact);

    vishid = vishid + vishidinc;
    visbiases = visbiases + visbiasinc;
    hidbiases = hidbiases + hidbiasinc;

%%%%%%%%%%%%%%%% END OF UPDATES %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

  end
  fprintf(1, 'epoch %4i error %6.1f  \n', epoch, errsum); 
end;

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。