機器學習 —— 概率圖模型(Homework: Exact Inference)


  在前三周的作業中,我構造了概率圖模型並調用第三方的求解器對器進行了求解,最終獲得了每個隨機變量的分布(有向圖),最大后驗分布(雙向圖)。本周作業的主要內容就是自行編寫概率圖模型的求解器。實際上,從根本上來說求解器並不是必要的。其作用只是求取邊緣分布或者MAP,在得到聯合CPD后,尋找聯合CPD的最大值即可獲得MAP,對每個變量進行邊緣分布求取即可獲得邊緣分布。但是,這種簡單粗暴的方法效率極其低下,對於MAP求取而言,每次得到新的evidance時都要重新搜索CPD,對於單個變量分布而言,更是對每個變量都要反復進行整體邊緣化。以一個長度為6字母的單詞為例,聯合CPD有着多達26^6個數據,反復操作會浪費大量的計算資源。

1、團樹算法初始化

  團樹算法背后的思路是分而治之。對於一組隨機變量ABCDEFG,如果A和其他變量之間是獨立的,那么無論是求邊緣分布還是MAP都可以將A單獨考慮。如果ABC聯系比較緊密,CDE聯系比較緊密,那么如果兩個團關於C的邊緣分布是相同的,則我們沒有必要將ABCDE全部乘在一起再來分別求各個變量的邊緣分布。因為反過來想,乘的時候也只是把對應的C乘起來,如果C的邊緣分布相同,在相乘的時候其實兩個團之間並沒有引入其他信息,此時乘法不會對ABDE的邊緣分布產生影響。團樹算法的數學過程和Variable Elimination是相同的。

  PGM在計算機中的表達是factorLists,factor的var(i),var表示節點連接關系。val描述了factor中var的關系。cliqueTree其實是一種特殊的factorLists,它的var是clique,表示一堆聚類的var。它的val表示的還是var之間的關系。只不過此時var之間的連接不復存在了。所以clique由兩個變量組成:1、cliqueTree 2、edges.

  團樹算法的初始化可以分為兩個過程:1、將變量抱團;2、獲取團的初始勢;

  變量抱團是一個玄學過程,因為有很多不同的抱法,而且還都是對的。比較常見的是最小邊,最小割等...其實如果是人來判斷很容易就能得到結果,但是使用計算機算法則要費一些功夫了。不過這不涉及我們對團樹算法的理解,所以Koller教授代勞了。

  團的初始勢表示團里變量之間的關系。其算法如下,需要注意的是不能重復使用factor.因為一個factor表達了一種關系,如果兩個團里都有同一個factor,那么就是...這個事情。。。你幫他重復一遍。。。等於你也有責任的,曉得吧?

  

 1 %COMPUTEINITIALPOTENTIALS Sets up the cliques in the clique tree that is
 2 %passed in as a parameter.
 3 %
 4 %   P = COMPUTEINITIALPOTENTIALS(C) Takes the clique tree skeleton C which is a
 5 %   struct with three fields:
 6 %   - nodes: cell array representing the cliques in the tree.
 7 %   - edges: represents the adjacency matrix of the tree.
 8 %   - factorList: represents the list of factors that were used to build
 9 %   the tree. 
10 %   
11 %   It returns the standard form of a clique tree P that we will use through 
12 %   the rest of the assigment. P is struct with two fields:
13 %   - cliqueList: represents an array of cliques with appropriate factors 
14 %   from factorList assigned to each clique. Where the .val of each clique
15 %   is initialized to the initial potential of that clique.
16 %   - edges: represents the adjacency matrix of the tree. 
17 %
18 % Copyright (C) Daphne Koller, Stanford University, 2012
19 
20 
21 
22 function P = ComputeInitialPotentials(C)
23 Input = C;
24 % number of cliques
25 N = length(Input.nodes);
26 
27 % initialize cluster potentials 
28 P.cliqueList = repmat(struct('var', [], 'card', [], 'val', []), N, 1);
29 P.edges = zeros(N);
30 
31 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
32 % YOUR CODE HERE
33 %
34 % First, compute an assignment of factors from factorList to cliques. 
35 % Then use that assignment to initialize the cliques in cliqueList to 
36 % their initial potentials. 
37 
38 % C.nodes is a list of cliques.
39 % So in your code, you should start with: P.cliqueList(i).var = C.nodes{i};
40 % Print out C to get a better understanding of its structure.
41 %
42 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
43 % N_factors = length(C.factorList);
44 for i = 1:N
45     k = 1;
46     clear clique_
47      N_factors = length(Input.factorList);
48     for j = 1:N_factors
49         if min(ismember(Input.factorList(j).var,Input.nodes{i}))
50             clique_(k) = Input.factorList(j);
51             k = k+1;
52             Input.factorList(j) =struct('var', [], 'card', [], 'val', []);
53         end
54     end
55     Joint_Dis_cliq = ComputeJointDistribution(clique_);
56     Joint_Dis_cliq_std = StandardizeFactors(Joint_Dis_cliq);
57     P.cliqueList(i) = Joint_Dis_cliq_std;
58 end
59 P.edges = Input.edges;
60 end
View Code

2、團樹的校准

  繼續之前的例子,ABC聯系比較緊密,CDE聯系比較緊密,所以抱成了兩個團。如果其關於C的邊緣分布相同,那么我們則可以在直接對兩個團求ABDE的邊緣分布,而不用乘起來了。然而令人悲傷的是現實中往往C的邊緣分布是不同的。這時就需要對團樹進行校准,希望經過“校准”這個操作后,兩邊關於C達成了一致意見。顯然,一棵校准后的團樹求任意一個變量的邊緣分布都是方便的,只要對很小規模的聯合分布進行邊際化就行。

  要使得兩邊關於C的意見達成一致,最簡單的方法就是把C在“A團”中的邊緣分布乘以"E團”的勢。反過來再把A在“E團”中的邊緣分布乘以A團的勢。那么此時C在兩個團中的邊緣分布就完全一樣了 all = margin(C,A)*margin(C,E)。此即為團樹校准的朴素想法。在數學上,團樹的校准依然來自VE算法。讓AB領盒飯后,C繼續參加下一輪的VE。AB領盒飯剩下的C就是C在A團中的邊緣分布。

  團樹校准的關鍵是知道消息傳播的順序。消息一般先由葉向根傳遞,再由根向葉傳遞。並且,一個團在得到其所有鄰團的消息之前,不能向下一個團傳遞消息。消息傳遞順序獲取算法如下:

  

 1 %COMPUTEINITIALPOTENTIALS Sets up the cliques in the clique tree that is
 2 %passed in as a parameter.
 3 %
 4 %   P = COMPUTEINITIALPOTENTIALS(C) Takes the clique tree skeleton C which is a
 5 %   struct with three fields:
 6 %   - nodes: cell array representing the cliques in the tree.
 7 %   - edges: represents the adjacency matrix of the tree.
 8 %   - factorList: represents the list of factors that were used to build
 9 %   the tree. 
10 %   
11 %   It returns the standard form of a clique tree P that we will use through 
12 %   the rest of the assigment. P is struct with two fields:
13 %   - cliqueList: represents an array of cliques with appropriate factors 
14 %   from factorList assigned to each clique. Where the .val of each clique
15 %   is initialized to the initial potential of that clique.
16 %   - edges: represents the adjacency matrix of the tree. 
17 %
18 % Copyright (C) Daphne Koller, Stanford University, 2012
19 
20 
21 
22 function P = ComputeInitialPotentials(C)
23 Input = C;
24 % number of cliques
25 N = length(Input.nodes);
26 
27 % initialize cluster potentials 
28 P.cliqueList = repmat(struct('var', [], 'card', [], 'val', []), N, 1);
29 P.edges = zeros(N);
30 
31 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
32 % YOUR CODE HERE
33 %
34 % First, compute an assignment of factors from factorList to cliques. 
35 % Then use that assignment to initialize the cliques in cliqueList to 
36 % their initial potentials. 
37 
38 % C.nodes is a list of cliques.
39 % So in your code, you should start with: P.cliqueList(i).var = C.nodes{i};
40 % Print out C to get a better understanding of its structure.
41 %
42 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
43 % N_factors = length(C.factorList);
44 for i = 1:N
45     k = 1;
46     clear clique_
47      N_factors = length(Input.factorList);
48     for j = 1:N_factors
49         if min(ismember(Input.factorList(j).var,Input.nodes{i}))
50             clique_(k) = Input.factorList(j);
51             k = k+1;
52             Input.factorList(j) =struct('var', [], 'card', [], 'val', []);
53         end
54     end
55     Joint_Dis_cliq = ComputeJointDistribution(clique_);
56     Joint_Dis_cliq_std = StandardizeFactors(Joint_Dis_cliq);
57     P.cliqueList(i) = Joint_Dis_cliq_std;
58 end
59 P.edges = Input.edges;
60 end
View Code

  

  在獲取消息傳遞順序之后,則可進一步對被傳遞的消息進行計算。被傳遞的消息應為某個團對被傳播變量的“所有認知”,所有認知則包括該團本身對該消息的認知,以及該團收到的“情報”。需要注意的是,向下家報告情報的時候要對所有信息進行總結,但是不能將下家告訴你的事情重復一遍。因為。。。重復一遍你也有責任的,知道吧。。。。

  

 1     while (1)
 2 
 3       [i,j]=GetNextCliques(P,MESSAGES);
 4 
 5       if i == 0
 6         break
 7       end
 8 
 9       to_be_summed =  setdiff(P.cliqueList(i).var,P.cliqueList(j).var);
10       to_be_propogan  =  setdiff(P.cliqueList(i).var,to_be_summed);
11 
12       tmp_ = 1;
13       clear factorList
14       for k = 1:N
15           if P.edges(i,k)==1&&k~=j&&~isempty(MESSAGES(k,i).var)
16               factorList(tmp_) = MESSAGES(k,i); 
17               tmp_ = tmp_+1;
18           end
19       end
20       factorList(tmp_) = P.cliqueList(i);
21       MESSAGES(i,j) = ComputeMarginal(to_be_propogan,ComputeJointDistribution(factorList),[]);
22     end
View Code

  在消息完成從頂向下以及從下到上的傳播后,每個團需要根據周邊傳來的消息進行總結。也就是把消息與本身的勢相乘(消息是一種邊緣分布)

 1 N = length(P.cliqueList);
 2     for i = 1:N
 3         tmp_ = 1;
 4         for k = 1:N
 5           if P.edges(i,k)==1
 6               factorList(tmp_) = MESSAGES(k,i); 
 7               tmp_ = tmp_+1;
 8           end
 9         end
10         factorList(tmp_) = P.cliqueList(i);
11         belief(i) = ComputeJointDistribution(factorList);
12         clear factorList
13     end
View Code

  此時,團樹稱為已經校准。對各個團的中的變量進行marginal就可以得到每個變量的邊緣分布了。

3、基於團樹的MAP估計

  在很多時候,我們可能對單個變量的分布並不感興趣,而是對[ABCDE]這個組合取哪個值概率最大感興趣。這個思想可以用於信號解碼,OCR,圖像處理等領域。很多時候我們不關心單個像素的label是啥,只關心分割出來的像素塊label是啥。這類問題稱為最大后驗估計(MAP)。

            argmaxP(AB)  = argmaxP(A)P(B|A) = argmax_a{P(A){argmax_bP(B|A)}

  顯然,從上述過程中,很容易聯想到之前提到的邊際。只不過這里把邊際換成了argmax。P(A){argmax_bP(B|A)}的結果依舊是分布,只不過這個分布的前提是無論A取哪個值,其assignment to val都對應着argmax_b。也就是說,此時如果選擇最大的val,那么assignment則對應的是argmax_ab。這種操作的意義就在於可以對一組變量的MAP分而治之,最終單個變量的MAP就是全局MAP的一部分。此時的MESSAGE計算如下:

  

 1 for i = 1:N
 2     P.cliqueList(i).val = log(P.cliqueList(i).val);
 3 end
 4 
 5     while (1)
 6 
 7       [i,j]=GetNextCliques(P,MESSAGES);
 8 
 9       if i == 0
10         break
11       end
12 
13       to_be_summed =  setdiff(P.cliqueList(i).var,P.cliqueList(j).var);
14       to_be_propogan  =  setdiff(P.cliqueList(i).var,to_be_summed);
15 
16       tmp_ = 1;
17       clear factorList
18       for k = 1:N
19           if P.edges(i,k)==1&&k~=j&&~isempty(MESSAGES(k,i).var)
20               factorList(tmp_) = MESSAGES(k,i); 
21               tmp_ = tmp_+1;
22           end
23       end
24       factorList(tmp_) = P.cliqueList(i);
25       F = factorList;
26       Joint = F(1);
27         for l = 2:length(F)
28             % Iterate through factors and incorporate them into the joint distribution
29             Joint = FactorSum(Joint, F(l));
30         end
31       MESSAGES(i,j) = FactorMaxMarginalization(Joint,to_be_summed);
32     end
View Code

    此處對val取對數是因為在map估計時,card一般都比較大。對應的val太小不便於作乘法(OCR的card是26!!!)

   消息的綜合如下:

 1     
 2     for i = 1:N
 3         tmp_ = 1;
 4         for k = 1:N
 5           if P.edges(i,k)==1
 6               factorList(tmp_) = MESSAGES(k,i); 
 7               tmp_ = tmp_+1;
 8           end
 9         end    
10      factorList(tmp_) = P.cliqueList(i);
11      F = factorList;
12      belief = F(1);
13         for l = 2:length(F)
14             % Iterate through factors and incorporate them into the joint distribution
15             belief = FactorSum(belief, F(l));
16         end
17     
18      clear factorList
19      Belief(i) = belief;    
20     end
View Code

  

4、總結

  團樹算法作為一種精確推理算法在VE算法的基礎上大幅減小了計算量和搜索空間。但其作為一種精確推理方法,依舊有着較大局限性。下周的Homework會以實現MCMC算法為目標~就是Alpha狗用的哪個蒙特卡羅哦~敬請期待。

      所有代碼請點這里

  

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM