Directed Minimum Spanning Tree: Chu-Liu/Edmonds Algorithm


我們的現代數據庫大作業要求實現一個圖查詢系統,包括基於屬性的子圖查詢、可達性查詢(可選)、最短路徑查詢(可選)、TopK最短路徑查詢(可選)、圖形化展示(可選)等功能。分成子圖同構查詢小組以及可達性及TopK路徑查詢小組。

小組長之前研究了Efficiently answering reachability queries on very large directed graphs這篇論文,關於Path-tree計算可達性的,其中需要構造最大生成樹(無需固定root),於是負責打醬油的我就開始琢磨單連通有向圖的最大生成樹算法Edmonds Algorithm了。

 

 

Edmonds Algorithm介紹


Solving The Directed MST Problem

Chu and Liu [2], Edmonds [3], and Bock [4] have independently given efficient algorithms for finding the MST on a directed graph. The Chu-Liu and Edmonds algorithms are virtually identical; the Bock algorithm is similar but stated on matrices instead of on graphs. Furthermore, a distributed algorithm is given by Humblet [5]. In the sequel, we shall briefly illustrate the Chu-Liu/Edmonds algorithm, following by a comprehensive example (due to [1]). Reader can also refer to [6] [7] for an efficient implementation, O(mlogn) and O(n^2) for dense graph, of this algorithm.

Chu-Liu/Edmonds Algorithm

  1. Discard the arcs entering the root if any; For each node other than the root, select the entering arc with the smallest cost; Let the selected n-1 arcs be the set S.
  2. If no cycle formed, G(N,S) is a MST. Otherwise, continue.
  3. For each cycle formed, contract the nodes in the cycle into a pseudo-node (k), and modify the cost of each arc which enters a node (j) in the cycle from some node (i)outside the cycle according to the following equation.

    c(i,k)=c(i,j) - (c(x(j),j) - min_{j}(c(x(j),j))

    where c(x(j),j) is the cost of the arc in the cycle which enters j.

  4. For each pseudo-node, select the entering arc which has the smallest modified cost; Replace the arc which enters the same real node in Sby the new selected arc.
  5. Go to step 2 with the contracted graph.

The key idea of the algorithm is to find the replacing arc(s) which has the minimum extra cost to eliminate cycle(s) if any. The given equation exhibits the associated extra cost. The following example illustrates that the contraction technique finds the minimum extra cost replacing arc (2,3) for arc (4,3) and hence the cycle is eliminated.

ex2

 

References

  1. E. Lawler, ``Combinatorial optimization: networks and matroids'', Saunders College Publishing, 1976.
  2. Y. J. Chu and T. H. Liu, ``On the shortest arborescence of a directed graph'', Science Sinica, v.14, 1965, pp.1396-1400.
  3. J. Edmonds, ``Optimum branchings'', J. Research of the National Bureau of Standards, 71B, 1967, pp.233-240.
  4. F. Bock, ``An algorithm to construct a minimum spanning tree in a directed network'', Developments in Operations Research, Gordon and Breach, NY, 1971, pp. 29-44.
  5. P. Humblet, ``A distributed algorithm for minimum weighted directed spanning trees'', IEEE Trans. on Communications, v.COM-31, n.6, 1983, pp.756-762.
  6. R. E. Tarjan, ``Finding Optimum Branchings'', Networks, v.7, 1977, pp.25-35.
  7. P.M. Camerini, L. Fratta, and F. Maffioli, ``A note on finding optimum branchings'', Networks, v.9, 1979, pp.309-312.

 

下面是Wiki上的一段算法描述,包括了計算最后最大生成樹總權值的計算。

BV: a vertex bucket

BE: an edge bucket

G0 = (V0,E0) :the original digraph.

v : a vertex

e :an edge of maximum positive weight that is incident to v

Ci : a circuit

ui is a replacement vertex for Ci

image

 

算法復雜度的改進


其中,關於算法復雜度Wiki上是這樣描述的:

The order of this algorithm is O(EV). There is a faster implementation of the algorithm by Robert Tarjan. The order is O(E \log V)for a sparse graph and O(V^2) for a dense graph. This is as fast as Prim's algorithm for an undirected minimum spanning tree. In 1986, Gabow, Galil, Spencer, and Tarjan made a faster implementation, and its order is O(E + V \log V).

Fibonacci 堆是Fredman 和Tarjan 於1984 年發明的,這個Tarjan將F-Heaps應用到很多圖算法中,減少了算法復雜度,比如說86年用於Edmonds Algorithm的這篇paper:

H. N. Gabow, Z. Galil, T. Spencer, and R. E. Tarjan, “Efficient algorithms for finding minimum spanning trees in undirected and directed graphs,” Combinatorica 6 (1986), 109-122.

By observing that in certain situations items can be moved among F-heaps
in  O(1)
  amortized time per item moved, we obtain an implementation  of Edmonds' minimum directed spanning tree algorithm [16] with a running time of O (n log n +m)

[16] R.  E. TARJAN, Applications of path  compression on balanced trees,  J.  Assoc.  Comput.  Mach. 26  (1979), 690--715.

 

Tarjan版本的實現


Wiki最后給了兩個實現的鏈接

Edmonds's algorithm ( edmonds-alg ) – An open source implementation of Edmonds's algorithm written in C++ and licensed under the MIT License. This source is using Tarjan's implementation for the dense graph.

The package edmonds-alg contains a C++-implementation of Edmonds's optimum branching algorithm as described by Tarjan in 1977.

 

AlgoWiki – Edmonds's algorithm - A public-domain implementation of Edmonds's algorithm written in Java.

反正整合到我的代碼里之后,我是無法理解代碼的行為,看到有人說這個AlgoWiki的實現中getCycles()有問題,並且提供了一份Tarjan版本的新的實現,不知道這個好不好使。

Tarjan的論文:Finding Optimum Branchings  

上述論文的修正: A Note on Finding Optimum Branchings

 

Coolshell上介紹過一些有意思的算法代碼,有Edmonds’s Matching Algorithm的Java實現,細看發現這個不是求最大生成樹的Edmonds‘s Algorithm算法,白高興了。

 

補充一個matlab的:

http://www.mathworks.com/matlabcentral/fileexchange/24327-maximumminimum-weight-spanning-tree-directed

http://www.mathworks.com/matlabcentral/fileexchange/24899

 

固定root的算法:

1. 刪去所有自己連向自己的入邊。
2. 移除樹根的全部入邊。
3. 判斷樹根能不能連到圖上各個點,否則生成樹不存在。
4. 重復以下步驟,直到形成生成樹為止:
4.1 找出圖上每個點的最小入邊。O(E)
4.2 找出所有水母(環)。如果沒有水母就表示目前已是最小生成樹。O(V)
4.3 調整所有進入水母環的邊的權重。O(E)
w(a, x) -= w(å, x),åx是x點的最小入邊,ax為其他連入x點的邊。
4.4 收縮水母環成為一點。O(E)

不固定root的算法

1. 刪去所有自己連向自己的入邊。
2. 重復以下步驟,直到形成生成樹為止:
2.1 找出圖上每個點的最小入邊。O(E)
如果有兩個點以上找不到入邊,則表示生成樹不存在。
(找不到入邊的點可作為生成樹樹根)
2.2 找出所有水母。如果沒有水母就表示目前已是最小生成樹。O(V)
2.3 調整所有進入水母環的邊的權重。O(E)
w(a, x) -= w(å, x),åx是x點的最小入邊,ax為其他連入x點的邊。
2.4 收縮水母環成為一點。O(E)

於是我開始使用兩年沒摸過的Java了。。。先把比較弱的AlgoWiki整合到小組代碼框架里,再搗鼓下用F-heap優化算法的Tarjan's implementation的C++實現,,改成Java版本的。

 

AlgoWiki的算法偽碼,固定root版本,得改改

Algorithm Overview

  • Remove all edges going into the root node (2)
  • For each node, select only the incoming edge with smallest weight (4.1)
  • For each circuit that is formed: (4.2)
    • edge "m" is the edge in this circuit with minimum weight
    • Combine all the circuit's nodes into one pseudo-node "k"  (4.4)
    • For each edge "e" entering a node in "k" in the original graph: (4.3) 
      • edge "n" is the edge currently entering this node in the circuit
      • track the minimum modified edge weight between each "e" based on the following:
        • modWeight = weight("e") - ( weight("n") - weight("m") )
    • On edge "e" with minimum modified weight, add edge "e" and remove edge "n"

有C++基礎,閱讀Java代碼,加之谷歌娘,還是比較無障礙的

【Java】Final 與 C++ Const的區別

Comparator和Comparable在排序中的應用

基於紅黑樹的TreeMap類使用實例解析

 

英文

The Directed Minimum Spanning Tree Problem Description of the algorithm summarized(總結) by Shanchieh Jay Yang, May 2000.

http://en.wikipedia.org/wiki/Edmonds'_algorithm

http://en.vionto.com/show/me/Edmonds's+algorithm

中文

http://hi.baidu.com/zhanggmcn/item/aed6f75d0247e710aaf6d7e7

http://acm.nudt.edu.cn/~twcourse/Tree.html#a17


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM