《算法4》1.5 - Union-Find 算法解決動態連通性問題,Python實現


Union-Find 算法(中文稱並查集算法)是解決動態連通性(Dynamic Conectivity)問題的一種算法,作者以此為實例,講述了如何分析和改進算法,本節涉及三個算法實現,分別是Quick Find, Quick Union 和 Weighted Quick Union。

動態連通性(Dynamic Connectivity)

動態連通性是計算機圖論中的一種數據結構,動態維護圖結構中相連接的組信息。
簡單的說就是,圖中各個點之間是否相連、連接后組成了多少個組等信息。我們稱連接在一起就像形成了一個圈子似的,成為一個組(Component),每個組有其自己的一些特征,比如組內所有成員都有同一個標記等。

提到圈子,大家比較好理解,我們在社交網絡中,彼此熟悉的人之間組成自己的圈子,"熟悉"用計算機中的語言來表示就是“Connected 連通的。圈子是會變化的,今天你又新認識了某人,明天你跟某人友盡了,這種變化是動態的,所以有了動態連通性這種數據結構和問題。

比較常見的應用有,社交網絡中比如LinkedIn, 判斷某個用戶與其它用戶是否熟悉:如果你與用戶A熟悉,用戶A與用戶B熟悉,則認為你與用戶B也是連接的,你可以看到用戶B的信息。在計算機網絡中,也存在類似的情況,判斷網絡中某兩個節點是否相連。

Dynamic Connectivity 的計算機語言表述

給定一個整數對(p,q),如果p 和 q尚未連通,則使二者相連通。p 和 q 相連后,我們稱 p 和 q 在同一個組內。

當p 和 q 連通時,以下關系則成立:

  • 自反性:p和p自身是相連的
  • 對稱性:如果p和q相連,那么q和p也相連
  • 傳遞性:如果p和q相連,q和r相連,那么p和r也相連

在一個網絡中,會存在很多類似的整數對(p,q),假設網絡容量是 N,我們可以定義一個從 0 到 N-1的整數數組,p,q是其中的值,我們可能需要的操作有:

  • 判斷 p 和 q 是否相連
  • 如果未相連,則連接 p 和 q, 如果已相連,則可以不做啥
  • 查找 p 或 q 屬於哪個組中 (如圈子)

這里的一個關鍵是,如何確定 p 和 q 是在同一個組內。這意味着,每個組需要有一些特定的屬性,我們在后面的算法中會有考慮。

Union-Find 算法描述 Dynamic Connectivity

Union-Find 算法中,提供了對應的方法來實現我們前面提到的可能的操作:

  • connected(): 判斷 p 和 q 是否相連,這里要調用 find(p) 和 find(q),如果二者屬於同一個組,則認為是相連的,即isConnected()返回true.

  • union(): 如果未相連,則連接 p 和 q, 如果已相連,則可以不做啥

  • find(): 查找 p 或 q 屬於哪個組中 (如圈子),這里返回值是整數,作為組的標識符(component identifier)。

  • count(): 返回組的數量

算法4中的API:

class UF:
    def __init__(self,N):
    def union(self,p,q): # initialize N sites with integer names
    def find(self,p): #return component identifier for p
    def connected(self,p,q): #return true if p and q are in the same component
    def count(): #number of components

Union-Find 算法及實現

根據我們前面的描述,如果確定每個組的標識符似乎比較關鍵,只要確定了,就可以判斷是否相連。

那用什么來作為標識符,區分各個組呢?

最簡單的一個辦法是,所有的節點都賦予一個 ID,如果兩個節點相連,則將這兩個節點的 ID 設成一樣的,這樣,這兩個節點便屬於同一個組了。網絡中每個組都有了一個唯一的 ID。只要節點 p 和 q 的 ID 相同,則認為節點 p 和 q 相連。我們用數組來放置節點 ID,find()方法可以快速返回 ID,所以我們的第一個算法就叫做 QuickFind。

QuickFind 算法

QuickFind 算法中,find方法比較簡單,union(p,q)方法需要考慮的一點是,要將與p相連的所有節點 id 都設為q當前的 id,使p所在的組和q所在的組結合成了一個同一組。(注:也可以把與q相連的所有節點id都設為p的id)

最開始的時候,所有節點都互不相連。我們假設所有的節點由id=0到N-1的整數表示。

圖片.png

代碼:

# -*- coding: utf-8 -*-

class QuickFind(object):
    id=[]
    count=0
    
    def __init__(self,n):
        self.count = n
        i=0
        while i<n:
            self.id.append(i)
            i+=1
            
    def connected(self,p,q):
        return self.find(p) == self.find(q)
    
    def find(self,p):    
        return self.id[p]
    
    def union(self,p,q):
        idp = self.find(p)
        if not self.connected(p,q):
            for i in range(len(self.id)):
                if self.id[i]==idp: # 將p所在組內的所有節點的id都設為q的當前id
                    self.id[i] = self.id[q] 
            self.count -= 1 
                                

我們的測試端代碼如下:

# -*- coding: utf-8 -*-

import quickfind

qf = quickfind.QuickFind(10)

print "initial id list is %s" % (",").join(str(x) for x in qf.id)

list = [
        (4,3),
        (3,8),
        (6,5),
        (9,4),
        (2,1),
        (8,9),
        (5,0),
        (7,2),
        (6,1),
        (1,0),
        (6,7)
        ]

for k in list:
    p =  k[0]
    q =  k[1]
    qf.union(p,q)
    print "%d and %d is connected? %s" % (p,q,str(qf.connected(p,q)    ))
    
print "final id list is %s" % (",").join(str(x) for x in qf.id)
print "count of components is: %d" % qf.count

運行結果:

initial id list is 0,1,2,3,4,5,6,7,8,9
4 and 3 is connected? True
3 and 8 is connected? True
6 and 5 is connected? True
9 and 4 is connected? True
2 and 1 is connected? True
8 and 9 is connected? True
5 and 0 is connected? True
7 and 2 is connected? True
6 and 1 is connected? True
1 and 0 is connected? True
6 and 7 is connected? True
final id list is 1,1,1,8,8,1,1,1,8,8
count of components is: 2

下圖是算法4中的圖示,可供參考:

圖片.png

QuickFind 算法分析:

find方法快速返回數組的值,但union方法最壞情況下,幾乎需要遍歷整個數組,如果數組很大(比如社交網絡巨大) 、需要連接的節點對很多的時候,QuickFind算法的復雜度就相當大了。所以我們需要改進一下union方法。

QuickUnion 算法

前面的QuickFind算法中,union的時候可能需要遍歷整個數組,導致算法性能下降。有沒有什么辦法可以不用遍歷整個數組,又可以保證同一個組內的所有節點都有一個共同屬性呢?樹結構。樹的所有節點都有一個共同的根節點,每個樹只有一個根節點,那每個樹就可以代表一個組。union(p,q)的時候,只要把p所在的樹附加到q所在的樹的根節點,這樣,p和q就在同一樹中了。

改進后的算法即是QuickUnion算法。我們同樣要用到 id 數組,只是這里的 id 放的是節點所在樹的根節點。

find(p): 返回的是 p 所在樹的根節點
union(p,q): 將 p 所在樹的根節點的 id 設為 q 所在樹的根節點

圖片.png

代碼實現:

# -*- coding: utf-8 -*-

class QuickUnion(object):
    id=[]
    count=0
    
    def __init__(self,n):
        self.count = n
        i=0
        while i<n:
            self.id.append(i)
            i+=1

    def connected(self,p,q):
        if self.find(p) == self.find(q):
            return True
        else:            
            return False
    
    def find(self,p):   
        while (p != self.id[p]):
            p = self.id[p]
        return p
    
    def union(self,p,q):
        idq = self.find(q)
        idp = self.find(p)
        if not self.connected(p,q):
            self.id[idp]=idq
            self.count -=1            
                               

類似的測試端代碼:

# -*- coding: utf-8 -*-

import quickunion

qf = quickunion.QuickUnion(10)

print "initial id list is %s" % (",").join(str(x) for x in qf.id)

list = [
        (4,3),
        (3,8),
        (6,5),
        (9,4),
        (2,1),
        (8,9),
        (5,0),
        (7,2),
        (6,1),
        (1,0),
        (6,7)
        ]

for k in list:
    p =  k[0]
    q =  k[1]
    qf.union(p,q)
    print "%d and %d is connected? %s" % (p,q,str(qf.connected(p,q)    ))
    
print "final root list is %s" % (",").join(str(x) for x in qf.id)
print "count of components is: %d" % qf.count

運行結果:

initial id list is 0,1,2,3,4,5,6,7,8,9
4 and 3 is connected? True
3 and 8 is connected? True
6 and 5 is connected? True
9 and 4 is connected? True
2 and 1 is connected? True
8 and 9 is connected? True
5 and 0 is connected? True
7 and 2 is connected? True
6 and 1 is connected? True
1 and 0 is connected? True
6 and 7 is connected? True
final root list is 1,1,1,8,3,0,5,1,8,8
count of components is: 2

算法4中的圖示供參考理解:

圖片.png

QuickUnion 算法分析:

union方法已經很快速了現在,find方法比QuickFind慢了,其最壞的情況下,如下圖,一次find需要訪問1+..+N次數組,union方法中需要調用兩次find方法,即復雜度變成2(1+...+N)=(N+1)N,接近N的平方了。

圖片.png

Weighted Quick Union 算法

前面的QuickUnion算法中,union的時候只是簡單的將兩個樹合並起來,並沒有考慮兩個樹的大小,所以導致最壞情況的發生。改進的方法可以是,在union之前,先判斷兩個樹的大小(節點數量),將小點的樹附加到大點的樹上,這樣,合並后的樹的深度不會變得非常大。

示例如下:

圖片.png

要判斷樹的大小,需要引進一個新的數組,size 數組,存放樹的大小。初始化的時候 size 各元素都設為 1。

代碼:

# -*- coding: utf-8 -*-

class WeightedQuickUnion(object):
    id=[]
    count=0
    sz=[]
    
    def __init__(self,n):
        self.count = n
        i=0
        while i<n:
            self.id.append(i)
            self.sz.append(1) # inital size of each tree is 1
            i+=1

    def connected(self,p,q):
        if self.find(p) == self.find(q):
            return True
        else:            
            return False
    
    def find(self,p):   
        while (p != self.id[p]):
            p = self.id[p]
        return p
    
    def union(self,p,q):
        idp = self.find(p)
        print "id of %d is: %d" % (p,idp)
        idq = self.find(q)
        print "id of %d is: %d" % (q,idq)
        if not self.connected(p,q):            
            print "Before Connected: tree size of %d's id is: %d" % (p,self.sz[idp])
            print "Before Connected: tree size of %d's id is: %d" % (q,self.sz[idq])
            if (self.sz[idp] < self.sz[idq]):
                print "tree size of %d's id is smaller than %d's id" %(p,q)
                print "id of %d's id (%d) is set to %d" % (p,idp,idq)
                self.id[idp] = idq
                         
                print "tree size of %d's id is incremented by tree size of %d's id" %(q,p)
                self.sz[idq] += self.sz[idp]    
                print "After Connected: tree size of %d's id is: %d" % (p,self.sz[idp])
                print "After Connected: tree size of %d's id is: %d" % (q,self.sz[idq])         
            else:                  
                print "tree size of %d's id is larger than or equal with %d's id" %(p,q)
                print "id of %d's id (%d) is set to %d" % (q,idq,idp)
                self.id[idq] = idp
                print "tree size of %d's id is incremented by tree size of %d's id" %(p,q)
                self.sz[idp] += self.sz[idq]   
                print "After Connected: tree size of %d's id is: %d" % (p,self.sz[idp])
                print "After Connected: tree size of %d's id is: %d" % (q,self.sz[idq])         
        
            self.count -=1    

測試端代碼:

# -*- coding: utf-8 -*-

import weightedquickunion

qf = weightedquickunion.WeightedQuickUnion(10)

print "initial id list is %s" % (",").join(str(x) for x in qf.id)

list = [
        (4,3),
        (3,8),
        (6,5),
        (9,4),
        (2,1),
        (8,9),
        (5,0),
        (7,2),
        (6,1),
        (1,0),
        (6,7)
        ]

for k in list:
    p =  k[0]
    q =  k[1]
    print "." * 10 + "unioning %d and %d"  % (p,q)  + "." * 10
    qf.union(p,q)
    print "%d and %d is connected? %s" % (p,q,str(qf.connected(p,q)    ))
    
print "final id list is %s" % (",").join(str(x) for x in qf.id)
print "count of components is: %d" % qf.count

代碼運行結果:

initial id list is 0,1,2,3,4,5,6,7,8,9
..........unioning 4 and 3..........
id of 4 is: 4
id of 3 is: 3
Before Connected: tree size of 4's id is: 1
Before Connected: tree size of 3's id is: 1
tree size of 4's id is larger than or equal with 3's id
id of 3's id (3) is set to 4
tree size of 4's id is incremented by tree size of 3's id
After Connected: tree size of 4's id is: 2
After Connected: tree size of 3's id is: 1
4 and 3 is connected? True
..........unioning 3 and 8..........
id of 3 is: 4
id of 8 is: 8
Before Connected: tree size of 3's id is: 2
Before Connected: tree size of 8's id is: 1
tree size of 3's id is larger than or equal with 8's id
id of 8's id (8) is set to 4
tree size of 3's id is incremented by tree size of 8's id
After Connected: tree size of 3's id is: 3
After Connected: tree size of 8's id is: 1
3 and 8 is connected? True
..........unioning 6 and 5..........
id of 6 is: 6
id of 5 is: 5
Before Connected: tree size of 6's id is: 1
Before Connected: tree size of 5's id is: 1
tree size of 6's id is larger than or equal with 5's id
id of 5's id (5) is set to 6
tree size of 6's id is incremented by tree size of 5's id
After Connected: tree size of 6's id is: 2
After Connected: tree size of 5's id is: 1
6 and 5 is connected? True
..........unioning 9 and 4..........
id of 9 is: 9
id of 4 is: 4
Before Connected: tree size of 9's id is: 1
Before Connected: tree size of 4's id is: 3
tree size of 9's id is smaller than 4's id
id of 9's id (9) is set to 4
tree size of 4's id is incremented by tree size of 9's id
After Connected: tree size of 9's id is: 1
After Connected: tree size of 4's id is: 4
9 and 4 is connected? True
..........unioning 2 and 1..........
id of 2 is: 2
id of 1 is: 1
Before Connected: tree size of 2's id is: 1
Before Connected: tree size of 1's id is: 1
tree size of 2's id is larger than or equal with 1's id
id of 1's id (1) is set to 2
tree size of 2's id is incremented by tree size of 1's id
After Connected: tree size of 2's id is: 2
After Connected: tree size of 1's id is: 1
2 and 1 is connected? True
..........unioning 8 and 9..........
id of 8 is: 4
id of 9 is: 4
8 and 9 is connected? True
..........unioning 5 and 0..........
id of 5 is: 6
id of 0 is: 0
Before Connected: tree size of 5's id is: 2
Before Connected: tree size of 0's id is: 1
tree size of 5's id is larger than or equal with 0's id
id of 0's id (0) is set to 6
tree size of 5's id is incremented by tree size of 0's id
After Connected: tree size of 5's id is: 3
After Connected: tree size of 0's id is: 1
5 and 0 is connected? True
..........unioning 7 and 2..........
id of 7 is: 7
id of 2 is: 2
Before Connected: tree size of 7's id is: 1
Before Connected: tree size of 2's id is: 2
tree size of 7's id is smaller than 2's id
id of 7's id (7) is set to 2
tree size of 2's id is incremented by tree size of 7's id
After Connected: tree size of 7's id is: 1
After Connected: tree size of 2's id is: 3
7 and 2 is connected? True
..........unioning 6 and 1..........
id of 6 is: 6
id of 1 is: 2
Before Connected: tree size of 6's id is: 3
Before Connected: tree size of 1's id is: 3
tree size of 6's id is larger than or equal with 1's id
id of 1's id (2) is set to 6
tree size of 6's id is incremented by tree size of 1's id
After Connected: tree size of 6's id is: 6
After Connected: tree size of 1's id is: 3
6 and 1 is connected? True
..........unioning 1 and 0..........
id of 1 is: 6
id of 0 is: 6
1 and 0 is connected? True
..........unioning 6 and 7..........
id of 6 is: 6
id of 7 is: 6
6 and 7 is connected? True
final id list is 6,2,6,4,4,6,6,2,4,4
count of components is: 2

算法4中的圖示:

圖片.png


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM