d-ary heap實現一個快速的優先級隊列（C#）

本文轉載自查看原文 2018-08-22 13:56 831 Game

d-ary heap簡介：

d-ary heap 是泛化版本的binary heap(d=2)，d-ary heap每個非葉子節點最多有d個孩子結點。

d-ary heap擁有如下屬性：

類似complete binary tree，除了樹的最后一層，其它層全部填滿結點，且增加結點方式由左至右。
類似binary heap，它也分兩類最大堆和最小堆。

下面給出一個3-ary heap示例：

3-ary max heap - root node is maximum of all nodes
             10
       /      |     \
      7       9      8
  /  |  \    /
 4   6   5  7


3-ary min heap -root node is minimum of all nodes
             10
         /    |    \
       12     11    13
     / | \
    14 15 18

具有n個節點的完全d叉樹的高度由log_dn給出。

d-ary heap的應用：

d-ary heap常用於進一步實現優先級隊列，d-ary heap實現的優先級隊列比用binary heap實現的優先隊列在添加新元素的方面效率更高。binary heap：O(log₂n) vs d-ary heap： O(log_kn) ，當d > 2 時，log_kn < log₂n 。但是d-ary heap實現的優先級隊列缺點是提取優先級隊列首個元素比binary heap實現的優先隊列需要消耗更多性能。binary heap:O(log₂n) vs d-ary heap：O((d-1)log_dn),當 d > 2 時，(d-1)log_dn > log₂n ，通過對數換底公式可證。結果看起來喜憂參半，那么什么情況下特別適合使用d-ary heap呢？答案就是游戲中常見的尋路算法。就以A*和Dijkstra algorithm舉例。兩者一般都需要一個優先級隊列（有某些A*算法不適用優先級隊列，比如迭代加深A*），而這些算法在取出隊列首個元素時，往往要向隊列中添加更多的臨近結點。也就是添加結點次數遠遠大於提取次數。那么正好，d-ary heap可以取長補短。另外，d-ary heap比binary heap 對緩存更加友好，更多的子結點相鄰在一起。故在實際運行效率往往會更好一些。

d-ary heap及優先級隊列的實現：

我們用數組實現d-ary heap，數組以0為起始，可以得到如下規律：

若該結點為非根結點，那么使用該結點的索引i可以取得其的父結點索引，父結點為(i-1)/d；
若該結點的索引為i，那么它的孩子結點索引分別為(d*i)+1 , (d*i)+2 …. (d*i)+d；
若heap大小為n，最后一個非葉子結點的索引為(n-1)/d；（注：本文給出的實現並沒有使用該規則）

構建d-ary heap堆：本文給出的實現側重於進一步實現優先級隊列，並采用最小堆（方便適配尋路算法）。所以把一個輸入數組堆化，並不是核心操作，為了方便撰寫代碼以及加強可讀性，構建堆算法采用從根結點至下方式，而不是從最后一個非葉子結點向上的方式。優點顯而易見，代碼清晰，不需要使用遞歸且不需要大量if else語句來尋找最小的孩子結點。只要孩子結點的值小於其父節點將其交換即可。缺點顯而易見，交換次數增加從而降低效率。

public void BuildHeap() 
{
         for (int i = 1; i < numberOfItems; i++) 
　　　　　 {
            int bubbleIndex = i;
            ar node = heap[i];
                
            while (bubbleIndex != 0) 
　　　　　　　{
                int parentIndex = (bubbleIndex-1) / D;

                if (node.CompareTo(heap[parentIndex]) < 0) 
　　　　　　　　   {
                    heap[bubbleIndex] = heap[parentIndex];

                    heap[parentIndex] = node;

                    bubbleIndex = parentIndex;
                     
                } else 
　　　　　　　　　 {
                    break;
                }
            }
        }
}

Push：向優先級隊列中添加新的元素，若添加node為空，拋出異常，若空間不足，則擴展空間。最后調用內部函數DecreaseKey加入新的結點到d-ary heap。

public void Push(T node) 
{
     if (node == null) throw new System.ArgumentNullException("node");

     if (numberOfItems == heap.Length) 
　　  {
         Expand();
     }

    DecreaseKey(node, (ushort)numberOfItems);
    numberOfItems++;
}

DecreaseKey:傳入的index為當前隊列中現有元素的數量。這個函數是私有的，因為對於優先級隊列來說並不需要提供改接口。這里我們使用了一個優化技巧，暫不保存待加入的結點到數組，直到我們找到了它在數組中的合適位置，這樣可以節省不必要的交換。

private void DecreaseKey (T node, ushort index)
{
            
            if(index < numberOfItems)
            {
                if(node.CompareTo(heap[index]) > 0 )
                {
                    throw new System.Exception("New node key greater than orginal key");
                }
            }
            int bubbleIndex = index;
            

            while (bubbleIndex != 0) 
　　　　　　  {
                // Parent node of the bubble node
                int parentIndex = (bubbleIndex-1) / D;

                if (node.CompareTo(heap[parentIndex]) < 0 ) {
                    // Swap the bubble node and parent node
                    // (we don't really need to store the bubble node until we know the final index though
                    // so we do that after the loop instead)
                    heap[bubbleIndex] = heap[parentIndex];
                    bubbleIndex = parentIndex;
                } else {
                    break;
                }
            }

            heap[bubbleIndex] = node;
}

Pop：彈出優先級隊列top元素，調用內部函數ExtractMin。

public T Pop () 
{
     return ExtractMin();
}

ExtractMin：返回當前root node，更新numberOfItems，重新堆化。把最后一個葉子結點移動到root node，結點依照規則上浮。這里使用了同樣的優化技巧。不必把最后一個葉子結點保存到數組0的位置，等到確定其最終位置再把它存入數組。這樣做的好處節省交換次數。

private T ExtractMin()
{
            T returnItem = heap[0];

            numberOfItems--;
            if (numberOfItems == 0) return returnItem;

            // Last item in the heap array
            var swapItem = heap[numberOfItems];
        
            int swapIndex = 0, parent;

            
            while (true) {
                parent = swapIndex;
                var curSwapItem = swapItem;
                int pd = parent * D + 1;

                // If this holds, then the indices used
                // below are guaranteed to not throw an index out of bounds
                // exception since we choose the size of the array in that way
                if (pd <= numberOfItems) 
　　　　　　　　   {
                    
                    for(int i = 0;i<D-1;i++)
                    {
                        if (pd+i < numberOfItems && (heap[pd+i].CompareTo(curSwapItem) < 0))
                        {
                            curSwapItem = heap[pd+i];
                            swapIndex = pd+i;
                        }

                    }
                
                    if (pd+D-1 < numberOfItems && (heap[pd+D-1].CompareTo(curSwapItem) < 0)) 
                    {
                        swapIndex = pd+D-1;
                    }
                }

                // One if the parent's children are smaller or equal, swap them
                // (actually we are just pretenting we swapped them, we hold the swapData
                // in local variable and only assign it once we know the final index)
                if (parent != swapIndex) {
                    heap[parent] = heap[swapIndex];
                } else {
                    break;
                }
            }

            // Assign element to the final position
            heap[swapIndex] = swapItem;

            // For debugging
            Validate ();

            return returnItem;
}

時間復雜度分析：

對於用d ary heap實現的優先級隊列，若隊列擁有n個元素，其對應堆的高度最大為log_dn ，添加新元素時間復雜度為O(log_dn)
對於用d ary heap實現的優先級隊列，若隊列擁有n個元素，其對應堆的高度最大為log_dn，要在d個孩子結點當中選取最小或最大結點，層層不斷上浮。故刪除隊首元素時間復雜度為(d-1)log_dn
對於把數組轉化為d ary heap，采用從最后一個非葉子結點向上的方式，其時間復雜度為O(n)，分析思路和binary heap一樣。舉例說明，對於擁有n個結點的4 ary heap，高度為1子樹的有（3/4)n，高度為2的子樹有（3/16)n... 處理高度為1的子樹需要O(1),處理高度為2的子樹需要O(2)... 累加公式為 $\sum_{k=1}^{log_{4}^{n}}{\frac{3}{4^{k}}}nk$ ，根據比值收斂法可知這個無窮級數是收斂的，故復雜度仍為O(n)。那么對於本文給出的自頂向下的方式，其復雜度又如何呢？答案為O($dlog_{d}^{n}n$),具體的運算過程（詳見下一條），理論上時間復雜度要高於采用從最后一個非葉子結點向上的方式。但兩者實際效率相差多少需進行實際測試。
本文的buildheap算法，第i層的結點至多需要比較和交換i次，且第i層結點數d^i，由此可得時間統計范式為$\sum_{i=1}^{log_{d}^{n}}{d^{i}}i$，以d=4為例 $\sum_{i=1}^{log_{4}^{n}}{4^{i}}i$。需要求前i項和Si關於i的表達式，Si= 1*4 +2*4²+3*4³+.....+ i*4ⁱ,那么4Si=1*4²+2*4³+......+i*4ⁱ⁺¹，用4Si-Si進行錯位相減，得知3Si=i*4ⁱ⁺¹- (4+4²+......+4ⁱ) 。痛快，后者是一個等比數列。這樣整個式子最后表達為$Si=\frac{4}{9}+\frac{1}{3}(i-\frac{1}{3})4^{i+1}$,我們知道i值為log_dn，代入可得O($dlog_{d}^{n}n$)。

總結：

通過使用System.Diagnostics.Stopwatch 進行多次測試，發現d=4 時，push和pop的性能都不錯，d=4很多情況下Push都比d=2的情況要好一些。push可以確定性能確實有所提高，pop不能確定到底是好了還是壞了，實驗結果互有勝負。說到底System.Diagnostics.Stopwatch並不是精確測試，里面還有.net的噪音。

附錄：

優先級隊列完整程序

Q&A：

我的尋路算法想要使用C++或Java標准庫自帶的PriorityQueue，兩者都沒有提供DecreaseKey函數，帶來的問題是我無法更新隊列里元素key，沒有辦法進行邊放松，如何處理？

筆者文章DecreaseKey也是私有的，沒有提供給PriorityQueue的使用者。為什么不提供呢？因為即便提供了尋路算法如何給出DecreaseKey所需的index呢？我們知道需要更新的元素在優先級隊列中，但是index並不知道，要獲取index就需要進行搜索（或者使用額外數據結構輔助）。使用額外的數據結構輔助確定index必然占用更多內存空間，使用搜索確定index必然消耗更多時間尤其是當隊列中元素很多時。訣竅根本不改變它。而是將該節點的 "新建副本 " (具有新的更好的成本) 添加到優先級隊列中。由於成本較低, 該節點的新副本將在隊列中的原始副本之前提取, 因此將在前面進行處理。后面遇到的重復結點直接忽略即可，並且很多情況還沒等到處理重復結點時我們已經找到路徑了。我們所額外負擔的就是優先級隊列中存在一些多余對象。這種負擔非常小，而且實現起來簡便。

參考文獻：

https://www.geeksforgeeks.org/k-ary-heap/

http://en.wikipedia.org/wiki/Binary_heap

https://en.wikipedia.org/wiki/D-ary_heap

歡迎評論區交流，批評，指正~

原創文章，轉載請標明出處，謝謝~

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 優先級隊列實現 C/C++優先級隊列如何基於RabbitMQ實現優先級隊列優先級隊列 c# 線程的優先級優先級隊列的總結 golang 優先級隊列 C++ 優先級隊列(priority_queue)用法用redis實現支持優先級的消息隊列用Python實現數據結構之優先級隊列