數據結構--堆的實現之深入分析

本文轉載自查看原文 2016-05-07 16:42 8859 堆/ 數據結構

一，介紹

以前在學習堆時，寫了兩篇文章：數據結構--堆的實現(上) 和數據結構--堆的實現(下)，感覺對堆的認識還是不夠。本文主要分析數據結構堆(討論小頂堆)的基本操作的一些細節，比如 insert(插入)操作和 deleteMin(刪除堆頂元素)操作的實現細節、分析建堆的時間復雜度、堆的優缺點及二叉堆的不足。

二，堆的實現分析

堆的物理存儲結構是一維數組，邏輯存儲結構是完全二叉樹。堆的基本操作有：insert--向堆中插入一個元素；deleteMin--刪除堆頂元素

故堆的類結構如下：

public class BinaryHeap<T extends Comparable<? super T>> {
    
    private T[] array;
    private int currentSize;
    
    public BinaryHeap() {
        
    }
    
    public BinaryHeap(T[] array){
        
    }
    
    public void insert(T x){
        //do something
    }
    public T deleteMin(){
        
    }
    
    //other operations....
}

①insert操作

1     public void insert(T x){
2         if(currentSize == array.length - 1)//數組0號位置作為哨兵
3             enlarge(array.length * 2 + 1);0
4         
5         int hole = currentSize++;
6         for(array[0] = x; x.compareTo(array[hole / 2]) < 0; hole /= 2)
7             array[hole] = array[hole / 2];//將父節點往下移
8         array[hole] = x;//將待插入的元素放置到合適位置
9     }

1）數組0號元素作為哨兵，可以避免交換操作。

因為，在與父節點的比較過程中，若父節點比待插入的節點大(子節點)，則需要交換父節點和待插入節點。而引入哨兵，將待插入節點保存在數組0號元素處，當父節點比待插入的節點大時，直接用父節點替換待插入的節點大(子節點)。

2）復雜度分析

可以看出，最壞情況下，比較進行到根節點時會結束。因此，insert操作時間取決於樹的高度。故復雜度為O(logN)。但是在平均情況下，insert操作只需要O(1)時間就能完成，因為畢竟並不是所有的節點都會被調度至根結點，只有在待插入的節點的權值最小時才會向上調整堆頂。

此外，對於二叉樹，求解父節點操作： hole = hole / 2，除以2可以使用右移一位來實現。

因此，可以看出d叉樹(完成二叉樹 d=2 )，當 d 很大時，樹的高度就很小，插入的性能會有一定的提高。為什么說是一定的？？后面會詳細分析。

②deleteMin操作

deleteMin操作將堆中最后一個元素替換第一個元素，然后在第一個元素處向下進行堆調整。

 1     public AnyType deleteMin( )
 2     {
 3         if( isEmpty( ) )
 4             throw new UnderflowException( );
 5 
 6         AnyType minItem = findMin( );
 7         array[ 1 ] = array[ currentSize-- ];//最后一個元素替換堆頂元素
 8         percolateDown( 1 );//向下執行堆調整
 9 
10         return minItem;
11     }

 1     /**
 2      * Internal method to percolate down in the heap.
 3      * @param hole the index at which the percolate begins.
 4      */
 5     private void percolateDown( int hole )
 6     {
 7         int child;
 8         AnyType tmp = array[ hole ];
 9 
10         for( ; hole * 2 <= currentSize; hole = child )
11         {
12             child = hole * 2;
13             if( child != currentSize &&
14                     array[ child + 1 ].compareTo( array[ child ] ) < 0 )
15                 child++;
16             if( array[ child ].compareTo( tmp ) < 0 )
17                 array[ hole ] = array[ child ];
18             else
19                 break;
20         }
21         array[ hole ] = tmp;
22     }

當從第一個元素(堆頂元素)處向下進行堆調整時，一般該元素會被調整至葉子結點。堆頂元素的高度為樹的高度。故時間復雜度為：O(logN)。

③其他一些操作

1）decreaseKey(p,Δ)/increaseKey(p,Δ)---更改位置p處元素的權值

這兩個操作一般不常用。它們會破壞堆的性質。因此，當修改了p處元素的權值時，需要進行堆調整(decreseKey為向上調整，increaseKey為向下調整)

2）delete(p)--刪除堆中位置為p處的元素

前面介紹的deleteMin操作刪除的是堆頂元素，那如何刪除堆中的任一一個元素？

其實，可以將刪除堆中任一一個元素（該元素位置為 p）轉換成刪除堆頂元素。

借助 1）中的修改位置p處元素的權值操作：decrese(p,Δ)。將p處元素的權值降為負無窮大。此時，該元素會向上調整至堆頂，然后執行deleteMin即可。

三，建堆（buildHeap）

從最后一個非葉子結點開始向前進行向下調整。

1     /**
2      * Establish heap order property from an arbitrary
3      * arrangement of items. Runs in linear time.
4      */
5     private void buildHeap( )
6     {
7         for( int i = currentSize / 2; i > 0; i-- )
8             percolateDown( i );
9     }

i 的初始值為最后一個非葉子結點的位置。

時間復雜度分析：

建堆的時間復雜度與堆中所有的結點的高度相同。

分析如下：首先，葉子結點的高度為0。而建堆，就是從最后一個非葉子結點開始，不斷調用percolateDown(i)，而percolateDown(i)方法的時間復雜度就是位置 i 處節點的高度。在上面第7行for循環中，當 i 自減為1時，表明已經到了堆頂元素，因此整個buildHeap的時間復雜度就是所有非葉子結點的高度之和。而葉子結點的高度為0，故buildHeap的時間復雜度可理解成整個二叉堆的所有的結點的高度之和。

而對於理想二叉堆而言:(二叉堆是一顆完全二叉樹，理想二叉堆為滿二叉樹)

所有結點的高度之為：2^(h+1)-1-(h+1)。其中，h表示二叉堆的高度

又可以表示成：N-b(N)，N是堆中結點的個數，b(N)是N的二進制表示法中1的個數，如：b(7)=3

另，JDK8類庫：java.util.concurrent.PriorityBlockingQueue就是基於數組實現的優先級隊列。（The implementation uses an array-based binary heap....）

四，d 堆

上面分析了二叉堆的基本操作。那什么是 d 堆呢？為什么要有 d 堆呢？

對於二叉堆，d=2。顧名思義，d堆就是所有節點都有d個兒子的堆。為什么需要這種堆？

分析二叉堆的基本操作，insert操作需要定位父結點，這需要一個除法操作，操作的次數與樹的高度有關。deleteMin操作需要找出所有兒子中權值最小的那個兒子，而尋找兒子節點則需要乘法操作，操作的復雜度與兒子的個數有關(d越大，節點的兒子數越多，查找越慢)。

假設，我們的需求是有大量的insert操作，而僅有少量的deleteMin，那d堆從理論上講就有性能優勢了。因為d 遠大於2時，樹的高度很小啊，但是當d不是2的倍數時，除法操作不能通過移位來實現，也許會有一定的性能損失，這也是為什么insert操作分析中講的“插入性能會有一定的提高”。

而如果有大量的deleteMin操作，那d堆反而可能會除低性能，因為：d 越大，說明節點的兒子個數越多，找出權值最小的兒子就需要更多的比較次數了。

可見，d堆的提出，是因為需求不同而導致的。比如，insert屬於高頻需求.....

五，二叉堆的不足

根據上面的分析，二叉堆的insert復雜度O(logN)，deleteMin最壞也是O(logN)。

但是如果需要查找堆中某個元素呢？或者需要合並兩個堆呢？

對於二叉堆而言，對find 和 merge操作的支持不夠。這是由二叉堆的存儲結構決定的，因為二叉堆中的元素實際存儲在數組中。正因為如此，所有支持有效合並的高級數據結構都需要使用鏈式數據結構。另外，關於數據結構的合並操作，可參考：數據結構--並查集的原理及實現

六，其他形式的“堆”

為了克服二叉堆的不足，提出了一面一些類型的堆，它們主要是為了支持merge 和 find 操作。這就不詳細介紹了。

①左式堆

對堆的結構有一定的要求：它有一個“零路徑長”的概念，①任意一個節點的零路徑長比它的各個兒子的零路徑長的最小值大1。②對於堆中每一個節點，它的左兒子的零路徑長至少與右兒子的零路徑長相等。

②斜堆

對堆的結構沒有要求。

③二項隊列

最大的特點就是，做到了merge操作時間復雜度為O(logN)，而insert操作的平均時間復雜度為O(1)。

關於二項隊列，可參考：數據結構--二項隊列分析及實現

參考的BinaryHeap的完整實現如下：

package c9.shortestPath;
// BinaryHeap class
//
// CONSTRUCTION: with optional capacity (that defaults to 100)
//               or an array containing initial items
//
// ******************PUBLIC OPERATIONS*********************
// void insert( x )       --> Insert x
// Comparable deleteMin( )--> Return and remove smallest item
// Comparable findMin( )  --> Return smallest item
// boolean isEmpty( )     --> Return true if empty; else false
// void makeEmpty( )      --> Remove all items
// ******************ERRORS********************************
// Throws RuntimeExceptionException as appropriate

/**
 * Implements a binary heap.
 * Note that all "matching" is based on the compareTo method.
 * @author Mark Allen Weiss
 */
public class BinaryHeap<AnyType extends Comparable<? super AnyType>>
{
    /**
     * Construct the binary heap.
     */
    public BinaryHeap( )
    {
        this( DEFAULT_CAPACITY );
    }

    /**
     * Construct the binary heap.
     * @param capacity the capacity of the binary heap.
     */
    public BinaryHeap( int capacity )
    {
        currentSize = 0;
        array = (AnyType[]) new Comparable[ capacity + 1 ];
    }
    
    /**
     * Construct the binary heap given an array of items.
     */
    public BinaryHeap( AnyType [ ] items )
    {
            currentSize = items.length;
            array = (AnyType[]) new Comparable[ ( currentSize + 2 ) * 11 / 10 ];

            int i = 1;
            for( AnyType item : items )
                array[ i++ ] = item;
            buildHeap( );
    }

    /**
     * Insert into the priority queue, maintaining heap order.
     * Duplicates are allowed.
     * @param x the item to insert.
     */
    public void insert( AnyType x )
    {
        if( currentSize == array.length - 1 )
            enlargeArray( array.length * 2 + 1 );

            // Percolate up
        int hole = ++currentSize;
        for( array[ 0 ] = x; x.compareTo( array[ hole / 2 ] ) < 0; hole /= 2 )
            array[ hole ] = array[ hole / 2 ];
        array[ hole ] = x;
    }


    private void enlargeArray( int newSize )
    {
            AnyType [] old = array;
            array = (AnyType []) new Comparable[ newSize ];
            for( int i = 0; i < old.length; i++ )
                array[ i ] = old[ i ];        
    }
    
    /**
     * Find the smallest item in the priority queue.
     * @return the smallest item, or throw an UnderflowException if empty.
     */
    public AnyType findMin( )
    {
        if( isEmpty( ) )
            throw new RuntimeException( );
        return array[ 1 ];
    }

    /**
     * Remove the smallest item from the priority queue.
     * @return the smallest item, or throw an UnderflowException if empty.
     */
    public AnyType deleteMin( )
    {
        if( isEmpty( ) )
            throw new RuntimeException( );

        AnyType minItem = findMin( );
        array[ 1 ] = array[ currentSize-- ];
        percolateDown( 1 );

        return minItem;
    }

    /**
     * Establish heap order property from an arbitrary
     * arrangement of items. Runs in linear time.
     */
    public void buildHeap( )
    {
        for( int i = currentSize / 2; i > 0; i-- )
            percolateDown( i );
    }

    /**
     * Test if the priority queue is logically empty.
     * @return true if empty, false otherwise.
     */
    public boolean isEmpty( )
    {
        return currentSize == 0;
    }

    /**
     * Make the priority queue logically empty.
     */
    public void makeEmpty( )
    {
        currentSize = 0;
    }

    private static final int DEFAULT_CAPACITY = 10;

    private int currentSize;      // Number of elements in heap
    private AnyType [ ] array; // The heap array

    /**
     * Internal method to percolate down in the heap.
     * @param hole the index at which the percolate begins.
     */
    private void percolateDown( int hole )
    {
        int child;
        AnyType tmp = array[ hole ];

        for( ; hole * 2 <= currentSize; hole = child )
        {
            child = hole * 2;
            if( child != currentSize &&
                    array[ child + 1 ].compareTo( array[ child ] ) < 0 )
                child++;
            if( array[ child ].compareTo( tmp ) < 0 )
                array[ hole ] = array[ child ];
            else
                break;
        }
        array[ hole ] = tmp;
    }

        // Test program
    public static void main( String [ ] args )
    {
        int numItems = 10000;
        BinaryHeap<Integer> h = new BinaryHeap<>( );
        int i = 37;

        for( i = 37; i != 0; i = ( i + 37 ) % numItems )
            h.insert( i );
        for( i = 1; i < numItems; i++ )
            if( h.deleteMin( ) != i )
                System.out.println( "Oops! " + i );
    }
}

View Code

參考資料

排序算法總結之堆排序

數據結構與算法分析 Mark Allen Weiss著

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 數據結構--堆的實現(上) Linux堆內存管理深入分析(上) 數據結構之“堆” 數據結構堆 Linux堆內存管理深入分析（下） java數據結構----堆數據結構：堆（Heap）數據結構：堆與棧基本數據結構 —— 堆以及堆排序（C++實現）使用C#實現數據結構堆