面試題 HashMap 數據結構實現原理

本文轉載自查看原文 2016-07-02 19:42 18708 博客

數據結構

      
      
      
              
       
       
       
                
                
                  HashMap的數據結構 
                 
                
                  數據結構中有數組和鏈表來實現對數據的存儲，但這兩者基本上是兩個極端。 
                 
                 
                  
                  數組：數組存儲區間是連續的，占用內存嚴重，故空間復雜的很大。但數組的二分查找時間復雜度小，為O(1)；數組的特點是：尋址容易，插入和刪除困難；
 
                  鏈表：鏈表存儲區間離散，占用內存比較寬松，故空間復雜度很小，但時間復雜度很大，達O（N）。鏈表的特點是：尋址困難，插入和刪除容易。 
                  
                 
                 
                  
                 
                
                  哈希表 
                 
                
                  那么我們能不能綜合兩者的特性，做出一種尋址容易，插入刪除也容易的數據結構？ 
                 
                
                  答案是肯定的，這就是我們要提起的哈希表。 
                 
                 
                 哈希表（(Hash table）既滿足了數據的查找方便，同時不占用太多的內容空間，使用也十分方便。 
                 
                
                  哈希表有多種不同的實現方法，我接下來解釋的是最常用的一種方法—— 拉鏈法，我們可以理解為“鏈表的數組” ，如圖： 
                
       
       
       
               


       
       
       
                
                
                  從上圖我們可以發現哈希表是由【 
                 數組+鏈表】組成的，一個長度為16的數組中，每個元素存儲的是一個鏈表的頭結點。 
                 
                
                  那么這些元素是按照什么樣的規則存儲到數組中呢？ 
                 
                
                  一般情況是通過【 
                 hash(key)%len】獲得，也就是元素的key的哈希值對數組長度取模得到。 
                 
                
                  比如上述哈希表中，12%16=12,28%16=12,108%16=12,140%16=12。所以12、28、108以及140都存儲在數組下標為12的位置。 
                 
                 
                  
                 
                
                  HashMap也可以理解為其存儲數據的容器就是一個【 
                 線性數組】。 
                 
                
                  這可能讓我們很不解，一個線性的數組怎么實現按鍵值對來存取數據呢？ 
                 
                
                  這里HashMap有做一些處理。首先HashMap里面實現一個靜態內部類Entry，其重要的屬性有 key , value, next。從屬性key,value我們就能很明顯的看出來Entry就是HashMap鍵值對實現的一個基礎bean，我們上面說到 
                 HashMap的基礎就是一個線性數組，這個數組就是Entry[]，Map里面的內容都保存在Entry[]里面。 
                 
                 
                  
                  /** The table, resized as necessary. Length MUST Always be a power of two. */ 
                  
                 transient Entry[] table;

存數據的邏輯

      
      
      
              
       
       
       
                
                既然是線性數組，為什么能隨機存取？這里HashMap用了一個小算法，大致是這樣實現： 
                
 
               
      
      
      
              
      
      
      
              
       
       
       
                
                 
                //存儲時: 
               
       
       
       
                
                int hash = key.hashCode(); // 每個key的hash是一個固定的int值 
               
       
       
       
                
                int index = hash % Entry[].length; 
                // 去模運算，運算后的值肯定在0-length之間 
               
       
       
       
                
                Entry[index] = value; 
                // 以去模后的值為索引，把value存進去 
               
      
      
      
              
      
      
      
              
       
       
       
               
      
      
      
              
      
      
      
              
       
       
       
               
                 疑問：如果兩個key通過hash%Entry[].length得到的index相同，會不會有覆蓋的危險？ 
               
       
       
       
               
                 這里HashMap里面用到鏈式數據結構的一個概念。 
               
       
       
       
               
                 上面我們提到過 
                Entry類里面有一個next屬性，作用是指向下一個Entry。 
               
       
       
       
               
                 打個比方， 第一個鍵值對A進來，通過計算其key的hash得到的index=0，記做:Entry[0] = A。 
               
       
       
       
               
                 一會后又進來一個鍵值對B，通過計算其index也等於0，現在怎么辦？ 
               
       
       
       
               
                 HashMap會這樣做:B.next = A,Entry[0] = B。 
               
       
       
       
               
                 如果又進來C,index也等於0,那么C.next = B,Entry[0] = C； 
               
       
       
       
               
                 這樣我們發現 
                index=0的地方其實存取了A,B,C三個鍵值對,他們通過next這個屬性鏈接在一起。 
               
       
       
       
                
                 
               
      
      
      
              
      
      
      
              
       
       
       
                
                 
                    public V put(K key, V value) { 
               
       
       
       
                
                        if (key == null) return putForNullKey(value); //null總是放在數組的第一個鏈表中 
               
       
       
       
                
                        int hash = hash(key.hashCode()); 
               
       
       
       
                
                        int i = indexFor(hash, table.length); 
               
       
       
       
                
                        //遍歷鏈表 
               
       
       
       
                
                        for (Entry<K, V> e = table[i]; e != null; e = e.next) { 
               
       
       
       
                
                            Object k; 
               
       
       
       
                
                            //如果key在鏈表中已存在，則替換為新value（不要誤解為是用新的值把舊的值覆蓋了！） 
               
       
       
       
                
                            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) { 
               
       
       
       
                
                                V oldValue = e.value; 
               
       
       
       
                
                                e.value = value; 
               
       
       
       
                
                                e.recordAccess(this); 
               
       
       
       
                
                                return oldValue; 
               
       
       
       
                
                            } 
               
       
       
       
                
                        } 
               
       
       
       
                
                        modCount++; 
               
       
       
       
                
                        addEntry(hash, key, value, i); 
               
       
       
       
                
                        return null; 
               
       
       
       
                
                    } 
               
       
       
       
               
       
       
       
                
                    void addEntry(int hash, K key, V value, int bucketIndex) { 
               
       
       
       
                
                        Entry<K, V> e = table[bucketIndex]; 
               
       
       
       
                
                        table[bucketIndex] = new Entry<K, V>(hash, key, value, e); //參數e, 是Entry.next 
               
       
       
       
                
                        //如果size超過threshold，則擴充table大小。再散列 
               
       
       
       
                
                        if (size++ >= threshold) resize(2 * table.length); 
               
       
       
       
                   }
      
      
      
              
      
      
      
              
       
       
       
               
      
      
      
              
      
      
      
              
       
       
       
               當然HashMap里面也包含一些優化方面的實現，比如：Entry[]的長度一定后，隨着map里面數據的越來越長，這樣同一個index的鏈就會很長，會不會影響性能？
      
      
      
              
      
      
      
              
       
       
       
               HashMap里面設置一個因子，隨着map的size越來越大，Entry[]會以一定的規則加長長度。

取數據的邏輯

       
       
       
               
        
        
        
                 
                 //取值時: 
                
        
        
        
                 
                 int hash = key.hashCode(); 
                
        
        
        
                 
                 int index = hash % Entry[].length; 
                
        
        
        
                return Entry[index];
        
        
        
                

       
       
       
               
       
       
       
               
        
        
        
                
       
       
       
               
       
       
       
               
        
        
        
                 
                  
                     public V get(Object key) { 
                
        
        
        
                 
                         if (key == null) return getForNullKey(); 
                
        
        
        
                 
                         int hash = hash(key.hashCode()); 
                
        
        
        
                 
                         //先定位到數組元素，再遍歷該元素處的鏈表 
                
        
        
        
                 
                         for (Entry<K, V> e = table[indexFor(hash, table.length)]; e != null; e = e.next) { 
                
        
        
        
                 
                             Object k; 
                
        
        
        
                 
                             if (e.hash == hash && ((k = e.key) == key || key.equals(k))) return e.value; 
                
        
        
        
                 
                         } 
                
        
        
        
                 
                         return null; 
                
        
        
        
                    }

其他邏輯

        
        
        
                
         
         
         
                 
                   null key的存取 
                 
         
         
         
                 
                   null key總是存放在Entry[]數組的第一個元素。 
                 
        
        
        
                
        
        
        
                
         
         
         
                  
                   
                      private V putForNullKey(V value) { 
                 
         
         
         
                  
                          for (Entry<K, V> e = table[0]; e != null; e = e.next) { 
                 
         
         
         
                  
                              if (e.key == null) { 
                 
         
         
         
                  
                                  V oldValue = e.value; 
                 
         
         
         
                  
                                  e.value = value; 
                 
         
         
         
                  
                                  e.recordAccess(this); 
                 
         
         
         
                  
                                  return oldValue; 
                 
         
         
         
                  
                              } 
                 
         
         
         
                  
                          } 
                 
         
         
         
                  
                          modCount++; 
                 
         
         
         
                  
                          addEntry(0, null, value, 0); 
                 
         
         
         
                  
                          return null; 
                 
         
         
         
                  
                      } 
                 
         
         
         
                 
         
         
         
                  
                      private V getForNullKey() { 
                 
         
         
         
                  
                          for (Entry<K, V> e = table[0]; e != null; e = e.next) { 
                 
         
         
         
                  
                              if (e.key == null) return e.value; 
                 
         
         
         
                  
                          } 
                 
         
         
         
                  
                          return null; 
                 
         
         
         
                     }
        
        
        
                
        
        
        
                
         
         
         
                 
        
        
        
                
        
        
        
                
         
         
         
                 
                   確定數組index：hashcode % table.length取模 
                 
         
         
         
                 
                   HashMap存取時，都需要計算當前key應該對應Entry[]數組哪個元素，即計算數組下標；算法如下： 
                 
        
        
        
                
        
        
        
                
         
         
         
                  
                   
                      /** 
                   Returns index for hash code h. 
                   */ 
                 
         
         
         
                  
                      static int indexFor(int h, int length) { 
                 
         
         
         
                  
                          return h & (length - 1); 
                 
         
         
         
                     }
        
        
        
                
        
        
        
                
         
         
         
                 
                   按位取並，作用上相當於取模mod或者取余%。 
                 
         
         
         
                  
                  注意：不過的 
                  hashCode進行運算后的值可能相等， 
                  這意味着數組下標相同；但是，不要錯誤的理解為數組下標相同表示hashCode相同。 
                 
        
        
        
                
        
        
        
                
         
         
         
                 
        
        
        
                
        
        
        
                
         
         
         
                 初始大小
         
         
         
                 

        
        
        
                
        
        
        
                
         
         
         
                  
                   
                  public HashMap(int initialCapacity, float loadFactor) { 
                 
         
         
         
                  
                      ..... 
                 
         
         
         
                  
                      // Find a power of 2 >= initialCapacity 
                 
         
         
         
                  
                      int capacity = 1; 
                 
         
         
         
                  
                      while (capacity < initialCapacity) 
                 
         
         
         
                  
                          capacity <<= 1; 
                 
         
         
         
                  
                      this.loadFactor = loadFactor; 
                 
         
         
         
                  
                      threshold = (int)(capacity * loadFactor); 
                 
         
         
         
                  
                      table = new Entry[capacity]; 
                 
         
         
         
                  
                      init(); 
                 
         
         
         
                 }
        
        
        
                
        
        
        
                
         
         
         
                  
                  注意初始大小並不是構造函數中的initialCapacity！而是 >= initialCapacity的2的n次冪！！！！！

解決hash沖突的方法

        
        
        
                
         
         
         
                 
                   開放定址法（線性探測再散列，二次探測再散列，偽隨機探測再散列） 
                 
         
         
         
                 
                   再哈希法 
                 
         
         
         
                  
                  鏈地址法 
                 
         
         
         
                 
                   建立一個公共溢出區 
                 
         
         
         
                 
                   Java中 
                  HashMap 
                  的解決辦法是采用的鏈地址法。

再散列過程

        
        
        
                
         
         
         
                 當哈希表的容量超過默認容量時，必須調整table的大小。
        
        
        
                
        
        
        
                
         
         
         
                 當容量已經達到最大可能值時，那么該方法就將容量調整到Integer.MAX_VALUE返回，這時，需要創建一張新表，將原表映射到新表中。
         
         
         
                 

        
        
        
                
        
        
        
                
         
         
         
                  
                   
                      /** 
                 
         
         
         
                  
                      * Rehashes the contents of this map into a new array with a 
                 
         
         
         
                  
                      * larger capacity.  This method is called automatically when the 
                 
         
         
         
                  
                      * number of keys in this map reaches its threshold. 
                 
         
         
         
                  
                      * 
                 
         
         
         
                  
                      * If current capacity is MAXIMUM_CAPACITY, this method does not 
                 
         
         
         
                  
                      * resize the map, but sets threshold to Integer.MAX_VALUE. 
                 
         
         
         
                  
                      * This has the effect of preventing future calls. 
                 
         
         
         
                  
                      * 
                 
         
         
         
                  
                      * @param newCapacity the new capacity, MUST be a power of two; 
                 
         
         
         
                  
                      *        must be greater than current capacity unless current 
                 
         
         
         
                  
                      *        capacity is MAXIMUM_CAPACITY (in which case value 
                 
         
         
         
                  
                      *        is irrelevant). 
                 
         
         
         
                  
                      */ 
                 
         
         
         
                  
                      void resize(int newCapacity) { 
                 
         
         
         
                  
                          Entry[] oldTable = table; 
                 
         
         
         
                  
                          int oldCapacity = oldTable.length; 
                 
         
         
         
                  
                          if (oldCapacity == MAXIMUM_CAPACITY) { 
                 
         
         
         
                  
                              threshold = Integer.MAX_VALUE; 
                 
         
         
         
                  
                              return; 
                 
         
         
         
                  
                          } 
                 
         
         
         
                  
                          Entry[] newTable = new Entry[newCapacity]; 
                 
         
         
         
                  
                          transfer(newTable); 
                 
         
         
         
                  
                          table = newTable; 
                 
         
         
         
                  
                          threshold = (int) (newCapacity * loadFactor); 
                 
         
         
         
                  
                      } 
                 
         
         
         
                 
         
         
         
                  
                      /** 
                 
         
         
         
                  
                       * Transfers all entries from current table to newTable. 
                 
         
         
         
                  
                       */ 
                 
         
         
         
                  
                      void transfer(Entry[] newTable) { 
                 
         
         
         
                  
                          Entry[] src = table; 
                 
         
         
         
                  
                          int newCapacity = newTable.length; 
                 
         
         
         
                  
                          for (int j = 0; j < src.length; j++) { 
                 
         
         
         
                  
                              Entry<K, V> e = src[j]; 
                 
         
         
         
                  
                              if (e != null) { 
                 
         
         
         
                  
                                  src[j] = null; 
                 
         
         
         
                  
                                  do { 
                 
         
         
         
                  
                                      Entry<K, V> next = e.next; 
                 
         
         
         
                  
                                      //重新計算index 
                 
         
         
         
                  
                                      int i = indexFor(e.hash, newCapacity); 
                 
         
         
         
                  
                                      e.next = newTable[i]; 
                 
         
         
         
                  
                                      newTable[i] = e; 
                 
         
         
         
                  
                                      e = next; 
                 
         
         
         
                  
                                  } while (e != null); 
                 
         
         
         
                  
                              } 
                 
         
         
         
                  
                          } 
                 
         
         
         
                     }

來自為知筆記(Wiz)

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 數據結構面試題 Java數據結構面試題數據結構04 鏈表的面試題數據結構與算法筆試面試題整理前端面試題（數據結構與算法）數據結構經典面試題目數據結構和算法面試題系列總結微軟的22道數據結構算法面試題（含答案） php面試題之二——數據結構和算法（高級部分） HashMap的數據結構（一）

面試題 HashMap 數據結構 實現原理

免責聲明！

面試題 HashMap 數據結構實現原理