JDK1.7 hashMap源碼分析

本文轉載自查看原文 2019-06-13 11:28 910 JDK源碼解讀

了解HashMap原理之前先了解一下幾種數據結構：

1、數組：采用一段連續的內存空間來存儲數據。對於指定下標的查找，時間復雜度為O(1)，對於給定元素的查找，需要遍歷整個數據，時間復雜度為O(n)。但對於有序

　　數組的查找，可用二分查找法，時間復雜度為O(logn)，對於一般的插入刪除操作，涉及到數組元素的移動，其平均時間復雜度為O(n)。

2、哈希表：也叫散列表，用的是數組支持元素下標隨機訪問的特性，將鍵值映射為數組的下標進行元素的查找。所以哈希表就是數組的一種擴展，將鍵值映射為元素下標的函數叫做

　　哈希函數，哈希函數運算得到的結果叫做哈希值。哈希函數的設計至關重要，好的哈希函數會盡可能地保證計算簡單和散列地址分布均勻。

　　哈希沖突（也叫哈希碰撞）：不同的鍵值通過哈希函數運算得到相同的哈希值，解決哈希沖突的方式有開放尋址法和鏈表法，ThreadLocalMap由於其元素個數較少，

　　采用的是開放尋址法，而HashMap采用的是鏈表法來解決哈希沖突，即所有散列值相同的元素都放在相同槽對應的鏈表中（也就是數組+鏈表的方式）

3、鏈表：鏈表使用內存中不連續的內存塊進行數組的存儲，其不支持隨機訪問，每次元素的查找都要遍歷整個鏈表，時間復雜度為O(n)。

　　HashMap是由數組+鏈表構成的，即存放鏈表的數組，數組是HashMap的主體，鏈表則是為了解決哈希碰撞而存在的，如果定位到的數組不包含鏈表（當前的entry指向為null），那么對於查找，刪除等操作，時間復雜度僅為O(1)，如果定位到的數組包含鏈表，對於添加操作，其時間復雜度為O(n)，首先需要遍歷鏈表，存在相同的key則覆蓋value，否則新增；對於查找操作，也是一樣需要遍歷整個鏈表，然后通過key對象的equals方法逐一比對，時間復雜度也為O(n)。所以，HashMap中鏈表出現的越少，長度越短，性能才越好，這也是HashMap設置閥值即擴容的原因。

HashMap的主干是一個Entry數組，Entry是HashMap的基本組成單元，每一個Entry包含一個key-value鍵值對。

    /**
     * An empty table instance to share when the table is not inflated.
     */
    static final Entry<?,?>[] EMPTY_TABLE = {};
    /**
     * The table, resized as necessary. Length MUST Always be a power of two.
     */
    transient Entry<K,V>[] table = (Entry<K,V>[]) EMPTY_TABLE;

Entry是HashMap中的一個靜態內部類

    static class Entry<K,V> implements Map.Entry<K,V> {
        final K key;
        V value;
        Entry<K,V> next;
        int hash;

        /**
         * Creates new entry.
         */
        Entry(int h, K k, V v, Entry<K,V> n) {
            value = v;
            next = n;
            key = k;
            hash = h;
        }

        public final K getKey() {
            return key;
        }

        public final V getValue() {
            return value;
        }

        public final V setValue(V newValue) {
            V oldValue = value;
            value = newValue;
            return oldValue;
        }

        public final boolean equals(Object o) {
            if (!(o instanceof Map.Entry))
                return false;
            Map.Entry e = (Map.Entry)o;
            Object k1 = getKey();
            Object k2 = e.getKey();
            if (k1 == k2 || (k1 != null && k1.equals(k2))) {
                Object v1 = getValue();
                Object v2 = e.getValue();
                if (v1 == v2 || (v1 != null && v1.equals(v2)))
                    return true;
            }
            return false;
        }

        public final int hashCode() {
            return Objects.hashCode(getKey()) ^ Objects.hashCode(getValue());
        }

        public final String toString() {
            return getKey() + "=" + getValue();
        }

        /**
         * This method is invoked whenever the value in an entry is
         * overwritten by an invocation of put(k,v) for a key k that's already
         * in the HashMap.
         */
        void recordAccess(HashMap<K,V> m) {
        }

        /**
         * This method is invoked whenever the entry is
         * removed from the table.
         */
        void recordRemoval(HashMap<K,V> m) {
        }
    }

其他屬性：

    /**
     * The default initial capacity - MUST be a power of two.處理容量，2的4次方，16，擴容后的容量必須是2的次方
     */
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

    /**
     * The maximum capacity, used if a higher value is implicitly specified
     * by either of the constructors with arguments.
     * MUST be a power of two <= 1<<30.　　最大容量，2的30次方
     */
    static final int MAXIMUM_CAPACITY = 1 << 30;

    /**
     * The load factor used when none specified in constructor.　　默認負載因子，0.75f
     */
    static final float DEFAULT_LOAD_FACTOR = 0.75f;
        /**
     * The next size value at which to resize (capacity * load factor).
     * @serial
     */
    // If table == EMPTY_TABLE then this is the initial capacity at which the
    // table will be created when inflated.
    int threshold;　　擴容閥值

構造函數：

    public HashMap(int initialCapacity, float loadFactor) {
　　　　 // 校驗初始容量值是否合法
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
　　　　
        this.loadFactor = loadFactor;
　　　　 // 目前擴容閥值等於初始容量，在真正構建數組的時候，其值為 容量*負載因子
        threshold = initialCapacity;
        init();
    }

可以看到，在進行put操作的時候才真正構建table數組

put方法：

    public V put(K key, V value) {
        if (table == EMPTY_TABLE) {
            inflateTable(threshold);
        }
        if (key == null)
            return putForNullKey(value);
　　　　　// 根據key計算哈希值
        int hash = hash(key);
　　　　　// 根據哈希值和數據長度計算數據下標
        int i = indexFor(hash, table.length);
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
　　　　　　　// 哈希值相同再比較key是否相同，相同的話值替換，否則將這個槽轉成鏈表
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

        modCount++;　　// fast-fail，迭代時響應快速失敗，還未添加元素就進行modCount++,將為后續留下很多隱患
        addEntry(hash, key, value, i);　　// 添加元素，注意最后一個參數i是table數組的下標
        return null;
    }

inflateTable：

    /**
     * Inflates the table.
     */
    private void inflateTable(int toSize) {
        // Find a power of 2 >= toSize，尋找大於等於toSize的最小的2的次冪，如果toSize=13，則capacity=16；toSize=16,capacity=16
　　　　 // toSize=28,capacity=32;也就是說，當你設置了HashMap的初始容量initCapacity時，並不是存儲的數據達到設置的初始容量initCapacity*loadFactor時就擴容
　　　　 // 而是到了capacity = roundUpToPowerOf2(initCapacity)，capacity *loadFactor時才會擴容。

        int capacity = roundUpToPowerOf2(toSize); // 返回小於(toSize- 1) *2的最接近的2的次冪 ，如果toSize=1，則capacity=1，所以如果將initcapacity設為的話，第一次put不會擴容  

        threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);
        table = new Entry[capacity];
        initHashSeedAsNeeded(capacity);
    }

hash方法：

    final int hash(Object k) {
        int h = hashSeed;
        if (0 != h && k instanceof String) {
            return sun.misc.Hashing.stringHash32((String) k);
        }
　　　　// 先取key的hashCode再和hashSeed進行異或運算
        h ^= k.hashCode();

        // This function ensures that hashCodes that differ only by
        // constant multiples at each bit position have a bounded
        // number of collisions (approximately 8 at default load factor).
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }

indexFor：

    /**
     * Returns index for hash code h.　　返回數組下標
     */
    static int indexFor(int h, int length) {
        // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
        return h & (length-1);　　保證獲取的index一定在數組范圍內
    }

所以最終存儲位置的獲取流程是這樣的：

key--hashCode()-->hashCode--hash()-->h--indexFor()、h&(length-1)-->存儲下標

addEntry：

　　 transient int size;　　// Entry數組實際大小
    void addEntry(int hash, K key, V value, int bucketIndex) {
　　　　　// 添加新元素前先判斷數組的大小是否大於等於閥值，如果是且數組下標位置已經存在元素則對數組進行擴容，並對新的key重新根據新的數組長度計算下標
        if ((size >= threshold) && (null != table[bucketIndex])) {
　　　　　　  // 數組長度擴容為之前的2倍
            resize(2 * table.length);
            hash = (null != key) ? hash(key) : 0;
            bucketIndex = indexFor(hash, table.length);
        }
　　　　
        createEntry(hash, key, value, bucketIndex);
    }
　　// 將新的key-value存入Entry數組並size自增1
    void createEntry(int hash, K key, V value, int bucketIndex) {
        Entry<K,V> e = table[bucketIndex];　　//如果兩個線程同時執行到此處，那么一個線程的賦值就會被另一個覆蓋掉，這是對象丟失的原因之一
        table[bucketIndex] = new Entry<>(hash, key, value, e);　　
        size++;
    }

resize：

    void resize(int newCapacity) {
　　　　 // 保存就的數組
        Entry[] oldTable = table;
        int oldCapacity = oldTable.length;
　　　　　// 判斷數組的長度是不是已經達到了最大值
        if (oldCapacity == MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return;
        }
　　　　 // 創建一個新的數組
        Entry[] newTable = new Entry[newCapacity];
　　　　 // 將舊數組的內容轉換到新的數組中
        transfer(newTable, initHashSeedAsNeeded(newCapacity));
        table = newTable;
　　　　 // 計算新數組的擴容閥值
        threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
    }

transfer：

哈希桶內的元素被逆序排列到新表中

    /**
     * Transfers all entries from current table to newTable.
     */
    void transfer(Entry[] newTable, boolean rehash) {
        int newCapacity = newTable.length;
　　　　 // 遍歷舊數組得到每一個key再根據新數組的長度重新計算下標存進去，如果是一個鏈表，則鏈表中的每個鍵值對也都要重新hash計算索引
        for (Entry<K,V> e : table) {
　　　　　　// 如果此slot上存在元素，則進行遍歷，直到e==null，退出循環
            while(null != e) {
                Entry<K,V> next = e.next;
　　　　　　　　　// 當前元素總是直接放在數組下標的slot上，而不是放在鏈表的最后，所以
                if (rehash) {
                    e.hash = null == e.key ? 0 : hash(e.key);
                }
                int i = indexFor(e.hash, newCapacity);
　　　　　　　　　// 把原來slot上的元素作為當前元素的下一個
                e.next = newTable[i];
　　　　　　　　 // 新遷移過來的節點直接放置在slot位置上
                newTable[i] = e;
                e = next;
            }
        }
    }

get方法：

    public V get(Object key) {
　　　　// 如果key為null，直接去table[0]處去檢索即可
        if (key == null)
            return getForNullKey();
        Entry<K,V> entry = getEntry(key); // 根據key去獲取Entry數組

        return null == entry ? null : entry.getValue();
    }
    final Entry<K,V> getEntry(Object key) {
        if (size == 0) {
            return null;
        }
　　　　　// 根據key的hashCode重新計算hash值
        int hash = (key == null) ? 0 : hash(key);
　　　　 // 獲取查找的key所在數組中的索引，然后遍歷鏈表，通過equals方法對比key找到對應的記錄
        for (Entry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
            Object k;
            if (e.hash == hash &&
                ((k = e.key) == key || (key != null && key.equals(k))))
                return e;
        }
        return null;
    }

get方法相對比較簡單，key(hashCode)-->hash-->indexFor-->index，找到對應位置table[index]，再查看是否有鏈表，通過key的equals方法對比找到對應的記錄。

重寫equal方法的同時必須重寫hashCode()方法？如果不重寫會有什么問題呢？

如：User類重寫了equals方法卻沒有重寫hashCode方法

public class User {
    private int age;
    private String name;

    public User(int age, String name) {
        this.age = age;
        this.name = name;
    }

    public int getAge() {
        return age;
    }

    public void setAge(int age) {
        this.age = age;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) {
            return true;
        }
        if (o == null || getClass() != o.getClass()) {
            return false;
        }
        User user = (User) o;
        return age == user.age &&
                Objects.equals(name, user.name);
    }

}

將其作為key存入HashMap中，然后獲取

        User user = new User(20, "yangyongjie");
        Map<User, String> map = new HashMap<>(1);
        map.put(user, "菜鳥");
        String value = map.get(new User(20, "yangyongjie"));
        System.out.println(value); //null

結果卻為null，為什么呢？因為在默認情況下，hashCode方法是將對象的存儲地址進行映射的，Object.hashCode()的實現是默認為每一個對象生成不同的int數值，它本身是native方法，一般與對象內存地址相關。而上面put和get的User雖然通過重寫了equals方法使其邏輯上年齡和姓名相等的兩個對象被判定為同一個對象，但是其兩個對象的地址值並不相同，因此hashCode一定不同，那自然在put時的下標和get時的下標也不同。所以，如果重寫了equals方法一定要同時重寫hashCode方法。

此外，因為Set存儲的是不重復的對象，依據hashCode和equals進行判斷，所以Set存儲的自定義對象也必須重寫這兩個方法。

補充一下：未重寫前的equals方法和hashCode方法都可以用來比較兩個對象的地址值是否相同，不同的是，兩個地址值不同的對象的hashCode可能相同，但是equals一定不同。

HashMap存在的一些問題

死鏈：

　　兩個線程A，B同時對HashMap進行resize()操作，在執行transfer方法的while循環時，若此時當前槽上的元素為a-->b-->null

　　1、線程A執行到 Entry<K,V> next = e.next;時發生阻塞，此時e=a，next=b

　　2、線程B完整的執行了整段代碼，此時新表newTable元素為b-->a-->null

　　3、線程A繼續執行后面的代碼，執行完一個循環之后，newTable變為了a<-->b，造成while(e!=null) 一直死循環，CPU飆升

擴容數據丟失：

　　同樣在resize的transfer方法上

　　1、當前線程遷移過程中，其他線程新增的元素有可能落在已經遍歷過的哈希槽上；在遍歷完成之后，table數組引用指向了newTable，

　　　　這時新增的元素就會丟失，被無情的垃圾回收。

　　2、如果多個線程同時執行resize，每個線程又都會new Entry[newCapacity]，此時這是線程內的局部變量，線程之前是不可見的。遷移完成

　　　　后，resize的線程會給table線程共享變量，從而覆蓋其他線程的操作，因此在被覆蓋的new table上插入的數據會被丟棄掉。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 JDK1.7中HashMap死環問題及JDK1.8中對HashMap的優化源碼詳解 jdk1.7和jdk1.8 hashMap擴容 HashMap在JDK1.7中可能出現的並發問題 ubuntu jdk1.7 安裝 HashMap源碼分析（jdk7） JDK源碼分析（5）之 HashMap 相關 JDK8-HashMap源碼分析 HashMap源碼分析(一):JDK源碼分析系列 Jdk1.7 與 jdk1.8的區別 jdk1.6 和 jdk1.7 區別