HashMap之容量大小與擴容

本文轉載自查看原文 2019-12-09 16:01 1345

作為Java中最常用的K-V數據類型，HashMap的源碼有很多地方值得細讀。

首先，需要區分清楚幾個概念：capacity、size、threshold

容量（capacity）是指當前map最多可以存放多少個元素，大小(size)是指當前map已經存放了多少個k-v鍵值對。threshold是擴容的閾值，當size超過閾值后，便需要對map進行擴容。也就是說，一般情況下，map當中的鍵值對數量不會達到其容量上限。閾值一般為：capacity*loadFactor(負載因子)

一、默認情況下，new HashMap()得到的對象，其容量為16，負載因子為0.75

    /**
     * The default initial capacity - MUST be a power of two.
     */
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

    /**
     * The maximum capacity, used if a higher value is implicitly specified
     * by either of the constructors with arguments.
     * MUST be a power of two <= 1<<30.
     */
    static final int MAXIMUM_CAPACITY = 1 << 30;

    /**
     * The load factor used when none specified in constructor.
     */
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

二、在初始化map時，若指定了容量大小，那么，實際的容量值為大於等於該數的第一個2的冪的值。

即：tableSizeFor方法結果（1 => 1 ; 5=> 8 ; 8=>8 ; 9=> 16）

在下面的代碼執行時：

Map m = new HashMap(5);

實際調用了:

　　/**
     * Returns a power of two size for the given target capacity.
     */
    static final int tableSizeFor(int cap) {
        int n = cap - 1;
        n |= n >>> 1;
        n |= n >>> 2;
        n |= n >>> 4;
        n |= n >>> 8;
        n |= n >>> 16;
        return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
    }

public HashMap(int initialCapacity, float loadFactor) {
    if (initialCapacity < 0)
        throw new IllegalArgumentException("Illegal initial capacity: " +
                                           initialCapacity);
    if (initialCapacity > MAXIMUM_CAPACITY)
        initialCapacity = MAXIMUM_CAPACITY;
    if (loadFactor <= 0 || Float.isNaN(loadFactor))
        throw new IllegalArgumentException("Illegal load factor: " +
                                           loadFactor);
    this.loadFactor = loadFactor;
    this.threshold = tableSizeFor(initialCapacity);
}

這里的tableForSize方法，位移操作，在進行按位或，

實際上是將一個二進制數從最高位不為0起，將其后面所有的位數都置為1.例如：

0010 0000   （原始數據）
0001 0000    （右移1位）
-------------（按位 或）
0011 0000   
0000 1100
-------------
0011 1100
0000 0011
-------------
0011 1111

所以，tableForSize方法，實際上是將一個32位的數據，從最高位不為0起，后面全部置為1，然后再+1，結果就是最接近指定大小的數的2的冪

具體可以參考下面文章的分析：https://www.hollischuang.com/archives/2431

值得注意的是，上面的源碼中，是將tableForSize的值賦值給了threshold, 那為何說是我們初始化容量（capacity）的大小為該值呢?

因為在Map初始化時，是第一次向map添加數據才會觸發的。第一次put數據時，調用：

    public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }

    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length; 　　//注意這一行代碼
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
           ...
        }
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

注意上面，putVal會判斷table是否為null

if ((tab = table) == null || (n = tab.length) == 0)

如果為null,則調用resize方法：

n = (tab = resize()).length;

而resize方法中：

    final Node<K,V>[] resize() {
        Node<K,V>[] oldTab = table;
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        int oldThr = threshold;
        int newCap, newThr = 0;
        if (oldCap > 0) {
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                newThr = oldThr << 1; // double threshold
        }
        else if (oldThr > 0) // initial capacity was placed in threshold
            newCap = oldThr; //注意這一行代碼
        else {               // zero initial threshold signifies using defaults
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        if (newThr == 0) {
            float ft = (float)newCap * loadFactor; //擴容閾值會重新計算，為容量* 負載因子
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE); 
        }

實際上就是將之前設置的threshold作為了初始化的容量大小。

不過，對於初始化容量為1時，即 new HashMap(1)；此刻的capacity有一點不同，就是在沒有調用put方法時，capacity==1 ，再調用put之后，capacity ==2。這是因為發生了擴容

        HashMap<String,String> m = new HashMap(1); 
        //m.put("",""); //調用之后，capacatiy會等於2
        Method method = m.getClass().getDeclaredMethod("capacity");
        method.setAccessible(true);
        System.out.println(method.invoke(m)); // 輸出：1

其它情況下：

        HashMap<String,String> m = new HashMap(3); 
        //m.put("",""); //調用之后，capacatiy會等於4，調用與否都是一致的
        Method method = m.getClass().getDeclaredMethod("capacity");
        method.setAccessible(true);
        System.out.println(method.invoke(m)); // 輸出：4

實際上，調用capacity方法時：

    final int capacity() {
        return (table != null) ? table.length :
            (threshold > 0) ? threshold :
            DEFAULT_INITIAL_CAPACITY;
    }

也就是說，如果已經初始化了table數組，則返回數組的大小，否則返回threadhold.

三、擴容時的操作：

每次觸發擴容時，capacity會變為原來的兩倍。

參考：https://www.hollischuang.com/archives/2431

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 HashMap的容量大小增長原理（JDK1.6/1.7/1.8）查看數據庫容量大小為什么HashMap擴容是2倍以及容量為什么是2的n次冪 HashMap初始大小和擴容后的大小？ MySQL查看數據庫表容量大小 MySQL查看數據庫表容量大小 MySQL查看數據庫表容量大小為什么阿里巴巴建議集合初始化時，指定集合容量大小 mysql查看指定數據庫各表容量大小顯存中的顯存頻率，顯存位寬，容量大小，帶寬