作為Java中最常用的K-V數據類型,HashMap的源碼有很多地方值得細讀。
首先,需要區分清楚幾個概念:capacity、size、threshold
容量(capacity)是指當前map最多可以存放多少個元素,大小(size)是指當前map已經存放了多少個k-v鍵值對。threshold是擴容的閾值,當size超過閾值后,便需要對map進行擴容。也就是說,一般情況下,map當中的鍵值對數量不會達到其容量上限。閾值一般為:capacity*loadFactor(負載因子)
一、默認情況下,new HashMap()得到的對象,其容量為16,負載因子為0.75
/** * The default initial capacity - MUST be a power of two. */ static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16 /** * The maximum capacity, used if a higher value is implicitly specified * by either of the constructors with arguments. * MUST be a power of two <= 1<<30. */ static final int MAXIMUM_CAPACITY = 1 << 30; /** * The load factor used when none specified in constructor. */ static final float DEFAULT_LOAD_FACTOR = 0.75f;
二、在初始化map時,若指定了容量大小,那么,實際的容量值為大於等於該數的第一個2的冪的值。
即:tableSizeFor方法結果(1 => 1 ; 5=> 8 ; 8=>8 ; 9=> 16)
在下面的代碼執行時:
Map m = new HashMap(5);
實際調用了:
/** * Returns a power of two size for the given target capacity. */ static final int tableSizeFor(int cap) { int n = cap - 1; n |= n >>> 1; n |= n >>> 2; n |= n >>> 4; n |= n >>> 8; n |= n >>> 16; return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1; } public HashMap(int initialCapacity, float loadFactor) { if (initialCapacity < 0) throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity); if (initialCapacity > MAXIMUM_CAPACITY) initialCapacity = MAXIMUM_CAPACITY; if (loadFactor <= 0 || Float.isNaN(loadFactor)) throw new IllegalArgumentException("Illegal load factor: " + loadFactor); this.loadFactor = loadFactor; this.threshold = tableSizeFor(initialCapacity); }
這里的tableForSize方法,位移操作,在進行按位或,
實際上是將一個二進制數從最高位不為0起,將其后面所有的位數都置為1.例如:
0010 0000 (原始數據) 0001 0000 (右移1位) -------------(按位 或) 0011 0000 0000 1100 ------------- 0011 1100 0000 0011 ------------- 0011 1111
所以,tableForSize方法,實際上是將一個32位的數據,從最高位不為0起,后面全部置為1,然后再+1,結果就是 最接近指定大小的數的2的冪
具體可以參考下面文章的分析:https://www.hollischuang.com/archives/2431
值得注意的是,上面的源碼中,是將tableForSize的值賦值給了threshold, 那為何說是我們初始化容量(capacity)的大小為該值呢?
因為在Map初始化時,是第一次向map添加數據才會觸發的。第一次put數據時,調用:
public V put(K key, V value) { return putVal(hash(key), key, value, false, true); } final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) { Node<K,V>[] tab; Node<K,V> p; int n, i; if ((tab = table) == null || (n = tab.length) == 0) n = (tab = resize()).length; //注意這一行代碼 if ((p = tab[i = (n - 1) & hash]) == null) tab[i] = newNode(hash, key, value, null); else { ... } ++modCount; if (++size > threshold) resize(); afterNodeInsertion(evict); return null; }
注意上面,putVal會判斷table是否為null
if ((tab = table) == null || (n = tab.length) == 0)
如果為null,則調用resize方法:
n = (tab = resize()).length;
而resize方法中:
final Node<K,V>[] resize() { Node<K,V>[] oldTab = table; int oldCap = (oldTab == null) ? 0 : oldTab.length; int oldThr = threshold; int newCap, newThr = 0; if (oldCap > 0) { if (oldCap >= MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return oldTab; } else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY) newThr = oldThr << 1; // double threshold } else if (oldThr > 0) // initial capacity was placed in threshold newCap = oldThr; //注意這一行代碼 else { // zero initial threshold signifies using defaults newCap = DEFAULT_INITIAL_CAPACITY; newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); } if (newThr == 0) { float ft = (float)newCap * loadFactor; //擴容閾值會重新計算,為容量* 負載因子 newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ? (int)ft : Integer.MAX_VALUE); }
實際上就是將之前設置的threshold作為了初始化的容量大小。
不過,對於初始化容量為1時,即 new HashMap(1);此刻的capacity有一點不同,就是在沒有調用put方法時,capacity==1 ,再調用put之后,capacity ==2。這是因為發生了擴容
HashMap<String,String> m = new HashMap(1); //m.put("",""); //調用之后,capacatiy會等於2 Method method = m.getClass().getDeclaredMethod("capacity"); method.setAccessible(true); System.out.println(method.invoke(m)); // 輸出:1
其它情況下:
HashMap<String,String> m = new HashMap(3); //m.put("",""); //調用之后,capacatiy會等於4,調用與否都是一致的 Method method = m.getClass().getDeclaredMethod("capacity"); method.setAccessible(true); System.out.println(method.invoke(m)); // 輸出:4
實際上,調用capacity方法時:
final int capacity() { return (table != null) ? table.length : (threshold > 0) ? threshold : DEFAULT_INITIAL_CAPACITY; }
也就是說,如果已經初始化了table數組,則返回數組的大小,否則返回threadhold.
三、擴容時的操作:
每次觸發擴容時,capacity會變為原來的兩倍。