本篇內容是學習的記錄,可能會有所不足。
一:JDK1.7中的HashMap
JDK1.7的hashMap是由數組 + 鏈表組成
/** 1 << 4,表示1,左移4位,變成10000,即16,以二進制形式運行,效率更高 * 默認的hashMap數組長度 * The default initial capacity - MUST be a power of two. */ static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16 /** * The maximum capacity, used if a higher value is implicitly specified * by either of the constructors with arguments. * MUST be a power of two <= 1<<30. * hashMap的最大容量 */ static final int MAXIMUM_CAPACITY = 1 << 30; //1 073 741 824 /** * The load factor used when none specified in constructor. * 負載因子 */ static final float DEFAULT_LOAD_FACTOR = 0.75f; /** * An empty table instance to share when the table is not inflated. */ static final Entry<?,?>[] EMPTY_TABLE = {}; /** * The table, resized as necessary. Length MUST Always be a power of two. * hashTable,根據需要調整大小。長度一定是2的冪。 */ transient Entry<K,V>[] table = (Entry<K,V>[]) EMPTY_TABLE; /** * The number of key-value mappings contained in this map. * hashMap中元素的個數 */ transient int size; /** * The next size value at which to resize (capacity * load factor). * @serial */ // If table == EMPTY_TABLE then this is the initial capacity at which the // table will be created when inflated. int threshold; /** * The load factor for the hash table. * * @serial */ final float loadFactor; /** * The number of times this HashMap has been structurally modified * Structural modifications are those that change the number of mappings in * the HashMap or otherwise modify its internal structure (e.g., * rehash). This field is used to make iterators on Collection-views of * the HashMap fail-fast. (See ConcurrentModificationException). * 記錄hashMap元素被修改的次數 */ transient int modCount;
1:DEFAULT_INITIAL_CAPACITY,是hashMap默認的初始容量,它的大小一定是2的冪。
2:MAXIMUM_CAPACITY,hashMap支持的最大容量。
3:DEFAULT_LOAD_FACTOR,hashMap默認的負載因子,值為0.75,它決定hashMap數據的密度。
4:Entry<K,V>[] table,hashMap數組,可以根據自己的需要調整大小,長度一定是2的冪。
5:size,主要是記錄hashMap中元素的數量。
6:threshold,調整hashMap后的值,即容量*負載因子。
7:loadFactor,可以調整的負載因子。
8:modCount,用來記錄hashMap結構被修改的次數。
hashMap源碼中有四個構造函數,初始化的時候可以知道容量和負載因子的大小。
/** 做了兩件事:1、為threshold、loadFactor賦值 2、調用init() * Constructs an empty <tt>HashMap</tt> with the specified initial * capacity and load factor. * * @param initialCapacity the initial capacity * @param loadFactor the load factor * @throws IllegalArgumentException if the initial capacity is negative * or the load factor is nonpositive */ public HashMap(int initialCapacity, float loadFactor) { if (initialCapacity < 0) throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity); if (initialCapacity > MAXIMUM_CAPACITY) //限制最大容量 initialCapacity = MAXIMUM_CAPACITY; if (loadFactor <= 0 || Float.isNaN(loadFactor)) //檢查 loadFactor throw new IllegalArgumentException("Illegal load factor: " + loadFactor); //真正在做的,只是記錄下loadFactor、initialCpacity的值 this.loadFactor = loadFactor; //記錄下loadFactor threshold = initialCapacity; //初始的 閾值threshold=initialCapacity=16 init(); } /** * Constructs an empty <tt>HashMap</tt> with the specified initial * capacity and the default load factor (0.75). * * @param initialCapacity the initial capacity. * @throws IllegalArgumentException if the initial capacity is negative. */ public HashMap(int initialCapacity) { this(initialCapacity, DEFAULT_LOAD_FACTOR); } /** 默認的初始化容量、默認的加載因子 * Constructs an empty <tt>HashMap</tt> with the default initial capacity * (16) and the default load factor (0.75). */ public HashMap() { //16 0.75 this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR); } /** * Constructs a new <tt>HashMap</tt> with the same mappings as the * specified <tt>Map</tt>. The <tt>HashMap</tt> is created with * default load factor (0.75) and an initial capacity sufficient to * hold the mappings in the specified <tt>Map</tt>. * * @param m the map whose mappings are to be placed in this map * @throws NullPointerException if the specified map is null */ public HashMap(Map<? extends K, ? extends V> m) { this(Math.max((int) (m.size() / DEFAULT_LOAD_FACTOR) + 1, DEFAULT_INITIAL_CAPACITY), DEFAULT_LOAD_FACTOR); inflateTable(threshold); putAllForCreate(m); }
接下來看下put方法:
public V put(K key, V value) { if (Entry<K,V>[] table == EMPTY_TABLE) { inflateTable(threshold); //初始化表 (初始化、擴容 合並為了一個方法) } if (key == null) //對key為null做特殊處理 return putForNullKey(value); int hash = hash(key); //計算hash值 int i = indexFor(hash, table.length); //根據hash值計算出index下標 for (Entry<K,V> e = table[i]; e != null; e = e.next) { //遍歷下標為i處的鏈表 Object k; if (e.hash == hash && ((k = e.key) == key || key.equals(k))) { //如果key值相同,覆蓋舊值,返回新值 V oldValue = e.value; e.value = value; //新值 覆蓋 舊值 e.recordAccess(this); //do nothing return oldValue; //返回舊值 } } modCount++; //修改次數+1,類似於一個version number addEntry(hash, key, value, i); return null; }
可以看到到table是空的時候,調用了一個方法:
private void inflateTable(int toSize) { // Find a power of 2 >= toSize int capacity = roundUpToPowerOf2(toSize); // threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1); table = new Entry[capacity]; //初始化表 initHashSeedAsNeeded(capacity); }
這個方法用來初始化table和table的擴容,roundUpToPowerOf2可以保證hashMap的容量一定是2的冪。
hashMap put元素時,會先根據hash運算計算出hash值,然后根據hash值和table的長度進行取模,計算出元素在table中的下標,如果key相同就覆蓋原來的舊值,如果不相同就加入鏈表中。
/** * Returns index for hash code h. * 計算元素在table中的下標位置 */ static int indexFor(int h, int length) { // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2"; return h & (length-1); } /** * Adds a new entry with the specified key, value and hash code to * the specified bucket. It is the responsibility of this * method to resize the table if appropriate. * * Subclass overrides this to alter the behavior of put method. */ void addEntry(int hash, K key, V value, int bucketIndex) { if ((size >= threshold) && (null != table[bucketIndex])) { //如果size大於threshold && table在下標為index的地方已經有entry了 resize(2 * table.length); //擴容,將數組長度變為原來兩倍 hash = (null != key) ? hash(key) : 0; //重新計算 hash 值 bucketIndex = indexFor(hash, table.length); //重新計算下標 } createEntry(hash, key, value, bucketIndex); //創建entry } /** * Like addEntry except that this version is used when creating entries * as part of Map construction or "pseudo-construction" (cloning, * deserialization). This version needn't worry about resizing the table. * * Subclass overrides this to alter the behavior of HashMap(Map), * clone, and readObject. */ void createEntry(int hash, K key, V value, int bucketIndex) { Entry<K,V> e = table[bucketIndex]; //獲取table中存的entry table[bucketIndex] = new Entry<>(hash, key, value, e); //將新的entry放到數組中,next指向舊的table[i] size++; //修改map中元素個數 }
當put的元素個數大於12時,即大於hashMap的容量*負載因子計算后的值,那么就會進行擴容,上述源代碼可以看到擴容的條件, 除了大於12,還要看當前put進table所處的位置,是否為null,若是null,就不進行擴容,否則就擴容成原來容量的2倍,擴容后需要重新計算hash和計算下標,由於table的長度發生了變化,需要重新計算。
接下來看下get方法:
public V get(Object key) { if (key == null) return getForNullKey(); Entry<K,V> entry = getEntry(key); return null == entry ? null : entry.getValue(); } /** * Returns the entry associated with the specified key in the * HashMap. Returns null if the HashMap contains no mapping * for the key. */ final Entry<K,V> getEntry(Object key) { if (size == 0) { return null; } int hash = (key == null) ? 0 : hash(key); for (Entry<K,V> e = table[indexFor(hash, table.length)]; e != null; e = e.next) { Object k; if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) return e; } return null; }
get方法也是需要先計算hash然后計算下標,再去尋找元素。
二:JDK1.8中的HashMap
JDK1.8中的hashMap和1.7最大的區別就是引入了紅黑樹
/** * The table, initialized on first use, and resized as * necessary. When allocated, length is always a power of two. * (We also tolerate length zero in some operations to allow * bootstrapping mechanics that are currently not needed.) */ transient Node<K,V>[] table; /** * Holds cached entrySet(). Note that AbstractMap fields are used * for keySet() and values(). */ transient Set<Map.Entry<K,V>> entrySet; /** * The number of key-value mappings contained in this map. */ transient int size; /** * The number of times this HashMap has been structurally modified * Structural modifications are those that change the number of mappings in * the HashMap or otherwise modify its internal structure (e.g., * rehash). This field is used to make iterators on Collection-views of * the HashMap fail-fast. (See ConcurrentModificationException). */ transient int modCount; /** * The next size value at which to resize (capacity * load factor). * * @serial */ // (The javadoc description is true upon serialization. // Additionally, if the table array has not been allocated, this // field holds the initial array capacity, or zero signifying // DEFAULT_INITIAL_CAPACITY.) int threshold; /** * The load factor for the hash table. * * @serial */ final float loadFactor; /** * The default initial capacity - MUST be a power of two. */ static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16 /** * The maximum capacity, used if a higher value is implicitly specified * by either of the constructors with arguments. * MUST be a power of two <= 1<<30. */ static final int MAXIMUM_CAPACITY = 1 << 30; /** * The load factor used when none specified in constructor. */ static final float DEFAULT_LOAD_FACTOR = 0.75f; /** * The bin count threshold for using a tree rather than list for a * bin. Bins are converted to trees when adding an element to a * bin with at least this many nodes. The value must be greater * than 2 and should be at least 8 to mesh with assumptions in * tree removal about conversion back to plain bins upon * shrinkage. * */ static final int TREEIFY_THRESHOLD = 8; /** * The bin count threshold for untreeifying a (split) bin during a * resize operation. Should be less than TREEIFY_THRESHOLD, and at * most 6 to mesh with shrinkage detection under removal. */ static final int UNTREEIFY_THRESHOLD = 6; /** * The smallest table capacity for which bins may be treeified. * (Otherwise the table is resized if too many nodes in a bin.) * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts * between resizing and treeification thresholds. */ static final int MIN_TREEIFY_CAPACITY = 64; /** * Basic hash bin node, used for most entries. (See below for * TreeNode subclass, and in LinkedHashMap for its Entry subclass.) */ static class Node<K,V> implements Map.Entry<K,V> { final int hash; final K key; V value; Node<K,V> next; Node(int hash, K key, V value, Node<K,V> next) { this.hash = hash; this.key = key; this.value = value; this.next = next; } public final K getKey() { return key; } public final V getValue() { return value; } public final String toString() { return key + "=" + value; } public final int hashCode() { return Objects.hashCode(key) ^ Objects.hashCode(value); } public final V setValue(V newValue) { V oldValue = value; value = newValue; return oldValue; } public final boolean equals(Object o) { if (o == this) return true; if (o instanceof Map.Entry) { Map.Entry<?,?> e = (Map.Entry<?,?>)o; if (Objects.equals(key, e.getKey()) && Objects.equals(value, e.getValue())) return true; } return false; } }
下面看下put方法:
public V put(K key, V value) { return putVal(hash(key), key, value, false, true); } /** * Implements Map.put and related methods. 添加元素 * * @param hash hash for key * @param key the key * @param value the value to put * @param onlyIfAbsent if true, don't change existing value * @param evict if false, the table is in creation mode. * @return previous value, or null if none */ final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) { Node<K,V>[] tab; Node<K,V> p; int n, i; if ((tab = table) == null || (n = tab.length) == 0) //若table為null n = (tab = resize()).length; //resize if ((p = tab[i = (n - 1) & hash]) == null) //計算下標i,取出i處的元素為p,如果p為null tab[i] = newNode(hash, key, value, null); //創建新的node,放到數組中 else { //若 p!=null Node<K,V> e; K k; if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k)))) //若key相同 e = p; //直接覆蓋 else if (p instanceof TreeNode) //如果為 樹節點 e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value); //放到樹中 else { //如果key不相同,也不是treeNode for (int binCount = 0; ; ++binCount) { //遍歷i處的鏈表 if ((e = p.next) == null) { //找到尾部 p.next = newNode(hash, key, value, null); //在末尾添加一個node if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st //如果鏈表長度 >= 8 treeifyBin(tab, hash); //將鏈表轉成共黑樹 break; } if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) //若果key相同,直接退出循環 break; p = e; } } if (e != null) { // existing mapping for key V oldValue = e.value; if (!onlyIfAbsent || oldValue == null) e.value = value; afterNodeAccess(e); return oldValue; } } ++modCount; if (++size > threshold) resize(); afterNodeInsertion(evict); return null; }
可以看到,上述源代碼中,put的時候加入了紅黑樹,當put元素時,若鏈表的長度大於8,即源代碼中的TREEIFY_THRESHOLD的值,這個時候鏈表就會轉化為紅黑樹結構;當進行擴容的時候,紅黑樹轉移后,若元素個數小於6,那么就會重新轉化為鏈表。
三:JDK1.7中的ConcurrentHashMap
JDK1.7中的ConcurrentHashMap和JDK1.7中的HashMap的區別就是數組所存的元素,我們知道ConcurrentHashMap 是線程安全的。
public V put(K key, V value) { Segment<K,V> s; if (value == null) throw new NullPointerException(); int hash = hash(key); // 計算Hash值 int j = (hash >>> segmentShift) & segmentMask; //計算下標j if ((s = (Segment<K,V>)UNSAFE.getObject // nonvolatile; recheck (segments, (j << SSHIFT) + SBASE)) == null) // in ensureSegment s = ensureSegment(j); //若j處有segment就返回,若沒有就創建並返回 return s.put(key, hash, value, false); //將值put到segment中去 } final V put(K key, int hash, V value, boolean onlyIfAbsent) { HashEntry<K,V> node = tryLock() ? null : scanAndLockForPut(key, hash, value); //如果tryLock成功,就返回null,否則。。。 V oldValue; try { HashEntry<K,V>[] tab = table; int index = (tab.length - 1) & hash; //根據table數組的長度 和 hash值計算index小標 HashEntry<K,V> first = entryAt(tab, index); //找到table數組在 index處鏈表的頭部 for (HashEntry<K,V> e = first;;) { //從first開始遍歷鏈表 if (e != null) { //若e!=null K k; if ((k = e.key) == key || (e.hash == hash && key.equals(k))) { //如果key相同 oldValue = e.value; //獲取舊值 if (!onlyIfAbsent) { //若absent=false e.value = value; //覆蓋舊值 ++modCount; // } break; //若已經找到,就退出鏈表遍歷 } e = e.next; //若key不相同,繼續遍歷 } else { //直到e為null if (node != null) //將元素放到鏈表頭部 node.setNext(first); else node = new HashEntry<K,V>(hash, key, value, first); //創建新的Entry int c = count + 1; //count 用來記錄元素個數 if (c > threshold && tab.length < MAXIMUM_CAPACITY) //如果hashmap元素個數超過threshold,並且table長度小於最大容量 rehash(node); //rehash跟resize的功能差不多,將table的長度變為原來的兩倍,重新打包entries,並將給定的node添加到新的table else //如果還有容量 setEntryAt(tab, index, node); //就在index處添加鏈表節點 ++modCount; //修改操作數 count = c; //將count+1 oldValue = null; // break; } } } finally { unlock(); //執行完操作后,釋放鎖 } return oldValue; //返回oldValue } private Segment<K,V> ensureSegment(int k) { final Segment<K,V>[] ss = this.segments; long u = (k << SSHIFT) + SBASE; // raw offset 獲取下標k處的offset, Segment<K,V> seg; if ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u)) == null) { //如果下標k處沒有元素 Segment<K,V> proto = ss[0]; // use segment 0 as prototype int cap = proto.table.length; //根據proto 獲得 cap參數 float lf = proto.loadFactor; //。。。 int threshold = (int)(cap * lf); //計算threshold HashEntry<K,V>[] tab = (HashEntry<K,V>[])new HashEntry[cap]; if ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u)) == null) { // recheck //如果下標k處仍然沒有元素 Segment<K,V> s = new Segment<K,V>(lf, threshold, tab); //創建segment while ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u)) == null) { //若下標k處仍然沒有元素,自旋 if (UNSAFE.compareAndSwapObject(ss, u, null, seg = s)) //若通過CAS更新成功,則退出 break; } } } return seg; }
/** segments中每個元素都是一個專用的hashtable * The segments, each of which is a specialized hash table. */ final Segment<K,V>[] segments;
可以看到1.7中的ConcurrentHashMap數組中所存的是segments,每個segments下都是一個hashTable。當put元素時,會加鎖,然后計算hash和下標,計算下標會計算兩次,一次是在數組中的segments的位置,一次是在hashTable的位置。
四:JDK1.8中的ConcurrentHashMap
JDK1.8中的ConcurrentHashMap和JDK1.8中的HashMap結構一樣,只是在處理上有區別
public V put(K key, V value) { return putVal(key, value, false); } /** Implementation for put and putIfAbsent */ final V putVal(K key, V value, boolean onlyIfAbsent) { if (key == null || value == null) throw new NullPointerException(); int hash = spread(key.hashCode()); //計算hash值 int binCount = 0; for (Node<K,V>[] tab = table;;) { //自旋 Node<K,V> f; int n, i, fh; if (tab == null || (n = tab.length) == 0) //table==null || table.length==0 tab = initTable(); //就initTable else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) { //若下標 i 處的元素為null if (casTabAt(tab, i, null, //直接用CAS操作,i處的元素 new Node<K,V>(hash, key, value, null))) break; // no lock when adding to empty bin 想emptybin中假如元素的時候,不需要加鎖 } else if ((fh = f.hash) == MOVED) //若下標 i 處的元素不為null,且f.hash==MOVED MOVED為常量值-1 tab = helpTransfer(tab, f); // else { //如果是一般的節點 V oldVal = null; synchronized (f) { //當頭部元素不為null,且不需要轉換成樹時,需要進行同步操作 if (tabAt(tab, i) == f) { if (fh >= 0) { //若 鏈表頭部hash值 >=0 binCount = 1; for (Node<K,V> e = f;; ++binCount) { K ek; if (e.hash == hash && ((ek = e.key) == key || (ek != null && key.equals(ek)))) { //如果key相同 oldVal = e.val; if (!onlyIfAbsent) //且不為absent e.val = value; //舊值覆蓋新值 break; } Node<K,V> pred = e; if ((e = e.next) == null), { //如果鏈表遍歷完成,還沒退出,說明沒有相同的key存在,在尾部添加節點 pred.next = new Node<K,V>(hash, key, value, null); break; } } } else if (f instanceof TreeBin) { //如果f是Tree的節點 Node<K,V> p; binCount = 2; if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key, value)) != null) { oldVal = p.val; if (!onlyIfAbsent) p.val = value; } } } } if (binCount != 0) { if (binCount >= TREEIFY_THRESHOLD) treeifyBin(tab, i); if (oldVal != null) return oldVal; break; } } } addCount(1L, binCount); return null; }
當put元素時,會使用CAS操作,去判斷數組中所要put到的位置元素是否為空,為空就修改為當前的put的元素,若CAS操作失敗,那么會自旋,這個時候發現數組里已經有元素了,那么就會鎖住鏈表或者紅黑樹頭部,把元素放入鏈表或者紅黑樹下面 。
五:hash沖突
當put的時候需要計算hash和下標,這個時候計算出來的值可能存在一樣的,那么存到數組中的相同位置,就會發生hash沖突,
計算出的hash值一樣一定會發生hash沖突,但是hash值一樣的概率很小,計算出的下標值是一樣的概率很大,所以hash沖突主要是由下標位置一樣引起的,hashMap的解決方式是使用鏈地址法,即使用鏈表的方式解決,key一樣的時候才會覆蓋,否則就把元素放到鏈表的下一個位置。