Go map實現原理

本文轉載自查看原文 2019-02-20 20:16 611 Golang

map結構

整體為一個數組，數組每個元素可以理解成一個槽，槽是一個鏈表結構，槽的每個節點可存8個元素，搞清楚了map的結構，想想對應的增刪改查操作也不是那么難

1：槽大小計算&hash算法

我們可以簡單的理解成：槽大小為1<<N，每個元素計算出一個hash值hashCode，hash到這些槽中，hash算法：hashCode&1<<N-1，剛好和槽的范圍完全重合

關於hash沖突，那是必須的，當你指定預計元素個數時，預計平均一個槽6.5個元素，據我觀察好多語言的hash都是采用hashCode&1<<N-1的hash方式，可能是因為實現簡單，元素分布比較均勻

hashCode的計算請參考文件：src/runtime/hash64.go src/runtime/hash32.go

2：容量計算和擴容

m := make(map[string]int, hint)
func overLoadFactor(count int64, B uint8) bool {
   return count >= 8 && float32(count) >= 6.5*float32((uint64(1)<<B))
}
b := uint8(0)  //最大128位足夠了
for ; overLoadFactor(hint, b); b++ {
}

func makeBucketArray(t *maptype, b uint8) (buckets unsafe.Pointer, nextOverflow *bmap) {
   base := uintptr(1 << b)
   nbuckets := base
   if b >= 4 {
      nbuckets += 1 << (b - 4)    //多分配了1/16的空間，當某個槽滿了之后可以可以從這里多取出一個節點
      sz := t.bucket.size * nbuckets
      up := roundupsize(sz)
      if up != sz {
         nbuckets = up / t.bucket.size
      }
   }
   buckets = newarray(t.bucket, int(nbuckets))
   if base != nbuckets {
      nextOverflow = (*bmap)(add(buckets, base*uintptr(t.bucketsize)))
      last := (*bmap)(add(buckets, (nbuckets-1)*uintptr(t.bucketsize)))
      last.setoverflow(t, (*bmap)(buckets))
   }
   return buckets, nextOverflow
}

初始化容量和擴容都是調上面的方法makeBucketArray

舉個例子：m := make(map[int][string], 10)，通過上面計算b的方法得出b=2，即4=1<<2個槽，每個槽一個節點，每個節點可容納8個元素，總共可容納32個元素，可容納期望的10個元素

擴容基本采取2*N + N/16的方式(2倍擴容)，多出來的N/16用於槽滿的情況，新增的節點就從這多出來的地方取，如果多出來的槽也用完了就直接new新的內存

槽大小(t.bucketsize)應該是提前就計算好了的，每個槽能容納8個元素，當槽滿之后，如果還有新增到這個槽的元素，會新增一個槽，以鏈表的方式連在這個槽的后面

擴容后數據怎么復制過來

go並沒有采用一下子就將數據全部復制過來的方式，如果數據很多還是非常耗時的，而是在寫數據的時候如果命中一個老槽，就將這個槽中的所有數據重新hash到新的數據結果中，在擴容過程中，新老數據結構同時提供數據的查詢，寫數據的時候只會寫入新的數據結構中，同時將命中老數據結構對應槽中的數據重新hash到新的數據結構中。

擴容條件

1：已經處於擴容狀態就不能再擴容了，就新老兩個數據結構，沒有第三了

2：元素個數>6.5*槽數量（槽的每個節點能容納8個元素，但是分布不可能這么均勻的），可以擴容

3：槽節點寫滿的個數到達一定數量也可以擴容

也就是說，在沒有擴容的情況下：元素個數太多 || 槽節點滿的數量多，都會觸發擴容

一個槽就是多個節點組成的鏈表，節點數據結構如下：

// A bucket for a Go map.
const bucketCnt = 8
type bmap struct {
   // tophash generally contains the top byte of the hash value
   // for each key in this bucket. If tophash[0] < minTopHash,
   // tophash[0] is a bucket evacuation state instead.
   tophash [bucketCnt]uint8
   // Followed by bucketCnt keys and then bucketCnt values.
   // NOTE: packing all the keys together and then all the values together makes the
   // code a bit more complicated than alternating key/value/key/value/... but it allows
   // us to eliminate padding which would be needed for, e.g., map[int64]int8.
   // Followed by an overflow pointer.
}

節點數據結構：

1：8個標記位的數組，用於標記每個元素的情況，如這個位置是否有元素，8個字節

2：連續的8個key，根據key的類型能計算出8個key的大小

3：連續的8個value，根據value的類型能計算出8個value的大小

4：指向下一個節點的指針，8個字節

但從節點的數據結構中只看到了8個標記位的數組，其他的都沒有寫出來，但是一個節點的大小是能計算出來的，可以分配一個能容納以上4個元素的空間，在強制轉換成*bmap就行，然后按照地址+偏移量就能計算出每個元素的地址

value = map[key] 源碼分析

//go:linkname reflect_mapaccess reflect.mapaccess      map[key]
func reflect_mapaccess(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer {
   val, ok := mapaccess2(t, h, key)
   if !ok {
      val = nil
   }
   return val
}
func mapaccess2(t *maptype, h *hmap, key unsafe.Pointer) (unsafe.Pointer, bool) {
   alg := t.key.alg
   //計算hashCode
   hash := alg.hash(key, uintptr(h.hash0))
   m := 1<<h.B - 1
   //hash到對應的槽，go的hash算法就是：hash&(1<<h.B - 1)
   b := (*bmap)(unsafe.Pointer(uintptr(h.buckets) + (hash&m)*uintptr(t.bucketsize)))
   //老的數據結構對應的槽如果有數據，就從老的取
   if c := h.oldbuckets; c != nil {
      if !h.sameSizeGrow() {
          m >>= 1   //如果正在擴容，老的容量為新的容量的一半
      }
      oldb := (*bmap)(unsafe.Pointer(uintptr(c) + (hash&m)*uintptr(t.bucketsize)))
      if !evacuated(oldb) {
         b = oldb
      }
   }
   top := uint8(hash >> (sys.PtrSize*8 - 8))
   if top < minTopHash {
      top += minTopHash
   }
   for {
      for i := uintptr(0); i < bucketCnt; i++ {
         if b.tophash[i] != top {    //top標記位表示有值
            continue
         }
         //通過槽首地址+偏移量計算第i個key的位置
         k := add(unsafe.Pointer(b), dataOffset+i*uintptr(t.keysize))
         k = *((*unsafe.Pointer)(k)) 
         //如果第i個key等於我們要查找的key，就返回第i個value
         if alg.equal(key, k) {
            //通過槽首地址+偏移量計算第i個value的位置
            v := add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.valuesize))
            v = *((*unsafe.Pointer)(v))
            return v, true
         }
      }
      //b指向下一個節點，如果不為空就繼續遍歷
      b = b.overflow(t)
      if b == nil {
         return unsafe.Pointer(&zeroVal[0]), false
      }
   }
}

map[key] = value 源碼分析

//go:linkname reflect_mapassign reflect.mapassign      map[key] = value
func reflect_mapassign(t *maptype, h *hmap, key unsafe.Pointer, val unsafe.Pointer) {
   p := mapassign(t, h, key)  //這里會把key寫進去
   typedmemmove(t.elem, p, val) //這里寫value
}
// Like mapaccess, but allocates a slot for the key if it is not present in the map.
func mapassign(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer {
   alg := t.key.alg
   hash := alg.hash(key, uintptr(h.hash0))
   h.flags |= hashWriting
   //沒元素就分配1個元素的空間
   if h.buckets == nil {
      h.buckets = newarray(t.bucket, 1)
   }
 
again:
   //hash到第bucket個槽
   bucket := hash & (uintptr(1)<<h.B - 1)
   //如果正在擴容，會將老數據結果中對應槽中的所有數據都重新hash到新的數據結構中
   if h.growing() {
      growWork(t, h, bucket)
   }
   計算第bucket個槽的地址
   b := (*bmap)(unsafe.Pointer(uintptr(h.buckets) + bucket*uintptr(t.bucketsize)))
   top := uint8(hash >> (sys.PtrSize*8 - 8))
   if top < minTopHash {
      top += minTopHash
   }
 
   var inserti *uint8        //標記位
   var insertk unsafe.Pointer//key寫在這里
   var val unsafe.Pointer    //value寫在這里
   for {
      for i := uintptr(0); i < bucketCnt; i++ {
         if b.tophash[i] != top {
            //第i個元素為空，如果是新增的，新增就往這里寫數據啊
            if b.tophash[i] == empty && inserti == nil {
               inserti = &b.tophash[i]
               insertk = add(unsafe.Pointer(b), dataOffset+i*uintptr(t.keysize))
               val = add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.valuesize))
            }
            continue
         }
         k := add(unsafe.Pointer(b), dataOffset+i*uintptr(t.keysize))
         if t.indirectkey {
            k = *((*unsafe.Pointer)(k))
         }
         //如果key相等就是更新數據
         if !alg.equal(key, k) {
            continue
         }
         // already have a mapping for key. Update it.
         if t.needkeyupdate {
            typedmemmove(t.key, k, key)
         }
         //更新數據，找到對應的位置返回就行
         val = add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.valuesize))
         goto done
      }
      //b指向下一個節點，如果不為空就繼續遍歷
      ovf := b.overflow(t)
      if ovf == nil {
         break
      }
      b = ovf
   }
 
   //已經處於擴容狀態就不能再擴容了，就新老兩個數據結構，沒有第三了
   //元素個數>6.5*槽數量（槽的每個節點能容納8個元素，但是分布不可能這么均勻的），可以擴容
   //槽節點寫滿的個數較多也可以擴容
   if !h.growing() && (overLoadFactor(int64(h.count), h.B) || tooManyOverflowBuckets(h.noverflow, h.B)) {
      hashGrow(t, h)
      goto again // Growing the table invalidates everything, so try again
   }
 
   if inserti == nil {
      //當前節點已經滿了，重新分配一個節點，新節點會連在當前節點后面
      newb := h.newoverflow(t, b)
      inserti = &newb.tophash[0]
      insertk = add(unsafe.Pointer(newb), dataOffset)
      val = add(insertk, bucketCnt*uintptr(t.keysize))
   }
 
   //引用類型分配key所需空間，同時把key在槽中對應的位置指向這個空間
   if t.indirectkey {
      kmem := newobject(t.key)
      *(*unsafe.Pointer)(insertk) = kmem
      insertk = kmem
   }
   //引用類型分配value所需空間，同時把value在槽中對應的位置指向這個空間
   if t.indirectvalue {
      vmem := newobject(t.elem)
      *(*unsafe.Pointer)(val) = vmem
   }
   //將key寫入剛分配的空間
   typedmemmove(t.key, insertk, key)
   //更新標記位
   *inserti = top
   //元素個數+1
   h.count++
 
done:
   if h.flags&hashWriting == 0 {
      throw("concurrent map writes")
   }
   h.flags &^= hashWriting
   if t.indirectvalue {
      val = *((*unsafe.Pointer)(val))
   }
   //直接將val返回了，value為一個結構體(非指針)map[key].fieldXXX = 1，這樣做是不行的
   //因為val作為返回值會被拷貝，如果val類型結構體(非指針)，map[key]返回的就是一個拷貝，並沒有更新真正的val，而指針類型卻沒有這個問題，如果結構體較大最好以指針的方式存入map
   return val
}

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Map的底層實現原理 go 實現struct轉map 淺析Golang map的實現原理 java Map及其實現類的底層原理 Golang - Map 內部實現原理解析 go-淺談接口實現原理 Go語言interface實現原理詳解深入 Go 語言 defer 實現原理 Go 協程實現原理和使用示例 [GO]json解析到map