golang垃圾回收GC


  go GC 的基本特征是非分代、非緊縮、寫屏障、並發標記清理。核心是抑制堆增長,充分利用CPU資源。

1. 三色標記

  是指並發(垃圾回收和用戶邏輯並發執行)的對系統中的對象進行顏色標記,然后根據顏色將對象進行清理。基本原理:

  • 起初將堆上所有對象都標記為白色;
  • 從底部開始遍歷對象,將遍歷到的白色對象標記為灰色,放入待處理隊列;
  • 遍歷灰色對象,把灰色對像所引用的白色對象也標記為灰色,將原灰色對象本身標記為黑色;
  • 循環執行上一步,直至原灰色對象全部標記為黑色;

步驟4結束后,標記為白色的對象就是不可達對象,就是垃圾對象,可以進行回收。

  最后white的對象都會被清理掉

寫屏障

  在進行三色標記的時候並沒有STW,也就是說,此時的對象還是可以進行修改;考慮這樣一種情況,在進行三色標記掃描灰色對象時,掃描到了對象A,並標記了對象A的所有引用,當開始掃描對象D的引用時,另一個goroutine修改了D->E的引用,變成了A->E的引用,就會導致E對象掃描不到而一直是白對象,就會被誤認為是垃圾。寫屏障就是為了解決這樣的問題,引入寫屏障后,在A->E后,E會被認為是存活的,即使后面E被A對象拋棄,E只會被在下一輪的GC中進行回收,這一輪GC不會回收對象E。

  寫屏障監視對象內存修改,重新標色或放回隊列。

  Go1.9中開始啟用了混合寫屏障,偽代碼如下:

1 writePointer(slot, ptr):
2     shade(*slot)
3     if any stack is grey:
4         shade(ptr)
5     *slot = ptr

  混合寫屏障會同時標記指針寫入目標的"原指針"和“新指針"。

  標記原指針的原因是, 其他運行中的線程有可能會同時把這個指針的值復制到寄存器或者棧上的本地變量,因為復制指針到寄存器或者棧上的本地變量不會經過寫屏障, 所以有可能會導致指針不被標記,標記新指針的原因是, 其他運行中的線程有可能會轉移指針的位置。

  混合寫屏障可以讓GC在並行標記結束后不需要重新掃描各個G的堆棧, 可以減少Mark Termination中的STW時間。

除了寫屏障外, 在GC的過程中所有新分配的對象都會立刻變為黑色。

控制器

控制器全程參與並發回收任務,記錄相關狀態數據,動態調整運行策略,影響並發標記單元的工作模式和數量,平衡CPU資源占用。回收結束時參與next_gc回收閾值設定,調整垃圾回收觸發頻率。

   //mgc.go
1
// gcController implements the GC pacing controller that determines 2 // when to trigger concurrent garbage collection and how much marking 3 // work to do in mutator assists and background marking. 4 // 5 // It uses a feedback control algorithm to adjust the memstats.gc_trigger 6 // trigger based on the heap growth and GC CPU utilization each cycle. 7 // This algorithm optimizes for heap growth to match GOGC and for CPU 8 // utilization between assist and background marking to be 25% of 9 // GOMAXPROCS. The high-level design of this algorithm is documented 10 // at https://golang.org/s/go15gcpacing. 11 // 12 // All fields of gcController are used only during a single mark 13 // cycle. 14 15 //GC controller實現GC起搏控制器,該控制器確定何時觸發並發垃圾收集,以及在mutator協助和后台標記中要做多少標記工作。 16 // 17 //它使用反饋控制算法根據堆增長和每個周期的gc CPU利用率調整memstats.gc_觸發器。 18 //該算法優化堆增長以匹配GOGC,並優化輔助和后台標記之間的CPU利用率為GOMAXPROCS的25%。該算法的高級設計在https://golang.org/s/go15gcpacking上有文檔記錄。 19 // 20 //gcController的所有字段只在一個標記周期內使用。

輔助回收

   當對象回收速遞遠快於后台標記,會引發堆惡性擴張等惡果,甚至是使垃圾回收永遠也無法完成,此時讓用戶代碼線程參與后台標記回收非常有必要,為對象分配堆內存時,通過相關策略去執行一定限度的回收操作,平衡分配和回收操作,讓進程處於良性狀態。

2. 初始化

  初始化過程中,重點是設置 gcpercent 和 next_gc

 //mgc.go
1
// Initialized from $GOGC. GOGC=off means no GC. 2 var gcpercent int32 3 4 func gcinit() { 5 if unsafe.Sizeof(workbuf{}) != _WorkbufSize { 6 throw("size of Workbuf is suboptimal") 7 } 8 9 // No sweep on the first cycle. 10 mheap_.sweepdone = 1 11 12 // Set a reasonable initial GC trigger. 13 memstats.triggerRatio = 7 / 8.0 14 15 // Fake a heap_marked value so it looks like a trigger at 16 // heapminimum is the appropriate growth from heap_marked. 17 // This will go into computing the initial GC goal. 18 memstats.heap_marked = uint64(float64(heapminimum) / (1 + memstats.triggerRatio)) 19 20 // Set gcpercent from the environment. This will also compute 21 // and set the GC trigger and goal. 22 //設置GOGC 23 _ = setGCPercent(readgogc()) 24 25 work.startSema = 1 26 work.markDoneSema = 1 27 } 28 29 func readgogc() int32 { 30 p := gogetenv("GOGC") 31 if p == "off" { 32 return -1 33 } 34 if n, ok := atoi32(p); ok { 35 return n 36 } 37 return 100 38 } 39 40 // gcenable is called after the bulk of the runtime initialization, 41 // just before we're about to start letting user code run. 42 // It kicks off the background sweeper goroutine and enables GC. 43 func gcenable() { 44 c := make(chan int, 1) 45 go bgsweep(c) 46 <-c 47 memstats.enablegc = true // now that runtime is initialized, GC is okay 48 } 49 50 //go:linkname setGCPercent runtime/debug.setGCPercent 51 func setGCPercent(in int32) (out int32) { 52 lock(&mheap_.lock) 53 out = gcpercent 54 if in < 0 { 55 in = -1 56 } 57 gcpercent = in 58 heapminimum = defaultHeapMinimum * uint64(gcpercent) / 100 59 // Update pacing in response to gcpercent change. 60 gcSetTriggerRatio(memstats.triggerRatio) 61 unlock(&mheap_.lock) 62 63 // If we just disabled GC, wait for any concurrent GC mark to 64 // finish so we always return with no GC running. 65 if in < 0 { 66 gcWaitOnMark(atomic.Load(&work.cycles)) 67 } 68 69 return out 70 }

啟動

  在為對象分配堆內存后,mallocgc函數會檢查垃圾回收觸發條件,並依照相關狀態啟動或參與輔助回收。

malloc.go
1
// Allocate an object of size bytes. 2 // Small objects are allocated from the per-P cache's free lists. 3 // Large objects (> 32 kB) are allocated straight from the heap. 4 func mallocgc(size uintptr, typ *_type, needzero bool) unsafe.Pointer { 5 if gcphase == _GCmarktermination { 6 throw("mallocgc called with gcphase == _GCmarktermination") 7 } 8 9 // ... 10 11 // assistG is the G to charge for this allocation, or nil if 12 // GC is not currently active. 13 var assistG *g 14 if gcBlackenEnabled != 0 { 15 //讓出資源 16 // Charge the current user G for this allocation. 17 assistG = getg() 18 if assistG.m.curg != nil { 19 assistG = assistG.m.curg 20 } 21 // Charge the allocation against the G. We'll account 22 // for internal fragmentation at the end of mallocgc. 23 assistG.gcAssistBytes -= int64(size) 24 25 if assistG.gcAssistBytes < 0 { 26 //輔助參與回收任務 27 // This G is in debt. Assist the GC to correct 28 // this before allocating. This must happen 29 // before disabling preemption. 30 gcAssistAlloc(assistG) 31 } 32 } 33 34 // Set mp.mallocing to keep from being preempted by GC. 35 mp := acquirem() 36 if mp.mallocing != 0 { 37 throw("malloc deadlock") 38 } 39 if mp.gsignal == getg() { 40 throw("malloc during signal") 41 } 42 mp.mallocing = 1 43 44 shouldhelpgc := false 45 dataSize := size 46 c := gomcache() 47 var x unsafe.Pointer 48 noscan := typ == nil || typ.kind&kindNoPointers != 0 49 50 //判斷對象大小 51 //…… 52 53 // Allocate black during GC. 54 // All slots hold nil so no scanning is needed. 55 // This may be racing with GC so do it atomically if there can be 56 // a race marking the bit. 57 if gcphase != _GCoff { 58 //直接分配黑色對象 59 gcmarknewobject(uintptr(x), size, scanSize) 60 } 61 62 if assistG != nil { 63 // Account for internal fragmentation in the assist 64 // debt now that we know it. 65 assistG.gcAssistBytes -= int64(size - dataSize) 66 } 67 //檢查垃圾回收觸發條件 68 if shouldhelpgc { 69 //啟動並發垃圾回收 70 if t := (gcTrigger{kind: gcTriggerHeap}); t.test() { 71 gcStart(t) 72 } 73 } 74 75 return x 76 }

  垃圾回收默認以全並發模式運行,但可以用環境變量參數或參數禁用並發標記和並發清理。GC goroutine一直循環,直到符合觸發條件時被喚醒。

gcStart

  //mgc.go
1
func gcStart(mode gcMode, trigger gcTrigger) { 2 // Since this is called from malloc and malloc is called in 3 // the guts of a number of libraries that might be holding 4 // locks, don't attempt to start GC in non-preemptible or 5 // potentially unstable situations. 6 // 判斷當前g是否可以搶占,不可搶占時不觸發GC 7 mp := acquirem() 8 if gp := getg(); gp == mp.g0 || mp.locks > 1 || mp.preemptoff != "" { 9 releasem(mp) 10 return 11 } 12 releasem(mp) 13 mp = nil 14 15 // Pick up the remaining unswept/not being swept spans concurrently 16 // 17 // This shouldn't happen if we're being invoked in background 18 // mode since proportional sweep should have just finished 19 // sweeping everything, but rounding errors, etc, may leave a 20 // few spans unswept. In forced mode, this is necessary since 21 // GC can be forced at any point in the sweeping cycle. 22 // 23 // We check the transition condition continuously here in case 24 // this G gets delayed in to the next GC cycle. 25 // 清掃 殘留的未清掃的垃圾 26 for trigger.test() && gosweepone() != ^uintptr(0) { 27 sweep.nbgsweep++ 28 } 29 30 // Perform GC initialization and the sweep termination 31 // transition. 32 semacquire(&work.startSema) 33 // Re-check transition condition under transition lock. 34 // 判斷gcTrriger的條件是否成立 35 if !trigger.test() { 36 semrelease(&work.startSema) 37 return 38 } 39 40 // For stats, check if this GC was forced by the user 41 // 判斷並記錄GC是否被強制執行的,runtime.GC()可以被用戶調用並強制執行 42 work.userForced = trigger.kind == gcTriggerAlways || trigger.kind == gcTriggerCycle 43 44 // In gcstoptheworld debug mode, upgrade the mode accordingly. 45 // We do this after re-checking the transition condition so 46 // that multiple goroutines that detect the heap trigger don't 47 // start multiple STW GCs. 48 // 設置gc的mode 49 if mode == gcBackgroundMode { 50 if debug.gcstoptheworld == 1 { 51 mode = gcForceMode 52 } else if debug.gcstoptheworld == 2 { 53 mode = gcForceBlockMode 54 } 55 } 56 57 // Ok, we're doing it! Stop everybody else 58 semacquire(&worldsema) 59 60 if trace.enabled { 61 traceGCStart() 62 } 63 64 // Check that all Ps have finished deferred mcache flushes.
 
         
   65  for _, p := range allp {
 
         
         if fg := atomic.Load(&p.mcache.flushGen); fg != mheap_.sweepgen {
 
         
            println("runtime: p", p.id, "flushGen", fg, "!= sweepgen", mheap_.sweepgen)
 
         
            throw("p mcache not flushed")
 
         
        }
 
         
    }
    65     
      // 啟動后台標記任務
    66     gcBgMarkStartWorkers()
 67     // 重置gc 標記相關的狀態
 68     gcResetMarkState()
 69 
 70     work.stwprocs, work.maxprocs = gomaxprocs, gomaxprocs
 71     if work.stwprocs > ncpu {
 72         // This is used to compute CPU time of the STW phases,
 73         // so it can't be more than ncpu, even if GOMAXPROCS is.
 74         work.stwprocs = ncpu
 75     }
 76     work.heap0 = atomic.Load64(&memstats.heap_live)
 77     work.pauseNS = 0
 78     work.mode = mode
 79 
 80     now := nanotime()
 81     work.tSweepTerm = now
 82     work.pauseStart = now
 83     if trace.enabled {
 84         traceGCSTWStart(1)
 85     }
 86     // STW,停止世界
 87     systemstack(stopTheWorldWithSema)
 88     // Finish sweep before we start concurrent scan.
 89     // 先清掃上一輪的垃圾,確保上輪GC完成
 90     systemstack(func() {
 91         finishsweep_m()
 92     })
 93     // clearpools before we start the GC. If we wait they memory will not be
 94     // reclaimed until the next GC cycle.
 95     // 清理 sync.pool sched.sudogcache、sched.deferpool,這里不展開,sync.pool已經說了,剩余的后面的文章會涉及
 96     clearpools()
 97 
 98     // 增加GC技術
 99     work.cycles++
100     if mode == gcBackgroundMode { // Do as much work concurrently as possible
101         gcController.startCycle()
102         work.heapGoal = memstats.next_gc
103 
104         // Enter concurrent mark phase and enable
105         // write barriers.
106         //
107         // Because the world is stopped, all Ps will
108         // observe that write barriers are enabled by
109         // the time we start the world and begin
110         // scanning.
111         //
112         // Write barriers must be enabled before assists are
113         // enabled because they must be enabled before
114         // any non-leaf heap objects are marked. Since
115         // allocations are blocked until assists can
116         // happen, we want enable assists as early as
117         // possible.
118         // 設置GC的狀態為 gcMark
119         setGCPhase(_GCmark)
120 
121         // 更新 bgmark 的狀態
122         gcBgMarkPrepare() // Must happen before assist enable.
123         // 計算並排隊root 掃描任務,並初始化相關掃描任務狀態
124         gcMarkRootPrepare()
125 
126         // Mark all active tinyalloc blocks. Since we're
127         // allocating from these, they need to be black like
128         // other allocations. The alternative is to blacken
129         // the tiny block on every allocation from it, which
130         // would slow down the tiny allocator.
131         // 標記 tiny 對象
132         gcMarkTinyAllocs()
133 
134         // At this point all Ps have enabled the write
135         // barrier, thus maintaining the no white to
136         // black invariant. Enable mutator assists to
137         // put back-pressure on fast allocating
138         // mutators.
139         // 設置 gcBlackenEnabled 為 1,啟用寫屏障
140         atomic.Store(&gcBlackenEnabled, 1)
141 
142         // Assists and workers can start the moment we start
143         // the world.
144         gcController.markStartTime = now
145 
146         // Concurrent mark.
147         systemstack(func() {
148             now = startTheWorldWithSema(trace.enabled)
149         })
150         work.pauseNS += now - work.pauseStart
151         work.tMark = now
152     } else {
153         // 非並行模式
154         // 記錄完成標記階段的開始時間
155         if trace.enabled {
156             // Switch to mark termination STW.
157             traceGCSTWDone()
158             traceGCSTWStart(0)
159         }
160         t := nanotime()
161         work.tMark, work.tMarkTerm = t, t
162         work.heapGoal = work.heap0
163 
164         // Perform mark termination. This will restart the world.
165         // stw,進行標記,清掃並start the world
166         gcMarkTermination(memstats.triggerRatio)
167     }
168 
169     semrelease(&work.startSema)
170 }

4. 並發標記

  • 掃描:遍歷相關內存區域,依照指針標記找出灰色可達對象,加入隊列;
  • 標記:將灰色對象從隊列取出,將其引用對象標記為灰色,自身標記為黑色。

gcBgMarkStartWorkers

  這個函數准備一些 執行bg mark工作的mark worker goroutine,但是這些goroutine並不是立即工作的,它們在回收任務開始前被綁定到P,然后進入休眠狀態,等到GC的狀態被標記為gcMark 才被調度器喚醒,開始工作。

 1 func gcBgMarkStartWorkers() {
 2     // Background marking is performed by per-P G's. Ensure that
 3     // each P has a background GC G.
 4     for _, p := range allp {
 5         if p.gcBgMarkWorker == 0 {
 6             go gcBgMarkWorker(p)
 7             // 等待gcBgMarkWorker goroutine 的 bgMarkReady信號再繼續
 8             notetsleepg(&work.bgMarkReady, -1)
 9             noteclear(&work.bgMarkReady)
10         }
11     }
12 }

  MarkWorker有三種工作模式:

  • gcMark Worker DedicateMode:全力運行,直到並發標記任務結束;
  • gcMark WorkerFractionMode:參與標記任務但可被搶占和調度;
  • gcMark WorkerIdleMode:僅在空閑時參與標記任務。

gcBgMarkWorker

  后台標記任務的函數,不同模式的Mark Worker 對待工作的態度完全不同。

  1 func gcBgMarkWorker(_p_ *p) {
  2     gp := getg()
  3     // 用於休眠結束后重新獲取p和m
  4     type parkInfo struct {
  5         m      muintptr // Release this m on park.
  6         attach puintptr // If non-nil, attach to this p on park.
  7     }
  8     // We pass park to a gopark unlock function, so it can't be on
  9     // the stack (see gopark). Prevent deadlock from recursively
 10     // starting GC by disabling preemption.
 11     gp.m.preemptoff = "GC worker init"
 12     park := new(parkInfo)
 13     gp.m.preemptoff = ""
 14     // 設置park的m和p的信息,留着后面傳給gopark,在被gcController.findRunnable喚醒的時候,便於找回
 15     park.m.set(acquirem())
 16     park.attach.set(_p_)
 17     // Inform gcBgMarkStartWorkers that this worker is ready.
 18     // After this point, the background mark worker is scheduled
 19     // cooperatively by gcController.findRunnable. Hence, it must
 20     // never be preempted, as this would put it into _Grunnable
 21     // and put it on a run queue. Instead, when the preempt flag
 22     // is set, this puts itself into _Gwaiting to be woken up by
 23     // gcController.findRunnable at the appropriate time.
 24     // 讓gcBgMarkStartWorkers notetsleepg停止等待並繼續及退出
 25     notewakeup(&work.bgMarkReady)
 26 
 27     for {
 28         // Go to sleep until woken by gcController.findRunnable.
 29         // We can't releasem yet since even the call to gopark
 30         // may be preempted.
 31         // 讓g進入休眠
 32         gopark(func(g *g, parkp unsafe.Pointer) bool {
 33             park := (*parkInfo)(parkp)
 34 
 35             // The worker G is no longer running, so it's
 36             // now safe to allow preemption.
 37             // 釋放當前搶占的m
 38             releasem(park.m.ptr())
 39 
 40             // If the worker isn't attached to its P,
 41             // attach now. During initialization and after
 42             // a phase change, the worker may have been
 43             // running on a different P. As soon as we
 44             // attach, the owner P may schedule the
 45             // worker, so this must be done after the G is
 46             // stopped.
 47             // 設置關聯p,上面已經設置過了
 48             if park.attach != 0 {
 49                 p := park.attach.ptr()
 50                 park.attach.set(nil)
 51                 // cas the worker because we may be
 52                 // racing with a new worker starting
 53                 // on this P.
 54                 if !p.gcBgMarkWorker.cas(0, guintptr(unsafe.Pointer(g))) {
 55                     // The P got a new worker.
 56                     // Exit this worker.
 57                     return false
 58                 }
 59             }
 60             return true
 61         }, unsafe.Pointer(park), waitReasonGCWorkerIdle, traceEvGoBlock, 0)
 62 
 63         // Loop until the P dies and disassociates this
 64         // worker (the P may later be reused, in which case
 65         // it will get a new worker) or we failed to associate.
 66         // 檢查P的gcBgMarkWorker是否和當前的G一致, 不一致時結束當前的任務
 67         if _p_.gcBgMarkWorker.ptr() != gp {
 68             break
 69         }
 70 
 71         // Disable preemption so we can use the gcw. If the
 72         // scheduler wants to preempt us, we'll stop draining,
 73         // dispose the gcw, and then preempt.
 74         // gopark第一個函數中釋放了m,這里再搶占回來
 75         park.m.set(acquirem())
 76 
 77         if gcBlackenEnabled == 0 {
 78             throw("gcBgMarkWorker: blackening not enabled")
 79         }
 80 
 81         startTime := nanotime()
 82         // 設置gcmark的開始時間
 83         _p_.gcMarkWorkerStartTime = startTime
 84 
 85         decnwait := atomic.Xadd(&work.nwait, -1)
 86         if decnwait == work.nproc {
 87             println("runtime: work.nwait=", decnwait, "work.nproc=", work.nproc)
 88             throw("work.nwait was > work.nproc")
 89         }
 90         // 切換到g0工作
 91         systemstack(func() {
 92             // Mark our goroutine preemptible so its stack
 93             // can be scanned. This lets two mark workers
 94             // scan each other (otherwise, they would
 95             // deadlock). We must not modify anything on
 96             // the G stack. However, stack shrinking is
 97             // disabled for mark workers, so it is safe to
 98             // read from the G stack.
 99             // 設置G的狀態為waiting,以便於另一個g掃描它的棧(兩個g可以互相掃描對方的棧)
100             casgstatus(gp, _Grunning, _Gwaiting)
101             switch _p_.gcMarkWorkerMode {
102             default:
103                 throw("gcBgMarkWorker: unexpected gcMarkWorkerMode")
104             case gcMarkWorkerDedicatedMode:
105                 // 專心執行標記工作的模式
106                 gcDrain(&_p_.gcw, gcDrainUntilPreempt|gcDrainFlushBgCredit)
107                 if gp.preempt {
108                     // 被搶占了,把所有本地運行隊列中的G放到全局運行隊列中
109                     // We were preempted. This is
110                     // a useful signal to kick
111                     // everything out of the run
112                     // queue so it can run
113                     // somewhere else.
114                     lock(&sched.lock)
115                     for {
116                         gp, _ := runqget(_p_)
117                         if gp == nil {
118                             break
119                         }
120                         globrunqput(gp)
121                     }
122                     unlock(&sched.lock)
123                 }
124                 // Go back to draining, this time
125                 // without preemption.
126                 // 繼續執行標記工作
127                 gcDrain(&_p_.gcw, gcDrainNoBlock|gcDrainFlushBgCredit)
128             case gcMarkWorkerFractionalMode:
129                 // 執行標記工作,知道被搶占
130                 gcDrain(&_p_.gcw, gcDrainFractional|gcDrainUntilPreempt|gcDrainFlushBgCredit)
131             case gcMarkWorkerIdleMode:
132                 // 空閑的時候執行標記工作
133                 gcDrain(&_p_.gcw, gcDrainIdle|gcDrainUntilPreempt|gcDrainFlushBgCredit)
134             }
135             // 把G的waiting狀態轉換到runing狀態
136             casgstatus(gp, _Gwaiting, _Grunning)
137         })
138 
139         // If we are nearing the end of mark, dispose
140         // of the cache promptly. We must do this
141         // before signaling that we're no longer
142         // working so that other workers can't observe
143         // no workers and no work while we have this
144         // cached, and before we compute done.
145         // 及時處理本地緩存,上交到全局的隊列中
146         if gcBlackenPromptly {
147             _p_.gcw.dispose()
148         }
149 
150         // Account for time.
151         // 累加耗時
152         duration := nanotime() - startTime
153         switch _p_.gcMarkWorkerMode {
154         case gcMarkWorkerDedicatedMode:
155             atomic.Xaddint64(&gcController.dedicatedMarkTime, duration)
156             atomic.Xaddint64(&gcController.dedicatedMarkWorkersNeeded, 1)
157         case gcMarkWorkerFractionalMode:
158             atomic.Xaddint64(&gcController.fractionalMarkTime, duration)
159             atomic.Xaddint64(&_p_.gcFractionalMarkTime, duration)
160         case gcMarkWorkerIdleMode:
161             atomic.Xaddint64(&gcController.idleMarkTime, duration)
162         }
163 
164         // Was this the last worker and did we run out
165         // of work?
166         incnwait := atomic.Xadd(&work.nwait, +1)
167         if incnwait > work.nproc {
168             println("runtime: p.gcMarkWorkerMode=", _p_.gcMarkWorkerMode,
169                 "work.nwait=", incnwait, "work.nproc=", work.nproc)
170             throw("work.nwait > work.nproc")
171         }
172 
173         // If this worker reached a background mark completion
174         // point, signal the main GC goroutine.
175         if incnwait == work.nproc && !gcMarkWorkAvailable(nil) {
176             // Make this G preemptible and disassociate it
177             // as the worker for this P so
178             // findRunnableGCWorker doesn't try to
179             // schedule it.
180             // 取消p m的關聯
181             _p_.gcBgMarkWorker.set(nil)
182             releasem(park.m.ptr())
183 
184             gcMarkDone()
185 
186             // Disable preemption and prepare to reattach
187             // to the P.
188             //
189             // We may be running on a different P at this
190             // point, so we can't reattach until this G is
191             // parked.
192             park.m.set(acquirem())
193             park.attach.set(_p_)
194         }
195     }
196 }

gcDrain

  三色標記的主要實現

  gcDrain掃描所有的roots和對象,並表黑灰色對象,知道所有的roots和對象都被標記

  1 func gcDrain(gcw *gcWork, flags gcDrainFlags) {
  2     if !writeBarrier.needed {
  3         throw("gcDrain phase incorrect")
  4     }
  5 
  6     gp := getg().m.curg
  7     // 看到搶占標識是否要返回
  8     preemptible := flags&gcDrainUntilPreempt != 0
  9     // 沒有任務時是否要等待任務
 10     blocking := flags&(gcDrainUntilPreempt|gcDrainIdle|gcDrainFractional|gcDrainNoBlock) == 0
 11     // 是否計算后台的掃描量來減少輔助GC和喚醒等待中的G
 12     flushBgCredit := flags&gcDrainFlushBgCredit != 0
 13     // 是否在空閑的時候執行標記任務
 14     idle := flags&gcDrainIdle != 0
 15     // 記錄初始的已經執行過的掃描任務
 16     initScanWork := gcw.scanWork
 17 
 18     // checkWork is the scan work before performing the next
 19     // self-preempt check.
 20     // 設置對應模式的工作檢查函數
 21     checkWork := int64(1<<63 - 1)
 22     var check func() bool
 23     if flags&(gcDrainIdle|gcDrainFractional) != 0 {
 24         checkWork = initScanWork + drainCheckThreshold
 25         if idle {
 26             check = pollWork
 27         } else if flags&gcDrainFractional != 0 {
 28             check = pollFractionalWorkerExit
 29         }
 30     }
 31 
 32     // Drain root marking jobs.
 33     // 如果root對象沒有掃描完,則掃描
 34     if work.markrootNext < work.markrootJobs {
 35         for !(preemptible && gp.preempt) {
 36             job := atomic.Xadd(&work.markrootNext, +1) - 1
 37             if job >= work.markrootJobs {
 38                 break
 39             }
 40             // 執行root掃描任務
 41             markroot(gcw, job)
 42             if check != nil && check() {
 43                 goto done
 44             }
 45         }
 46     }
 47 
 48     // Drain heap marking jobs.
 49     // 循環直到被搶占
 50     for !(preemptible && gp.preempt) {
 51         // Try to keep work available on the global queue. We used to
 52         // check if there were waiting workers, but it's better to
 53         // just keep work available than to make workers wait. In the
 54         // worst case, we'll do O(log(_WorkbufSize)) unnecessary
 55         // balances.
 56         if work.full == 0 {
 57             // 平衡工作,如果全局的標記隊列為空,則分一部分工作到全局隊列中
 58             gcw.balance()
 59         }
 60 
 61         var b uintptr
 62         if blocking {
 63             b = gcw.get()
 64         } else {
 65             b = gcw.tryGetFast()
 66             if b == 0 {
 67                 b = gcw.tryGet()
 68             }
 69         }
 70         // 獲取任務失敗,跳出循環
 71         if b == 0 {
 72             // work barrier reached or tryGet failed.
 73             break
 74         }
 75         // 掃描獲取的到對象
 76         scanobject(b, gcw)
 77 
 78         // Flush background scan work credit to the global
 79         // account if we've accumulated enough locally so
 80         // mutator assists can draw on it.
 81         // 如果當前掃描的數量超過了 gcCreditSlack,就把掃描的對象數量加到全局的數量,批量更新
 82         if gcw.scanWork >= gcCreditSlack {
 83             atomic.Xaddint64(&gcController.scanWork, gcw.scanWork)
 84             if flushBgCredit {
 85                 gcFlushBgCredit(gcw.scanWork - initScanWork)
 86                 initScanWork = 0
 87             }
 88             checkWork -= gcw.scanWork
 89             gcw.scanWork = 0
 90             // 如果掃描的對象數量已經達到了 執行下次搶占的目標數量 checkWork, 則調用對應模式的函數
 91             // idle模式為 pollWork, Fractional模式為 pollFractionalWorkerExit ,在第20行
 92             if checkWork <= 0 {
 93                 checkWork += drainCheckThreshold
 94                 if check != nil && check() {
 95                     break
 96                 }
 97             }
 98         }
 99     }
100 
101     // In blocking mode, write barriers are not allowed after this
102     // point because we must preserve the condition that the work
103     // buffers are empty.
104 
105 done:
106     // Flush remaining scan work credit.
107     if gcw.scanWork > 0 {
108         // 把掃描的對象數量添加到全局
109         atomic.Xaddint64(&gcController.scanWork, gcw.scanWork)
110         if flushBgCredit {
111             gcFlushBgCredit(gcw.scanWork - initScanWork)
112         }
113         gcw.scanWork = 0
114     }
115 }

  處理灰色對象時,無需知道其真實大小,只當做內存分配器提供的object塊即可。按指針類型長度對齊配合bitmap標記進行遍歷,就可找出所有引用成員,將其作為灰色對象壓入隊列,當然,當前對象自然成為黑色對象,從隊列移除。

markroot

  這個被用於根對象掃描。

 1 func markroot(gcw *gcWork, i uint32) {
 2     // TODO(austin): This is a bit ridiculous. Compute and store
 3     // the bases in gcMarkRootPrepare instead of the counts.
 4     baseFlushCache := uint32(fixedRootCount)
 5     baseData := baseFlushCache + uint32(work.nFlushCacheRoots)
 6     baseBSS := baseData + uint32(work.nDataRoots)
 7     baseSpans := baseBSS + uint32(work.nBSSRoots)
 8     baseStacks := baseSpans + uint32(work.nSpanRoots)
 9     end := baseStacks + uint32(work.nStackRoots)
10 
11     // Note: if you add a case here, please also update heapdump.go:dumproots.
12     switch {
13     // 釋放mcache中的span
14     case baseFlushCache <= i && i < baseData:
15         flushmcache(int(i - baseFlushCache))
16     // 掃描可讀寫的全局變量
17     case baseData <= i && i < baseBSS:
18         for _, datap := range activeModules() {
19             markrootBlock(datap.data, datap.edata-datap.data, datap.gcdatamask.bytedata, gcw, int(i-baseData))
20         }
21     // 掃描只讀的全局隊列
22     case baseBSS <= i && i < baseSpans:
23         for _, datap := range activeModules() {
24             markrootBlock(datap.bss, datap.ebss-datap.bss, datap.gcbssmask.bytedata, gcw, int(i-baseBSS))
25         }
26     // 掃描Finalizer隊列
27     case i == fixedRootFinalizers:
28         // Only do this once per GC cycle since we don't call
29         // queuefinalizer during marking.
30         if work.markrootDone {
31             break
32         }
33         for fb := allfin; fb != nil; fb = fb.alllink {
34             cnt := uintptr(atomic.Load(&fb.cnt))
35             scanblock(uintptr(unsafe.Pointer(&fb.fin[0])), cnt*unsafe.Sizeof(fb.fin[0]), &finptrmask[0], gcw)
36         }
37     // 釋放已經終止的stack
38     case i == fixedRootFreeGStacks:
39         // Only do this once per GC cycle; preferably
40         // concurrently.
41         if !work.markrootDone {
42             // Switch to the system stack so we can call
43             // stackfree.
44             systemstack(markrootFreeGStacks)
45         }
46     // 掃描MSpan.specials
47     case baseSpans <= i && i < baseStacks:
48         // mark MSpan.specials
49         markrootSpans(gcw, int(i-baseSpans))
50 
51     default:
52         // the rest is scanning goroutine stacks
53         // 獲取需要掃描的g
54         var gp *g
55         if baseStacks <= i && i < end {
56             gp = allgs[i-baseStacks]
57         } else {
58             throw("markroot: bad index")
59         }
60 
61         // remember when we've first observed the G blocked
62         // needed only to output in traceback
63         status := readgstatus(gp) // We are not in a scan state
64         if (status == _Gwaiting || status == _Gsyscall) && gp.waitsince == 0 {
65             gp.waitsince = work.tstart
66         }
67 
68         // scang must be done on the system stack in case
69         // we're trying to scan our own stack.
70         // 轉交給g0進行掃描
71         systemstack(func() {
72             // If this is a self-scan, put the user G in
73             // _Gwaiting to prevent self-deadlock. It may
74             // already be in _Gwaiting if this is a mark
75             // worker or we're in mark termination.
76             userG := getg().m.curg
77             selfScan := gp == userG && readgstatus(userG) == _Grunning
78             // 如果是掃描自己的,則轉換自己的g的狀態
79             if selfScan {
80                 casgstatus(userG, _Grunning, _Gwaiting)
81                 userG.waitreason = waitReasonGarbageCollectionScan
82             }
83 
84             // TODO: scang blocks until gp's stack has
85             // been scanned, which may take a while for
86             // running goroutines. Consider doing this in
87             // two phases where the first is non-blocking:
88             // we scan the stacks we can and ask running
89             // goroutines to scan themselves; and the
90             // second blocks.
91             // 掃描g的棧
92             scang(gp, gcw)
93 
94             if selfScan {
95                 casgstatus(userG, _Gwaiting, _Grunning)
96             }
97         })
98     }
99 }

  所有這些掃描過程,最終通過scanblock 比對bitmap區域信息找出合法指針,將其目標當做灰色可達對象添加到待處理隊列。

markRootBlock

  根據 ptrmask0,來掃描[b0, b0+n0)區域

 1 func markrootBlock(b0, n0 uintptr, ptrmask0 *uint8, gcw *gcWork, shard int) {
 2     if rootBlockBytes%(8*sys.PtrSize) != 0 {
 3         // This is necessary to pick byte offsets in ptrmask0.
 4         throw("rootBlockBytes must be a multiple of 8*ptrSize")
 5     }
 6 
 7     b := b0 + uintptr(shard)*rootBlockBytes
 8     // 如果需掃描的block區域,超出b0+n0的區域,直接返回
 9     if b >= b0+n0 {
10         return
11     }
12     ptrmask := (*uint8)(add(unsafe.Pointer(ptrmask0), uintptr(shard)*(rootBlockBytes/(8*sys.PtrSize))))
13     n := uintptr(rootBlockBytes)
14     if b+n > b0+n0 {
15         n = b0 + n0 - b
16     }
17 
18     // Scan this shard.
19     // 掃描給定block的shard
20     scanblock(b, n, ptrmask, gcw)
21 }
 1 func scanblock(b0, n0 uintptr, ptrmask *uint8, gcw *gcWork) {
 2     // Use local copies of original parameters, so that a stack trace
 3     // due to one of the throws below shows the original block
 4     // base and extent.
 5     b := b0
 6     n := n0
 7 
 8     for i := uintptr(0); i < n; {
 9         // Find bits for the next word.
10         // 找到bitmap中對應的bits
11         bits := uint32(*addb(ptrmask, i/(sys.PtrSize*8)))
12         if bits == 0 {
13             i += sys.PtrSize * 8
14             continue
15         }
16         for j := 0; j < 8 && i < n; j++ {
17             if bits&1 != 0 {
18                 // 如果該地址包含指針
19                 // Same work as in scanobject; see comments there.
20                 obj := *(*uintptr)(unsafe.Pointer(b + i))
21                 if obj != 0 {
22                     // 如果該地址下找到了對應的對象,標灰
23                     if obj, span, objIndex := findObject(obj, b, i); obj != 0 {
24                         greyobject(obj, b, i, span, gcw, objIndex)
25                     }
26                 }
27             }
28             bits >>= 1
29             i += sys.PtrSize
30         }
31     }
32 }

  此處的gcWork是專門設計的高性能隊列,它允許局部隊列和全局隊列work.full/partial協同工作,平衡任務分配。

greyobject

  標灰對象其實就是找到對應bitmap,標記存活並扔進隊列

 1 func greyobject(obj, base, off uintptr, span *mspan, gcw *gcWork, objIndex uintptr) {
 2     // obj should be start of allocation, and so must be at least pointer-aligned.
 3     if obj&(sys.PtrSize-1) != 0 {
 4         throw("greyobject: obj not pointer-aligned")
 5     }
 6     mbits := span.markBitsForIndex(objIndex)
 7 
 8     if useCheckmark {
 9         // 這里是用來debug,確保所有的對象都被正確標識
10         if !mbits.isMarked() {
11             // 這個對象沒有被標記
12             printlock()
13             print("runtime:greyobject: checkmarks finds unexpected unmarked object obj=", hex(obj), "\n")
14             print("runtime: found obj at *(", hex(base), "+", hex(off), ")\n")
15 
16             // Dump the source (base) object
17             gcDumpObject("base", base, off)
18 
19             // Dump the object
20             gcDumpObject("obj", obj, ^uintptr(0))
21 
22             getg().m.traceback = 2
23             throw("checkmark found unmarked object")
24         }
25         hbits := heapBitsForAddr(obj)
26         if hbits.isCheckmarked(span.elemsize) {
27             return
28         }
29         hbits.setCheckmarked(span.elemsize)
30         if !hbits.isCheckmarked(span.elemsize) {
31             throw("setCheckmarked and isCheckmarked disagree")
32         }
33     } else {
34         if debug.gccheckmark > 0 && span.isFree(objIndex) {
35             print("runtime: marking free object ", hex(obj), " found at *(", hex(base), "+", hex(off), ")\n")
36             gcDumpObject("base", base, off)
37             gcDumpObject("obj", obj, ^uintptr(0))
38             getg().m.traceback = 2
39             throw("marking free object")
40         }
41 
42         // If marked we have nothing to do.
43         // 對象被正確標記了,無需做其他的操作
44         if mbits.isMarked() {
45             return
46         }
47         // mbits.setMarked() // Avoid extra call overhead with manual inlining.
48         // 標記對象
49         atomic.Or8(mbits.bytep, mbits.mask)
50         // If this is a noscan object, fast-track it to black
51         // instead of greying it.
52         // 如果對象不是指針,則只需要標記,不需要放進隊列,相當於直接標黑
53         if span.spanclass.noscan() {
54             gcw.bytesMarked += uint64(span.elemsize)
55             return
56         }
57     }
58 
59     // Queue the obj for scanning. The PREFETCH(obj) logic has been removed but
60     // seems like a nice optimization that can be added back in.
61     // There needs to be time between the PREFETCH and the use.
62     // Previously we put the obj in an 8 element buffer that is drained at a rate
63     // to give the PREFETCH time to do its work.
64     // Use of PREFETCHNTA might be more appropriate than PREFETCH
65     // 判斷對象是否被放進隊列,沒有則放入,標灰步驟完成
66     if !gcw.putFast(obj) {
67         gcw.put(obj)
68     }
69 }

gcWork.putFast

  work有wbuf1 wbuf2兩個隊列用於保存灰色對象,首先會往wbuf1隊列里加入灰色對象,wbuf1滿了后,交換wbuf1和wbuf2,這事wbuf2便晉升為wbuf1,繼續存放灰色對象,兩個隊列都滿了,則想全局進行申請

  putFast這里進嘗試將對象放進wbuf1隊列中

 1 func (w *gcWork) putFast(obj uintptr) bool {
 2     wbuf := w.wbuf1
 3     if wbuf == nil {
 4         // 沒有申請緩存隊列,返回false
 5         return false
 6     } else if wbuf.nobj == len(wbuf.obj) {
 7         // wbuf1隊列滿了,返回false
 8         return false
 9     }
10 
11     // 向未滿wbuf1隊列中加入對象
12     wbuf.obj[wbuf.nobj] = obj
13     wbuf.nobj++
14     return true
15 }

gcWork.put

put不僅嘗試將對象放入wbuf1,還會再wbuf1滿的時候,嘗試更換wbuf1 wbuf2的角色,都滿的話,則想全局進行申請,並將滿的隊列上交到全局隊列

 1 func (w *gcWork) put(obj uintptr) {
 2     flushed := false
 3     wbuf := w.wbuf1
 4     if wbuf == nil {
 5         // 如果wbuf1不存在,則初始化wbuf1 wbuf2兩個隊列
 6         w.init()
 7         wbuf = w.wbuf1
 8         // wbuf is empty at this point.
 9     } else if wbuf.nobj == len(wbuf.obj) {
10         // wbuf1滿了,更換wbuf1 wbuf2的角色
11         w.wbuf1, w.wbuf2 = w.wbuf2, w.wbuf1
12         wbuf = w.wbuf1
13         if wbuf.nobj == len(wbuf.obj) {
14             // 更換角色后,wbuf1也滿了,說明兩個隊列都滿了
15             // 把 wbuf1上交全局並獲取一個空的隊列
16             putfull(wbuf)
17             wbuf = getempty()
18             w.wbuf1 = wbuf
19             // 設置隊列上交的標志位
20             flushed = true
21         }
22     }
23 
24     wbuf.obj[wbuf.nobj] = obj
25     wbuf.nobj++
26 
27     // If we put a buffer on full, let the GC controller know so
28     // it can encourage more workers to run. We delay this until
29     // the end of put so that w is in a consistent state, since
30     // enlistWorker may itself manipulate w.
31     // 此時全局已經有標記滿的隊列,GC controller選擇調度更多work進行工作
32     if flushed && gcphase == _GCmark {
33         gcController.enlistWorker()
34     }
35 }

gcw.balance()

  繼續分析 gcDrain的58行,balance work是什么

 1 func (w *gcWork) balance() {
 2     if w.wbuf1 == nil {
 3         // 這里wbuf1 wbuf2隊列還沒有初始化
 4         return
 5     }
 6     // 如果wbuf2不為空,則上交到全局,並獲取一個空島隊列給wbuf2
 7     if wbuf := w.wbuf2; wbuf.nobj != 0 {
 8         putfull(wbuf)
 9         w.wbuf2 = getempty()
10     } else if wbuf := w.wbuf1; wbuf.nobj > 4 {
11         // 把未滿的wbuf1分成兩半,並把其中一半上交的全局隊列
12         w.wbuf1 = handoff(wbuf)
13     } else {
14         return
15     }
16     // We flushed a buffer to the full list, so wake a worker.
17     // 這里,全局隊列有滿的隊列了,其他work可以工作了
18     if gcphase == _GCmark {
19         gcController.enlistWorker()
20     }
21 }

gcw.get()

  繼續分析 gcDrain的63行,這里就是首先從本地的隊列獲取一個對象,如果本地隊列的wbuf1沒有,嘗試從wbuf2獲取,如果兩個都沒有,則嘗試從全局隊列獲取一個滿的隊列,並獲取一個對象

 1 func (w *gcWork) get() uintptr {
 2     wbuf := w.wbuf1
 3     if wbuf == nil {
 4         w.init()
 5         wbuf = w.wbuf1
 6         // wbuf is empty at this point.
 7     }
 8     if wbuf.nobj == 0 {
 9         // wbuf1空了,更換wbuf1 wbuf2的角色
10         w.wbuf1, w.wbuf2 = w.wbuf2, w.wbuf1
11         wbuf = w.wbuf1
12         // 原wbuf2也是空的,嘗試從全局隊列獲取一個滿的隊列
13         if wbuf.nobj == 0 {
14             owbuf := wbuf
15             wbuf = getfull()
16             // 獲取不到,則返回
17             if wbuf == nil {
18                 return 0
19             }
20             // 把空的隊列上傳到全局空隊列,並把獲取的滿的隊列,作為自身的wbuf1
21             putempty(owbuf)
22             w.wbuf1 = wbuf
23         }
24     }
25 
26     // TODO: This might be a good place to add prefetch code
27 
28     wbuf.nobj--
29     return wbuf.obj[wbuf.nobj]
30 }

  gcw.tryGet() gcw.tryGetFast() 邏輯差不多,相對比較簡單,就不繼續分析了

scanobject

  我們繼續分析到 gcDrain 的L76,這里已經獲取到了b,開始消費隊列

  1 func scanobject(b uintptr, gcw *gcWork) {
  2     // Find the bits for b and the size of the object at b.
  3     //
  4     // b is either the beginning of an object, in which case this
  5     // is the size of the object to scan, or it points to an
  6     // oblet, in which case we compute the size to scan below.
  7     // 獲取b對應的bits
  8     hbits := heapBitsForAddr(b)
  9     // 獲取b所在的span
 10     s := spanOfUnchecked(b)
 11     n := s.elemsize
 12     if n == 0 {
 13         throw("scanobject n == 0")
 14     }
 15     // 對象過大,則切割后再掃描,maxObletBytes為128k
 16     if n > maxObletBytes {
 17         // Large object. Break into oblets for better
 18         // parallelism and lower latency.
 19         if b == s.base() {
 20             // It's possible this is a noscan object (not
 21             // from greyobject, but from other code
 22             // paths), in which case we must *not* enqueue
 23             // oblets since their bitmaps will be
 24             // uninitialized.
 25             // 如果不是指針,直接標記返回,相當於標黑了
 26             if s.spanclass.noscan() {
 27                 // Bypass the whole scan.
 28                 gcw.bytesMarked += uint64(n)
 29                 return
 30             }
 31 
 32             // Enqueue the other oblets to scan later.
 33             // Some oblets may be in b's scalar tail, but
 34             // these will be marked as "no more pointers",
 35             // so we'll drop out immediately when we go to
 36             // scan those.
 37             // 按maxObletBytes切割后放入到 隊列
 38             for oblet := b + maxObletBytes; oblet < s.base()+s.elemsize; oblet += maxObletBytes {
 39                 if !gcw.putFast(oblet) {
 40                     gcw.put(oblet)
 41                 }
 42             }
 43         }
 44 
 45         // Compute the size of the oblet. Since this object
 46         // must be a large object, s.base() is the beginning
 47         // of the object.
 48         n = s.base() + s.elemsize - b
 49         if n > maxObletBytes {
 50             n = maxObletBytes
 51         }
 52     }
 53 
 54     var i uintptr
 55     for i = 0; i < n; i += sys.PtrSize {
 56         // Find bits for this word.
 57         // 獲取到對應的bits
 58         if i != 0 {
 59             // Avoid needless hbits.next() on last iteration.
 60             hbits = hbits.next()
 61         }
 62         // Load bits once. See CL 22712 and issue 16973 for discussion.
 63         bits := hbits.bits()
 64         // During checkmarking, 1-word objects store the checkmark
 65         // in the type bit for the one word. The only one-word objects
 66         // are pointers, or else they'd be merged with other non-pointer
 67         // data into larger allocations.
 68         if i != 1*sys.PtrSize && bits&bitScan == 0 {
 69             break // no more pointers in this object
 70         }
 71         // 不是指針,繼續
 72         if bits&bitPointer == 0 {
 73             continue // not a pointer
 74         }
 75 
 76         // Work here is duplicated in scanblock and above.
 77         // If you make changes here, make changes there too.
 78         obj := *(*uintptr)(unsafe.Pointer(b + i))
 79 
 80         // At this point we have extracted the next potential pointer.
 81         // Quickly filter out nil and pointers back to the current object.
 82         if obj != 0 && obj-b >= n {
 83             // Test if obj points into the Go heap and, if so,
 84             // mark the object.
 85             //
 86             // Note that it's possible for findObject to
 87             // fail if obj points to a just-allocated heap
 88             // object because of a race with growing the
 89             // heap. In this case, we know the object was
 90             // just allocated and hence will be marked by
 91             // allocation itself.
 92             // 找到指針對應的對象,並標灰
 93             if obj, span, objIndex := findObject(obj, b, i); obj != 0 {
 94                 greyobject(obj, b, i, span, gcw, objIndex)
 95             }
 96         }
 97     }
 98     gcw.bytesMarked += uint64(n)
 99     gcw.scanWork += int64(i)
100 }

  標灰就是標記並放進隊列,標黑就是標記,所以當灰色對象從隊列中取出后,我們就可以認為這個對象是黑色對象了。

  至此,gcDrain的標記工作分析完成,我們繼續回到gcBgMarkWorker分析

  1 func gcMarkDone() {
  2 top:
  3     semacquire(&work.markDoneSema)
  4 
  5     // Re-check transition condition under transition lock.
  6     if !(gcphase == _GCmark && work.nwait == work.nproc && !gcMarkWorkAvailable(nil)) {
  7         semrelease(&work.markDoneSema)
  8         return
  9     }
 10 
 11     // Disallow starting new workers so that any remaining workers
 12     // in the current mark phase will drain out.
 13     //
 14     // TODO(austin): Should dedicated workers keep an eye on this
 15     // and exit gcDrain promptly?
 16     // 禁止新的標記任務
 17     atomic.Xaddint64(&gcController.dedicatedMarkWorkersNeeded, -0xffffffff)
 18     prevFractionalGoal := gcController.fractionalUtilizationGoal
 19     gcController.fractionalUtilizationGoal = 0
 20 
 21     // 如果gcBlackenPromptly表名需要所有本地緩存隊列立即上交到全局隊列,並禁用本地緩存隊列
 22     if !gcBlackenPromptly {
 23         // Transition from mark 1 to mark 2.
 24         //
 25         // The global work list is empty, but there can still be work
 26         // sitting in the per-P work caches.
 27         // Flush and disable work caches.
 28 
 29         // Disallow caching workbufs and indicate that we're in mark 2.
 30         // 禁用本地緩存隊列,進入mark2階段
 31         gcBlackenPromptly = true
 32 
 33         // Prevent completion of mark 2 until we've flushed
 34         // cached workbufs.
 35         atomic.Xadd(&work.nwait, -1)
 36 
 37         // GC is set up for mark 2. Let Gs blocked on the
 38         // transition lock go while we flush caches.
 39         semrelease(&work.markDoneSema)
 40         // 切換到g0執行,本地緩存上傳到全局的操作
 41         systemstack(func() {
 42             // Flush all currently cached workbufs and
 43             // ensure all Ps see gcBlackenPromptly. This
 44             // also blocks until any remaining mark 1
 45             // workers have exited their loop so we can
 46             // start new mark 2 workers.
 47             forEachP(func(_p_ *p) {
 48                 wbBufFlush1(_p_)
 49                 _p_.gcw.dispose()
 50             })
 51         })
 52 
 53         // Check that roots are marked. We should be able to
 54         // do this before the forEachP, but based on issue
 55         // #16083 there may be a (harmless) race where we can
 56         // enter mark 2 while some workers are still scanning
 57         // stacks. The forEachP ensures these scans are done.
 58         //
 59         // TODO(austin): Figure out the race and fix this
 60         // properly.
 61         // 檢查所有的root是否都被標記了
 62         gcMarkRootCheck()
 63 
 64         // Now we can start up mark 2 workers.
 65         atomic.Xaddint64(&gcController.dedicatedMarkWorkersNeeded, 0xffffffff)
 66         gcController.fractionalUtilizationGoal = prevFractionalGoal
 67 
 68         incnwait := atomic.Xadd(&work.nwait, +1)
 69         // 如果沒有更多的任務,則執行第二次調用,從mark2階段轉換到mark termination階段
 70         if incnwait == work.nproc && !gcMarkWorkAvailable(nil) {
 71             // This loop will make progress because
 72             // gcBlackenPromptly is now true, so it won't
 73             // take this same "if" branch.
 74             goto top
 75         }
 76     } else {
 77         // Transition to mark termination.
 78         now := nanotime()
 79         work.tMarkTerm = now
 80         work.pauseStart = now
 81         getg().m.preemptoff = "gcing"
 82         if trace.enabled {
 83             traceGCSTWStart(0)
 84         }
 85         systemstack(stopTheWorldWithSema)
 86         // The gcphase is _GCmark, it will transition to _GCmarktermination
 87         // below. The important thing is that the wb remains active until
 88         // all marking is complete. This includes writes made by the GC.
 89 
 90         // Record that one root marking pass has completed.
 91         work.markrootDone = true
 92 
 93         // Disable assists and background workers. We must do
 94         // this before waking blocked assists.
 95         atomic.Store(&gcBlackenEnabled, 0)
 96 
 97         // Wake all blocked assists. These will run when we
 98         // start the world again.
 99         // 喚醒所有的輔助GC
100         gcWakeAllAssists()
101 
102         // Likewise, release the transition lock. Blocked
103         // workers and assists will run when we start the
104         // world again.
105         semrelease(&work.markDoneSema)
106 
107         // endCycle depends on all gcWork cache stats being
108         // flushed. This is ensured by mark 2.
109         // 計算下一次gc出發的閾值
110         nextTriggerRatio := gcController.endCycle()
111 
112         // Perform mark termination. This will restart the world.
113         // start the world,並進入完成階段
114         gcMarkTermination(nextTriggerRatio)
115     }
116 }

gcMarkTermination

  結束標記,並進行清掃等工作

  1 func gcMarkTermination(nextTriggerRatio float64) {
  2     // World is stopped.
  3     // Start marktermination which includes enabling the write barrier.
  4     atomic.Store(&gcBlackenEnabled, 0)
  5     gcBlackenPromptly = false
  6     // 設置GC的階段標識
  7     setGCPhase(_GCmarktermination)
  8 
  9     work.heap1 = memstats.heap_live
 10     startTime := nanotime()
 11 
 12     mp := acquirem()
 13     mp.preemptoff = "gcing"
 14     _g_ := getg()
 15     _g_.m.traceback = 2
 16     gp := _g_.m.curg
 17     // 設置當前g的狀態為waiting狀態
 18     casgstatus(gp, _Grunning, _Gwaiting)
 19     gp.waitreason = waitReasonGarbageCollection
 20 
 21     // Run gc on the g0 stack. We do this so that the g stack
 22     // we're currently running on will no longer change. Cuts
 23     // the root set down a bit (g0 stacks are not scanned, and
 24     // we don't need to scan gc's internal state).  We also
 25     // need to switch to g0 so we can shrink the stack.
 26     systemstack(func() {
 27         // 通過g0掃描當前g的棧
 28         gcMark(startTime)
 29         // Must return immediately.
 30         // The outer function's stack may have moved
 31         // during gcMark (it shrinks stacks, including the
 32         // outer function's stack), so we must not refer
 33         // to any of its variables. Return back to the
 34         // non-system stack to pick up the new addresses
 35         // before continuing.
 36     })
 37 
 38     systemstack(func() {
 39         work.heap2 = work.bytesMarked
 40         if debug.gccheckmark > 0 {
 41             // Run a full stop-the-world mark using checkmark bits,
 42             // to check that we didn't forget to mark anything during
 43             // the concurrent mark process.
 44             // 如果啟用了gccheckmark,則檢查所有可達對象是否都有標記
 45             gcResetMarkState()
 46             initCheckmarks()
 47             gcMark(startTime)
 48             clearCheckmarks()
 49         }
 50 
 51         // marking is complete so we can turn the write barrier off
 52         // 設置gc的階段標識,GCoff時會關閉寫屏障
 53         setGCPhase(_GCoff)
 54         // 開始清掃
 55         gcSweep(work.mode)
 56 
 57         if debug.gctrace > 1 {
 58             startTime = nanotime()
 59             // The g stacks have been scanned so
 60             // they have gcscanvalid==true and gcworkdone==true.
 61             // Reset these so that all stacks will be rescanned.
 62             gcResetMarkState()
 63             finishsweep_m()
 64 
 65             // Still in STW but gcphase is _GCoff, reset to _GCmarktermination
 66             // At this point all objects will be found during the gcMark which
 67             // does a complete STW mark and object scan.
 68             setGCPhase(_GCmarktermination)
 69             gcMark(startTime)
 70             setGCPhase(_GCoff) // marking is done, turn off wb.
 71             gcSweep(work.mode)
 72         }
 73     })
 74 
 75     _g_.m.traceback = 0
 76     casgstatus(gp, _Gwaiting, _Grunning)
 77 
 78     if trace.enabled {
 79         traceGCDone()
 80     }
 81 
 82     // all done
 83     mp.preemptoff = ""
 84 
 85     if gcphase != _GCoff {
 86         throw("gc done but gcphase != _GCoff")
 87     }
 88 
 89     // Update GC trigger and pacing for the next cycle.
 90     // 更新下次出發gc的增長比
 91     gcSetTriggerRatio(nextTriggerRatio)
 92 
 93     // Update timing memstats
 94     // 更新用時
 95     now := nanotime()
 96     sec, nsec, _ := time_now()
 97     unixNow := sec*1e9 + int64(nsec)
 98     work.pauseNS += now - work.pauseStart
 99     work.tEnd = now
100     atomic.Store64(&memstats.last_gc_unix, uint64(unixNow)) // must be Unix time to make sense to user
101     atomic.Store64(&memstats.last_gc_nanotime, uint64(now)) // monotonic time for us
102     memstats.pause_ns[memstats.numgc%uint32(len(memstats.pause_ns))] = uint64(work.pauseNS)
103     memstats.pause_end[memstats.numgc%uint32(len(memstats.pause_end))] = uint64(unixNow)
104     memstats.pause_total_ns += uint64(work.pauseNS)
105 
106     // Update work.totaltime.
107     sweepTermCpu := int64(work.stwprocs) * (work.tMark - work.tSweepTerm)
108     // We report idle marking time below, but omit it from the
109     // overall utilization here since it's "free".
110     markCpu := gcController.assistTime + gcController.dedicatedMarkTime + gcController.fractionalMarkTime
111     markTermCpu := int64(work.stwprocs) * (work.tEnd - work.tMarkTerm)
112     cycleCpu := sweepTermCpu + markCpu + markTermCpu
113     work.totaltime += cycleCpu
114 
115     // Compute overall GC CPU utilization.
116     totalCpu := sched.totaltime + (now-sched.procresizetime)*int64(gomaxprocs)
117     memstats.gc_cpu_fraction = float64(work.totaltime) / float64(totalCpu)
118 
119     // Reset sweep state.
120     // 重置清掃的狀態
121     sweep.nbgsweep = 0
122     sweep.npausesweep = 0
123 
124     // 如果是強制開啟的gc,標識增加
125     if work.userForced {
126         memstats.numforcedgc++
127     }
128 
129     // Bump GC cycle count and wake goroutines waiting on sweep.
130     // 統計執行GC的次數然后喚醒等待清掃的G
131     lock(&work.sweepWaiters.lock)
132     memstats.numgc++
133     injectglist(work.sweepWaiters.head.ptr())
134     work.sweepWaiters.head = 0
135     unlock(&work.sweepWaiters.lock)
136 
137     // Finish the current heap profiling cycle and start a new
138     // heap profiling cycle. We do this before starting the world
139     // so events don't leak into the wrong cycle.
140     mProf_NextCycle()
141     // start the world
142     systemstack(func() { startTheWorldWithSema(true) })
143 
144     // Flush the heap profile so we can start a new cycle next GC.
145     // This is relatively expensive, so we don't do it with the
146     // world stopped.
147     mProf_Flush()
148 
149     // Prepare workbufs for freeing by the sweeper. We do this
150     // asynchronously because it can take non-trivial time.
151     prepareFreeWorkbufs()
152 
153     // Free stack spans. This must be done between GC cycles.
154     systemstack(freeStackSpans)
155 
156     // Print gctrace before dropping worldsema. As soon as we drop
157     // worldsema another cycle could start and smash the stats
158     // we're trying to print.
159     if debug.gctrace > 0 {
160         util := int(memstats.gc_cpu_fraction * 100)
161 
162         var sbuf [24]byte
163         printlock()
164         print("gc ", memstats.numgc,
165             " @", string(itoaDiv(sbuf[:], uint64(work.tSweepTerm-runtimeInitTime)/1e6, 3)), "s ",
166             util, "%: ")
167         prev := work.tSweepTerm
168         for i, ns := range []int64{work.tMark, work.tMarkTerm, work.tEnd} {
169             if i != 0 {
170                 print("+")
171             }
172             print(string(fmtNSAsMS(sbuf[:], uint64(ns-prev))))
173             prev = ns
174         }
175         print(" ms clock, ")
176         for i, ns := range []int64{sweepTermCpu, gcController.assistTime, gcController.dedicatedMarkTime + gcController.fractionalMarkTime, gcController.idleMarkTime, markTermCpu} {
177             if i == 2 || i == 3 {
178                 // Separate mark time components with /.
179                 print("/")
180             } else if i != 0 {
181                 print("+")
182             }
183             print(string(fmtNSAsMS(sbuf[:], uint64(ns))))
184         }
185         print(" ms cpu, ",
186             work.heap0>>20, "->", work.heap1>>20, "->", work.heap2>>20, " MB, ",
187             work.heapGoal>>20, " MB goal, ",
188             work.maxprocs, " P")
189         if work.userForced {
190             print(" (forced)")
191         }
192         print("\n")
193         printunlock()
194     }
195 
196     semrelease(&worldsema)
197     // Careful: another GC cycle may start now.
198 
199     releasem(mp)
200     mp = nil
201 
202     // now that gc is done, kick off finalizer thread if needed
203     // 如果不是並行GC,則讓當前M開始調度
204     if !concurrentSweep {
205         // give the queued finalizers, if any, a chance to run
206         Gosched()
207     }
208 }

5. 清理

goSweep

  清掃任務:

 1 func gcSweep(mode gcMode) {
 2     if gcphase != _GCoff {
 3         throw("gcSweep being done but phase is not GCoff")
 4     }
 5 
 6     lock(&mheap_.lock)
 7     // sweepgen在每次GC之后都會增長2,每次GC之后sweepSpans的角色都會互換
 8     mheap_.sweepgen += 2
 9     mheap_.sweepdone = 0
10     if mheap_.sweepSpans[mheap_.sweepgen/2%2].index != 0 {
11         // We should have drained this list during the last
12         // sweep phase. We certainly need to start this phase
13         // with an empty swept list.
14         throw("non-empty swept list")
15     }
16     mheap_.pagesSwept = 0
17     unlock(&mheap_.lock)
18     // 如果不是並行GC,或者強制GC
19     if !_ConcurrentSweep || mode == gcForceBlockMode {
20         // Special case synchronous sweep.
21         // Record that no proportional sweeping has to happen.
22         lock(&mheap_.lock)
23         mheap_.sweepPagesPerByte = 0
24         unlock(&mheap_.lock)
25         // Sweep all spans eagerly.
26         // 清掃所有的span
27         for sweepone() != ^uintptr(0) {
28             sweep.npausesweep++
29         }
30         // Free workbufs eagerly.
31         // 釋放所有的 workbufs
32         prepareFreeWorkbufs()
33         for freeSomeWbufs(false) {
34         }
35         // All "free" events for this mark/sweep cycle have
36         // now happened, so we can make this profile cycle
37         // available immediately.
38         mProf_NextCycle()
39         mProf_Flush()
40         return
41     }
42 
43     // Background sweep.
44     lock(&sweep.lock)
45     // 喚醒后台清掃任務,也就是 bgsweep 函數,清掃流程跟上面非並行清掃差不多
46     if sweep.parked {
47         sweep.parked = false
48         ready(sweep.g, 0, true)
49     }
50     unlock(&sweep.lock)
51 }

  並發清理同樣由一個專門的goroutine完成,它在 runtime.main 調用時被創建。

sweepone

  接下來我們就分析一下sweepone 清掃的流程

 1 func sweepone() uintptr {
 2     _g_ := getg()
 3     sweepRatio := mheap_.sweepPagesPerByte // For debugging
 4 
 5     // increment locks to ensure that the goroutine is not preempted
 6     // in the middle of sweep thus leaving the span in an inconsistent state for next GC
 7     _g_.m.locks++
 8     // 檢查是否已經完成了清掃
 9     if atomic.Load(&mheap_.sweepdone) != 0 {
10         _g_.m.locks--
11         return ^uintptr(0)
12     }
13     // 增加清掃的worker數量
14     atomic.Xadd(&mheap_.sweepers, +1)
15 
16     npages := ^uintptr(0)
17     sg := mheap_.sweepgen
18     for {
19         // 循環獲取需要清掃的span
20         s := mheap_.sweepSpans[1-sg/2%2].pop()
21         if s == nil {
22             atomic.Store(&mheap_.sweepdone, 1)
23             break
24         }
25         if s.state != mSpanInUse {
26             // This can happen if direct sweeping already
27             // swept this span, but in that case the sweep
28             // generation should always be up-to-date.
29             if s.sweepgen != sg {
30                 print("runtime: bad span s.state=", s.state, " s.sweepgen=", s.sweepgen, " sweepgen=", sg, "\n")
31                 throw("non in-use span in unswept list")
32             }
33             continue
34         }
35         // sweepgen == h->sweepgen - 2, 表示這個span需要清掃
36         // sweepgen == h->sweepgen - 1, 表示這個span正在被清掃
37         // 這是里確定span的狀態及嘗試轉換span的狀態
38         if s.sweepgen != sg-2 || !atomic.Cas(&s.sweepgen, sg-2, sg-1) {
39             continue
40         }
41         npages = s.npages
42         // 單個span的清掃
43         if !s.sweep(false) {
44             // Span is still in-use, so this returned no
45             // pages to the heap and the span needs to
46             // move to the swept in-use list.
47             npages = 0
48         }
49         break
50     }
51 
52     // Decrement the number of active sweepers and if this is the
53     // last one print trace information.
54     // 當前worker清掃任務完成,更新sweepers的數量
55     if atomic.Xadd(&mheap_.sweepers, -1) == 0 && atomic.Load(&mheap_.sweepdone) != 0 {
56         if debug.gcpacertrace > 0 {
57             print("pacer: sweep done at heap size ", memstats.heap_live>>20, "MB; allocated ", (memstats.heap_live-mheap_.sweepHeapLiveBasis)>>20, "MB during sweep; swept ", mheap_.pagesSwept, " pages at ", sweepRatio, " pages/byte\n")
58         }
59     }
60     _g_.m.locks--
61     return npages
62 }

mspan.sweep

  1 func (s *mspan) sweep(preserve bool) bool {
  2     // It's critical that we enter this function with preemption disabled,
  3     // GC must not start while we are in the middle of this function.
  4     _g_ := getg()
  5     if _g_.m.locks == 0 && _g_.m.mallocing == 0 && _g_ != _g_.m.g0 {
  6         throw("MSpan_Sweep: m is not locked")
  7     }
  8     sweepgen := mheap_.sweepgen
  9     // 只有正在清掃中狀態的span才可以正常執行
 10     if s.state != mSpanInUse || s.sweepgen != sweepgen-1 {
 11         print("MSpan_Sweep: state=", s.state, " sweepgen=", s.sweepgen, " mheap.sweepgen=", sweepgen, "\n")
 12         throw("MSpan_Sweep: bad span state")
 13     }
 14 
 15     if trace.enabled {
 16         traceGCSweepSpan(s.npages * _PageSize)
 17     }
 18     // 先更新清掃的page數
 19     atomic.Xadd64(&mheap_.pagesSwept, int64(s.npages))
 20 
 21     spc := s.spanclass
 22     size := s.elemsize
 23     res := false
 24 
 25     c := _g_.m.mcache
 26     freeToHeap := false
 27 
 28     // The allocBits indicate which unmarked objects don't need to be
 29     // processed since they were free at the end of the last GC cycle
 30     // and were not allocated since then.
 31     // If the allocBits index is >= s.freeindex and the bit
 32     // is not marked then the object remains unallocated
 33     // since the last GC.
 34     // This situation is analogous to being on a freelist.
 35 
 36     // Unlink & free special records for any objects we're about to free.
 37     // Two complications here:
 38     // 1. An object can have both finalizer and profile special records.
 39     //    In such case we need to queue finalizer for execution,
 40     //    mark the object as live and preserve the profile special.
 41     // 2. A tiny object can have several finalizers setup for different offsets.
 42     //    If such object is not marked, we need to queue all finalizers at once.
 43     // Both 1 and 2 are possible at the same time.
 44     specialp := &s.specials
 45     special := *specialp
 46     // 判斷在special中的對象是否存活,是否至少有一個finalizer,釋放沒有finalizer的對象,把有finalizer的對象組成隊列
 47     for special != nil {
 48         // A finalizer can be set for an inner byte of an object, find object beginning.
 49         objIndex := uintptr(special.offset) / size
 50         p := s.base() + objIndex*size
 51         mbits := s.markBitsForIndex(objIndex)
 52         if !mbits.isMarked() {
 53             // This object is not marked and has at least one special record.
 54             // Pass 1: see if it has at least one finalizer.
 55             hasFin := false
 56             endOffset := p - s.base() + size
 57             for tmp := special; tmp != nil && uintptr(tmp.offset) < endOffset; tmp = tmp.next {
 58                 if tmp.kind == _KindSpecialFinalizer {
 59                     // Stop freeing of object if it has a finalizer.
 60                     mbits.setMarkedNonAtomic()
 61                     hasFin = true
 62                     break
 63                 }
 64             }
 65             // Pass 2: queue all finalizers _or_ handle profile record.
 66             for special != nil && uintptr(special.offset) < endOffset {
 67                 // Find the exact byte for which the special was setup
 68                 // (as opposed to object beginning).
 69                 p := s.base() + uintptr(special.offset)
 70                 if special.kind == _KindSpecialFinalizer || !hasFin {
 71                     // Splice out special record.
 72                     y := special
 73                     special = special.next
 74                     *specialp = special
 75                     freespecial(y, unsafe.Pointer(p), size)
 76                 } else {
 77                     // This is profile record, but the object has finalizers (so kept alive).
 78                     // Keep special record.
 79                     specialp = &special.next
 80                     special = *specialp
 81                 }
 82             }
 83         } else {
 84             // object is still live: keep special record
 85             specialp = &special.next
 86             special = *specialp
 87         }
 88     }
 89 
 90     if debug.allocfreetrace != 0 || raceenabled || msanenabled {
 91         // Find all newly freed objects. This doesn't have to
 92         // efficient; allocfreetrace has massive overhead.
 93         mbits := s.markBitsForBase()
 94         abits := s.allocBitsForIndex(0)
 95         for i := uintptr(0); i < s.nelems; i++ {
 96             if !mbits.isMarked() && (abits.index < s.freeindex || abits.isMarked()) {
 97                 x := s.base() + i*s.elemsize
 98                 if debug.allocfreetrace != 0 {
 99                     tracefree(unsafe.Pointer(x), size)
100                 }
101                 if raceenabled {
102                     racefree(unsafe.Pointer(x), size)
103                 }
104                 if msanenabled {
105                     msanfree(unsafe.Pointer(x), size)
106                 }
107             }
108             mbits.advance()
109             abits.advance()
110         }
111     }
112 
113     // Count the number of free objects in this span.
114     // 獲取需要釋放的alloc對象的總數
115     nalloc := uint16(s.countAlloc())
116     // 如果sizeclass為0,卻分配的總數量為0,則釋放到mheap
117     if spc.sizeclass() == 0 && nalloc == 0 {
118         s.needzero = 1
119         freeToHeap = true
120     }
121     nfreed := s.allocCount - nalloc
122     if nalloc > s.allocCount {
123         print("runtime: nelems=", s.nelems, " nalloc=", nalloc, " previous allocCount=", s.allocCount, " nfreed=", nfreed, "\n")
124         throw("sweep increased allocation count")
125     }
126 
127     s.allocCount = nalloc
128     // 判斷span是否empty
129     wasempty := s.nextFreeIndex() == s.nelems
130     // 重置freeindex
131     s.freeindex = 0 // reset allocation index to start of span.
132     if trace.enabled {
133         getg().m.p.ptr().traceReclaimed += uintptr(nfreed) * s.elemsize
134     }
135 
136     // gcmarkBits becomes the allocBits.
137     // get a fresh cleared gcmarkBits in preparation for next GC
138     // 重置 allocBits為 gcMarkBits
139     s.allocBits = s.gcmarkBits
140     // 重置 gcMarkBits
141     s.gcmarkBits = newMarkBits(s.nelems)
142 
143     // Initialize alloc bits cache.
144     // 更新allocCache
145     s.refillAllocCache(0)
146 
147     // We need to set s.sweepgen = h.sweepgen only when all blocks are swept,
148     // because of the potential for a concurrent free/SetFinalizer.
149     // But we need to set it before we make the span available for allocation
150     // (return it to heap or mcentral), because allocation code assumes that a
151     // span is already swept if available for allocation.
152     if freeToHeap || nfreed == 0 {
153         // The span must be in our exclusive ownership until we update sweepgen,
154         // check for potential races.
155         if s.state != mSpanInUse || s.sweepgen != sweepgen-1 {
156             print("MSpan_Sweep: state=", s.state, " sweepgen=", s.sweepgen, " mheap.sweepgen=", sweepgen, "\n")
157             throw("MSpan_Sweep: bad span state after sweep")
158         }
159         // Serialization point.
160         // At this point the mark bits are cleared and allocation ready
161         // to go so release the span.
162         atomic.Store(&s.sweepgen, sweepgen)
163     }
164 
165     if nfreed > 0 && spc.sizeclass() != 0 {
166         c.local_nsmallfree[spc.sizeclass()] += uintptr(nfreed)
167         // 把span釋放到mcentral上
168         res = mheap_.central[spc].mcentral.freeSpan(s, preserve, wasempty)
169         // MCentral_FreeSpan updates sweepgen
170     } else if freeToHeap {
171         // 這里是大對象的span釋放,與117行呼應
172         // Free large span to heap
173 
174         // NOTE(rsc,dvyukov): The original implementation of efence
175         // in CL 22060046 used SysFree instead of SysFault, so that
176         // the operating system would eventually give the memory
177         // back to us again, so that an efence program could run
178         // longer without running out of memory. Unfortunately,
179         // calling SysFree here without any kind of adjustment of the
180         // heap data structures means that when the memory does
181         // come back to us, we have the wrong metadata for it, either in
182         // the MSpan structures or in the garbage collection bitmap.
183         // Using SysFault here means that the program will run out of
184         // memory fairly quickly in efence mode, but at least it won't
185         // have mysterious crashes due to confused memory reuse.
186         // It should be possible to switch back to SysFree if we also
187         // implement and then call some kind of MHeap_DeleteSpan.
188         if debug.efence > 0 {
189             s.limit = 0 // prevent mlookup from finding this span
190             sysFault(unsafe.Pointer(s.base()), size)
191         } else {
192             // 把sapn釋放到mheap上
193             mheap_.freeSpan(s, 1)
194         }
195         c.local_nlargefree++
196         c.local_largefree += size
197         res = true
198     }
199     if !res {
200         // The span has been swept and is still in-use, so put
201         // it on the swept in-use list.
202         // 如果span未釋放到mcentral或mheap,表示span仍然處於in-use狀態
203         mheap_.sweepSpans[sweepgen/2%2].push(s)
204     }
205     return res
206 }

  並發清理本質上就是一個死循環,被喚醒后開始執行清理任務。通過遍歷所有span對象,觸發內存分配的回收操作。任務完成后再次休眠,等待下次任務。

6. 回收流程

  GO的GC是並行GC, 也就是GC的大部分處理和普通的go代碼是同時運行的, 這讓GO的GC流程比較復雜.
首先GC有四個階段, 它們分別是:

  • Sweep Termination: 對未清掃的span進行清掃, 只有上一輪的GC的清掃工作完成才可以開始新一輪的GC
  • Mark: 掃描所有根對象, 和根對象可以到達的所有對象, 標記它們不被回收
  • Mark Termination: 完成標記工作, 重新掃描部分根對象(要求STW)
  • Sweep: 按標記結果清掃span

  在GC過程中會有兩種后台任務(G), 一種是標記用的后台任務, 一種是清掃用的后台任務.標記用的后台任務會在需要時啟動, 可以同時工作的后台任務數量大約是P的數量的25%, 也就是go所講的讓25%的cpu用在GC上的根據.清掃用的后台任務在程序啟動時會啟動一個, 進入清掃階段時喚醒.

  目前整個GC流程會進行兩次STW(Stop The World), 第一次是Mark階段的開始, 第二次是Mark Termination階段.第一次STW會准備根對象的掃描, 啟動寫屏障(Write Barrier)和輔助GC(mutator assist).第二次STW會重新掃描部分根對象, 禁用寫屏障(Write Barrier)和輔助GC(mutator assist).需要注意的是, 不是所有根對象的掃描都需要STW, 例如掃描棧上的對象只需要停止擁有該棧的G.寫屏障的實現使用了Hybrid Write Barrier, 大幅減少了第二次STW的時間.

 7. 監控

  場景:服務重啟,海量客戶端重新接入,瞬間分配大量對象,這會將垃圾回收的觸發條件next_gc推到一個很大值。服務正常后,因活躍的遠小於該閾值,造成垃圾回收久久無法觸發,服務進程內會有大量白色對象無法被回收,造成隱性內存泄漏,也可能是某個對象在短期內大量使用臨時對象造成。

場景示例:

 1 //testms.go
 2 packmage main
 3 
 4 import (
 5     "fmt"
 6     "runtime"
 7     "time"
 8 )
 9 
10 func test(){
11     type M [1 << 10]byte
12     data := make([]*M, 1024*20)
13 
14     //申請20MB內存分配,超出初始閾值,將next_GC提高
15     for i := range data {
16         data[i] = new(M)
17     }
18 
19     //解除引用,預防內聯導致data生命周期變長
20     for i := range data {
21         data[i] = nil 
22     }
23 }
24 
25 func main(){
26     test()
27     now := time.New()
28     for{
29         var ms runtime.MemStats
30         runtime.ReadMemStats(&ms)
31         fmt.Printf("%s %d MB\n", now.Format("15:04:05"), ms.NextGC>>20)
32 
33         time.Sleep(time.Second * 30)
34     }
35 }

編譯執行:

  test()函數模擬了短期內大量分配對象的行為。

  輸出結果顯示在其結束后的的一段時間內都沒有觸發垃圾回收。直到forcegc介入,才將next_gc恢復正常。這是垃圾回收的最后一道保障措施。監控服務sysmon每隔2分鍾就會檢查一次垃圾回收狀態,如超出2分鍾未觸發,則強制執行。

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM