go GC 的基本特征是非分代、非緊縮、寫屏障、並發標記清理。核心是抑制堆增長,充分利用CPU資源。
1. 三色標記
是指並發(垃圾回收和用戶邏輯並發執行)的對系統中的對象進行顏色標記,然后根據顏色將對象進行清理。基本原理:
- 起初將堆上所有對象都標記為白色;
- 從底部開始遍歷對象,將遍歷到的白色對象標記為灰色,放入待處理隊列;
- 遍歷灰色對象,把灰色對像所引用的白色對象也標記為灰色,將原灰色對象本身標記為黑色;
- 循環執行上一步,直至原灰色對象全部標記為黑色;
步驟4結束后,標記為白色的對象就是不可達對象,就是垃圾對象,可以進行回收。
最后white的對象都會被清理掉
寫屏障
在進行三色標記的時候並沒有STW,也就是說,此時的對象還是可以進行修改;考慮這樣一種情況,在進行三色標記掃描灰色對象時,掃描到了對象A,並標記了對象A的所有引用,當開始掃描對象D的引用時,另一個goroutine修改了D->E的引用,變成了A->E的引用,就會導致E對象掃描不到而一直是白對象,就會被誤認為是垃圾。寫屏障就是為了解決這樣的問題,引入寫屏障后,在A->E后,E會被認為是存活的,即使后面E被A對象拋棄,E只會被在下一輪的GC中進行回收,這一輪GC不會回收對象E。
寫屏障監視對象內存修改,重新標色或放回隊列。
Go1.9中開始啟用了混合寫屏障,偽代碼如下:
1 writePointer(slot, ptr): 2 shade(*slot) 3 if any stack is grey: 4 shade(ptr) 5 *slot = ptr
混合寫屏障會同時標記指針寫入目標的"原指針"和“新指針"。
標記原指針的原因是, 其他運行中的線程有可能會同時把這個指針的值復制到寄存器或者棧上的本地變量,因為復制指針到寄存器或者棧上的本地變量不會經過寫屏障, 所以有可能會導致指針不被標記,標記新指針的原因是, 其他運行中的線程有可能會轉移指針的位置。
混合寫屏障可以讓GC在並行標記結束后不需要重新掃描各個G的堆棧, 可以減少Mark Termination中的STW時間。
除了寫屏障外, 在GC的過程中所有新分配的對象都會立刻變為黑色。
控制器
控制器全程參與並發回收任務,記錄相關狀態數據,動態調整運行策略,影響並發標記單元的工作模式和數量,平衡CPU資源占用。回收結束時參與next_gc回收閾值設定,調整垃圾回收觸發頻率。
//mgc.go
1 // gcController implements the GC pacing controller that determines 2 // when to trigger concurrent garbage collection and how much marking 3 // work to do in mutator assists and background marking. 4 // 5 // It uses a feedback control algorithm to adjust the memstats.gc_trigger 6 // trigger based on the heap growth and GC CPU utilization each cycle. 7 // This algorithm optimizes for heap growth to match GOGC and for CPU 8 // utilization between assist and background marking to be 25% of 9 // GOMAXPROCS. The high-level design of this algorithm is documented 10 // at https://golang.org/s/go15gcpacing. 11 // 12 // All fields of gcController are used only during a single mark 13 // cycle. 14 15 //GC controller實現GC起搏控制器,該控制器確定何時觸發並發垃圾收集,以及在mutator協助和后台標記中要做多少標記工作。 16 // 17 //它使用反饋控制算法根據堆增長和每個周期的gc CPU利用率調整memstats.gc_觸發器。 18 //該算法優化堆增長以匹配GOGC,並優化輔助和后台標記之間的CPU利用率為GOMAXPROCS的25%。該算法的高級設計在https://golang.org/s/go15gcpacking上有文檔記錄。 19 // 20 //gcController的所有字段只在一個標記周期內使用。
輔助回收
當對象回收速遞遠快於后台標記,會引發堆惡性擴張等惡果,甚至是使垃圾回收永遠也無法完成,此時讓用戶代碼線程參與后台標記回收非常有必要,為對象分配堆內存時,通過相關策略去執行一定限度的回收操作,平衡分配和回收操作,讓進程處於良性狀態。
2. 初始化
初始化過程中,重點是設置 gcpercent 和 next_gc
//mgc.go
1 // Initialized from $GOGC. GOGC=off means no GC. 2 var gcpercent int32 3 4 func gcinit() { 5 if unsafe.Sizeof(workbuf{}) != _WorkbufSize { 6 throw("size of Workbuf is suboptimal") 7 } 8 9 // No sweep on the first cycle. 10 mheap_.sweepdone = 1 11 12 // Set a reasonable initial GC trigger. 13 memstats.triggerRatio = 7 / 8.0 14 15 // Fake a heap_marked value so it looks like a trigger at 16 // heapminimum is the appropriate growth from heap_marked. 17 // This will go into computing the initial GC goal. 18 memstats.heap_marked = uint64(float64(heapminimum) / (1 + memstats.triggerRatio)) 19 20 // Set gcpercent from the environment. This will also compute 21 // and set the GC trigger and goal. 22 //設置GOGC 23 _ = setGCPercent(readgogc()) 24 25 work.startSema = 1 26 work.markDoneSema = 1 27 } 28 29 func readgogc() int32 { 30 p := gogetenv("GOGC") 31 if p == "off" { 32 return -1 33 } 34 if n, ok := atoi32(p); ok { 35 return n 36 } 37 return 100 38 } 39 40 // gcenable is called after the bulk of the runtime initialization, 41 // just before we're about to start letting user code run. 42 // It kicks off the background sweeper goroutine and enables GC. 43 func gcenable() { 44 c := make(chan int, 1) 45 go bgsweep(c) 46 <-c 47 memstats.enablegc = true // now that runtime is initialized, GC is okay 48 } 49 50 //go:linkname setGCPercent runtime/debug.setGCPercent 51 func setGCPercent(in int32) (out int32) { 52 lock(&mheap_.lock) 53 out = gcpercent 54 if in < 0 { 55 in = -1 56 } 57 gcpercent = in 58 heapminimum = defaultHeapMinimum * uint64(gcpercent) / 100 59 // Update pacing in response to gcpercent change. 60 gcSetTriggerRatio(memstats.triggerRatio) 61 unlock(&mheap_.lock) 62 63 // If we just disabled GC, wait for any concurrent GC mark to 64 // finish so we always return with no GC running. 65 if in < 0 { 66 gcWaitOnMark(atomic.Load(&work.cycles)) 67 } 68 69 return out 70 }
啟動
在為對象分配堆內存后,mallocgc函數會檢查垃圾回收觸發條件,並依照相關狀態啟動或參與輔助回收。
malloc.go
1 // Allocate an object of size bytes. 2 // Small objects are allocated from the per-P cache's free lists. 3 // Large objects (> 32 kB) are allocated straight from the heap. 4 func mallocgc(size uintptr, typ *_type, needzero bool) unsafe.Pointer { 5 if gcphase == _GCmarktermination { 6 throw("mallocgc called with gcphase == _GCmarktermination") 7 } 8 9 // ... 10 11 // assistG is the G to charge for this allocation, or nil if 12 // GC is not currently active. 13 var assistG *g 14 if gcBlackenEnabled != 0 { 15 //讓出資源 16 // Charge the current user G for this allocation. 17 assistG = getg() 18 if assistG.m.curg != nil { 19 assistG = assistG.m.curg 20 } 21 // Charge the allocation against the G. We'll account 22 // for internal fragmentation at the end of mallocgc. 23 assistG.gcAssistBytes -= int64(size) 24 25 if assistG.gcAssistBytes < 0 { 26 //輔助參與回收任務 27 // This G is in debt. Assist the GC to correct 28 // this before allocating. This must happen 29 // before disabling preemption. 30 gcAssistAlloc(assistG) 31 } 32 } 33 34 // Set mp.mallocing to keep from being preempted by GC. 35 mp := acquirem() 36 if mp.mallocing != 0 { 37 throw("malloc deadlock") 38 } 39 if mp.gsignal == getg() { 40 throw("malloc during signal") 41 } 42 mp.mallocing = 1 43 44 shouldhelpgc := false 45 dataSize := size 46 c := gomcache() 47 var x unsafe.Pointer 48 noscan := typ == nil || typ.kind&kindNoPointers != 0 49 50 //判斷對象大小 51 //…… 52 53 // Allocate black during GC. 54 // All slots hold nil so no scanning is needed. 55 // This may be racing with GC so do it atomically if there can be 56 // a race marking the bit. 57 if gcphase != _GCoff { 58 //直接分配黑色對象 59 gcmarknewobject(uintptr(x), size, scanSize) 60 } 61 62 if assistG != nil { 63 // Account for internal fragmentation in the assist 64 // debt now that we know it. 65 assistG.gcAssistBytes -= int64(size - dataSize) 66 } 67 //檢查垃圾回收觸發條件 68 if shouldhelpgc { 69 //啟動並發垃圾回收 70 if t := (gcTrigger{kind: gcTriggerHeap}); t.test() { 71 gcStart(t) 72 } 73 } 74 75 return x 76 }
垃圾回收默認以全並發模式運行,但可以用環境變量參數或參數禁用並發標記和並發清理。GC goroutine一直循環,直到符合觸發條件時被喚醒。
gcStart
//mgc.go
1 func gcStart(mode gcMode, trigger gcTrigger) { 2 // Since this is called from malloc and malloc is called in 3 // the guts of a number of libraries that might be holding 4 // locks, don't attempt to start GC in non-preemptible or 5 // potentially unstable situations. 6 // 判斷當前g是否可以搶占,不可搶占時不觸發GC 7 mp := acquirem() 8 if gp := getg(); gp == mp.g0 || mp.locks > 1 || mp.preemptoff != "" { 9 releasem(mp) 10 return 11 } 12 releasem(mp) 13 mp = nil 14 15 // Pick up the remaining unswept/not being swept spans concurrently 16 // 17 // This shouldn't happen if we're being invoked in background 18 // mode since proportional sweep should have just finished 19 // sweeping everything, but rounding errors, etc, may leave a 20 // few spans unswept. In forced mode, this is necessary since 21 // GC can be forced at any point in the sweeping cycle. 22 // 23 // We check the transition condition continuously here in case 24 // this G gets delayed in to the next GC cycle. 25 // 清掃 殘留的未清掃的垃圾 26 for trigger.test() && gosweepone() != ^uintptr(0) { 27 sweep.nbgsweep++ 28 } 29 30 // Perform GC initialization and the sweep termination 31 // transition. 32 semacquire(&work.startSema) 33 // Re-check transition condition under transition lock. 34 // 判斷gcTrriger的條件是否成立 35 if !trigger.test() { 36 semrelease(&work.startSema) 37 return 38 } 39 40 // For stats, check if this GC was forced by the user 41 // 判斷並記錄GC是否被強制執行的,runtime.GC()可以被用戶調用並強制執行 42 work.userForced = trigger.kind == gcTriggerAlways || trigger.kind == gcTriggerCycle 43 44 // In gcstoptheworld debug mode, upgrade the mode accordingly. 45 // We do this after re-checking the transition condition so 46 // that multiple goroutines that detect the heap trigger don't 47 // start multiple STW GCs. 48 // 設置gc的mode 49 if mode == gcBackgroundMode { 50 if debug.gcstoptheworld == 1 { 51 mode = gcForceMode 52 } else if debug.gcstoptheworld == 2 { 53 mode = gcForceBlockMode 54 } 55 } 56 57 // Ok, we're doing it! Stop everybody else 58 semacquire(&worldsema) 59 60 if trace.enabled { 61 traceGCStart() 62 } 63 64 // Check that all Ps have finished deferred mcache flushes.
// 啟動后台標記任務
67 // 重置gc 標記相關的狀態 68 gcResetMarkState() 69 70 work.stwprocs, work.maxprocs = gomaxprocs, gomaxprocs 71 if work.stwprocs > ncpu { 72 // This is used to compute CPU time of the STW phases, 73 // so it can't be more than ncpu, even if GOMAXPROCS is. 74 work.stwprocs = ncpu 75 } 76 work.heap0 = atomic.Load64(&memstats.heap_live) 77 work.pauseNS = 0 78 work.mode = mode 79 80 now := nanotime() 81 work.tSweepTerm = now 82 work.pauseStart = now 83 if trace.enabled { 84 traceGCSTWStart(1) 85 } 86 // STW,停止世界 87 systemstack(stopTheWorldWithSema) 88 // Finish sweep before we start concurrent scan. 89 // 先清掃上一輪的垃圾,確保上輪GC完成 90 systemstack(func() { 91 finishsweep_m() 92 }) 93 // clearpools before we start the GC. If we wait they memory will not be 94 // reclaimed until the next GC cycle. 95 // 清理 sync.pool sched.sudogcache、sched.deferpool,這里不展開,sync.pool已經說了,剩余的后面的文章會涉及 96 clearpools() 97 98 // 增加GC技術 99 work.cycles++ 100 if mode == gcBackgroundMode { // Do as much work concurrently as possible 101 gcController.startCycle() 102 work.heapGoal = memstats.next_gc 103 104 // Enter concurrent mark phase and enable 105 // write barriers. 106 // 107 // Because the world is stopped, all Ps will 108 // observe that write barriers are enabled by 109 // the time we start the world and begin 110 // scanning. 111 // 112 // Write barriers must be enabled before assists are 113 // enabled because they must be enabled before 114 // any non-leaf heap objects are marked. Since 115 // allocations are blocked until assists can 116 // happen, we want enable assists as early as 117 // possible. 118 // 設置GC的狀態為 gcMark 119 setGCPhase(_GCmark) 120 121 // 更新 bgmark 的狀態 122 gcBgMarkPrepare() // Must happen before assist enable. 123 // 計算並排隊root 掃描任務,並初始化相關掃描任務狀態 124 gcMarkRootPrepare() 125 126 // Mark all active tinyalloc blocks. Since we're 127 // allocating from these, they need to be black like 128 // other allocations. The alternative is to blacken 129 // the tiny block on every allocation from it, which 130 // would slow down the tiny allocator. 131 // 標記 tiny 對象 132 gcMarkTinyAllocs() 133 134 // At this point all Ps have enabled the write 135 // barrier, thus maintaining the no white to 136 // black invariant. Enable mutator assists to 137 // put back-pressure on fast allocating 138 // mutators. 139 // 設置 gcBlackenEnabled 為 1,啟用寫屏障 140 atomic.Store(&gcBlackenEnabled, 1) 141 142 // Assists and workers can start the moment we start 143 // the world. 144 gcController.markStartTime = now 145 146 // Concurrent mark. 147 systemstack(func() { 148 now = startTheWorldWithSema(trace.enabled) 149 }) 150 work.pauseNS += now - work.pauseStart 151 work.tMark = now 152 } else { 153 // 非並行模式 154 // 記錄完成標記階段的開始時間 155 if trace.enabled { 156 // Switch to mark termination STW. 157 traceGCSTWDone() 158 traceGCSTWStart(0) 159 } 160 t := nanotime() 161 work.tMark, work.tMarkTerm = t, t 162 work.heapGoal = work.heap0 163 164 // Perform mark termination. This will restart the world. 165 // stw,進行標記,清掃並start the world 166 gcMarkTermination(memstats.triggerRatio) 167 } 168 169 semrelease(&work.startSema) 170 }
4. 並發標記
- 掃描:遍歷相關內存區域,依照指針標記找出灰色可達對象,加入隊列;
- 標記:將灰色對象從隊列取出,將其引用對象標記為灰色,自身標記為黑色。
gcBgMarkStartWorkers
這個函數准備一些 執行bg mark工作的mark worker goroutine,但是這些goroutine並不是立即工作的,它們在回收任務開始前被綁定到P,然后進入休眠狀態,等到GC的狀態被標記為gcMark 才被調度器喚醒,開始工作。
1 func gcBgMarkStartWorkers() { 2 // Background marking is performed by per-P G's. Ensure that 3 // each P has a background GC G. 4 for _, p := range allp { 5 if p.gcBgMarkWorker == 0 { 6 go gcBgMarkWorker(p) 7 // 等待gcBgMarkWorker goroutine 的 bgMarkReady信號再繼續 8 notetsleepg(&work.bgMarkReady, -1) 9 noteclear(&work.bgMarkReady) 10 } 11 } 12 }
MarkWorker有三種工作模式:
- gcMark Worker DedicateMode:全力運行,直到並發標記任務結束;
- gcMark WorkerFractionMode:參與標記任務但可被搶占和調度;
- gcMark WorkerIdleMode:僅在空閑時參與標記任務。
gcBgMarkWorker
后台標記任務的函數,不同模式的Mark Worker 對待工作的態度完全不同。
1 func gcBgMarkWorker(_p_ *p) { 2 gp := getg() 3 // 用於休眠結束后重新獲取p和m 4 type parkInfo struct { 5 m muintptr // Release this m on park. 6 attach puintptr // If non-nil, attach to this p on park. 7 } 8 // We pass park to a gopark unlock function, so it can't be on 9 // the stack (see gopark). Prevent deadlock from recursively 10 // starting GC by disabling preemption. 11 gp.m.preemptoff = "GC worker init" 12 park := new(parkInfo) 13 gp.m.preemptoff = "" 14 // 設置park的m和p的信息,留着后面傳給gopark,在被gcController.findRunnable喚醒的時候,便於找回 15 park.m.set(acquirem()) 16 park.attach.set(_p_) 17 // Inform gcBgMarkStartWorkers that this worker is ready. 18 // After this point, the background mark worker is scheduled 19 // cooperatively by gcController.findRunnable. Hence, it must 20 // never be preempted, as this would put it into _Grunnable 21 // and put it on a run queue. Instead, when the preempt flag 22 // is set, this puts itself into _Gwaiting to be woken up by 23 // gcController.findRunnable at the appropriate time. 24 // 讓gcBgMarkStartWorkers notetsleepg停止等待並繼續及退出 25 notewakeup(&work.bgMarkReady) 26 27 for { 28 // Go to sleep until woken by gcController.findRunnable. 29 // We can't releasem yet since even the call to gopark 30 // may be preempted. 31 // 讓g進入休眠 32 gopark(func(g *g, parkp unsafe.Pointer) bool { 33 park := (*parkInfo)(parkp) 34 35 // The worker G is no longer running, so it's 36 // now safe to allow preemption. 37 // 釋放當前搶占的m 38 releasem(park.m.ptr()) 39 40 // If the worker isn't attached to its P, 41 // attach now. During initialization and after 42 // a phase change, the worker may have been 43 // running on a different P. As soon as we 44 // attach, the owner P may schedule the 45 // worker, so this must be done after the G is 46 // stopped. 47 // 設置關聯p,上面已經設置過了 48 if park.attach != 0 { 49 p := park.attach.ptr() 50 park.attach.set(nil) 51 // cas the worker because we may be 52 // racing with a new worker starting 53 // on this P. 54 if !p.gcBgMarkWorker.cas(0, guintptr(unsafe.Pointer(g))) { 55 // The P got a new worker. 56 // Exit this worker. 57 return false 58 } 59 } 60 return true 61 }, unsafe.Pointer(park), waitReasonGCWorkerIdle, traceEvGoBlock, 0) 62 63 // Loop until the P dies and disassociates this 64 // worker (the P may later be reused, in which case 65 // it will get a new worker) or we failed to associate. 66 // 檢查P的gcBgMarkWorker是否和當前的G一致, 不一致時結束當前的任務 67 if _p_.gcBgMarkWorker.ptr() != gp { 68 break 69 } 70 71 // Disable preemption so we can use the gcw. If the 72 // scheduler wants to preempt us, we'll stop draining, 73 // dispose the gcw, and then preempt. 74 // gopark第一個函數中釋放了m,這里再搶占回來 75 park.m.set(acquirem()) 76 77 if gcBlackenEnabled == 0 { 78 throw("gcBgMarkWorker: blackening not enabled") 79 } 80 81 startTime := nanotime() 82 // 設置gcmark的開始時間 83 _p_.gcMarkWorkerStartTime = startTime 84 85 decnwait := atomic.Xadd(&work.nwait, -1) 86 if decnwait == work.nproc { 87 println("runtime: work.nwait=", decnwait, "work.nproc=", work.nproc) 88 throw("work.nwait was > work.nproc") 89 } 90 // 切換到g0工作 91 systemstack(func() { 92 // Mark our goroutine preemptible so its stack 93 // can be scanned. This lets two mark workers 94 // scan each other (otherwise, they would 95 // deadlock). We must not modify anything on 96 // the G stack. However, stack shrinking is 97 // disabled for mark workers, so it is safe to 98 // read from the G stack. 99 // 設置G的狀態為waiting,以便於另一個g掃描它的棧(兩個g可以互相掃描對方的棧) 100 casgstatus(gp, _Grunning, _Gwaiting) 101 switch _p_.gcMarkWorkerMode { 102 default: 103 throw("gcBgMarkWorker: unexpected gcMarkWorkerMode") 104 case gcMarkWorkerDedicatedMode: 105 // 專心執行標記工作的模式 106 gcDrain(&_p_.gcw, gcDrainUntilPreempt|gcDrainFlushBgCredit) 107 if gp.preempt { 108 // 被搶占了,把所有本地運行隊列中的G放到全局運行隊列中 109 // We were preempted. This is 110 // a useful signal to kick 111 // everything out of the run 112 // queue so it can run 113 // somewhere else. 114 lock(&sched.lock) 115 for { 116 gp, _ := runqget(_p_) 117 if gp == nil { 118 break 119 } 120 globrunqput(gp) 121 } 122 unlock(&sched.lock) 123 } 124 // Go back to draining, this time 125 // without preemption. 126 // 繼續執行標記工作 127 gcDrain(&_p_.gcw, gcDrainNoBlock|gcDrainFlushBgCredit) 128 case gcMarkWorkerFractionalMode: 129 // 執行標記工作,知道被搶占 130 gcDrain(&_p_.gcw, gcDrainFractional|gcDrainUntilPreempt|gcDrainFlushBgCredit) 131 case gcMarkWorkerIdleMode: 132 // 空閑的時候執行標記工作 133 gcDrain(&_p_.gcw, gcDrainIdle|gcDrainUntilPreempt|gcDrainFlushBgCredit) 134 } 135 // 把G的waiting狀態轉換到runing狀態 136 casgstatus(gp, _Gwaiting, _Grunning) 137 }) 138 139 // If we are nearing the end of mark, dispose 140 // of the cache promptly. We must do this 141 // before signaling that we're no longer 142 // working so that other workers can't observe 143 // no workers and no work while we have this 144 // cached, and before we compute done. 145 // 及時處理本地緩存,上交到全局的隊列中 146 if gcBlackenPromptly { 147 _p_.gcw.dispose() 148 } 149 150 // Account for time. 151 // 累加耗時 152 duration := nanotime() - startTime 153 switch _p_.gcMarkWorkerMode { 154 case gcMarkWorkerDedicatedMode: 155 atomic.Xaddint64(&gcController.dedicatedMarkTime, duration) 156 atomic.Xaddint64(&gcController.dedicatedMarkWorkersNeeded, 1) 157 case gcMarkWorkerFractionalMode: 158 atomic.Xaddint64(&gcController.fractionalMarkTime, duration) 159 atomic.Xaddint64(&_p_.gcFractionalMarkTime, duration) 160 case gcMarkWorkerIdleMode: 161 atomic.Xaddint64(&gcController.idleMarkTime, duration) 162 } 163 164 // Was this the last worker and did we run out 165 // of work? 166 incnwait := atomic.Xadd(&work.nwait, +1) 167 if incnwait > work.nproc { 168 println("runtime: p.gcMarkWorkerMode=", _p_.gcMarkWorkerMode, 169 "work.nwait=", incnwait, "work.nproc=", work.nproc) 170 throw("work.nwait > work.nproc") 171 } 172 173 // If this worker reached a background mark completion 174 // point, signal the main GC goroutine. 175 if incnwait == work.nproc && !gcMarkWorkAvailable(nil) { 176 // Make this G preemptible and disassociate it 177 // as the worker for this P so 178 // findRunnableGCWorker doesn't try to 179 // schedule it. 180 // 取消p m的關聯 181 _p_.gcBgMarkWorker.set(nil) 182 releasem(park.m.ptr()) 183 184 gcMarkDone() 185 186 // Disable preemption and prepare to reattach 187 // to the P. 188 // 189 // We may be running on a different P at this 190 // point, so we can't reattach until this G is 191 // parked. 192 park.m.set(acquirem()) 193 park.attach.set(_p_) 194 } 195 } 196 }
gcDrain
三色標記的主要實現
gcDrain掃描所有的roots和對象,並表黑灰色對象,知道所有的roots和對象都被標記
1 func gcDrain(gcw *gcWork, flags gcDrainFlags) { 2 if !writeBarrier.needed { 3 throw("gcDrain phase incorrect") 4 } 5 6 gp := getg().m.curg 7 // 看到搶占標識是否要返回 8 preemptible := flags&gcDrainUntilPreempt != 0 9 // 沒有任務時是否要等待任務 10 blocking := flags&(gcDrainUntilPreempt|gcDrainIdle|gcDrainFractional|gcDrainNoBlock) == 0 11 // 是否計算后台的掃描量來減少輔助GC和喚醒等待中的G 12 flushBgCredit := flags&gcDrainFlushBgCredit != 0 13 // 是否在空閑的時候執行標記任務 14 idle := flags&gcDrainIdle != 0 15 // 記錄初始的已經執行過的掃描任務 16 initScanWork := gcw.scanWork 17 18 // checkWork is the scan work before performing the next 19 // self-preempt check. 20 // 設置對應模式的工作檢查函數 21 checkWork := int64(1<<63 - 1) 22 var check func() bool 23 if flags&(gcDrainIdle|gcDrainFractional) != 0 { 24 checkWork = initScanWork + drainCheckThreshold 25 if idle { 26 check = pollWork 27 } else if flags&gcDrainFractional != 0 { 28 check = pollFractionalWorkerExit 29 } 30 } 31 32 // Drain root marking jobs. 33 // 如果root對象沒有掃描完,則掃描 34 if work.markrootNext < work.markrootJobs { 35 for !(preemptible && gp.preempt) { 36 job := atomic.Xadd(&work.markrootNext, +1) - 1 37 if job >= work.markrootJobs { 38 break 39 } 40 // 執行root掃描任務 41 markroot(gcw, job) 42 if check != nil && check() { 43 goto done 44 } 45 } 46 } 47 48 // Drain heap marking jobs. 49 // 循環直到被搶占 50 for !(preemptible && gp.preempt) { 51 // Try to keep work available on the global queue. We used to 52 // check if there were waiting workers, but it's better to 53 // just keep work available than to make workers wait. In the 54 // worst case, we'll do O(log(_WorkbufSize)) unnecessary 55 // balances. 56 if work.full == 0 { 57 // 平衡工作,如果全局的標記隊列為空,則分一部分工作到全局隊列中 58 gcw.balance() 59 } 60 61 var b uintptr 62 if blocking { 63 b = gcw.get() 64 } else { 65 b = gcw.tryGetFast() 66 if b == 0 { 67 b = gcw.tryGet() 68 } 69 } 70 // 獲取任務失敗,跳出循環 71 if b == 0 { 72 // work barrier reached or tryGet failed. 73 break 74 } 75 // 掃描獲取的到對象 76 scanobject(b, gcw) 77 78 // Flush background scan work credit to the global 79 // account if we've accumulated enough locally so 80 // mutator assists can draw on it. 81 // 如果當前掃描的數量超過了 gcCreditSlack,就把掃描的對象數量加到全局的數量,批量更新 82 if gcw.scanWork >= gcCreditSlack { 83 atomic.Xaddint64(&gcController.scanWork, gcw.scanWork) 84 if flushBgCredit { 85 gcFlushBgCredit(gcw.scanWork - initScanWork) 86 initScanWork = 0 87 } 88 checkWork -= gcw.scanWork 89 gcw.scanWork = 0 90 // 如果掃描的對象數量已經達到了 執行下次搶占的目標數量 checkWork, 則調用對應模式的函數 91 // idle模式為 pollWork, Fractional模式為 pollFractionalWorkerExit ,在第20行 92 if checkWork <= 0 { 93 checkWork += drainCheckThreshold 94 if check != nil && check() { 95 break 96 } 97 } 98 } 99 } 100 101 // In blocking mode, write barriers are not allowed after this 102 // point because we must preserve the condition that the work 103 // buffers are empty. 104 105 done: 106 // Flush remaining scan work credit. 107 if gcw.scanWork > 0 { 108 // 把掃描的對象數量添加到全局 109 atomic.Xaddint64(&gcController.scanWork, gcw.scanWork) 110 if flushBgCredit { 111 gcFlushBgCredit(gcw.scanWork - initScanWork) 112 } 113 gcw.scanWork = 0 114 } 115 }
處理灰色對象時,無需知道其真實大小,只當做內存分配器提供的object塊即可。按指針類型長度對齊配合bitmap標記進行遍歷,就可找出所有引用成員,將其作為灰色對象壓入隊列,當然,當前對象自然成為黑色對象,從隊列移除。
markroot
這個被用於根對象掃描。
1 func markroot(gcw *gcWork, i uint32) { 2 // TODO(austin): This is a bit ridiculous. Compute and store 3 // the bases in gcMarkRootPrepare instead of the counts. 4 baseFlushCache := uint32(fixedRootCount) 5 baseData := baseFlushCache + uint32(work.nFlushCacheRoots) 6 baseBSS := baseData + uint32(work.nDataRoots) 7 baseSpans := baseBSS + uint32(work.nBSSRoots) 8 baseStacks := baseSpans + uint32(work.nSpanRoots) 9 end := baseStacks + uint32(work.nStackRoots) 10 11 // Note: if you add a case here, please also update heapdump.go:dumproots. 12 switch { 13 // 釋放mcache中的span 14 case baseFlushCache <= i && i < baseData: 15 flushmcache(int(i - baseFlushCache)) 16 // 掃描可讀寫的全局變量 17 case baseData <= i && i < baseBSS: 18 for _, datap := range activeModules() { 19 markrootBlock(datap.data, datap.edata-datap.data, datap.gcdatamask.bytedata, gcw, int(i-baseData)) 20 } 21 // 掃描只讀的全局隊列 22 case baseBSS <= i && i < baseSpans: 23 for _, datap := range activeModules() { 24 markrootBlock(datap.bss, datap.ebss-datap.bss, datap.gcbssmask.bytedata, gcw, int(i-baseBSS)) 25 } 26 // 掃描Finalizer隊列 27 case i == fixedRootFinalizers: 28 // Only do this once per GC cycle since we don't call 29 // queuefinalizer during marking. 30 if work.markrootDone { 31 break 32 } 33 for fb := allfin; fb != nil; fb = fb.alllink { 34 cnt := uintptr(atomic.Load(&fb.cnt)) 35 scanblock(uintptr(unsafe.Pointer(&fb.fin[0])), cnt*unsafe.Sizeof(fb.fin[0]), &finptrmask[0], gcw) 36 } 37 // 釋放已經終止的stack 38 case i == fixedRootFreeGStacks: 39 // Only do this once per GC cycle; preferably 40 // concurrently. 41 if !work.markrootDone { 42 // Switch to the system stack so we can call 43 // stackfree. 44 systemstack(markrootFreeGStacks) 45 } 46 // 掃描MSpan.specials 47 case baseSpans <= i && i < baseStacks: 48 // mark MSpan.specials 49 markrootSpans(gcw, int(i-baseSpans)) 50 51 default: 52 // the rest is scanning goroutine stacks 53 // 獲取需要掃描的g 54 var gp *g 55 if baseStacks <= i && i < end { 56 gp = allgs[i-baseStacks] 57 } else { 58 throw("markroot: bad index") 59 } 60 61 // remember when we've first observed the G blocked 62 // needed only to output in traceback 63 status := readgstatus(gp) // We are not in a scan state 64 if (status == _Gwaiting || status == _Gsyscall) && gp.waitsince == 0 { 65 gp.waitsince = work.tstart 66 } 67 68 // scang must be done on the system stack in case 69 // we're trying to scan our own stack. 70 // 轉交給g0進行掃描 71 systemstack(func() { 72 // If this is a self-scan, put the user G in 73 // _Gwaiting to prevent self-deadlock. It may 74 // already be in _Gwaiting if this is a mark 75 // worker or we're in mark termination. 76 userG := getg().m.curg 77 selfScan := gp == userG && readgstatus(userG) == _Grunning 78 // 如果是掃描自己的,則轉換自己的g的狀態 79 if selfScan { 80 casgstatus(userG, _Grunning, _Gwaiting) 81 userG.waitreason = waitReasonGarbageCollectionScan 82 } 83 84 // TODO: scang blocks until gp's stack has 85 // been scanned, which may take a while for 86 // running goroutines. Consider doing this in 87 // two phases where the first is non-blocking: 88 // we scan the stacks we can and ask running 89 // goroutines to scan themselves; and the 90 // second blocks. 91 // 掃描g的棧 92 scang(gp, gcw) 93 94 if selfScan { 95 casgstatus(userG, _Gwaiting, _Grunning) 96 } 97 }) 98 } 99 }
所有這些掃描過程,最終通過scanblock 比對bitmap區域信息找出合法指針,將其目標當做灰色可達對象添加到待處理隊列。
markRootBlock
根據 ptrmask0,來掃描[b0, b0+n0)區域
1 func markrootBlock(b0, n0 uintptr, ptrmask0 *uint8, gcw *gcWork, shard int) { 2 if rootBlockBytes%(8*sys.PtrSize) != 0 { 3 // This is necessary to pick byte offsets in ptrmask0. 4 throw("rootBlockBytes must be a multiple of 8*ptrSize") 5 } 6 7 b := b0 + uintptr(shard)*rootBlockBytes 8 // 如果需掃描的block區域,超出b0+n0的區域,直接返回 9 if b >= b0+n0 { 10 return 11 } 12 ptrmask := (*uint8)(add(unsafe.Pointer(ptrmask0), uintptr(shard)*(rootBlockBytes/(8*sys.PtrSize)))) 13 n := uintptr(rootBlockBytes) 14 if b+n > b0+n0 { 15 n = b0 + n0 - b 16 } 17 18 // Scan this shard. 19 // 掃描給定block的shard 20 scanblock(b, n, ptrmask, gcw) 21 }
1 func scanblock(b0, n0 uintptr, ptrmask *uint8, gcw *gcWork) { 2 // Use local copies of original parameters, so that a stack trace 3 // due to one of the throws below shows the original block 4 // base and extent. 5 b := b0 6 n := n0 7 8 for i := uintptr(0); i < n; { 9 // Find bits for the next word. 10 // 找到bitmap中對應的bits 11 bits := uint32(*addb(ptrmask, i/(sys.PtrSize*8))) 12 if bits == 0 { 13 i += sys.PtrSize * 8 14 continue 15 } 16 for j := 0; j < 8 && i < n; j++ { 17 if bits&1 != 0 { 18 // 如果該地址包含指針 19 // Same work as in scanobject; see comments there. 20 obj := *(*uintptr)(unsafe.Pointer(b + i)) 21 if obj != 0 { 22 // 如果該地址下找到了對應的對象,標灰 23 if obj, span, objIndex := findObject(obj, b, i); obj != 0 { 24 greyobject(obj, b, i, span, gcw, objIndex) 25 } 26 } 27 } 28 bits >>= 1 29 i += sys.PtrSize 30 } 31 } 32 }
此處的gcWork是專門設計的高性能隊列,它允許局部隊列和全局隊列work.full/partial協同工作,平衡任務分配。
greyobject
標灰對象其實就是找到對應bitmap,標記存活並扔進隊列
1 func greyobject(obj, base, off uintptr, span *mspan, gcw *gcWork, objIndex uintptr) { 2 // obj should be start of allocation, and so must be at least pointer-aligned. 3 if obj&(sys.PtrSize-1) != 0 { 4 throw("greyobject: obj not pointer-aligned") 5 } 6 mbits := span.markBitsForIndex(objIndex) 7 8 if useCheckmark { 9 // 這里是用來debug,確保所有的對象都被正確標識 10 if !mbits.isMarked() { 11 // 這個對象沒有被標記 12 printlock() 13 print("runtime:greyobject: checkmarks finds unexpected unmarked object obj=", hex(obj), "\n") 14 print("runtime: found obj at *(", hex(base), "+", hex(off), ")\n") 15 16 // Dump the source (base) object 17 gcDumpObject("base", base, off) 18 19 // Dump the object 20 gcDumpObject("obj", obj, ^uintptr(0)) 21 22 getg().m.traceback = 2 23 throw("checkmark found unmarked object") 24 } 25 hbits := heapBitsForAddr(obj) 26 if hbits.isCheckmarked(span.elemsize) { 27 return 28 } 29 hbits.setCheckmarked(span.elemsize) 30 if !hbits.isCheckmarked(span.elemsize) { 31 throw("setCheckmarked and isCheckmarked disagree") 32 } 33 } else { 34 if debug.gccheckmark > 0 && span.isFree(objIndex) { 35 print("runtime: marking free object ", hex(obj), " found at *(", hex(base), "+", hex(off), ")\n") 36 gcDumpObject("base", base, off) 37 gcDumpObject("obj", obj, ^uintptr(0)) 38 getg().m.traceback = 2 39 throw("marking free object") 40 } 41 42 // If marked we have nothing to do. 43 // 對象被正確標記了,無需做其他的操作 44 if mbits.isMarked() { 45 return 46 } 47 // mbits.setMarked() // Avoid extra call overhead with manual inlining. 48 // 標記對象 49 atomic.Or8(mbits.bytep, mbits.mask) 50 // If this is a noscan object, fast-track it to black 51 // instead of greying it. 52 // 如果對象不是指針,則只需要標記,不需要放進隊列,相當於直接標黑 53 if span.spanclass.noscan() { 54 gcw.bytesMarked += uint64(span.elemsize) 55 return 56 } 57 } 58 59 // Queue the obj for scanning. The PREFETCH(obj) logic has been removed but 60 // seems like a nice optimization that can be added back in. 61 // There needs to be time between the PREFETCH and the use. 62 // Previously we put the obj in an 8 element buffer that is drained at a rate 63 // to give the PREFETCH time to do its work. 64 // Use of PREFETCHNTA might be more appropriate than PREFETCH 65 // 判斷對象是否被放進隊列,沒有則放入,標灰步驟完成 66 if !gcw.putFast(obj) { 67 gcw.put(obj) 68 } 69 }
gcWork.putFast
work有wbuf1 wbuf2兩個隊列用於保存灰色對象,首先會往wbuf1隊列里加入灰色對象,wbuf1滿了后,交換wbuf1和wbuf2,這事wbuf2便晉升為wbuf1,繼續存放灰色對象,兩個隊列都滿了,則想全局進行申請
putFast這里進嘗試將對象放進wbuf1隊列中
1 func (w *gcWork) putFast(obj uintptr) bool { 2 wbuf := w.wbuf1 3 if wbuf == nil { 4 // 沒有申請緩存隊列,返回false 5 return false 6 } else if wbuf.nobj == len(wbuf.obj) { 7 // wbuf1隊列滿了,返回false 8 return false 9 } 10 11 // 向未滿wbuf1隊列中加入對象 12 wbuf.obj[wbuf.nobj] = obj 13 wbuf.nobj++ 14 return true 15 }
gcWork.put
put不僅嘗試將對象放入wbuf1,還會再wbuf1滿的時候,嘗試更換wbuf1 wbuf2的角色,都滿的話,則想全局進行申請,並將滿的隊列上交到全局隊列
1 func (w *gcWork) put(obj uintptr) { 2 flushed := false 3 wbuf := w.wbuf1 4 if wbuf == nil { 5 // 如果wbuf1不存在,則初始化wbuf1 wbuf2兩個隊列 6 w.init() 7 wbuf = w.wbuf1 8 // wbuf is empty at this point. 9 } else if wbuf.nobj == len(wbuf.obj) { 10 // wbuf1滿了,更換wbuf1 wbuf2的角色 11 w.wbuf1, w.wbuf2 = w.wbuf2, w.wbuf1 12 wbuf = w.wbuf1 13 if wbuf.nobj == len(wbuf.obj) { 14 // 更換角色后,wbuf1也滿了,說明兩個隊列都滿了 15 // 把 wbuf1上交全局並獲取一個空的隊列 16 putfull(wbuf) 17 wbuf = getempty() 18 w.wbuf1 = wbuf 19 // 設置隊列上交的標志位 20 flushed = true 21 } 22 } 23 24 wbuf.obj[wbuf.nobj] = obj 25 wbuf.nobj++ 26 27 // If we put a buffer on full, let the GC controller know so 28 // it can encourage more workers to run. We delay this until 29 // the end of put so that w is in a consistent state, since 30 // enlistWorker may itself manipulate w. 31 // 此時全局已經有標記滿的隊列,GC controller選擇調度更多work進行工作 32 if flushed && gcphase == _GCmark { 33 gcController.enlistWorker() 34 } 35 }
gcw.balance()
繼續分析 gcDrain的58行,balance work是什么
1 func (w *gcWork) balance() { 2 if w.wbuf1 == nil { 3 // 這里wbuf1 wbuf2隊列還沒有初始化 4 return 5 } 6 // 如果wbuf2不為空,則上交到全局,並獲取一個空島隊列給wbuf2 7 if wbuf := w.wbuf2; wbuf.nobj != 0 { 8 putfull(wbuf) 9 w.wbuf2 = getempty() 10 } else if wbuf := w.wbuf1; wbuf.nobj > 4 { 11 // 把未滿的wbuf1分成兩半,並把其中一半上交的全局隊列 12 w.wbuf1 = handoff(wbuf) 13 } else { 14 return 15 } 16 // We flushed a buffer to the full list, so wake a worker. 17 // 這里,全局隊列有滿的隊列了,其他work可以工作了 18 if gcphase == _GCmark { 19 gcController.enlistWorker() 20 } 21 }
gcw.get()
繼續分析 gcDrain的63行,這里就是首先從本地的隊列獲取一個對象,如果本地隊列的wbuf1沒有,嘗試從wbuf2獲取,如果兩個都沒有,則嘗試從全局隊列獲取一個滿的隊列,並獲取一個對象
1 func (w *gcWork) get() uintptr { 2 wbuf := w.wbuf1 3 if wbuf == nil { 4 w.init() 5 wbuf = w.wbuf1 6 // wbuf is empty at this point. 7 } 8 if wbuf.nobj == 0 { 9 // wbuf1空了,更換wbuf1 wbuf2的角色 10 w.wbuf1, w.wbuf2 = w.wbuf2, w.wbuf1 11 wbuf = w.wbuf1 12 // 原wbuf2也是空的,嘗試從全局隊列獲取一個滿的隊列 13 if wbuf.nobj == 0 { 14 owbuf := wbuf 15 wbuf = getfull() 16 // 獲取不到,則返回 17 if wbuf == nil { 18 return 0 19 } 20 // 把空的隊列上傳到全局空隊列,並把獲取的滿的隊列,作為自身的wbuf1 21 putempty(owbuf) 22 w.wbuf1 = wbuf 23 } 24 } 25 26 // TODO: This might be a good place to add prefetch code 27 28 wbuf.nobj-- 29 return wbuf.obj[wbuf.nobj] 30 }
gcw.tryGet()
gcw.tryGetFast()
邏輯差不多,相對比較簡單,就不繼續分析了
scanobject
我們繼續分析到 gcDrain 的L76,這里已經獲取到了b,開始消費隊列
1 func scanobject(b uintptr, gcw *gcWork) { 2 // Find the bits for b and the size of the object at b. 3 // 4 // b is either the beginning of an object, in which case this 5 // is the size of the object to scan, or it points to an 6 // oblet, in which case we compute the size to scan below. 7 // 獲取b對應的bits 8 hbits := heapBitsForAddr(b) 9 // 獲取b所在的span 10 s := spanOfUnchecked(b) 11 n := s.elemsize 12 if n == 0 { 13 throw("scanobject n == 0") 14 } 15 // 對象過大,則切割后再掃描,maxObletBytes為128k 16 if n > maxObletBytes { 17 // Large object. Break into oblets for better 18 // parallelism and lower latency. 19 if b == s.base() { 20 // It's possible this is a noscan object (not 21 // from greyobject, but from other code 22 // paths), in which case we must *not* enqueue 23 // oblets since their bitmaps will be 24 // uninitialized. 25 // 如果不是指針,直接標記返回,相當於標黑了 26 if s.spanclass.noscan() { 27 // Bypass the whole scan. 28 gcw.bytesMarked += uint64(n) 29 return 30 } 31 32 // Enqueue the other oblets to scan later. 33 // Some oblets may be in b's scalar tail, but 34 // these will be marked as "no more pointers", 35 // so we'll drop out immediately when we go to 36 // scan those. 37 // 按maxObletBytes切割后放入到 隊列 38 for oblet := b + maxObletBytes; oblet < s.base()+s.elemsize; oblet += maxObletBytes { 39 if !gcw.putFast(oblet) { 40 gcw.put(oblet) 41 } 42 } 43 } 44 45 // Compute the size of the oblet. Since this object 46 // must be a large object, s.base() is the beginning 47 // of the object. 48 n = s.base() + s.elemsize - b 49 if n > maxObletBytes { 50 n = maxObletBytes 51 } 52 } 53 54 var i uintptr 55 for i = 0; i < n; i += sys.PtrSize { 56 // Find bits for this word. 57 // 獲取到對應的bits 58 if i != 0 { 59 // Avoid needless hbits.next() on last iteration. 60 hbits = hbits.next() 61 } 62 // Load bits once. See CL 22712 and issue 16973 for discussion. 63 bits := hbits.bits() 64 // During checkmarking, 1-word objects store the checkmark 65 // in the type bit for the one word. The only one-word objects 66 // are pointers, or else they'd be merged with other non-pointer 67 // data into larger allocations. 68 if i != 1*sys.PtrSize && bits&bitScan == 0 { 69 break // no more pointers in this object 70 } 71 // 不是指針,繼續 72 if bits&bitPointer == 0 { 73 continue // not a pointer 74 } 75 76 // Work here is duplicated in scanblock and above. 77 // If you make changes here, make changes there too. 78 obj := *(*uintptr)(unsafe.Pointer(b + i)) 79 80 // At this point we have extracted the next potential pointer. 81 // Quickly filter out nil and pointers back to the current object. 82 if obj != 0 && obj-b >= n { 83 // Test if obj points into the Go heap and, if so, 84 // mark the object. 85 // 86 // Note that it's possible for findObject to 87 // fail if obj points to a just-allocated heap 88 // object because of a race with growing the 89 // heap. In this case, we know the object was 90 // just allocated and hence will be marked by 91 // allocation itself. 92 // 找到指針對應的對象,並標灰 93 if obj, span, objIndex := findObject(obj, b, i); obj != 0 { 94 greyobject(obj, b, i, span, gcw, objIndex) 95 } 96 } 97 } 98 gcw.bytesMarked += uint64(n) 99 gcw.scanWork += int64(i) 100 }
標灰就是標記並放進隊列,標黑就是標記,所以當灰色對象從隊列中取出后,我們就可以認為這個對象是黑色對象了。
至此,gcDrain的標記工作分析完成,我們繼續回到gcBgMarkWorker分析
1 func gcMarkDone() { 2 top: 3 semacquire(&work.markDoneSema) 4 5 // Re-check transition condition under transition lock. 6 if !(gcphase == _GCmark && work.nwait == work.nproc && !gcMarkWorkAvailable(nil)) { 7 semrelease(&work.markDoneSema) 8 return 9 } 10 11 // Disallow starting new workers so that any remaining workers 12 // in the current mark phase will drain out. 13 // 14 // TODO(austin): Should dedicated workers keep an eye on this 15 // and exit gcDrain promptly? 16 // 禁止新的標記任務 17 atomic.Xaddint64(&gcController.dedicatedMarkWorkersNeeded, -0xffffffff) 18 prevFractionalGoal := gcController.fractionalUtilizationGoal 19 gcController.fractionalUtilizationGoal = 0 20 21 // 如果gcBlackenPromptly表名需要所有本地緩存隊列立即上交到全局隊列,並禁用本地緩存隊列 22 if !gcBlackenPromptly { 23 // Transition from mark 1 to mark 2. 24 // 25 // The global work list is empty, but there can still be work 26 // sitting in the per-P work caches. 27 // Flush and disable work caches. 28 29 // Disallow caching workbufs and indicate that we're in mark 2. 30 // 禁用本地緩存隊列,進入mark2階段 31 gcBlackenPromptly = true 32 33 // Prevent completion of mark 2 until we've flushed 34 // cached workbufs. 35 atomic.Xadd(&work.nwait, -1) 36 37 // GC is set up for mark 2. Let Gs blocked on the 38 // transition lock go while we flush caches. 39 semrelease(&work.markDoneSema) 40 // 切換到g0執行,本地緩存上傳到全局的操作 41 systemstack(func() { 42 // Flush all currently cached workbufs and 43 // ensure all Ps see gcBlackenPromptly. This 44 // also blocks until any remaining mark 1 45 // workers have exited their loop so we can 46 // start new mark 2 workers. 47 forEachP(func(_p_ *p) { 48 wbBufFlush1(_p_) 49 _p_.gcw.dispose() 50 }) 51 }) 52 53 // Check that roots are marked. We should be able to 54 // do this before the forEachP, but based on issue 55 // #16083 there may be a (harmless) race where we can 56 // enter mark 2 while some workers are still scanning 57 // stacks. The forEachP ensures these scans are done. 58 // 59 // TODO(austin): Figure out the race and fix this 60 // properly. 61 // 檢查所有的root是否都被標記了 62 gcMarkRootCheck() 63 64 // Now we can start up mark 2 workers. 65 atomic.Xaddint64(&gcController.dedicatedMarkWorkersNeeded, 0xffffffff) 66 gcController.fractionalUtilizationGoal = prevFractionalGoal 67 68 incnwait := atomic.Xadd(&work.nwait, +1) 69 // 如果沒有更多的任務,則執行第二次調用,從mark2階段轉換到mark termination階段 70 if incnwait == work.nproc && !gcMarkWorkAvailable(nil) { 71 // This loop will make progress because 72 // gcBlackenPromptly is now true, so it won't 73 // take this same "if" branch. 74 goto top 75 } 76 } else { 77 // Transition to mark termination. 78 now := nanotime() 79 work.tMarkTerm = now 80 work.pauseStart = now 81 getg().m.preemptoff = "gcing" 82 if trace.enabled { 83 traceGCSTWStart(0) 84 } 85 systemstack(stopTheWorldWithSema) 86 // The gcphase is _GCmark, it will transition to _GCmarktermination 87 // below. The important thing is that the wb remains active until 88 // all marking is complete. This includes writes made by the GC. 89 90 // Record that one root marking pass has completed. 91 work.markrootDone = true 92 93 // Disable assists and background workers. We must do 94 // this before waking blocked assists. 95 atomic.Store(&gcBlackenEnabled, 0) 96 97 // Wake all blocked assists. These will run when we 98 // start the world again. 99 // 喚醒所有的輔助GC 100 gcWakeAllAssists() 101 102 // Likewise, release the transition lock. Blocked 103 // workers and assists will run when we start the 104 // world again. 105 semrelease(&work.markDoneSema) 106 107 // endCycle depends on all gcWork cache stats being 108 // flushed. This is ensured by mark 2. 109 // 計算下一次gc出發的閾值 110 nextTriggerRatio := gcController.endCycle() 111 112 // Perform mark termination. This will restart the world. 113 // start the world,並進入完成階段 114 gcMarkTermination(nextTriggerRatio) 115 } 116 }
gcMarkTermination
結束標記,並進行清掃等工作
1 func gcMarkTermination(nextTriggerRatio float64) { 2 // World is stopped. 3 // Start marktermination which includes enabling the write barrier. 4 atomic.Store(&gcBlackenEnabled, 0) 5 gcBlackenPromptly = false 6 // 設置GC的階段標識 7 setGCPhase(_GCmarktermination) 8 9 work.heap1 = memstats.heap_live 10 startTime := nanotime() 11 12 mp := acquirem() 13 mp.preemptoff = "gcing" 14 _g_ := getg() 15 _g_.m.traceback = 2 16 gp := _g_.m.curg 17 // 設置當前g的狀態為waiting狀態 18 casgstatus(gp, _Grunning, _Gwaiting) 19 gp.waitreason = waitReasonGarbageCollection 20 21 // Run gc on the g0 stack. We do this so that the g stack 22 // we're currently running on will no longer change. Cuts 23 // the root set down a bit (g0 stacks are not scanned, and 24 // we don't need to scan gc's internal state). We also 25 // need to switch to g0 so we can shrink the stack. 26 systemstack(func() { 27 // 通過g0掃描當前g的棧 28 gcMark(startTime) 29 // Must return immediately. 30 // The outer function's stack may have moved 31 // during gcMark (it shrinks stacks, including the 32 // outer function's stack), so we must not refer 33 // to any of its variables. Return back to the 34 // non-system stack to pick up the new addresses 35 // before continuing. 36 }) 37 38 systemstack(func() { 39 work.heap2 = work.bytesMarked 40 if debug.gccheckmark > 0 { 41 // Run a full stop-the-world mark using checkmark bits, 42 // to check that we didn't forget to mark anything during 43 // the concurrent mark process. 44 // 如果啟用了gccheckmark,則檢查所有可達對象是否都有標記 45 gcResetMarkState() 46 initCheckmarks() 47 gcMark(startTime) 48 clearCheckmarks() 49 } 50 51 // marking is complete so we can turn the write barrier off 52 // 設置gc的階段標識,GCoff時會關閉寫屏障 53 setGCPhase(_GCoff) 54 // 開始清掃 55 gcSweep(work.mode) 56 57 if debug.gctrace > 1 { 58 startTime = nanotime() 59 // The g stacks have been scanned so 60 // they have gcscanvalid==true and gcworkdone==true. 61 // Reset these so that all stacks will be rescanned. 62 gcResetMarkState() 63 finishsweep_m() 64 65 // Still in STW but gcphase is _GCoff, reset to _GCmarktermination 66 // At this point all objects will be found during the gcMark which 67 // does a complete STW mark and object scan. 68 setGCPhase(_GCmarktermination) 69 gcMark(startTime) 70 setGCPhase(_GCoff) // marking is done, turn off wb. 71 gcSweep(work.mode) 72 } 73 }) 74 75 _g_.m.traceback = 0 76 casgstatus(gp, _Gwaiting, _Grunning) 77 78 if trace.enabled { 79 traceGCDone() 80 } 81 82 // all done 83 mp.preemptoff = "" 84 85 if gcphase != _GCoff { 86 throw("gc done but gcphase != _GCoff") 87 } 88 89 // Update GC trigger and pacing for the next cycle. 90 // 更新下次出發gc的增長比 91 gcSetTriggerRatio(nextTriggerRatio) 92 93 // Update timing memstats 94 // 更新用時 95 now := nanotime() 96 sec, nsec, _ := time_now() 97 unixNow := sec*1e9 + int64(nsec) 98 work.pauseNS += now - work.pauseStart 99 work.tEnd = now 100 atomic.Store64(&memstats.last_gc_unix, uint64(unixNow)) // must be Unix time to make sense to user 101 atomic.Store64(&memstats.last_gc_nanotime, uint64(now)) // monotonic time for us 102 memstats.pause_ns[memstats.numgc%uint32(len(memstats.pause_ns))] = uint64(work.pauseNS) 103 memstats.pause_end[memstats.numgc%uint32(len(memstats.pause_end))] = uint64(unixNow) 104 memstats.pause_total_ns += uint64(work.pauseNS) 105 106 // Update work.totaltime. 107 sweepTermCpu := int64(work.stwprocs) * (work.tMark - work.tSweepTerm) 108 // We report idle marking time below, but omit it from the 109 // overall utilization here since it's "free". 110 markCpu := gcController.assistTime + gcController.dedicatedMarkTime + gcController.fractionalMarkTime 111 markTermCpu := int64(work.stwprocs) * (work.tEnd - work.tMarkTerm) 112 cycleCpu := sweepTermCpu + markCpu + markTermCpu 113 work.totaltime += cycleCpu 114 115 // Compute overall GC CPU utilization. 116 totalCpu := sched.totaltime + (now-sched.procresizetime)*int64(gomaxprocs) 117 memstats.gc_cpu_fraction = float64(work.totaltime) / float64(totalCpu) 118 119 // Reset sweep state. 120 // 重置清掃的狀態 121 sweep.nbgsweep = 0 122 sweep.npausesweep = 0 123 124 // 如果是強制開啟的gc,標識增加 125 if work.userForced { 126 memstats.numforcedgc++ 127 } 128 129 // Bump GC cycle count and wake goroutines waiting on sweep. 130 // 統計執行GC的次數然后喚醒等待清掃的G 131 lock(&work.sweepWaiters.lock) 132 memstats.numgc++ 133 injectglist(work.sweepWaiters.head.ptr()) 134 work.sweepWaiters.head = 0 135 unlock(&work.sweepWaiters.lock) 136 137 // Finish the current heap profiling cycle and start a new 138 // heap profiling cycle. We do this before starting the world 139 // so events don't leak into the wrong cycle. 140 mProf_NextCycle() 141 // start the world 142 systemstack(func() { startTheWorldWithSema(true) }) 143 144 // Flush the heap profile so we can start a new cycle next GC. 145 // This is relatively expensive, so we don't do it with the 146 // world stopped. 147 mProf_Flush() 148 149 // Prepare workbufs for freeing by the sweeper. We do this 150 // asynchronously because it can take non-trivial time. 151 prepareFreeWorkbufs() 152 153 // Free stack spans. This must be done between GC cycles. 154 systemstack(freeStackSpans) 155 156 // Print gctrace before dropping worldsema. As soon as we drop 157 // worldsema another cycle could start and smash the stats 158 // we're trying to print. 159 if debug.gctrace > 0 { 160 util := int(memstats.gc_cpu_fraction * 100) 161 162 var sbuf [24]byte 163 printlock() 164 print("gc ", memstats.numgc, 165 " @", string(itoaDiv(sbuf[:], uint64(work.tSweepTerm-runtimeInitTime)/1e6, 3)), "s ", 166 util, "%: ") 167 prev := work.tSweepTerm 168 for i, ns := range []int64{work.tMark, work.tMarkTerm, work.tEnd} { 169 if i != 0 { 170 print("+") 171 } 172 print(string(fmtNSAsMS(sbuf[:], uint64(ns-prev)))) 173 prev = ns 174 } 175 print(" ms clock, ") 176 for i, ns := range []int64{sweepTermCpu, gcController.assistTime, gcController.dedicatedMarkTime + gcController.fractionalMarkTime, gcController.idleMarkTime, markTermCpu} { 177 if i == 2 || i == 3 { 178 // Separate mark time components with /. 179 print("/") 180 } else if i != 0 { 181 print("+") 182 } 183 print(string(fmtNSAsMS(sbuf[:], uint64(ns)))) 184 } 185 print(" ms cpu, ", 186 work.heap0>>20, "->", work.heap1>>20, "->", work.heap2>>20, " MB, ", 187 work.heapGoal>>20, " MB goal, ", 188 work.maxprocs, " P") 189 if work.userForced { 190 print(" (forced)") 191 } 192 print("\n") 193 printunlock() 194 } 195 196 semrelease(&worldsema) 197 // Careful: another GC cycle may start now. 198 199 releasem(mp) 200 mp = nil 201 202 // now that gc is done, kick off finalizer thread if needed 203 // 如果不是並行GC,則讓當前M開始調度 204 if !concurrentSweep { 205 // give the queued finalizers, if any, a chance to run 206 Gosched() 207 } 208 }
5. 清理
goSweep
清掃任務:
1 func gcSweep(mode gcMode) { 2 if gcphase != _GCoff { 3 throw("gcSweep being done but phase is not GCoff") 4 } 5 6 lock(&mheap_.lock) 7 // sweepgen在每次GC之后都會增長2,每次GC之后sweepSpans的角色都會互換 8 mheap_.sweepgen += 2 9 mheap_.sweepdone = 0 10 if mheap_.sweepSpans[mheap_.sweepgen/2%2].index != 0 { 11 // We should have drained this list during the last 12 // sweep phase. We certainly need to start this phase 13 // with an empty swept list. 14 throw("non-empty swept list") 15 } 16 mheap_.pagesSwept = 0 17 unlock(&mheap_.lock) 18 // 如果不是並行GC,或者強制GC 19 if !_ConcurrentSweep || mode == gcForceBlockMode { 20 // Special case synchronous sweep. 21 // Record that no proportional sweeping has to happen. 22 lock(&mheap_.lock) 23 mheap_.sweepPagesPerByte = 0 24 unlock(&mheap_.lock) 25 // Sweep all spans eagerly. 26 // 清掃所有的span 27 for sweepone() != ^uintptr(0) { 28 sweep.npausesweep++ 29 } 30 // Free workbufs eagerly. 31 // 釋放所有的 workbufs 32 prepareFreeWorkbufs() 33 for freeSomeWbufs(false) { 34 } 35 // All "free" events for this mark/sweep cycle have 36 // now happened, so we can make this profile cycle 37 // available immediately. 38 mProf_NextCycle() 39 mProf_Flush() 40 return 41 } 42 43 // Background sweep. 44 lock(&sweep.lock) 45 // 喚醒后台清掃任務,也就是 bgsweep 函數,清掃流程跟上面非並行清掃差不多 46 if sweep.parked { 47 sweep.parked = false 48 ready(sweep.g, 0, true) 49 } 50 unlock(&sweep.lock) 51 }
並發清理同樣由一個專門的goroutine完成,它在 runtime.main 調用時被創建。
sweepone
接下來我們就分析一下sweepone 清掃的流程
1 func sweepone() uintptr { 2 _g_ := getg() 3 sweepRatio := mheap_.sweepPagesPerByte // For debugging 4 5 // increment locks to ensure that the goroutine is not preempted 6 // in the middle of sweep thus leaving the span in an inconsistent state for next GC 7 _g_.m.locks++ 8 // 檢查是否已經完成了清掃 9 if atomic.Load(&mheap_.sweepdone) != 0 { 10 _g_.m.locks-- 11 return ^uintptr(0) 12 } 13 // 增加清掃的worker數量 14 atomic.Xadd(&mheap_.sweepers, +1) 15 16 npages := ^uintptr(0) 17 sg := mheap_.sweepgen 18 for { 19 // 循環獲取需要清掃的span 20 s := mheap_.sweepSpans[1-sg/2%2].pop() 21 if s == nil { 22 atomic.Store(&mheap_.sweepdone, 1) 23 break 24 } 25 if s.state != mSpanInUse { 26 // This can happen if direct sweeping already 27 // swept this span, but in that case the sweep 28 // generation should always be up-to-date. 29 if s.sweepgen != sg { 30 print("runtime: bad span s.state=", s.state, " s.sweepgen=", s.sweepgen, " sweepgen=", sg, "\n") 31 throw("non in-use span in unswept list") 32 } 33 continue 34 } 35 // sweepgen == h->sweepgen - 2, 表示這個span需要清掃 36 // sweepgen == h->sweepgen - 1, 表示這個span正在被清掃 37 // 這是里確定span的狀態及嘗試轉換span的狀態 38 if s.sweepgen != sg-2 || !atomic.Cas(&s.sweepgen, sg-2, sg-1) { 39 continue 40 } 41 npages = s.npages 42 // 單個span的清掃 43 if !s.sweep(false) { 44 // Span is still in-use, so this returned no 45 // pages to the heap and the span needs to 46 // move to the swept in-use list. 47 npages = 0 48 } 49 break 50 } 51 52 // Decrement the number of active sweepers and if this is the 53 // last one print trace information. 54 // 當前worker清掃任務完成,更新sweepers的數量 55 if atomic.Xadd(&mheap_.sweepers, -1) == 0 && atomic.Load(&mheap_.sweepdone) != 0 { 56 if debug.gcpacertrace > 0 { 57 print("pacer: sweep done at heap size ", memstats.heap_live>>20, "MB; allocated ", (memstats.heap_live-mheap_.sweepHeapLiveBasis)>>20, "MB during sweep; swept ", mheap_.pagesSwept, " pages at ", sweepRatio, " pages/byte\n") 58 } 59 } 60 _g_.m.locks-- 61 return npages 62 }
mspan.sweep
1 func (s *mspan) sweep(preserve bool) bool { 2 // It's critical that we enter this function with preemption disabled, 3 // GC must not start while we are in the middle of this function. 4 _g_ := getg() 5 if _g_.m.locks == 0 && _g_.m.mallocing == 0 && _g_ != _g_.m.g0 { 6 throw("MSpan_Sweep: m is not locked") 7 } 8 sweepgen := mheap_.sweepgen 9 // 只有正在清掃中狀態的span才可以正常執行 10 if s.state != mSpanInUse || s.sweepgen != sweepgen-1 { 11 print("MSpan_Sweep: state=", s.state, " sweepgen=", s.sweepgen, " mheap.sweepgen=", sweepgen, "\n") 12 throw("MSpan_Sweep: bad span state") 13 } 14 15 if trace.enabled { 16 traceGCSweepSpan(s.npages * _PageSize) 17 } 18 // 先更新清掃的page數 19 atomic.Xadd64(&mheap_.pagesSwept, int64(s.npages)) 20 21 spc := s.spanclass 22 size := s.elemsize 23 res := false 24 25 c := _g_.m.mcache 26 freeToHeap := false 27 28 // The allocBits indicate which unmarked objects don't need to be 29 // processed since they were free at the end of the last GC cycle 30 // and were not allocated since then. 31 // If the allocBits index is >= s.freeindex and the bit 32 // is not marked then the object remains unallocated 33 // since the last GC. 34 // This situation is analogous to being on a freelist. 35 36 // Unlink & free special records for any objects we're about to free. 37 // Two complications here: 38 // 1. An object can have both finalizer and profile special records. 39 // In such case we need to queue finalizer for execution, 40 // mark the object as live and preserve the profile special. 41 // 2. A tiny object can have several finalizers setup for different offsets. 42 // If such object is not marked, we need to queue all finalizers at once. 43 // Both 1 and 2 are possible at the same time. 44 specialp := &s.specials 45 special := *specialp 46 // 判斷在special中的對象是否存活,是否至少有一個finalizer,釋放沒有finalizer的對象,把有finalizer的對象組成隊列 47 for special != nil { 48 // A finalizer can be set for an inner byte of an object, find object beginning. 49 objIndex := uintptr(special.offset) / size 50 p := s.base() + objIndex*size 51 mbits := s.markBitsForIndex(objIndex) 52 if !mbits.isMarked() { 53 // This object is not marked and has at least one special record. 54 // Pass 1: see if it has at least one finalizer. 55 hasFin := false 56 endOffset := p - s.base() + size 57 for tmp := special; tmp != nil && uintptr(tmp.offset) < endOffset; tmp = tmp.next { 58 if tmp.kind == _KindSpecialFinalizer { 59 // Stop freeing of object if it has a finalizer. 60 mbits.setMarkedNonAtomic() 61 hasFin = true 62 break 63 } 64 } 65 // Pass 2: queue all finalizers _or_ handle profile record. 66 for special != nil && uintptr(special.offset) < endOffset { 67 // Find the exact byte for which the special was setup 68 // (as opposed to object beginning). 69 p := s.base() + uintptr(special.offset) 70 if special.kind == _KindSpecialFinalizer || !hasFin { 71 // Splice out special record. 72 y := special 73 special = special.next 74 *specialp = special 75 freespecial(y, unsafe.Pointer(p), size) 76 } else { 77 // This is profile record, but the object has finalizers (so kept alive). 78 // Keep special record. 79 specialp = &special.next 80 special = *specialp 81 } 82 } 83 } else { 84 // object is still live: keep special record 85 specialp = &special.next 86 special = *specialp 87 } 88 } 89 90 if debug.allocfreetrace != 0 || raceenabled || msanenabled { 91 // Find all newly freed objects. This doesn't have to 92 // efficient; allocfreetrace has massive overhead. 93 mbits := s.markBitsForBase() 94 abits := s.allocBitsForIndex(0) 95 for i := uintptr(0); i < s.nelems; i++ { 96 if !mbits.isMarked() && (abits.index < s.freeindex || abits.isMarked()) { 97 x := s.base() + i*s.elemsize 98 if debug.allocfreetrace != 0 { 99 tracefree(unsafe.Pointer(x), size) 100 } 101 if raceenabled { 102 racefree(unsafe.Pointer(x), size) 103 } 104 if msanenabled { 105 msanfree(unsafe.Pointer(x), size) 106 } 107 } 108 mbits.advance() 109 abits.advance() 110 } 111 } 112 113 // Count the number of free objects in this span. 114 // 獲取需要釋放的alloc對象的總數 115 nalloc := uint16(s.countAlloc()) 116 // 如果sizeclass為0,卻分配的總數量為0,則釋放到mheap 117 if spc.sizeclass() == 0 && nalloc == 0 { 118 s.needzero = 1 119 freeToHeap = true 120 } 121 nfreed := s.allocCount - nalloc 122 if nalloc > s.allocCount { 123 print("runtime: nelems=", s.nelems, " nalloc=", nalloc, " previous allocCount=", s.allocCount, " nfreed=", nfreed, "\n") 124 throw("sweep increased allocation count") 125 } 126 127 s.allocCount = nalloc 128 // 判斷span是否empty 129 wasempty := s.nextFreeIndex() == s.nelems 130 // 重置freeindex 131 s.freeindex = 0 // reset allocation index to start of span. 132 if trace.enabled { 133 getg().m.p.ptr().traceReclaimed += uintptr(nfreed) * s.elemsize 134 } 135 136 // gcmarkBits becomes the allocBits. 137 // get a fresh cleared gcmarkBits in preparation for next GC 138 // 重置 allocBits為 gcMarkBits 139 s.allocBits = s.gcmarkBits 140 // 重置 gcMarkBits 141 s.gcmarkBits = newMarkBits(s.nelems) 142 143 // Initialize alloc bits cache. 144 // 更新allocCache 145 s.refillAllocCache(0) 146 147 // We need to set s.sweepgen = h.sweepgen only when all blocks are swept, 148 // because of the potential for a concurrent free/SetFinalizer. 149 // But we need to set it before we make the span available for allocation 150 // (return it to heap or mcentral), because allocation code assumes that a 151 // span is already swept if available for allocation. 152 if freeToHeap || nfreed == 0 { 153 // The span must be in our exclusive ownership until we update sweepgen, 154 // check for potential races. 155 if s.state != mSpanInUse || s.sweepgen != sweepgen-1 { 156 print("MSpan_Sweep: state=", s.state, " sweepgen=", s.sweepgen, " mheap.sweepgen=", sweepgen, "\n") 157 throw("MSpan_Sweep: bad span state after sweep") 158 } 159 // Serialization point. 160 // At this point the mark bits are cleared and allocation ready 161 // to go so release the span. 162 atomic.Store(&s.sweepgen, sweepgen) 163 } 164 165 if nfreed > 0 && spc.sizeclass() != 0 { 166 c.local_nsmallfree[spc.sizeclass()] += uintptr(nfreed) 167 // 把span釋放到mcentral上 168 res = mheap_.central[spc].mcentral.freeSpan(s, preserve, wasempty) 169 // MCentral_FreeSpan updates sweepgen 170 } else if freeToHeap { 171 // 這里是大對象的span釋放,與117行呼應 172 // Free large span to heap 173 174 // NOTE(rsc,dvyukov): The original implementation of efence 175 // in CL 22060046 used SysFree instead of SysFault, so that 176 // the operating system would eventually give the memory 177 // back to us again, so that an efence program could run 178 // longer without running out of memory. Unfortunately, 179 // calling SysFree here without any kind of adjustment of the 180 // heap data structures means that when the memory does 181 // come back to us, we have the wrong metadata for it, either in 182 // the MSpan structures or in the garbage collection bitmap. 183 // Using SysFault here means that the program will run out of 184 // memory fairly quickly in efence mode, but at least it won't 185 // have mysterious crashes due to confused memory reuse. 186 // It should be possible to switch back to SysFree if we also 187 // implement and then call some kind of MHeap_DeleteSpan. 188 if debug.efence > 0 { 189 s.limit = 0 // prevent mlookup from finding this span 190 sysFault(unsafe.Pointer(s.base()), size) 191 } else { 192 // 把sapn釋放到mheap上 193 mheap_.freeSpan(s, 1) 194 } 195 c.local_nlargefree++ 196 c.local_largefree += size 197 res = true 198 } 199 if !res { 200 // The span has been swept and is still in-use, so put 201 // it on the swept in-use list. 202 // 如果span未釋放到mcentral或mheap,表示span仍然處於in-use狀態 203 mheap_.sweepSpans[sweepgen/2%2].push(s) 204 } 205 return res 206 }
並發清理本質上就是一個死循環,被喚醒后開始執行清理任務。通過遍歷所有span對象,觸發內存分配的回收操作。任務完成后再次休眠,等待下次任務。
6. 回收流程
GO的GC是並行GC, 也就是GC的大部分處理和普通的go代碼是同時運行的, 這讓GO的GC流程比較復雜.
首先GC有四個階段, 它們分別是:
- Sweep Termination: 對未清掃的span進行清掃, 只有上一輪的GC的清掃工作完成才可以開始新一輪的GC
- Mark: 掃描所有根對象, 和根對象可以到達的所有對象, 標記它們不被回收
- Mark Termination: 完成標記工作, 重新掃描部分根對象(要求STW)
- Sweep: 按標記結果清掃span
在GC過程中會有兩種后台任務(G), 一種是標記用的后台任務, 一種是清掃用的后台任務.標記用的后台任務會在需要時啟動, 可以同時工作的后台任務數量大約是P的數量的25%, 也就是go所講的讓25%的cpu用在GC上的根據.清掃用的后台任務在程序啟動時會啟動一個, 進入清掃階段時喚醒.
目前整個GC流程會進行兩次STW(Stop The World), 第一次是Mark階段的開始, 第二次是Mark Termination階段.第一次STW會准備根對象的掃描, 啟動寫屏障(Write Barrier)和輔助GC(mutator assist).第二次STW會重新掃描部分根對象, 禁用寫屏障(Write Barrier)和輔助GC(mutator assist).需要注意的是, 不是所有根對象的掃描都需要STW, 例如掃描棧上的對象只需要停止擁有該棧的G.寫屏障的實現使用了Hybrid Write Barrier, 大幅減少了第二次STW的時間.
7. 監控
場景:服務重啟,海量客戶端重新接入,瞬間分配大量對象,這會將垃圾回收的觸發條件next_gc推到一個很大值。服務正常后,因活躍的遠小於該閾值,造成垃圾回收久久無法觸發,服務進程內會有大量白色對象無法被回收,造成隱性內存泄漏,也可能是某個對象在短期內大量使用臨時對象造成。
場景示例:
1 //testms.go 2 packmage main 3 4 import ( 5 "fmt" 6 "runtime" 7 "time" 8 ) 9 10 func test(){ 11 type M [1 << 10]byte 12 data := make([]*M, 1024*20) 13 14 //申請20MB內存分配,超出初始閾值,將next_GC提高 15 for i := range data { 16 data[i] = new(M) 17 } 18 19 //解除引用,預防內聯導致data生命周期變長 20 for i := range data { 21 data[i] = nil 22 } 23 } 24 25 func main(){ 26 test() 27 now := time.New() 28 for{ 29 var ms runtime.MemStats 30 runtime.ReadMemStats(&ms) 31 fmt.Printf("%s %d MB\n", now.Format("15:04:05"), ms.NextGC>>20) 32 33 time.Sleep(time.Second * 30) 34 } 35 }
編譯執行:
test()函數模擬了短期內大量分配對象的行為。
輸出結果顯示在其結束后的的一段時間內都沒有觸發垃圾回收。直到forcegc介入,才將next_gc恢復正常。這是垃圾回收的最后一道保障措施。監控服務sysmon每隔2分鍾就會檢查一次垃圾回收狀態,如超出2分鍾未觸發,則強制執行。