我遇到這樣的問題,本地部署時拋出異常java.lang.OutOfMemoryError:GC overhead limit exceeded導致服務起不來,查看日志發現加載了太多資源到內存,本地的性能也不好,gc時間消耗的較多。解決這種問題兩種方法是,增加參數,-XX:-UseGCOverheadLimit,關閉這個特性,同時增加heap大小,-Xmx1024m。坑填了,but why?
OOM大家都知道,就是JVM內存溢出了,那GC overhead limit exceed呢?
GC overhead limt exceed檢查是Hotspot VM 1.6定義的一個策略,通過統計GC時間來預測是否要OOM了,提前拋出異常,防止OOM發生。Sun 官方對此的定義是:“並行/並發回收器在GC回收時間過長時會拋出OutOfMemroyError。過長的定義是,超過98%的時間用來做GC並且回收了不到2%的堆內存。用來避免內存過小造成應用不能正常工作。“
聽起來沒啥用...預測OOM有啥用?起初開來這玩意只能用來Catch住釋放內存資源,避免應用掛掉。后來發現一般情況下這個策略不能拯救你的應用,但是可以在應用掛掉之前做最后的掙扎,比如數據保存或者保存現場(Heap Dump)。
而且有些時候這個策略還會帶來問題,比如加載某個大的內存數據時頻繁OOM。
假如你也生產環境中遇到了這個問題,在不知道原因時不要簡單的猜測和規避。可以通過-verbose:gc -XX:+PrintGCDetails看下到底什么原因造成了異常。通常原因都是因為old區占用過多導致頻繁Full GC,最終導致GC overhead limit exceed。如果gc log不夠可以借助於JProfile等工具查看內存的占用,old區是否有內存泄露。分析內存泄露還有一個方法-XX:+HeapDumpOnOutOfMemoryError,這樣OOM時會自動做Heap Dump,可以拿MAT來排查了。還要留意young區,如果有過多短暫對象分配,可能也會拋這個異常。
日志的信息不難理解,就是每次gc時打條日志,記錄GC的類型,前后大小和時間。舉個例子。
33.125: [GC [DefNew: 16000K->16000K(16192K), 0.0000574 secs][Tenured: 2973K->2704K(16384K), 0.1012650 secs] 18973K->2704K(32576K), 0.1015066 secs]
100.667:[Full GC [Tenured: 0K->210K(10240K), 0.0149142 secs] 4603K->210K(19456K), [Perm : 2999K->2999K(21248K)], 0.0150007 secs]
GC和Full GC代表gc的停頓類型,Full GC代表stop-the-world。箭頭兩邊是gc前后的區空間大小,分別是young區、tenured區和perm區,括號里是該區的總大小。冒號前面是gc發生的時間,單位是秒,從jvm啟動開始計算。DefNew代表Serial收集器,為Default New Generation的縮寫,類似的還有PSYoungGen,代表Parallel Scavenge收集器。這樣可以通過分析日志找到導致GC overhead limit exceeded的原因,通過調節相應的參數解決問題。
文中涉及到的名詞解釋,
Eden Space:堆內存池,大多數對象在這里分配內存空間。
Survivor Space:堆內存池,存儲在Eden Space的gc中存活下來的對象。
Tenured Generation:堆內存池,存儲Survivor Space中存活過幾次gc的對象。
Permanent Generation:非堆空間,存儲的是class和method對象。
Code Cache:非堆空間,JVM用來存儲編譯和存儲native code。
最后附上GC overhead limit exceed HotSpot的實現:
bool print_gc_overhead_limit_would_be_exceeded = false; if (is_full_gc) { if (gc_cost() > gc_cost_limit && free_in_old_gen < (size_t) mem_free_old_limit && free_in_eden < (size_t) mem_free_eden_limit) { // Collections, on average, are taking too much time, and // gc_cost() > gc_cost_limit // we have too little space available after a full gc. // total_free_limit < mem_free_limit // where // total_free_limit is the free space available in // both generations // total_mem is the total space available for allocation // in both generations (survivor spaces are not included // just as they are not included in eden_limit). // mem_free_limit is a fraction of total_mem judged to be an // acceptable amount that is still unused. // The heap can ask for the value of this variable when deciding // whether to thrown an OutOfMemory error. // Note that the gc time limit test only works for the collections // of the young gen + tenured gen and not for collections of the // permanent gen. That is because the calculation of the space // freed by the collection is the free space in the young gen + // tenured gen. // At this point the GC overhead limit is being exceeded. inc_gc_overhead_limit_count(); if (UseGCOverheadLimit) { if (gc_overhead_limit_count() >= AdaptiveSizePolicyGCTimeLimitThreshold){ // All conditions have been met for throwing an out-of-memory set_gc_overhead_limit_exceeded(true); // Avoid consecutive OOM due to the gc time limit by resetting // the counter. reset_gc_overhead_limit_count(); } else { // The required consecutive collections which exceed the // GC time limit may or may not have been reached. We // are approaching that condition and so as not to // throw an out-of-memory before all SoftRef's have been // cleared, set _should_clear_all_soft_refs in CollectorPolicy. // The clearing will be done on the next GC. bool near_limit = gc_overhead_limit_near(); if (near_limit) { collector_policy->set_should_clear_all_soft_refs(true); if (PrintGCDetails && Verbose) { gclog_or_tty->print_cr(" Nearing GC overhead limit, " "will be clearing all SoftReference"); } } } } // Set this even when the overhead limit will not // cause an out-of-memory. Diagnostic message indicating // that the overhead limit is being exceeded is sometimes // printed. print_gc_overhead_limit_would_be_exceeded = true; } else { // Did not exceed overhead limits reset_gc_overhead_limit_count(); } }
參照&延伸閱讀:
http://javaeesupportpatterns.blogspot.com/2012/01/gc-overhead-limit-exceeded-understand.html
http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html
http://reins.altervista.org/java/gc1.4.2_example.html
http://stackoverflow.com/questions/2129044/java-heap-terminology-young-old-and-permanent-generations
http://book.51cto.com/art/201306/399236.htm
https://blogs.oracle.com/jonthecollector/entry/presenting_the_permanent_generation