java.lang.OutOfMemoryError:GC overhead limit exceeded填坑心得


我遇到這樣的問題,本地部署時拋出異常java.lang.OutOfMemoryError:GC overhead limit exceeded導致服務起不來,查看日志發現加載了太多資源到內存,本地的性能也不好,gc時間消耗的較多。解決這種問題兩種方法是,增加參數,-XX:-UseGCOverheadLimit,關閉這個特性,同時增加heap大小,-Xmx1024m。坑填了,but why?

OOM大家都知道,就是JVM內存溢出了,那GC overhead limit exceed呢?

GC overhead limt exceed檢查是Hotspot VM 1.6定義的一個策略,通過統計GC時間來預測是否要OOM了,提前拋出異常,防止OOM發生。Sun 官方對此的定義是:“並行/並發回收器在GC回收時間過長時會拋出OutOfMemroyError。過長的定義是,超過98%的時間用來做GC並且回收了不到2%的堆內存。用來避免內存過小造成應用不能正常工作。“

聽起來沒啥用...預測OOM有啥用?起初開來這玩意只能用來Catch住釋放內存資源,避免應用掛掉。后來發現一般情況下這個策略不能拯救你的應用,但是可以在應用掛掉之前做最后的掙扎,比如數據保存或者保存現場(Heap Dump)。

而且有些時候這個策略還會帶來問題,比如加載某個大的內存數據時頻繁OOM。

假如你也生產環境中遇到了這個問題,在不知道原因時不要簡單的猜測和規避。可以通過-verbose:gc -XX:+PrintGCDetails看下到底什么原因造成了異常。通常原因都是因為old區占用過多導致頻繁Full GC,最終導致GC overhead limit exceed。如果gc log不夠可以借助於JProfile等工具查看內存的占用,old區是否有內存泄露。分析內存泄露還有一個方法-XX:+HeapDumpOnOutOfMemoryError,這樣OOM時會自動做Heap Dump,可以拿MAT來排查了。還要留意young區,如果有過多短暫對象分配,可能也會拋這個異常。

日志的信息不難理解,就是每次gc時打條日志,記錄GC的類型,前后大小和時間。舉個例子。

33.125: [GC [DefNew: 16000K->16000K(16192K), 0.0000574 secs][Tenured: 2973K->2704K(16384K), 0.1012650 secs] 18973K->2704K(32576K), 0.1015066 secs]

100.667:[Full GC [Tenured: 0K->210K(10240K), 0.0149142 secs] 4603K->210K(19456K), [Perm : 2999K->2999K(21248K)], 0.0150007 secs] 

GC和Full GC代表gc的停頓類型,Full GC代表stop-the-world。箭頭兩邊是gc前后的區空間大小,分別是young區、tenured區和perm區,括號里是該區的總大小。冒號前面是gc發生的時間,單位是秒,從jvm啟動開始計算。DefNew代表Serial收集器,為Default New Generation的縮寫,類似的還有PSYoungGen,代表Parallel Scavenge收集器。這樣可以通過分析日志找到導致GC overhead limit exceeded的原因,通過調節相應的參數解決問題。

文中涉及到的名詞解釋,

Eden Space:堆內存池,大多數對象在這里分配內存空間。

Survivor Space:堆內存池,存儲在Eden Space的gc中存活下來的對象。

Tenured Generation:堆內存池,存儲Survivor Space中存活過幾次gc的對象。

Permanent Generation:非堆空間,存儲的是class和method對象。

Code Cache:非堆空間,JVM用來存儲編譯和存儲native code。

最后附上GC overhead limit exceed HotSpot的實現:

  bool print_gc_overhead_limit_would_be_exceeded = false;
  if (is_full_gc) {
    if (gc_cost() > gc_cost_limit &&
      free_in_old_gen < (size_t) mem_free_old_limit &&
      free_in_eden < (size_t) mem_free_eden_limit) {
      // Collections, on average, are taking too much time, and
      //      gc_cost() > gc_cost_limit
      // we have too little space available after a full gc.
      //      total_free_limit < mem_free_limit
      // where
      //   total_free_limit is the free space available in
      //     both generations
      //   total_mem is the total space available for allocation
      //     in both generations (survivor spaces are not included
      //     just as they are not included in eden_limit).
      //   mem_free_limit is a fraction of total_mem judged to be an
      //     acceptable amount that is still unused.
      // The heap can ask for the value of this variable when deciding
      // whether to thrown an OutOfMemory error.
      // Note that the gc time limit test only works for the collections
      // of the young gen + tenured gen and not for collections of the
      // permanent gen.  That is because the calculation of the space
      // freed by the collection is the free space in the young gen +
      // tenured gen.
      // At this point the GC overhead limit is being exceeded.
      inc_gc_overhead_limit_count();
      if (UseGCOverheadLimit) {
        if (gc_overhead_limit_count() >=
            AdaptiveSizePolicyGCTimeLimitThreshold){
          // All conditions have been met for throwing an out-of-memory
          set_gc_overhead_limit_exceeded(true);
          // Avoid consecutive OOM due to the gc time limit by resetting
          // the counter.
          reset_gc_overhead_limit_count();
        } else {
          // The required consecutive collections which exceed the
          // GC time limit may or may not have been reached. We
          // are approaching that condition and so as not to
          // throw an out-of-memory before all SoftRef's have been
          // cleared, set _should_clear_all_soft_refs in CollectorPolicy.
          // The clearing will be done on the next GC.
          bool near_limit = gc_overhead_limit_near();
          if (near_limit) {
            collector_policy->set_should_clear_all_soft_refs(true);
            if (PrintGCDetails && Verbose) {
              gclog_or_tty->print_cr("  Nearing GC overhead limit, "
                "will be clearing all SoftReference");
            }
          }
        }
      }
      // Set this even when the overhead limit will not
      // cause an out-of-memory.  Diagnostic message indicating
      // that the overhead limit is being exceeded is sometimes
      // printed.
      print_gc_overhead_limit_would_be_exceeded = true;

    } else {
      // Did not exceed overhead limits
      reset_gc_overhead_limit_count();
    }
  }

參照&延伸閱讀:

http://javaeesupportpatterns.blogspot.com/2012/01/gc-overhead-limit-exceeded-understand.html

http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html

http://reins.altervista.org/java/gc1.4.2_example.html

http://stackoverflow.com/questions/2129044/java-heap-terminology-young-old-and-permanent-generations

http://book.51cto.com/art/201306/399236.htm

https://blogs.oracle.com/jonthecollector/entry/presenting_the_permanent_generation


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM