最近筆者遇到一個問題 監控平台忽然告警 GC overhead limit exceeded 這個異常
第一反應估計是堆溢出了。於是各種各種jmap jstack下載堆棧文件和堆日志文件。
以下是線程堆棧dump下來的日志文件
Jstack pid > xxx.log 線程dump【pid是進程ID】
"DubboClientHandler-172.16.3.244:20885-thread-168" #5165 daemon prio=5 os_prio=0 tid=0x00007f6604070000 nid=0x1151 waiting on condition [0x00007f65c31f8000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0000000731228070> (a java.util.concurrent.SynchronousQueue$TransferStack) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362) at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748)
很明顯這個dubbo線程一直在等待其他線程釋放資源 它目前是阻塞狀態
還有一個異常:
"DubboClientReconnectTimer-thread-3" #13057 daemon prio=5 os_prio=0 tid=0x00007f01e8e8d000 nid=0x4631 waiting on condition [0x00007f01dd5a6000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0000000730e115a8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1088) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748)
這個異常描述是dubbo客戶端重連線程也一直處於阻塞狀態 ;為什么會重連呢 原因是dubbo的心跳檢測機制發現與服務端的連接超時,一般1分鍾后 它會發起重連[消費者和生產者需要通過心跳機制來保持長連接]
綜合描述 客戶端調用的dubbo服務超時了 響應過於緩慢 客戶端不斷在重連。
本質原因第三方服務超時導致的客戶端消費程序響應緩慢 超時嚴重 大量線程堆積 不釋放 導致內除溢出...