默認情況下,我們使用的都是 jre 版本的 openjdk,當容器啟動卡住不動的時候,看不出來任何問題。
此時如果能 dump 就能知道線程在干啥,也能找到一些大概的問題。
此時 jre 版本的鏡像就不夠用了。
切換 jre 為 jdk 版本
只切換為 jdk 還不夠,還會遇到 Unable to get pid of LinuxThreads manager thread
的錯誤。
創建可以 dump 用的基礎鏡像
參考前面文章,創建如下鏡像:
FROM openjdk:8u191-jdk-alpine3.9 RUN apk add --no-cache tini ENTRYPOINT ["tini"]
修改項目使用的鏡像和啟動方式
假設上面創建的鏡像名為 openjdk:8u191-jdk-alpine3.9-tini
FROM openjdk:8u191-jdk-alpine3.9-tini COPY app.jar /opt/dubbo-app/app.jar WORKDIR /opt/dubbo-app EXPOSE 20880 ENTRYPOINT ["/sbin/tini", "--", "java", "-jar", "app.jar"]
啟動鏡像后進入容器
- jps 查看 pid
- jstack -l pid 查看線程信息
關於此次 BUG
經過查看堆棧和代碼,發現是 Dubbo 連接 zookeeper 時,用了 CountDownLatch
,由於通過環境變量配置的 ZOOKEEPER 地址中,環境變量名竟然配錯了,導致 zookeeper 一直連接不上,因此鎖死了主線程。
實際上這里沒有添加 timeout 也是 Dubbo 2.7.1 的一大 BUG。
dubbo 2.7.1 有很多嚴重 BUG,而且修復和發布的周期特別的長,一定要慎用。
主線程堆棧信息:
"main" #1 prio=5 os_prio=0 tid=0x00005592eb0f1000 nid=0x9 waiting on condition [0x00007fda15afd000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000f885dac0> (a java.util.concurrent.CountDownLatch$Sync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) at org.apache.dubbo.configcenter.support.zookeeper.ZookeeperDynamicConfiguration.<init>(ZookeeperDynamicConfiguration.java:64) at org.apache.dubbo.configcenter.support.zookeeper.ZookeeperDynamicConfigurationFactory.createDynamicConfiguration(ZookeeperDynamicConfigurationFactory.java:38) at org.apache.dubbo.configcenter.AbstractDynamicConfigurationFactory.getDynamicConfiguration(AbstractDynamicConfigurationFactory.java:33) - locked <0x00000000f885db68> (a org.apache.dubbo.configcenter.support.zookeeper.ZookeeperDynamicConfigurationFactory) at org.apache.dubbo.config.AbstractInterfaceConfig.getDynamicConfiguration(AbstractInterfaceConfig.java:275) at org.apache.dubbo.config.AbstractInterfaceConfig.prepareEnvironment(AbstractInterfaceConfig.java:250) at org.apache.dubbo.config.AbstractInterfaceConfig.startConfigCenter(AbstractInterfaceConfig.java:240) at org.apache.dubbo.config.AbstractInterfaceConfig.lambda$null$7(AbstractInterfaceConfig.java:584) at org.apache.dubbo.config.AbstractInterfaceConfig$$Lambda$218/1961945640.get(Unknown Source) at java.util.Optional.orElseGet(Optional.java:267)
對應代碼截圖如下:
