===========================================================================================
環境:
linux上的tomcat中部署了一個web服務,
時好時壞,經常上午啟動,下午就無法訪問。
總是莫名其妙的宕機。
===========================================================================================
解決步驟:
1.首先,在宕機的情況下,先不啟動tomcat,去查看日志文件catalina.out
目錄是在你的tomcat的目錄下的logs目錄下
tail -n 200 -f catalina.out
收集到的日志大致如下:

[GC [PSYoungGen: 1610144K->54342K(1998336K)] 2233750K->678204K(6777344K), 0.0687220 secs] [Times: user=0.22 sys=0.00, real=0.07 secs] [GC [PSYoungGen: 1661510K->91007K(2016768K)] 2285372K->764127K(6795776K), 0.1119750 secs] [Times: user=0.29 sys=0.01, real=0.11 secs] [GC [PSYoungGen: 1721727K->3808K(2003456K)] 2394847K->760833K(6782464K), 0.0980690 secs] [Times: user=0.27 sys=0.03, real=0.10 secs] java.net.ConnectException: Connection timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:117) at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:177) at org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:144) at org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:131) at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:611) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:446) at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:882) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107) at util.httpclient.HttpXmlClient.sendRequest(HttpXmlClient.java:234) at util.httpclient.HttpXmlClient.invoke(HttpXmlClient.java:201) at util.httpclient.HttpXmlClient.post(HttpXmlClient.java:43) at quartz.InvoiceApplyOrderStatusJob.getSalOrderStatus(InvoiceApplyOrderStatusJob.java:87) at quartz.InvoiceApplyOrderStatusJob.findOrderStatus(InvoiceApplyOrderStatusJob.java:45) at quartz.InvoiceApplyOrderStatusJob.excute(InvoiceApplyOrderStatusJob.java:37) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:64) at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:53) at org.springframework.scheduling.concurrent.ReschedulingRunnable.run(ReschedulingRunnable.java:81) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) [ERROR][quartz.InvoiceApplyOrderStatusJob] [ ] [ERROR][org.springframework.scheduling.support.TaskUtils$LoggingErrorHandler] [ Unexpected error occurred in scheduled task. ] java.lang.NullPointerException at quartz.InvoiceApplyOrderStatusJob.findOrderStatus(InvoiceApplyOrderStatusJob.java:48) at quartz.InvoiceApplyOrderStatusJob.excute(InvoiceApplyOrderStatusJob.java:37) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:64) at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:53) at org.springframework.scheduling.concurrent.ReschedulingRunnable.run(ReschedulingRunnable.java:81) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) [GC [PSYoungGen: 1634528K->3520K(2018304K)] 2391553K->761134K(6797312K), 0.0790130 secs] [Times: user=0.21 sys=0.00, real=0.08 secs] [GC [PSYoungGen: 1647552K->71575K(2015232K)] 2405166K->829734K(6794240K), 0.1101140 secs] [Times: user=0.34 sys=0.01, real=0.11 secs] [GC [PSYoungGen: 1715607K->75061K(2053120K)] 2473766K->898535K(6832128K), 0.1175420 secs] [Times: user=0.37 sys=0.00, real=0.12 secs] Java HotSpot(TM) 64-Bit Server VM warning: Attempt to protect stack guard pages failed. Java HotSpot(TM) 64-Bit Server VM warning: Attempt to deallocate stack guard pages failed. Java HotSpot(TM) 64-Bit Server VM warning: Attempt to deallocate stack guard pages failed. Java HotSpot(TM) 64-Bit Server VM warning: Attempt to deallocate stack guard pages failed. Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00007f906fa7b000, 12288, 0) failed; error='Cannot allocate memory' (errno=12) # # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (malloc) failed to allocate 12288 bytes for committing reserved memory. # An error report file with more information is saved as: # /backup/tomcat7/bin/hs_err_pid916618.log Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00007f8f11051000, 12288, 0) failed; error='Cannot allocate memory' (errno=12)
2.從上面查看日志文件,可以分析出以下問題
1》第一,很多次的發生了新生代GC
[GC [PSYoungGen: 1975488K->5056K(2202112K)] 3281950K->1312494K(6981120K), 0.1478450 secs] [Times: user=0.43 sys=0.01, real=0.15 secs]
[GC [PSYoungGen: 1975488K->5056K(2202112K)] 3281950K->1312494K(6981120K), 0.1478450 secs] [Times: user=0.43 sys=0.01, real=0.15 secs] 1.GC 表示一次Minor GC(新生代垃圾收集) 2.PSYoungGen 表示新生代使用的是多線程垃圾收集器Parallel Scavenge 3.1975488K 表示垃圾收集之前新生代占用空間 4.5056K 表示垃圾收集之后新生代的空間 5.新生代又細分為一個Eden區和兩個Survivor區,Minor GC之后Eden區為空,5056K就是Survivor占用的空間。 6.(2202112K) 表示整個年輕代的大小 7.3281950K->1312494K(6981120K) 則表示 垃圾收集之前3281950K 垃圾收集之后1312494K 的java堆大小,(6981120K)代表總堆大小,堆大小包括新生代和年老代。 8.[Times: user=0.43 sys=0.01, real=0.15 secs]提供cpu使用及時間消耗, user是用戶模式垃圾收集消耗的cpu時間,實例中垃圾收集器消耗了0.43秒用戶態cpu時間, sys是消耗系統態cpu時間, real是指垃圾收集器消耗的實際時間。
以上是拿了一條為例,解釋這條信息的意義。
詳細的參考地址:參考地址:https://jingyan.baidu.com/article/3ea51489c045d852e61bbaab.html
2》從這幾條日志記錄,可以看出java的運行環境想要繼續運行下去,已經沒有足夠的內存支撐它了
# There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (malloc) failed to allocate 12288 bytes for committing reserved memory. # An error report file with more information is saved as: # /backup/tomcat7/bin/hs_err_pid916618.log
java運行環境,已經沒有足夠的內存支撐它運行下去了。
查看更多的錯誤日志信息,可以看:/backup/tomcat7/bin/hs_err_pid916618.log 這個文件
3.那在上面提示的目錄下去查看這個文件
cat /backup/tomcat7/bin/hs_err_pid916618.log
文件詳情如下:
# # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (malloc) failed to allocate 12288 bytes for committing reserved memory. # Possible reasons: # The system is out of physical RAM or swap space # In 32 bit mode, the process size limit was hit # Possible solutions: # Reduce memory load on the system # Increase physical memory or swap space # Check if swap backing store is full # Use 64 bit Java on a 64 bit OS # Decrease Java heap size (-Xmx/-Xms) # Decrease number of Java threads # Decrease Java thread stack sizes (-Xss) # Set larger code cache with -XX:ReservedCodeCacheSize= # This output file may be truncated or incomplete. # # Out of Memory Error (os_linux.cpp:2756), pid=916618, tid=140258326329088 # # JRE version: Java(TM) SE Runtime Environment (7.0_79-b15) (build 1.7.0_79-b15) # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode linux-amd64 compressed oops) # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # --------------- T H R E A D --------------- Current thread (0x00007f93bd3bc000): JavaThread "elasticsearch[Cap 'N Hawk][generic][T#44]" daemon [_thread_new, id=869274, stack(0x00007f906fa7b000,0x00007f906fb7c000)] Stack: [0x00007f906fa7b000,0x00007f906fb7c000], sp=0x00007f906fb7a800, free space=1022k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x9a32da] VMError::report_and_die()+0x2ea V [libjvm.so+0x497f7b] report_vm_out_of_memory(char const*, int, unsigned long, char const*)+0x9b V [libjvm.so+0x81fcce] os::Linux::commit_memory_impl(char*, unsigned long, bool)+0xfe V [libjvm.so+0x81fd8c] os::pd_commit_memory(char*, unsigned long, bool)+0xc V [libjvm.so+0x817afa] os::commit_memory(char*, unsigned long, bool)+0x2a V [libjvm.so+0x81e25d] os::pd_create_stack_guard_pages(char*, unsigned long)+0x6d V [libjvm.so+0x95581e] JavaThread::create_stack_guard_pages()+0x5e V [libjvm.so+0x95c164] JavaThread::run()+0x34 V [libjvm.so+0x821ca8][root@dscrmapp bin]#
關鍵的兩點內容:
1》可能的原因
# Possible reasons: # The system is out of physical RAM or swap space # In 32 bit mode, the process size limit was hit
系統超出物理內存 或 虛擬內存
在32位的系統下,進程個數被限制了
2》可能的解決方法
# Possible solutions: # Reduce memory load on the system # Increase physical memory or swap space # Check if swap backing store is full # Use 64 bit Java on a 64 bit OS # Decrease Java heap size (-Xmx/-Xms) # Decrease number of Java threads # Decrease Java thread stack sizes (-Xss) # Set larger code cache with -XX:ReservedCodeCacheSize=
# 可能的解決方案: #減少系統上的內存負載 #增加物理內存或交換空間 #檢查交換后備存儲是否已滿 #在64位操作系統上使用64位Java #減少Java堆大小(-Xmx / -Xms) #減少Java線程的數量 #減少Java線程堆棧大小(-Xss) #使用設置更大的代碼緩存 -XX:ReservedCodeCacheSize=
4.解決順序
在下面解決方法之前,一定要檢查程序,處理好BUG,對異常進行捕獲,尤其是多線程中,一定要正確捕獲異常,對於不用的對象,釋放引用,以確保GC可以正常的回收!!
4.1 確保系統是64位,並且java版本是64位
查看linux是centos還是ubuntu系統命令:
lsb_release -a
centOS查看系統32位還是64位
getconf LONG_BIT
java版本確認
java -version
4.2 減小java堆大小和堆棧大小
原理參考地址:http://www.cnblogs.com/hrhguanli/p/4509544.html
異常的本質原因是因為,創建了太多的線程,沒有及時回收。
而給jvm分配的內存越多,那么你能創建的線程數就越少。
進入tomcat的bin目錄下
查找並編輯catalina.sh文件
vi catalina.sh
找到下面這行,對JVM的參數設置行,一般在【cygwin=false】上方一行
export JAVA_OPTS='-Xms7000m -Xmx8192m -XX:PermSize=1024m -XX:MaxPermSize=2048m -XX:+PrintGCDetails -server'
具體參數含義參考:https://www.cnblogs.com/sxdcgaq8080/p/7196580.html
可以看到這里對jvm的
初始化堆大小 -Xms7000m
最大堆大小 -Xmx8192m
[非堆內存]永久代初始大小 -XX:PermSize=1024m
[非對內存]永久代最大大小 -XX:MaxPermSize=2048m
修改參數為:
export JAVA_OPTS='-Xms2048m -Xmx3072m -XX:PermSize=1024m -XX:MaxPermSize=2048m -XX:+PrintGCDetails -server'
對堆大小減小一半多。
4.3 查看並增加虛擬內存
第一: 查看內存使用情況命令
free -m
-m 以M為單位展示
centOS 6.4增加虛擬內存:
參考地址:https://www.linuxidc.com/Linux/2014-09/106100.htm
第二:關閉原本的swap
sudo swapoff -a
此時再查看 發現swap已經變成0
第三:設置新的swap大小
dd if=/dev/zero of=/swapfile bs=1M count=31906
of是指 在指定的路徑創建swapfile文件
bs指的是Block Size,就是每一塊的大小。這里的例子是1M,意思就是count的數字,是以1M為單位的。
count是告訴程序,新的swapfile要多少個block。這里是31906,就是說,新的swap文件是31906M大小,也就是將近32G。
注意:可能需要點時間完成此步,耐心等待完成。
注意:swap大小原則,設置為物理內存的1-2倍大小。
因為最開始分析就是物理內存或swap內存不足導致,因此這里講swap內存設置為物理內存的2倍大小。
設置完成就是這樣
第四:把新增加的swapfile文件設置為swap文件
sudo mkswap /swapfile
第五:修改/etc/fstab文件,讓swap在啟動時自動生效
vi /etc/fstab
在文件最后一行添加
/swapfile swap swap defaults 0 0
第六:重啟服務器
命令:
reboot
重啟后重新連接
第七:掛載swapfile文件
swapon /swapfile
查看swap