問題描述
通過壓測,發現系統最后會停止響應,不再接收新的請求。
查看日志,發現有如下申請直接內存錯誤。
io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 2130706439, max: 2147483648)
at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:742) ~[netty-all-4.1.47.Final.jar!/:4.1.47.Final]
at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:697) ~[netty-all-4.1.47.Final.jar!/:4.1.47.Final]
at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:758) ~[netty-all-4.1.47.Final.jar!/:?]
at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:734) ~[netty-all-4.1.47.Final.jar!/:?]
at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:245) ~[netty-all-4.1.47.Final.jar!/:?]
at io.netty.buffer.PoolArena.allocate(PoolArena.java:215) ~[netty-all-4.1.47.Final.jar!/:?]
at io.netty.buffer.PoolArena.allocate(PoolArena.java:147) ~[netty-all-4.1.47.Final.jar!/:?]
at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:356) ~[netty-all-4.1.47.Final.jar!/:?]
at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:187) ~[netty-all-4.1.47.Final.jar!/:?]
at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:178) ~[netty-all-4.1.47.Final.jar!/:?]
at io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:139) ~[netty-all-4.1.47.Final.jar!/:?]
at io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:114) ~[netty-all-4.1.47.Final.jar!/:?]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:147) [netty-all-4.1.47.Final.jar!/:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714) [netty-all-4.1.47.Final.jar!/:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650) [netty-all-4.1.47.Final.jar!/:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576) [netty-all-4.1.47.Final.jar!/:?]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) [netty-all-4.1.47.Final.jar!/:?]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [netty-all-4.1.47.Final.jar!/:?]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-all-4.1.47.Final.jar!/:4.1.47.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-all-4.1.47.Final.jar!/:?]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]
問題分析
堆內存
首先懷疑是堆內存泄露,通過jmap
命令查看堆內存使用情況,發現使用正常。Xmx設置是8G,只使用了954M
如果確認是堆溢出,使用命令jmap -dump:format=b,file=heap2.hprof 28624
將堆快照dump出來,再使用MAT/JProfiler等工具分析,同時可以使用jConsole和JVisualVM監控堆內存分配情況。
jmap -heap 17667
Attaching to process ID 17667, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.191-b12
using thread-local object allocation.
Garbage-First (G1) GC with 4 thread(s)
Heap Configuration:
MinHeapFreeRatio = 40
MaxHeapFreeRatio = 70
MaxHeapSize = 8589934592 (8192.0MB)
NewSize = 1363144 (1.2999954223632812MB)
MaxNewSize = 5152702464 (4914.0MB)
OldSize = 5452592 (5.1999969482421875MB)
NewRatio = 2
SurvivorRatio = 8
MetaspaceSize = 268435456 (256.0MB)
CompressedClassSpaceSize = 1073741824 (1024.0MB)
MaxMetaspaceSize = 17592186044415 MB
G1HeapRegionSize = 2097152 (2.0MB)
Heap Usage:
G1 Heap:
regions = 4096
capacity = 8589934592 (8192.0MB)
used = 1001361104 (954.9723663330078MB)
free = 7588573488 (7237.027633666992MB)
11.657377518713474% used
G1 Young Generation:
Eden Space:
regions = 54
capacity = 1847590912 (1762.0MB)
used = 113246208 (108.0MB)
free = 1734344704 (1654.0MB)
6.129398410896708% used
Survivor Space:
regions = 0
capacity = 0 (0.0MB)
used = 0 (0.0MB)
free = 0 (0.0MB)
0.0% used
G1 Old Generation:
regions = 429
capacity = 1113587712 (1062.0MB)
used = 888114896 (846.9723663330078MB)
free = 225472816 (215.0276336669922MB)
79.75257686751486% used
43044 interned Strings occupying 5008680 bytes.
堆外內存
分析堆外內存。通過設置-Xmx參數只能限制java進程中Java Heap的內存大小。而java進程的內存是由Java Heap、Class、Thread、Code、GC、Internal、Symbol、Native Memory Tracking、unknown這幾部分組成。
比如當系統網絡請求過大時,Internal部分的內存會顯著上升,占用大量內存。
實際場景中,要結合top,jmap,NMT工具對java進程的內存進行分析。
jcmd配置NMT使用。NMT的全稱是Native Memory Tracker ,是一個本地內存跟蹤工具。
常用來分析JVM的內存使用情況。NMT功能默認關閉,可以通過以下方式開啟:
-XX:NativeMemoryTracking=[off | summary | detail]
配置項 說明
off 默認配置
summary 只收集匯總信息
detail 收集每次調用的信息
注意,根據Java官方文檔,開啟NMT會有5%-10%的性能損耗;
如果想JVM退出時打印退出時的內存使用情況,可以通過如下配置項:
-XX:+UnlockDiagnosticVMOptions -XX:+PrintNMTStatistics
通過jcmd查看堆NMT內存、類內存、內部內存等詳細信息
jcmd 17667 VM.native_memory detail scale=MB >temp.txt
17667:
Native Memory Tracking:
Total: reserved=15306MB, committed=11529MB
- Java Heap (reserved=8192MB, committed=5678MB)
(mmap: reserved=8192MB, committed=5678MB)
- Class (reserved=1146MB, committed=138MB)
(classes #23198)
(malloc=4MB #56498)
(mmap: reserved=1142MB, committed=134MB)
- Thread (reserved=461MB, committed=461MB)
(thread #458)
(stack: reserved=459MB, committed=459MB)
(malloc=2MB #2292)
(arena=1MB #912)
- Code (reserved=265MB, committed=118MB)
(malloc=21MB #25211)
(mmap: reserved=244MB, committed=97MB)
- GC (reserved=409MB, committed=315MB)
(malloc=73MB #151121)
(mmap: reserved=336MB, committed=243MB)
- Compiler (reserved=2MB, committed=2MB)
(malloc=2MB #3347)
- Internal (reserved=4778MB, committed=4778MB)
(malloc=4778MB #218361)
- Symbol (reserved=27MB, committed=27MB)
(malloc=23MB #249386)
(arena=4MB #1)
- Native Memory Tracking (reserved=11MB, committed=11MB)
(tracking overhead=11MB)
- Unknown (reserved=16MB, committed=0MB)
(mmap: reserved=16MB, committed=0MB)
Virtual memory map:
[0x00000005c0000000 - 0x00000007c0000000] reserved 8192MB for Java Heap from
[0x00007fce6c249efb] ReservedSpace::initialize(unsigned long, unsigned long, bool, char*, unsigned long, bool)+0xab
[0x00007fce6c24a4e2] ReservedHeapSpace::ReservedHeapSpace(unsigned long, unsigned long, bool, char*)+0x52
[0x00007fce6c216f65] Universe::reserve_heap(unsigned long, unsigned long)+0x65
[0x00007fce6bd6e681] G1CollectedHeap::initialize()+0x151
發現Internal區域很大。可以通過如下方式,對比內存變化。
一般對於堆外內存緩慢增長直到爆炸的情況來說,可以先設一個基線jcmd pid VM.native_memory baseline。
然后等放一段時間后再去看看內存增長的情況,通過jcmd pid VM.native_memory detail.diff(summary.diff)做一下 summary 或者 detail 級別的 diff。這邊堆外內存我們重點關注 Internal 的內存增長,如果增長十分明顯的話那就是有問題了。
jcmd 17667 VM.native_memory baseline
jcmd 13527 VM.native_memory scale=MB detail.diff --包含具體的內存分配信息
jcmd 13527 VM.native_memory scale=MB summary.diff --只包含內存概要信息
通過如上步驟基本確認是堆外內存泄露了。為了進一步監控直接內存,可以使用如下代碼,在啟動時加載。
package netty.util;
import java.lang.reflect.Field;
import java.util.concurrent.Executors;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicLong;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.util.ReflectionUtils;
import io.netty.util.internal.PlatformDependent;
/*直接內存監控*/
public class DirectMemoryReporter {
private static final Logger LOGGER = LoggerFactory.getLogger(DirectMemoryReporter.class);
private static final int _1K = 1024;
private static final String BUSINESS_KEY = "netty_direct_memory";
private AtomicLong directMemory;
public void init() {
Field field = ReflectionUtils.findField(PlatformDependent.class, "DIRECT_MEMORY_COUNTER");
field.setAccessible(true);
try {
directMemory = ((AtomicLong) field.get(PlatformDependent.class));
startReport();
} catch (IllegalAccessException e) {
LOGGER.error("netty_direct_memory error", e);
}
}
public void startReport() {
Runnable runnable = () -> {
doReport();
};
ScheduledExecutorService service = Executors.newSingleThreadScheduledExecutor();
// 第二個參數為首次執行的延時時間,第三個參數為定時執行的間隔時間
service.scheduleAtFixedRate(runnable, 30, 1, TimeUnit.SECONDS);
}
private void doReport() {
try {
int memoryInKB = (int) (directMemory.get());
LOGGER.info("{}:{}KB", BUSINESS_KEY, memoryInKB/1024);
} catch (Exception e) {
LOGGER.error("netty_direct_memory error", e);
}
}
}
可以通過new DirectMemoryReporter().init()啟動。之后每隔一秒可以在日志中看到直接內存的占用情況。
2021-03-02 09:58:59,073|[INFO]|[pool-11-thread-1]|[]|xxx.DirectMemoryReporter|netty_direct_memory:2032MB(2130706439B)
通過監控,發現隨着壓測,直接內存占用會不斷上漲,直到漲到Xmx設置的值(如果不設置-XX:MaxDirectMemorySize默認與Xmx相同)
定位
上面基本確認了是堆外直接內存泄露問題,下面要具體定位是哪里泄露的,需要使用到pmap(查看內存分配情況)、gdb(根據可能的內存泄露地址解析出實際內容)、strace(分析線程的內存分配情況)、jstack(定位線程的具體信息)
注意,除了pmap出來的信息大小比較小,strace和jstack的日志可能比較大,以當前定位為例,最終jstack日志6G、strace日志1G,傳到本地,文本編輯器打不開。因此盡量在linux環境下進行操作。
1、啟動內存pmap、內存分配strace、進程jstack跟蹤
先使用pmap -X pid >pmap1.txt
保存當前內存情況,同時啟動strace和jstack跟蹤
當跑性能跑出溢出后,再使用pmap -X pid >pmap2.txt
保存溢出時的內存分配
pmap
pmap -X pid >> pmap1.txt
strace
在壓測前執行該命令,並保持會話
strace -o 日志 -p pid -Ff -Tt -e trace=brk,mmap,munmap
jstack
通過腳本test.sh,每隔一段時間打印一次線程快照
#!/bin/bash
while [ 1 ];
do
jstack -l 16067 >> /opt/jstack_0301.txt
sleep 10
done
./test.sh &
通過對比找出可能的泄露的內存地址
通過分析
7f4674483000 rw-p 00000000 00:00 0 1575296 1574868 1574868 1535980 1574868 0 0
這一行增加的數據很大
2、gdb獲取可能泄露的內存
通過偏移計算,使用gdb獲取可能泄露的內存的內容,此步需要確認泄露的內存地址,如果內容中有時間,將時間計算出來
gdb
結束地址可以直接使用10進制設置偏移,如起始地址是0x7f4674483000,偏移1G 即偏移1x1024x1024x1024(1073741824),則結束地址為0x7f4674483000+1073741824,注意地址要以0x開頭,pmap出來的地址省略了0x
gdb attach pid
c
dump memory 文件名 開始地址 結束地址
打開文件就可以看到解析的內存內容
可以通過vi/hexdump命令查看
下面是實例
[root@host-x-x-x-x ~]# gdb attach 16067
GNU gdb (GDB) 9.1
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
···省略
warning: Cannot parse .gnu_debugdata section; LZMA support was disabled at compile time
--Type <RET> for more, q to quit, c to continue without paging--c #此處輸入c,並回車,則進入gdb
(gdb) dump memory out.bin 0x7f4674483000 0x7f4674483000+1073741824 #解析對應內存內容,一般需要計算偏移,如64M
(gdb) q #輸入q,退出
A debugging session is active.
Inferior 1 [process 16067] will be detached.
Quit anyway? (y or n) y #輸入y,退出
Detaching from program: /root/jdk-8u191/bin/java, process 16067
[Inferior 1 (process 16067) detached]
解析后的文件名為out.bin,打開后可以查看解析后的內容
[root@host-x-x-x-x ~]# ls -alh out.bin
-rw------- 1 root root 1.0G Mar 2 11:01 out.bin
[root@host-10-33-144-227 ~]# vi out.bin
^@^@^@^@^@^@^@^@^B^P^@^A^@^@^@^@^@^@^@^A^@^@^@^@^G^@^@ó^@^@^@^@DATA^M
2 xxx@xxx^M
3 To: xxx^M
4 Message-ID: <651317388.574761.1614644538528.JavaMail.root@host-xxx>^M
5 Subject: =?UTF-8?B?5rWL6K+V6YKu5Lu2?=^M
6 MIME-Version: 1.0^M
7 Content-Type: multipart/mixed; ^M
8 boundary="----=_Part_574760_1068795689.1614644538528"^M
9 ^M
10 ------=_Part_574760_1068795689.1614644538528^M
11 Content-Type: text/plain; charset=UTF-8^M
12 Content-Transfer-Encoding: base64^M
13 ^M
14 6L+Z5piv5LiA5p2h5rWL6K+V6YKu5Lu277yBClRoaXMgaXMgYSB0ZXN0IGVtYWlsIQDATA^
[root@host-10-33-144-227 ~]# hexdump -C out.bin #使用十六進制查看
00219420 41 55 54 48 20 4c 4f 47 49 4e 0d 0a 40 68 6d 61 |AUTH LOGIN..@xx|
00219430 69 6c 2e 68 75 61 77 65 69 2e 63 6f 6d 0d 0a 54 |xx.xxx.com..T|
00219440 6f 3a 20 6c 68 6e 39 34 31 34 40 68 6d 61 69 6c |o: xxx|
00219450 2e 68 75 61 77 65 69 2e 63 6f 6d 0d 0a 4d 65 73 |.xxx.com..Mes|
00219460 73 61 67 65 2d 49 44 3a 20 3c 31 33 38 38 32 34 |sage-ID: <138824|
00219470 33 34 39 2e 35 38 31 31 31 33 2e 31 36 31 34 36 |349.581113.16146|
00219480 34 34 39 35 33 32 32 30 2e 4a 61 76 61 4d 61 69 |44953220.JavaMai|
00219490 6c 2e 72 6f 6f 74 40 68 6f 73 74 2d 31 30 2d 33 |l.root@host-10-3|
002194a0 33 2d 31 34 34 2d 31 39 30 3e 0d 0a 53 75 62 6a |3-144-190>..Subj|
002194b0 65 63 74 3a 20 3d 3f 55 54 46 2d 38 3f 42 3f 35 |ect: =?UTF-8?B?5|
002194c0 72 57 4c 36 4b 2b 56 36 59 4b 75 35 4c 75 32 3f |rWL6K+V6YKu5Lu2?|
002194d0 3d 0d 0a 4d 49 4d 45 2d 56 65 72 73 69 6f 6e 3a |=..MIME-Version:|
002194e0 20 31 2e 30 0d 0a 43 6f 6e 74 65 6e 74 2d 54 79 | 1.0..Content-Ty|
002194f0 70 65 3a 20 6d 75 6c 74 69 70 61 72 74 2f 6d 69 |pe: multipart/mi|
00219500 78 65 64 3b 20 0d 0a 09 62 6f 75 6e 64 61 72 79 |xed; ...boundary|
00219510 3d 22 2d 2d 2d 2d 3d 5f 50 61 72 74 5f 35 38 31 |="----=_Part_581|
00219520 31 31 32 5f 37 38 38 34 33 32 36 31 34 2e 31 36 |112_788432614.16|
00219530 31 34 36 34 34 39 35 33 32 32 30 22 0d 0a 0d 0a |14644953220"....|
00219540 2d 2d 2d 2d 2d 2d 3d 5f 50 61 72 74 5f 35 38 31 |------=_Part_581|
00219550 31 31 32 5f 37 38 38 34 33 32 36 31 34 2e 31 36 |112_788432614.16|
00219560 31 34 36 34 34 39 35 33 32 32 30 0d 0a 43 6f 6e |14644953220..Con|
00219570 74 65 6e 74 2d 54 79 70 65 3a 20 74 65 78 74 2f |tent-Type: text/|
00219580 70 6c 61 69 6e 3b 20 63 68 61 72 73 65 74 3d 55 |plain; charset=U|
00219590 54 46 2d 38 0d 0a 43 6f 6e 74 65 6e 74 2d 54 72 |TF-8..Content-Tr|
002195a0 61 6e 73 66 65 72 2d 45 6e 63 6f 64 69 6e 67 3a |ansfer-Encoding:|
002195b0 20 62 61 73 65 36 34 0d 0a 0d 0a 36 4c 2b 5a 35 | base64....6L+Z5|
002195c0 70 69 76 35 4c 69 41 35 70 32 68 35 72 57 4c 36 |piv5LiA5p2h5rWL6|
002195d0 4b 2b 56 36 59 4b 75 35 4c 75 32 37 37 79 42 43 |K+V6YKu5Lu277yBC|
002195e0 6c 52 6f 61 58 4d 67 61 58 4d 67 59 53 42 30 5a |lRoaXMgaXMgYSB0Z|
002195f0 58 4e 30 49 47 56 74 59 57 6c 73 49 51 3d 3d 0d |XN0IGVtYWlsIQ==.|
00219600 0a 2d 2d 2d 2d 2d 2d 3d 5f 50 61 72 74 5f 35 38 |.------=_Part_58|
00219610 31 31 31 32 5f 37 38 38 34 33 32 36 31 34 2e 31 |1112_788432614.1|
00219620 36 31 34 36 34 34 39 35 33 32 32 30 2d 2d 0d 0a |614644953220--..|
00219630 2e 0d 0a 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00219640 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00219820 45 48 4c 4f 20 68 6f 73 74 2d 31 30 2d 33 33 2d |EHLO host-xxx-|
00219830 31 34 34 2d 31 39 30 0d 0a 00 00 00 00 00 00 00 |xx-xx.........|
00219840 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
通過內容,確認數據時間(注意到秒,后面3位毫秒值去掉),如
[root@host-x-x-x-x ~]# date -d @1614625596
Tue Mar 2 03:06:36 CST 2021
3、strace查看申請內存的pid
通過strace日志,查找上一步得到的泄露的內存地址是哪個進程申請的,通過搜索內存地址,可以查看是否該地址只被該進行mmap沒有munmap,如果沒有進行釋放,則懷疑是其導致內存泄露
通過上面分析,內存0x7f4674483000確認發生泄露,查找泄露內存的線程
[root@host-x-x-x-x opt]# grep "0x7f4674483000" strace_0301.txt
25897 08:22:40.071076 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4674483000 <0.000524>
由查找結果,發現線程25897只使用了mmap申請內存(大小為16M),未進行釋放munmap,存在泄漏
通過申請內存的大小16781312,我們可以過濾一下
[root@host-10-33-144-227 opt]# grep 16781312 strace_0301.txt |wc -l
96
[root@host-10-33-144-227 opt]# grep 16781312 strace_0301.txt
25900 22:48:40.293407 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46d34e2000 <0.001026>
25898 22:48:40.779514 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46d24e1000 <0.000931>
25895 22:48:42.072311 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25894 22:48:43.126104 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46d04df000 <0.000846>
25896 22:48:43.453893 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25899 22:48:54.240661 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25893 22:48:55.871836 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25897 22:48:58.918825 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46cc4db000 <0.000926>
25899 23:40:52.070042 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25900 23:40:52.076028 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46ca4d9000 <0.001435>
25894 23:40:53.138579 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25895 23:40:53.993883 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25897 23:40:54.021415 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46c74d6000 <0.000961>
25896 23:40:55.046603 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46c64d5000 <0.001035>
25898 23:41:01.209776 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46c54d4000 <0.001205>
25893 23:41:38.545986 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25898 00:32:43.584853 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25899 00:32:45.662573 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25900 00:32:46.247013 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25896 00:32:47.807808 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46c04cf000 <0.000738>
25894 00:32:48.105049 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46bf4ce000 <0.000832>
25895 00:32:48.646481 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46be4cd000 <0.000840>
25897 00:32:51.705296 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25893 00:33:31.796557 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25898 01:24:53.214635 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25899 01:24:54.268791 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25900 01:24:55.509481 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46b94c8000 <0.001084>
25894 01:24:55.975972 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25897 01:24:56.313316 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46b74c6000 <0.000877>
25896 01:24:57.998530 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46b64c5000 <0.001148>
25895 01:25:02.973712 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25893 01:25:39.658599 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46b44c3000 <0.000751>
25898 02:16:08.550683 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46b34c2000 <0.000872>
25894 02:16:09.253646 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46b24c1000 <0.000981>
25900 02:16:09.868975 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25897 02:16:10.299096 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46b04bf000 <0.000774>
25895 02:16:10.898468 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46af4be000 <0.000921>
25896 02:16:11.490967 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46ae4bd000 <0.000816>
25899 02:16:16.064957 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46ad4bc000 <0.001132>
25893 02:16:56.362348 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25898 03:07:37.304262 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46ab4ba000 <0.000832>
25899 03:07:37.985234 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46aa4b9000 <0.000840>
25895 03:07:40.860608 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46a94b8000 <0.001047>
25900 03:07:41.419694 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25896 03:07:43.400255 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25897 03:07:43.572092 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46a64b5000 <0.001165>
25894 03:07:49.068627 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46a54b4000 <0.001139>
25893 03:08:22.889127 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25899 04:00:37.946945 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46a34b2000 <0.001101>
25900 04:00:41.183475 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25898 04:00:41.470902 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46a14b0000 <0.000641>
25896 04:00:42.676048 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25894 04:00:43.223683 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f469f4ae000 <0.001021>
25897 04:00:45.400673 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f469e4ad000 <0.000564>
25895 04:00:46.895991 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25893 04:01:24.135505 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25898 04:52:44.120418 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f469b4aa000 <0.000958>
25894 04:52:46.038504 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f469a4a9000 <0.001086>
25900 04:52:46.404094 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46994a8000 <0.001060>
25896 04:52:46.780690 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25895 04:52:47.319693 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25897 04:52:47.591897 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25899 04:52:54.528811 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46954a4000 <0.000924>
25893 04:53:27.404271 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25899 05:44:25.134003 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25898 05:44:26.525414 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25900 05:44:27.165630 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25894 05:44:28.995850 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f469049f000 <0.000962>
25896 05:44:31.020882 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25897 05:44:31.657577 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25895 05:44:33.198010 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f468d49c000 <0.000796>
25893 05:45:08.843821 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25898 06:37:21.368773 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f468b49a000 <0.000512>
25899 06:37:23.990442 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f468a499000 <0.000969>
25894 06:37:24.215510 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4689498000 <0.001021>
25897 06:37:26.174882 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25895 06:37:26.414563 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4687496000 <0.000920>
25900 06:37:26.507630 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25896 06:37:39.847076 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25893 06:37:51.091554 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25894 07:31:27.933343 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25896 07:31:29.163123 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4682491000 <0.000441>
25898 07:31:29.324907 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4681490000 <0.000298>
25900 07:31:31.294559 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25893 07:31:31.443325 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25899 07:31:39.104532 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f467e48d000 <0.000974>
25895 07:31:43.078283 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25897 07:31:52.607503 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f467c48b000 <0.001193>
25898 08:22:19.669250 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f467b48a000 <0.000921>
25900 08:22:20.093744 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f467a489000 <0.000832>
25896 08:22:22.625889 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4679488000 <0.001093>
25893 08:22:23.076138 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25894 08:22:27.951371 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25895 08:22:38.615218 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
25899 08:22:38.734819 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4675484000 <0.000974>
25897 08:22:40.071076 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4674483000 <0.000524>
申請的總內存為1536.375MB(16781312*96=1611005952),發現與第2步中分析可能泄露的內存
7f4674483000 rw-p 00000000 00:00 0 1575296 1574868 1574868 1535980 1574868 0 0
中的大小1575296/1024=1538.375MB,基本相等。由此推斷,這些內存均是泄露的內存
4、jstack日志查找線程
通過上一步驟獲取的pid 25897,轉換16進制后6529(`printf "%x\n" 25897),到jstack日志中查找,通過對比上一步驟的時間與jstack日志的時間及線程號,找到其對應的線程(使用的是線程池,會重復利用)
[root@host-x-x-x-x opt]# printf "%x\n" 25897
6529
[root@host-x-x-x-x opt]# grep 6529 -C 10 jstack_0301.txt |more
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
Locked ownable synchronizers:
- None
"WorkerPetalGroup-9-5" #236 prio=5 os_prio=0 tid=0x0000000002d4b000 nid=0x6529 waiting for monitor entry [0x00007f47142cc000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.activemq.ActiveMQSession.send(ActiveMQSession.java:1906)
- waiting to lock <0x000000009bfd68b8> (a java.lang.Object)
at org.apache.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:288)
at org.apache.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:223)
at org.apache.activemq.ActiveMQMessageProducerSupport.send(ActiveMQMessageProducerSupport.java:269)
at org.apache.james.queue.activemq.ActiveMQMailQueue.produceMail(ActiveMQMailQueue.java:203)
at org.apache.james.queue.jms.JMSMailQueue.enQueue(JMSMailQueue.java:275)
at org.apache.james.queue.api.MailQueue.enQueue(MailQueue.java:96)
at org.apache.james.queue.jms.JMSMailQueue.enQueue(JMSMailQueue.java:304)
--
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
Locked ownable synchronizers:
- None
"WorkerPetalGroup-9-5" #236 prio=5 os_prio=0 tid=0x0000000002d4b000 nid=0x6529 waiting for monitor entry [0x00007f47142cc000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.activemq.ActiveMQSession.send(ActiveMQSession.java:1906)
- waiting to lock <0x000000009bfd68b8> (a java.lang.Object)
at org.apache.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:288)
at org.apache.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:223)
at org.apache.activemq.ActiveMQMessageProducerSupport.send(ActiveMQMessageProducerSupport.java:269)
at org.apache.james.queue.activemq.ActiveMQMailQueue.produceMail(ActiveMQMailQueue.java:203)
at org.apache.james.queue.jms.JMSMailQueue.enQueue(JMSMailQueue.java:275)
at org.apache.james.queue.api.MailQueue.enQueue(MailQueue.java:96)
at org.apache.james.queue.jms.JMSMailQueue.enQueue(JMSMailQueue.java:304)
--
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
可以發現該線程WorkerPetalGroup-9-5為我們自定義的Netty線程,為了進一步確認線程,需要根據時間,如22:48:58
25897 22:48:58.918825 mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f46cc4db000 <0.000926>
在jstack的日志中搜索2021-03-01 22:48:5,因為jstack每10秒打一次,從下面的搜索結果向下搜索線程 ,當前行號是8093667
找到如下內容
8114644 "WorkerPetalGroup-9-5" #236 prio=5 os_prio=0 tid=0x0000000002d4b000 nid=0x6529 runnable [0x00007f47142cd000]
8114645 java.lang.Thread.State: RUNNABLE
8114646 at sun.nio.ch.NativeThread.current(Native Method)
8114647 at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:326)
8114648 - locked <0x000000009a3d1f10> (a java.lang.Object)
8114649 - locked <0x000000009a3d1ef0> (a java.lang.Object)
8114650 at io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:253)
8114651 at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1133)
8114652 at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:350)
8114653 at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:148)
8114654 at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714)
8114655 at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
8114656 at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
8114657 at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
8114658 at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
8114659 at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
8114660 at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
8114661 at java.lang.Thread.run(Thread.java:748)
8114662
8114663 Locked ownable synchronizers:
8114664 - None
通過以上樣例發現,內存的申請都是調用了
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
在代碼中查找
io.netty.channel.nio.AbstractNioByteChannel.NioByteUnsafe#read
發現在該方法中申請了內存
5、分析代碼
由於netty的堆外內存回收依賴於ByteBuf的引用計數器,可以通過refCnt()方法獲取,由於繼承了io.netty.channel.SimpleChannelInboundHandler#channelRead,該方法會調用釋放方法釋放堆外內存(當引用計數為0時才會釋放)
@Override
public void channelRead(ChannelHandlerContext ctx, Object msg) throws Exception {
boolean release = true;
try {
if (acceptInboundMessage(msg)) {
@SuppressWarnings("unchecked")
I imsg = (I) msg;
channelRead0(ctx, imsg);
} else {
release = false;
ctx.fireChannelRead(msg);
}
} finally {
if (autoRelease && release) {
ReferenceCountUtil.release(msg);
}
}
}
通過在第15行打斷點,觀察msg在調用ReferenceCountUtil.release(msg);前和調用后的refCnt()的值,如果調用后該值仍不為0,則會導致堆外內存無法釋放。可以在代碼中查看是否使用了msg.retain()方法(該方法會使引用計數+1)。
其他
自定義Netty線程名稱
由於有組件也依賴於Netty,為了區分線程,需要設置業務線程名稱,如下
// 指定線程池並指定線程名稱
bossGroup = new NioEventLoopGroup(new DefaultThreadFactory("BossPetalGroup"));
workerGroup = new NioEventLoopGroup(ioWorker, new DefaultThreadFactory("WorkerPetalGroup"));
涉及的工具
1、可以使用JVisualVM/JConsole監控JVM資源
2、堆可以使用MAT(MemoryAnalyzer Tool)分析泄露堆內存,線程資源;使用JProfiler分析對象實例信息
3、堆外內存需要使用NMT(Native Memory Tracing)結合jcmd命令(比jmap更強大)
4、內存分析可以使用Linux工具pmap結合strace跟蹤內存分配,內存內容使用gdb進行解析
5、線程使用jstack結合4中的工具分析
6、使用jps查看java進程,jinfo -flags查看JVM參數信息
7、通過free -g查看系統內存情況
8、netstat -aon|grep port查看端口情況
9、jstat -gcutil 31974 1000 10查看JVM GC情況
Netty防止內存泄露常識
InBound Message
在AbstractNioByteChannel.NioByteUnsafe.read() 處創建了ByteBuf並調用 pipeline.fireChannelRead(byteBuf) 送入Handler鏈。根據上面的誰最后誰負責原則,每個Handler對消息可能有三種處理方式
- 對原消息不做處理,調用 ctx.fireChannelRead(msg)把原消息往下傳,那不用做什么釋放。
- 將原消息轉化為新的消息並調用 ctx.fireChannelRead(newMsg)往下傳,那必須把原消息release掉。
- 如果已經不再調用ctx.fireChannelRead(msg)傳遞任何消息,那更要把原消息release掉。
假設每一個Handler都把消息往下傳,Handler並也不知道誰是啟動Netty時所設定的Handler鏈的最后一員,所以Netty在Handler鏈的最末補了一個TailHandler,如果此時消息仍然是ReferenceCounted類型就會被release掉。