一、coredump
當用戶態進程出現異常后,在該進程的執行目錄下生成對應的coredump文件,如果我們想將coredump生成的位置做改變,就需要如下設置。
echo "/home/core-%e-%p-%u-%g-%t" > /proc/sys/kernel/core_pattern echo 0x000003ff >/proc/self/coredump_filter ulimit -c unlimited (修改profile文件) source /etc/profile
%%:相當於% %p:相當於<pid> %u:相當於<uid> %g:相當於<gid> %s:相當於導致dump的信號的數字 %t:相當於dump的時間 %e:相當於執行文件的名稱 %h:相當於hostname
常用的分析coredump的命令有:
bt(打印調用棧),f num(查看某一個frame的調用棧),disassemble 0x000000000040b9f0 (disassemble 地址,查看對應地址的反匯編),i r(查看寄存器的內容),p *(struct link_map*)0x7fab515ff690(查看結構體信息),x /40xb 0x7fab515ff690(查看某個地址的存儲值)
info proc mappings
(gdb) info proc mappings 查看proc maps Mapped address spaces: Start Addr End Addr Size Offset objfile 0x400000 0x410000 0x10000 0x0 /usr/bin/sysmonitor 0x60f000 0x610000 0x1000 0xf000 /usr/bin/sysmonitor 0x610000 0x611000 0x1000 0x10000 /usr/bin/sysmonitor 0x7fab509ee000 0x7fab50ba4000 0x1b6000 0x0 /usr/lib64/libc-2.17.so 0x7fab50ba4000 0x7fab50da3000 0x1ff000 0x1b6000 /usr/lib64/libc-2.17.so 0x7fab50da3000 0x7fab50da7000 0x4000 0x1b5000 /usr/lib64/libc-2.17.so 0x7fab50da7000 0x7fab50da9000 0x2000 0x1b9000 /usr/lib64/libc-2.17.so 0x7fab50dae000 0x7fab50daf000 0x1000 0x0 /usr/lib64/libalarm.so 0x7fab50daf000 0x7fab50faf000 0x200000 0x1000 /usr/lib64/libalarm.so 0x7fab50faf000 0x7fab50fb0000 0x1000 0x1000 /usr/lib64/libalarm.so
info files
(gdb) info files Symbols from "/home/sysmonitor". Local core dump file: `/home/core.sysmonitor_98927_1468144735', file type elf64-x86-64. 0x0000000000400000 - 0x0000000000410000 is load1 0x000000000060f000 - 0x0000000000610000 is load2 0x0000000000610000 - 0x0000000000611000 is load3 0x0000000000611000 - 0x0000000000644000 is load4 0x00000000023c0000 - 0x00000000023e1000 is load5 0x00007fab3c000000 - 0x00007fab3c021000 is load6 0x00007fab3c021000 - 0x00007fab40000000 is load7 0x00007fab44000000 - 0x00007fab44021000 is load8 0x00007fab44021000 - 0x00007fab48000000 is load9 0x00007fab4a9e2000 - 0x00007fab4a9e3000 is load10 0x00007fab4a9e3000 - 0x00007fab4b1e3000 is load11 0x00007fab4b1e3000 - 0x00007fab4b1e4000 is load12 0x00007fab4b1e4000 - 0x00007fab4b9e4000 is load13 0x00007fab50da3740 - 0x00007fab50da3748 is .init_array in /usr/lib64/libc.so.6 0x00007fab50da3748 - 0x00007fab50da3838 is __libc_subfreeres in /usr/lib64/libc.so.6 0x00007fab50da3838 - 0x00007fab50da3840 is __libc_atexit in /usr/lib64/libc.so.6 0x00007fab50da3840 - 0x00007fab50da3858 is __libc_thread_subfreeres in /usr/lib64/libc.so.6 0x00007fab50da3860 - 0x00007fab50da6b80 is .data.rel.ro in /usr/lib64/libc.so.6 0x00007fab50da6b80 - 0x00007fab50da6d70 is .dynamic in /usr/lib64/libc.so.6 0x00007fab50da6d70 - 0x00007fab50da6ff0 is .got in /usr/lib64/libc.so.6
gdb使用方法,gdb /usr/bin/cmd(debuginfo的文件,或者帶-g選項的) core_xxx
watch *(int*)監控4字節地址 用戶態修改
二、dmesg
當用戶進程出現coredump時,在messages日志中也會記錄異常,尤其是dmesg日志中,比如下面的記錄:
php-fpm22053: segfault at 2559 ip 000000398a6145b2 sp 00007fffad1d4b78 error 4 in ld-2.5.so398a600000+1c000
其實demsg日志,也可以給我們提供一些有效信息,包括段錯誤的地址0x2559, 指令執行寄存器IP0x 000000398a6145b2, 當前棧地址SP 0x00007fffad1d4b78, 錯誤號4,以及段錯誤發生在ld-2.5.so
中。發生在ld-2.5.so 中, 我們沒有debug symbol信息,因此無法直接定位段錯誤在程序的哪一行。但通過IP寄存器, 我們是可以定位到具體的匯編指令的。
objdump -d /lib64/ld-2.5.so > ld.asm 000000398a6145b0 <strcmp>: 398a6145b0: 8a 07 mov (%rdi),%al 398a6145b2: 3a 06 cmp (%rsi),%al 398a6145b4: 75 0d jne 398a6145c3 <strcmp+0x13>