關鍵詞:gdb、strace、kprobe、uprobe、objdump、meminfo、valgrind、backtrace等。
《Debugs Hacks中文版——深入調試的技術和工具》這本書是Miracle Linux一些同事合作,主要關注Linux下的調試技術和工具。
本文章以此書為藍本進行總結,進行適當的補充。
下面以Debug hacks地圖將內容組織如下:
0. 通用技術
《objdump》, 《strace》, 《kprobes》, 《uprobes》, systemtap, oprofile, 《valgrind》, 《/proc/meminfo》, 《/proc/<pid>/maps》
1. 程序異常結束應對方法
1.1 應用問題
內存非法訪問SEGV類型問題分析,可以《通過core+gdb離線分析》,通過watch分析非法內存訪問,《利用backtrace()/backtrace_symbols()棧回溯》。
1.2 內核問題
1.2.1 內核轉儲分析
如何設置內核轉儲?如何分析內核轉儲文件?
空指針引用,鏈表破壞等導致的Kernel Panic分析。
死循環,自旋鎖,信號量等導致的內核停止響應問題。
實時進程停止響應。
1.2.2 其他分析
《一個內存Oops解讀》:不同架構的Oops差異很大,尤其涉及到體系架構相關的寄存器、棧信息等。
2. 程序不結束異常
2.1 應用問題
用top查看負載是否過高?負載不高,則進行《應用程序死鎖停止響應分析》;負載高,則進行《應用程序死循環停止響應分析》。
2.2 內核問題
設置內核WDT檢測異常
通過SysRq分析問題
1. 使用strace尋找故障線索
strace用於跟蹤系統調用, 並顯示輸入輸出情況.
strace -o filename將strace結果保存到filename.
strace -f cmd跟蹤fork()之后的進程.
#include <stdio.h> #include <stdlib.h> int main(void) { FILE *fp; fp = fopen("/etc/shadow", "r"); if(fp == NULL) { printf("Error!\n"); return EXIT_FAILURE; } return EXIT_SUCCESS; }
編譯如上代碼, 使用strace ./st1執行結果如下:
execve("./st1", ["./st1"], [/* 81 vars */]) = 0... open("/etc/shadow", O_RDONLY) = -1 EACCES (Permission denied)-----------------找出問題的根源,是因為權限問題. fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0 write(1, "Error!\n", 7Error!------------------------------------------------------------控制台輸出內容 ) = 7 exit_group(1) = ? +++ exited with 1 +++
1.1 顯示系統調用地址
strace -i顯示對應系統調用地址:
[00007f53cb39e777] execve("./st1", ["./st1"], [/* 81 vars */]) = 0... [00007f5265e9a040] open("/etc/shadow", O_RDONLY) = -1 EACCES (Permission denied) [00007f5265e99c34] fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0 [00007f5265e9a2c0] write(1, "Error!\n", 7Error! ) = 7 [00007f5265e6f748] exit_group(1) = ? [????????????????] +++ exited with 1 +++
1.2 顯示系統調用相關時間
strace -tt和strace -ttt顯示每個系統調用執行的絕對時間, 只是格式不同.
09:10:35.411774 execve("./st1", ["./st1"], [/* 81 vars */]) = 0 ... 09:10:35.416745 open("/etc/shadow", O_RDONLY) = -1 EACCES (Permission denied) 09:10:35.416842 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0 09:10:35.416925 write(1, "Error!\n", 7Error! ) = 7 09:10:35.417019 exit_group(1) = ? 09:10:35.417161 +++ exited with 1 +++ 1537060239.606438 execve("./st1", ["./st1"], [/* 81 vars */]) = 0 ... 1537060239.609897 open("/etc/shadow", O_RDONLY) = -1 EACCES (Permission denied) 1537060239.609989 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0 1537060239.610065 write(1, "Error!\n", 7Error! ) = 7 1537060239.610160 exit_group(1) = ? 1537060239.610330 +++ exited with 1 +++
還有一種strace -r計算上一次系統調用開始到本次系統調用開始時間之間的差值:
0.000000 execve("./st1", ["./st1"], [/* 81 vars */]) = 0 ... 0.000074 open("/etc/shadow", O_RDONLY) = -1 EACCES (Permission denied) 0.000086 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0 0.000076 write(1, "Error!\n", 7Error! ) = 7 0.000095 exit_group(1) = ? 0.000146 +++ exited with 1 +++
strace -T則顯示每個系統調用總開始到結束的耗時.
execve("./st1", ["./st1"], [/* 81 vars */]) = 0 <0.000439>... open("/etc/shadow", O_RDONLY) = -1 EACCES (Permission denied) <0.000037> fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0 <0.000029> write(1, "Error!\n", 7Error! ) = 7 <0.000039> exit_group(1) = ?
strace -c顯示系統調用耗時的統計信息, 包括耗時百分比time, 總耗時seconds, 系統調用平均耗時usecs/call, 總次數calls, 錯誤次數errors, 系統調用名稱syscall.
% time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 20.35 0.000046 7 7 mmap 17.70 0.000040 10 4 mprotect 15.04 0.000034 34 1 munmap 11.95 0.000027 9 3 1 open 7.96 0.000018 6 3 3 access 7.96 0.000018 18 1 execve 5.75 0.000013 13 1 write 3.98 0.000009 3 3 fstat 3.98 0.000009 3 3 brk 2.21 0.000005 5 1 read 2.21 0.000005 3 2 close 0.88 0.000002 2 1 arch_prctl ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000226 30 4 total
strace -c只顯示內核中CPU耗時; 如下strace -w -c顯示從系統調用開始到結束的耗時, 更加准確.
% time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 31.18 0.000386 386 1 execve 15.67 0.000194 28 7 mmap 10.18 0.000126 32 4 mprotect 7.19 0.000089 30 3 1 open 6.38 0.000079 26 3 3 access 5.90 0.000073 24 3 fstat 5.82 0.000072 24 3 brk 4.60 0.000057 57 1 munmap 4.28 0.000053 53 1 arch_prctl 3.72 0.000046 23 2 close 2.99 0.000037 37 1 write 2.10 0.000026 26 1 read ------ ----------- ----------- --------- --------- ---------------- 100.00 0.001238 30 4 total
下面對sleep(3)分別使用兩個命令進行對比, 可以看出區別如下.
sudo strace -w -c -p `pidof st2` strace: Process 6277 attached ^Cstrace: Process 6277 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 99.16 93.011518 3000372 31 nanosleep 0.83 0.781669 781669 1 restart_syscall 0.00 0.003852 120 32 32 open 0.00 0.003817 119 32 write ------ ----------- ----------- --------- --------- ---------------- 100.00 93.800856 96 32 total
sudo strace -c -p `pidof st2` strace: Process 6277 attached ^Cstrace: Process 6277 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 41.00 0.007169 54 132 132 open 29.53 0.005164 39 131 nanosleep 29.11 0.005091 39 132 write 0.35 0.000062 62 1 restart_syscall ------ ----------- ----------- --------- --------- ---------------- 100.00 0.017486 396 132 total
可以看出從用戶角度strace -w -p更加准確, nanosleep()耗時基本上為3秒; 但是從內核角度來說, nanosleep並沒有實際占用3秒, 而是39微秒.
綜上所述,如果要分析單個系統調用的性能strace -T比較合適; 如果要分析系統調用統計信息strace -w -c比較合適.
1.3 attach到已有進程
如果進程已經運行, 可以通過strace -p `pidof st2`來附着到st2進行系統調用跟蹤.
#include <stdio.h> #include <stdlib.h> int main(void) { FILE *fp; while(1){ fp = fopen("/etc/shadow", "r"); if(fp == NULL) { printf("Error!\n"); } else close(fp); sleep(3); } return EXIT_SUCCESS; }
運行輸出Error之后, 通過strace附着到st2上, 並且打印系統調用耗時.
可以看出Error的原因, 並且可以看出sleep(3)的實際耗時.
strace: Process 6231 attached restart_syscall(<... resuming interrupted nanosleep ...>) = 0 <0.637061> open("/etc/shadow", O_RDONLY) = -1 EACCES (Permission denied) <0.000117> write(1, "Error!\n", 7) = 7 <0.000051> nanosleep({3, 0}, 0x7ffea3ee5560) = 0 <3.000289> open("/etc/shadow", O_RDONLY) = -1 EACCES (Permission denied) <0.000107> write(1, "Error!\n", 7) = 7 <0.000060>
1.4 strace -e高級功能
通過設置strace -e expr, 可以對監控的系統調用進行過濾.
2. valgrind使用方法
2.1 檢測內存泄露
內存泄露是申請的內存, 沒有被釋放. 造成可用內存越來越小, 從而引起內存緊張.
#include <stdio.h> #include <stdlib.h> #include <malloc.h> int main() { char *p = malloc(10); return EXIT_SUCCESS; }
通過valgrind --leak-check=yes ./test1, 得到如下結果.
==9218== Memcheck, a memory error detector ==9218== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==9218== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info ==9218== Command: ./test1 ==9218== ==9218== ==9218== HEAP SUMMARY:--------------------------------------------------------------------------堆的使用情況. ==9218== in use at exit: 10 bytes in 1 blocks ==9218== total heap usage: 1 allocs, 0 frees, 10 bytes allocated------------------------------1次分配, 0次釋放, 就是問題的根源. ==9218== ==9218== 10 bytes in 1 blocks are definitely lost in loss record 1 of 1 ==9218== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==9218== by 0x400537: main (test1.c:7)-------------------------------------------------------具體申請的點,也即泄漏點. ==9218== ==9218== LEAK SUMMARY:--------------------------------------------------------------------------泄露類型, 以及每種泄露情況. ==9218== definitely lost: 10 bytes in 1 blocks ==9218== indirectly lost: 0 bytes in 0 blocks ==9218== possibly lost: 0 bytes in 0 blocks ==9218== still reachable: 0 bytes in 0 blocks ==9218== suppressed: 0 bytes in 0 blocks ==9218== ==9218== For counts of detected and suppressed errors, rerun with: -v ==9218== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
2.2 檢測對非法內存地址的訪問
地址的越界操作, 也即堆非法地址訪問造成的問題比較隱蔽. 內存踩踏造成的影響, 也比較發散.
#include <stdio.h> #include <stdlib.h> #include <malloc.h> int main() { char *p = malloc(10); p[10] = 1; free(p); return EXIT_SUCCESS; }
執行valgrind ./test2結果如下:
==9265== Memcheck, a memory error detector ==9265== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==9265== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info ==9265== Command: ./test2 ==9265== ==9265== Invalid write of size 1--------------------------------------------------------------錯誤類型是, 無效的一字節寫. ==9265== at 0x400584: main (test2.c:8)-----------------------------------------------------錯誤發生地點. ==9265== Address 0x520404a is 0 bytes after a block of size 10 alloc'd-----------------------發生錯誤的地址. ==9265== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==9265== by 0x400577: main (test2.c:7) ==9265== ==9265== ==9265== HEAP SUMMARY:------------------------------------------------------------------------可以看出堆的使用沒有問題, 申請的內存被正確的釋放了. ==9265== in use at exit: 0 bytes in 0 blocks ==9265== total heap usage: 1 allocs, 1 frees, 10 bytes allocated ==9265== ==9265== All heap blocks were freed -- no leaks are possible ==9265== ==9265== For counts of detected and suppressed errors, rerun with: -v ==9265== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
2.3 訪問已釋放的區域
訪問已釋放的內存同樣也可能造成一些未知的錯誤, 造成一些不可理解的錯誤.
#include <stdio.h> #include <stdlib.h> #include <malloc.h> int main() { char *x = malloc(sizeof(int)); free(x); int a = *x + 1; return EXIT_SUCCESS; }
執行valgrind ./test4結果如下:
==9341== Memcheck, a memory error detector ==9341== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==9341== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info ==9341== Command: ./test4 ==9341== ==9341== Invalid read of size 1-------------------------------------------------------------讀一字節錯誤. ==9341== at 0x40058C: main (test4.c:10)--------------------------------------------------錯誤發生位置. ==9341== Address 0x5204040 is 0 bytes inside a block of size 4 free'd----------------------表明操作的地址只想一個已經被釋放的內存區域, 下面是詳細的申請位置和釋放位置. ==9341== at 0x4C2EDEB: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==9341== by 0x400587: main (test4.c:9) ==9341== Block was alloc'd at ==9341== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==9341== by 0x400577: main (test4.c:7) ==9341== ==9341== ==9341== HEAP SUMMARY: ==9341== in use at exit: 0 bytes in 0 blocks ==9341== total heap usage: 1 allocs, 1 frees, 4 bytes allocated ==9341== ==9341== All heap blocks were freed -- no leaks are possible ==9341== ==9341== For counts of detected and suppressed errors, rerun with: -v ==9341== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
2.4 內存雙重釋放
內存的雙重釋放問題在程序執行時, 已經可以暴露. 或者通過gdb + ulimit -c unlimited去分析.
#include <stdio.h> #include <stdlib.h> #include <malloc.h> int main() { char *x = malloc(sizeof(int)); free(x); free(x); return EXIT_SUCCESS; }
但是valgrind ./test5也提供了詳細的信息,
==9636== Memcheck, a memory error detector ==9636== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==9636== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info ==9636== Command: ./test5 ==9636== ==9636== Invalid free() / delete / delete[] / realloc()------------------------------------非正常釋放 ==9636== at 0x4C2EDEB: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)----產生錯誤的位置, 下面是正常的申請和釋放位置. ==9636== by 0x400593: main (test5.c:10) ==9636== Address 0x5204040 is 0 bytes inside a block of size 4 free'd ==9636== at 0x4C2EDEB: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==9636== by 0x400587: main (test5.c:9) ==9636== Block was alloc'd at ==9636== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==9636== by 0x400577: main (test5.c:7) ==9636== ==9636== ==9636== HEAP SUMMARY:---------------------------------------------------------------------堆的使用情況, 一個申請兩個釋放. ==9636== in use at exit: 0 bytes in 0 blocks ==9636== total heap usage: 1 allocs, 2 frees, 4 bytes allocated ==9636== ==9636== All heap blocks were freed -- no leaks are possible ==9636== ==9636== For counts of detected and suppressed errors, rerun with: -v ==9636== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
2.5 非法棧操作
#include <stdio.h> #include <stdlib.h> #include <malloc.h> int main() { int a; int *p = &a; p -= 0x80; *p = 1; return EXIT_SUCCESS; }
將p指向的地址向前移動512字節, 這個地址已經在棧之外了. 結果如下:
==10026== Memcheck, a memory error detector ==10026== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==10026== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info ==10026== Command: ./test6 ==10026== ==10026== Invalid write of size 4-------------------------------------------------非法的寫, 超出棧的區域. ==10026== at 0x400571: main (in /home/al/debug_hacks/valgrind/test6)-----------產生非法寫的位置. ==10026== Address 0xffefff8ac is on thread 1's stack ==10026== 500 bytes below stack pointer------------------------------------------相對於棧指針的偏移. ==10026== ==10026== ==10026== HEAP SUMMARY: ==10026== in use at exit: 0 bytes in 0 blocks ==10026== total heap usage: 0 allocs, 0 frees, 0 bytes allocated ==10026== ==10026== All heap blocks were freed -- no leaks are possible ==10026== ==10026== For counts of detected and suppressed errors, rerun with: -v ==10026== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
2.6 不對稱釋放
不對稱釋放, 也即free釋放的內存並不是malloc()分配的.
#include <stdio.h> #include <stdlib.h> #include <memory.h> int main() { char szTest[100] = {0}; char *p = szTest; free(p); return 0; }
==11548== Memcheck, a memory error detector ==11548== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==11548== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info ==11548== Command: ./test7 ==11548== ==11548== Invalid free() / delete / delete[] / realloc()---------------------------------------不正確的釋放free(). ==11548== at 0x4C2EDEB: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==11548== by 0x4005DD: main (test7.c:10)----------------------------------------------------異常出現地點. ==11548== Address 0xffefffa50 is on thread 1's stack ==11548== in frame #1, created by main (test7.c:7) ==11548== ==11548== ==11548== HEAP SUMMARY: ==11548== in use at exit: 0 bytes in 0 blocks ==11548== total heap usage: 0 allocs, 1 frees, 0 bytes allocated------------------------------沒有malloc()的free(). ==11548== ==11548== All heap blocks were freed -- no leaks are possible ==11548== ==11548== For counts of detected and suppressed errors, rerun with: -v ==11548== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
3. 利用backtrace()/backtrace_symbols()棧回溯
利用backtrace()獲取當前線程調用棧,然后通過backtrace_symbols()將地址轉化為一個字符串數組。從而實現了用戶空間的棧回溯。
在頭文件"execinfo.h"中聲明了三個函數用於獲取當前線程的函數調用堆棧。
#include <execinfo.h> int backtrace(void **buffer, int size); char **backtrace_symbols(void *const *buffer, int size); void backtrace_symbols_fd(void *const *buffer, int size, int fd);
int backtrace(void **buffer, int size)
該函數用於獲取當前線程的調用堆棧,獲取的信息將會被存放在buffer中,它是一個指針數組;參數 size 用來指定buffer中可以保存多少個void* 元素。
函數返回值是實際獲取的指針個數,最大不超過size大小在buffer中的指針實際是從堆棧中獲取的返回地址,每一個堆棧frame有一個返回地址。
注意某些編譯器的優化選項對獲取正確的調用堆棧有干擾,另外內聯函數沒有堆棧框架;刪除框架指針也會使無法正確解析堆棧內容。
char ** backtrace_symbols (void *const *buffer, int size)
backtrace_symbols()將從backtrace函數獲取的信息轉化為一個字符串數組。
參數buffer應該是從backtrace函數獲取的數組指針,size是該數組中的元素個數(backtrace的返回值),函數返回值是一個指向字符串數組的指針,它的大小同buffer相同。
每個字符串包含了一個相對於buffer中對應元素的可打印信息。它包括函數名,函數的偏移地址,和實際的返回地址。
現在,只有使用ELF二進制格式的程序和苦衷才能獲取函數名稱和偏移地址。在其他系統,只有16進制的返回地址能被獲取。另外,你可能需要傳遞相應的標志給鏈接器,以能支持函數名功能(比如,在使用GNU ld的系統中,你需要傳遞(-rdynamic))。
backtrace_symbols()生成的字符串都是malloc出來的,但是不要最后一個一個的free,因為backtrace_symbols是根據backtrace給出的call stack層數,一次性的malloc出來一塊內存來存放結果字符串的,所以,像上面代碼一樣,只需要在最后,free backtrace_symbols()的返回指針就OK了。
這一點backtrace的manual中也是特別提到的。
注意:如果不能為字符串獲取足夠的空間函數的返回值將會為NULL
void backtrace_symbols_fd (void *const *buffer, int size, int fd)
backtrace_symbols_fd()與backtrace_symbols()函數具有相同的功能,不同的是它不會給調用者返回字符串數組,而是將結果寫入文件描述符為fd的文件中,每個函數對應一行.它不需要調用malloc函數,因此適用於有可能調用該函數會失敗的情況。
#include <execinfo.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <signal.h> #define SIZE 1000 void trace(int signo) { int j, nptrs; void *buffer[SIZE]; char **strings; printf("signo: %d\n", signo); nptrs = backtrace(buffer, SIZE); printf("backtrace() returned %d addresses\n", nptrs); /* The call backtrace_symbols_fd(buffer, nptrs, STDOUT_FILENO) * would produce similar output to the following: */ strings = backtrace_symbols(buffer, nptrs); if (strings == NULL) { perror("Backtrace:"); exit(EXIT_FAILURE); } for (j = 0; j < nptrs; j++) printf("%s\n", strings[j]); free(strings); if (SIGSEGV == signo || SIGQUIT == signo) { exit(0); } } void segfault(void) { int *p = NULL; *p = 1; } int main(int argc, char *argv[]) { signal(SIGSEGV, trace); signal(SIGINT, trace); signal(SIGQUIT, trace); while (1) { sleep(1); if (time(0) % 7 == 0) { segfault(); } } return 0; }
編譯的時候 需要打開-g -rdynamic -fexceptions選項,
gcc -g -rdynamic -fexceptions backtrace.c -o backtrace
然后執行./backtrace得到如下結果:
./backtrace(trace+0x4b) [0x400b21] /lib/x86_64-linux-gnu/libc.so.6(+0x354b0) [0x7fc4a97b54b0] ./backtrace(segfault+0x10) [0x400c12] ./backtrace(main+0x85) [0x400ca0] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7fc4a97a0830] ./backtrace(_start+0x29) [0x400a09]
除了trace()函數,最接近的是segfault函數。
通過addr2line獲取在文件中位置。
addr2line 0x400c12 -e backtrace -afs 0x0000000000400c12 segfault backtrace.c:43
還可以通過objdump -SC backtrace,查看詳細信息。可以看出0x400C12對應的內容是*p = 1;。
4. objdump使用說明
objdump命令是用查看目標文件或者可執行的目標文件的構成的gcc工具。
objdump是gcc工具,用來查看編譯后目標文件的組成。
常用命令:
objdump -x obj:以某種分類信息的形式把目標文件的數據組成輸出;<可查到該文件的的所有動態庫>
objdump -t obj:輸出目標文件的符號表()
objdump -h obj:輸出目標文件的所有段概括()
objdump -j ./text/.data -S obj:輸出指定段的信息(反匯編源代碼)
objdump -S obj:輸出目標文件的符號表(), 當gcc -g時打印更明顯
objdump -j .text -Sl stack1 | more
-S 盡可能反匯編出源代碼,尤其當編譯的時候指定了-g這種調試參數時,效果比較明顯。隱含了-d參數。PS:需要objdump和源碼統一目錄下。
-l 用文件名和行號標注相應的目標代碼,僅僅和-d、-D或者-r一起使用使用-ld和使用-d的區別不是很大,在源碼級調試的時候有用,要求編譯時使用了-g之類的調試編譯選項。
-j name 僅僅顯示指定section的信息。
5. 應用程序死鎖停止響應分析
如果鎖使用的不好,會造成應用程序停止相應。
gcc -g astall.c -o astall -lpthread編譯如下代碼:
#include <stdio.h> #include <stdlib.h> #include <pthread.h> pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER; int cnt = 0; void cnt_reset(void) { pthread_mutex_lock(&mutex); cnt = 0; pthread_mutex_unlock(&mutex); } void *thr(void) { while(1) { pthread_mutex_lock(&mutex); if(cnt > 2) cnt_reset(); else cnt++; pthread_mutex_unlock(&mutex); printf("%d\n", cnt); sleep(1); } } int main(void) { pthread_t tid; pthread_create(&tid, NULL, thr, NULL); pthread_join(tid, NULL); return EXIT_SUCCESS; }
執行程序3秒后,程序就會停止響應。
此時通過top -p `pidof astall`可以看出,進程並沒有死循環,而是在睡眠狀態。
通過ps ax -L | grep astall可以看出兩個線程都處於Sl+狀態,也即都在睡眠中。
此時可以通過sudo gdb -p `pidof astall` attach到此進程。
可以看到當前有兩個線程在執行.
分別查看兩個線程的棧信息,線程1的睡眠點符合預期在pthread_join();線程2的睡眠點在pthread_mutex_lock(),這就是問題的根源。
從thread 2的棧回溯start_thread()->thr()->cnt_reset()->pthread_mutex_lock()可以看出,鎖死現場。
還可以編寫gdb 腳本來記錄pthread_mutex_lock()/pthread_mutex_unlock()被調用是棧信息:
gdb astall -x debug.cmd
set pagination off set logging file debug.log set logging overwrite set logging on start set $addr1 = pthread_mutex_lock set $addr2 = pthread_mutex_unlock b *$addr1 b *$addr2 while 1 c if $pc != $addr1 && $pc != $addr2 quit end bt full end
6. 應用程序死循環停止響應分析
有時候應用進入死循環,這時候可以通過top簡單確認。
如果CPU占用率很高,則說明很可能進入死循環。
#include <stdio.h> #include <stdlib.h> #include <pthread.h> pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER; int cnt = 0; void cnt_reset(void) { while(1){}; } void *thr(void) { while(1) { if(cnt > 2) cnt_reset(); else cnt++; printf("%d\n", cnt); sleep(1); } } int main(void) { pthread_t tid; pthread_create(&tid, NULL, thr, NULL); pthread_join(tid, NULL); return EXIT_SUCCESS; }
用top可以看出astall2占用率接近100%。
再用ps查看,可以兩個進程,一個進程死循環,另一個進程在睡眠。
分別查看兩個線程的棧信息,可以明顯的看出thread 1處於睡眠狀態;thread 2處於while(1),也即問題根源。
6. uprobe使用
uprobe是和kprobe類似的調試方法,編譯內核時通過CONFIG_UPROBE_EVENT=y來使能該特性。
6.1 uprobe介紹
和kprobe類似,使用時不需要通過current_tracer來激活,而是檢測點通過/sys/kernel/debug/tracing/uprobe_events設置,通過/sys/kernel/debug/tracing/events/uprobes/<EVENT>/enabled來使能。
然而,和kprobe不同的是,使用時需要用戶自己計算探測點在用戶態文件中的偏移,可以通過nm等工具,這還是有點麻煩的。
可以通過/sys/kernel/debug/tracing/uprobe_profile來查看某一檢測事件命中的總數和沒有命中的總數。第一列是事件名稱,第二列是事件命中的次數,第三列是事件miss-hits的次數。
6.2 uprobe示例
#include <stdlib.h> #include <stdio.h> int count = 0; int do_sth() { printf("current count = %d\n", count); return count++; } int main(int argc, char* argv[]) { int i = 0; while(1) { i = do_sth(); }; return 0; }
獲取函數的偏移方法:1.通過objdump找到函數對應地址A1;2.在maps查看程序對應的加載地址A2。A1-A2就是探測點函數的偏移地址。
# cat /proc/`pgrep bash`/maps | grep /bin/bash | grep r-xp 00400000-004e1000 r-xp 00000000 08:01 786439 /bin/bash # objdump -T /bin/zsh | grep -w free 00000000004ab500 g DF .text 0000000000000009 Base free
使用如下:
echo > /sys/kernel/debug/tracing/trace echo 'p:myprobe do_sys_open dfd=%ax filename=%dx flags=%cx mode=+4($stack)' > /sys/kernel/debug/tracing/kprobe_events echo 'r:myretprobe do_sys_open ret=$retval' >> /sys/kernel/debug/tracing/kprobe_events echo 'p:do_sth /home/al/debug_hacks/uprobe/loop_print:0x526 %ip %ax' > /sys/kernel/debug/tracing/uprobe_events echo 'r:do_sth_exit /home/al/debug_hacks/uprobe/loop_print:0x526 %ip %ax' >> /sys/kernel/debug/tracing/uprobe_events echo 1 > /sys/kernel/debug/tracing/events/kprobes/myprobe/enable echo 1 > /sys/kernel/debug/tracing/events/kprobes/myretprobe/enable echo 1 > /sys/kernel/debug/tracing/events/uprobes/do_sth/enable echo 1 > /sys/kernel/debug/tracing/events/uprobes/do_sth_exit/enable echo 0 > /sys/kernel/debug/tracing/events/kprobes/myprobe/enable echo 0 > /sys/kernel/debug/tracing/events/kprobes/myretprobe/enable echo 0 > /sys/kernel/debug/tracing/events/uprobes/do_sth/enable echo 0 > /sys/kernel/debug/tracing/events/uprobes/do_sth_exit/enable echo > /sys/kernel/debug/tracing/kprobe_events echo > /sys/kernel/debug/tracing/uprobe_events cat /sys/kernel/debug/tracing/trace
7. systemtap
8. oprofile
配置編譯環境和下載源碼:
sudo apt install binutils-dev libiberty-dev libpopt-dev -y
wget https://nchc.dl.sourceforge.net/project/oprofile/oprofile/oprofile-1.3.0/oprofile-1.3.0.tar.gz