在Linux中調試段錯誤(core dumped)


在Linux中調試段錯誤(core dumped)

  • 在作比賽的時候經常遇到段錯誤, 但是一般都采用的是printf打印信息這種笨方法,而且定位bug比較慢,今天嘗試利用gdb工具調試段錯誤.
  • 段錯誤(core dumped)一般都是數組索引位置不對,或者是數組越界等問題造成,在Linux環境下編程應該很容易就會遇到.

GDB調試的具體流程

什么是段錯誤Segmentation fault (core dumped)

  • 段錯誤一般是指程序嘗試訪問它不被允許訪問的內存地址,可能會被一下情況導致:
    • 試圖訪問(dereference)一個空指針, 系統不允許訪問地址為0的內存空間;
    • 試圖訪問一個不在自己內存訪問范圍內的一個指針;
    • 在C++程序中, 一個類的vtable(虛指針的列表)被占用, 而且指向了一個錯誤的地方, 導致程序試圖去執行一個沒有運行權限的地址;
    • 未內存對齊的程序訪問也可能導致段錯誤.

valgrind簡單工具進行調試

  • valgrind可以跟蹤程序的一些堆棧信息, 使用之前必須利用sudo apt-get install valgrind進行安裝該命令行工具.
  • 然后通過valgrind -v 可執行程序名字追蹤有問題的二進制可執行程序.
  • 下面是段錯誤程序的顯示結果:
$ valgrind -v ./bin/CodeCraft-2019 
==19578== Memcheck, a memory error detector
==19578== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==19578== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==19578== Command: ./bin/CodeCraft-2019
==19578== 
--19578-- Valgrind options:
--19578--    -v
--19578-- Contents of /proc/version:
--19578--   Linux version 4.15.0-46-generic (buildd@lgw01-amd64-038) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #49-Ubuntu SMP Wed Feb 6 09:33:07 UTC 2019
--19578-- 
--19578-- Arch and hwcaps: AMD64, LittleEndian, amd64-cx16-rdtscp-sse3-avx
--19578-- Page sizes: currently 4096, max supported 4096
--19578-- Valgrind library directory: /usr/lib/valgrind
--19578-- Reading syms from /usrdata/applications/huawei2019/03-28-01-coredump/bin/CodeCraft-2019
--19578-- Reading syms from /lib/x86_64-linux-gnu/ld-2.27.so
--19578--   Considering /lib/x86_64-linux-gnu/ld-2.27.so ..
--19578--   .. CRC mismatch (computed 1b7c895e wanted 2943108a)
--19578--   Considering /usr/lib/debug/lib/x86_64-linux-gnu/ld-2.27.so ..
--19578--   .. CRC is valid
--19578-- Reading syms from /usr/lib/valgrind/memcheck-amd64-linux
--19578--   Considering /usr/lib/valgrind/memcheck-amd64-linux ..
--19578--   .. CRC mismatch (computed c25f395c wanted 0a9602a8)
--19578--    object doesn't have a symbol table
--19578--    object doesn't have a dynamic symbol table
--19578-- Scheduler: using generic scheduler lock implementation.
--19578-- Reading suppressions file: /usr/lib/valgrind/default.supp
==19578== embedded gdbserver: reading from /tmp/vgdb-pipe-from-vgdb-to-19578-by-jl-on-???
==19578== embedded gdbserver: writing to   /tmp/vgdb-pipe-to-vgdb-from-19578-by-jl-on-???
==19578== embedded gdbserver: shared mem   /tmp/vgdb-pipe-shared-mem-vgdb-19578-by-jl-on-???
==19578== 
==19578== TO CONTROL THIS PROCESS USING vgdb (which you probably
==19578== don't want to do, unless you know exactly what you're doing,
==19578== or are doing some strange experiment):
==19578==   /usr/lib/valgrind/../../bin/vgdb --pid=19578 ...command...
==19578== 
==19578== TO DEBUG THIS PROCESS USING GDB: start GDB like this
==19578==   /path/to/gdb ./bin/CodeCraft-2019
==19578== and then give GDB the following command
==19578==   target remote | /usr/lib/valgrind/../../bin/vgdb --pid=19578
==19578== --pid is optional if only one valgrind process is running
==19578== 
--19578-- REDIR: 0x401f2f0 (ld-linux-x86-64.so.2:strlen) redirected to 0x58060901 (???)
--19578-- REDIR: 0x401f0d0 (ld-linux-x86-64.so.2:index) redirected to 0x5806091b (???)
--19578-- Reading syms from /usr/lib/valgrind/vgpreload_core-amd64-linux.so
--19578--   Considering /usr/lib/valgrind/vgpreload_core-amd64-linux.so ..
--19578--   .. CRC mismatch (computed 4b63d83e wanted 670599e6)
--19578--    object doesn't have a symbol table
--19578-- Reading syms from /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so
--19578--   Considering /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so ..
--19578--   .. CRC mismatch (computed a4b37bee wanted 8ad4dc94)
--19578--    object doesn't have a symbol table
==19578== WARNING: new redirection conflicts with existing -- ignoring it
--19578--     old: 0x0401f2f0 (strlen              ) R-> (0000.0) 0x58060901 ???
--19578--     new: 0x0401f2f0 (strlen              ) R-> (2007.0) 0x04c32db0 strlen
--19578-- REDIR: 0x401d360 (ld-linux-x86-64.so.2:strcmp) redirected to 0x4c33ee0 (strcmp)
--19578-- REDIR: 0x401f830 (ld-linux-x86-64.so.2:mempcpy) redirected to 0x4c374f0 (mempcpy)
--19578-- Reading syms from /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25
--19578--    object doesn't have a symbol table
--19578-- Reading syms from /lib/x86_64-linux-gnu/libgcc_s.so.1
--19578--    object doesn't have a symbol table
--19578-- Reading syms from /lib/x86_64-linux-gnu/libc-2.27.so
--19578--   Considering /lib/x86_64-linux-gnu/libc-2.27.so ..
--19578--   .. CRC mismatch (computed b1c74187 wanted 042cc048)
--19578--   Considering /usr/lib/debug/lib/x86_64-linux-gnu/libc-2.27.so ..
--19578--   .. CRC is valid
--19578-- Reading syms from /lib/x86_64-linux-gnu/libm-2.27.so
--19578--   Considering /lib/x86_64-linux-gnu/libm-2.27.so ..
--19578--   .. CRC mismatch (computed 7feae033 wanted b29b2508)
--19578--   Considering /usr/lib/debug/lib/x86_64-linux-gnu/libm-2.27.so ..
--19578--   .. CRC is valid
--19578-- REDIR: 0x547bc70 (libc.so.6:memmove) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547ad40 (libc.so.6:strncpy) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547bf50 (libc.so.6:strcasecmp) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547a790 (libc.so.6:strcat) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547ad70 (libc.so.6:rindex) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547d7c0 (libc.so.6:rawmemchr) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547bde0 (libc.so.6:mempcpy) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547bc10 (libc.so.6:bcmp) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547ad00 (libc.so.6:strncmp) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547a800 (libc.so.6:strcmp) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547bd40 (libc.so.6:memset) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x54990f0 (libc.so.6:wcschr) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547aca0 (libc.so.6:strnlen) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547a870 (libc.so.6:strcspn) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547bfa0 (libc.so.6:strncasecmp) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547a840 (libc.so.6:strcpy) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547c0e0 (libc.so.6:memcpy@@GLIBC_2.14) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547ada0 (libc.so.6:strpbrk) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547a7c0 (libc.so.6:index) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547ac70 (libc.so.6:strlen) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x54856c0 (libc.so.6:memrchr) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547bff0 (libc.so.6:strcasecmp_l) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547bbe0 (libc.so.6:memchr) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x5499eb0 (libc.so.6:wcslen) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547b050 (libc.so.6:strspn) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547bf20 (libc.so.6:stpncpy) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547bef0 (libc.so.6:stpcpy) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547d7f0 (libc.so.6:strchrnul) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x547c040 (libc.so.6:strncasecmp_l) redirected to 0x4a2a6e0 (_vgnU_ifunc_wrapper)
--19578-- REDIR: 0x548e330 (libc.so.6:__strrchr_sse2) redirected to 0x4c32790 (__strrchr_sse2)
--19578-- REDIR: 0x5474070 (libc.so.6:malloc) redirected to 0x4c2faa0 (malloc)
--19578-- REDIR: 0x548e620 (libc.so.6:__strlen_sse2) redirected to 0x4c32d30 (__strlen_sse2)
--19578-- REDIR: 0x556cfc0 (libc.so.6:__memcmp_sse4_1) redirected to 0x4c35d50 (__memcmp_sse4_1)
--19578-- REDIR: 0x5486e70 (libc.so.6:__strcmp_sse2_unaligned) redirected to 0x4c33da0 (strcmp)
Begin
--19578-- REDIR: 0x5498440 (libc.so.6:__mempcpy_sse2_unaligned) redirected to 0x4c37130 (mempcpy)
please input args: carPath, roadPath, crossPath, answerPath
--19578-- REDIR: 0x5498870 (libc.so.6:__memset_sse2_unaligned) redirected to 0x4c365d0 (memset)
--19578-- REDIR: 0x5474950 (libc.so.6:free) redirected to 0x4c30cd0 (free)
==19578== 
==19578== HEAP SUMMARY:
==19578==     in use at exit: 0 bytes in 0 blocks
==19578==   total heap usage: 2 allocs, 2 frees, 73,728 bytes allocated
==19578== 
==19578== All heap blocks were freed -- no leaks are possible
==19578== 
==19578== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==19578== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

怎么才能獲得core dump文件

  • 一個core dump文件是程序運行時的一份內存拷貝, 通過這個文件可以調試程序找到出bug的地方;
  • 當程序程序出現了段錯誤時, Linux內核會根據配置情況將一個core dump文件寫入到硬盤中.
  • Linux用ulimit設置連接數的最大值, ulimit只能做臨時修改,重啟后失效:
    • ulimit -c 設置core文件的最大值, 單位為區塊;
    • ulimit -a 顯示目前資源限制的設定.
    • 利用ulimit -c unlimited將core文件設置為無限大.
  • 不能產生core文件的原因:
    • 沒有足夠內存空間;
    • 禁用了core文件的創建;
    • 設置一個進程當前目錄沒有寫文件的的權限;
  • 利用命令sudo sysctl -w kernel.core_pattern=/tmp/core-%e.%p.%h.%t設置內核產生core文件的形式和位置, 放於/tmp目錄並且顯示時間戳.
    • 當程序出現段錯誤的時候, linux內核會自動地在/tmp目錄保存一個core文件.
  • 利用cat /proc/PID/limit也可以顯示一個進程中的core文件的大小限制.
  • kernel.core_pattern表示coredumps文件放於什么地方,它是一個內核參數,可以通過sysctl進行查看和進行控制:
    • sysctl -a表示查看內核的所有參數, 或使用sysctl kernel.core_pattern顯示kernel.core_pattern的參數.

通過GDB工具對生成的core文件進行回溯追蹤

  • 通過命令gdb -c my_core_file打開一個名為my_core_file的文件.
  • 調試我的coredump的程序結果如下:
sudo gdb -c /tmp/core-CodeCraft-2019.23637.jl.1554030516
GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
[New LWP 23637]
Core was generated by `./bin/CodeCraft-2019 ../1-map-training-1/car.txt ../1-map-training-1/road.txt .'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000556393da88ad in ?? ()
(gdb)
  • 可以看到, 該程序在執行過程中接收到了一個SIGSEGV信號, 該信號表示一個進程執行了一個無效的內存引用, 或發生了段錯誤.
  • 然后在gdb工具中不停的bt找到出現段錯誤在程序的多少行和真正引起段錯誤的原因.
    • bt的含義是backtrace, 列出調用棧.
    • gdb調試中常用的幾個命令參數:
      • attachGDB調試一個正在運行中的進程gdb <program> PID;
      • br用來設置斷點, br filename:line_num,br namespace::classname::func_name;
      • n:單步跳過, s:單步進行;
      • finish:執行到函數return返回的地方;
      • list:列出當前位置之后的10行代碼;list line_number列出line_number之后的十行代碼;
      • info locals列出當前函數的局部變量;
      • p var_打印變量值;
      • info breakpoints列出所有斷點;
      • delete breakpoints刪除所有斷點;
      • delete breakpoints id刪除編號為id的斷點;
      • disable/enable breakpoints id禁用/啟動斷點;
      • break ... if ...條件中斷;
  • 我的程序執行bt后發現有很多問號, 這是因為gdb沒有加載我程序庫的信息, 編譯的時候需要加上-g選項:
(gdb) bt
#0  0x0000556393da88ad in ?? ()
#1  0x00000009b6f194c0 in ?? ()
#2  0x00005563b686d1b0 in ?? ()
#3  0x00005563b688abe0 in ?? ()
#4  0x00007ffe22b8c070 in ?? ()
#5  0x00005563b5f36460 in ?? ()
#6  0x0000000000002bf9 in ?? ()
#7  0x0000000000000004 in ?? ()
#8  0x00005563b718a580 in ?? ()
#9  0x0000000000000020 in ?? ()
#10 0x00007ffe22b8c510 in ?? ()
#11 0x00007ffe22b8bf50 in ?? ()
#12 0x00005563b6a2ffd0 in ?? ()
#13 0x00007ffe22b8bf50 in ?? ()
#14 0x0000000000000008 in ?? ()
#15 0x00005563b6a30004 in ?? ()
#16 0x00007ffe22b8c450 in ?? ()
#17 0x00007ffe22b8c590 in ?? ()
#18 0x0000556393dabd1e in ?? ()
#19 0x00007f1b3f2da1f0 in ?? ()
#20 0x0000556393dabcd2 in ?? ()
#21 0x00007ffe22b8c610 in ?? ()
#22 0x00007ffe22b8bf00 in ?? ()
#23 0x00007ffe22b8c550 in ?? ()
#24 0x00007ffe22f747d0 in ?? ()
#25 0x00007ffe22b8c220 in ?? ()
#26 0x00007ffe22b8c200 in ?? ()
#27 0x0000000000000032 in ?? ()
#28 0x00007ffe22b8c470 in ?? ()
#29 0x00007ffe22b8c530 in ?? ()
#30 0x00007ffe22b8c070 in ?? ()
#31 0x00007ffe22b8c4f0 in ?? ()
#32 0x00007ffe22b8c510 in ?? ()
#33 0x00000000000211e0 in ?? ()
#34 0x00007ffe22b8c5b0 in ?? ()
#35 0x0000000022b8c490 in ?? ()
#36 0x0000000000000198 in ?? ()
#37 0x00007ffe22b8c490 in ?? ()
#38 0x00007ffe22b8c630 in ?? ()
#39 0x00007ffe22b8befc in ?? ()
#40 0x00003d2400000005 in ?? ()
#41 0x0000000000000000 in ?? ()

  • gdb中執行symbol-file 共享動態庫的路徑進行加載gdb調試時的動態庫搜索路徑.
    • ldd命令可以列出一個二進制文件的依賴關系.
    • 利用set solib-search-path進行尋找依賴庫.
#0  0x0000556393da88ad in ?? ()
#1  0x00000009b6f194c0 in ?? ()
#2  0x00005563b686d1b0 in ?? ()
#3  0x00005563b688abe0 in ?? ()
#4  0x00007ffe22b8c070 in ?? ()
#5  0x00005563b5f36460 in ?? ()
#6  0x0000000000002bf9 in ?? ()
#7  0x0000000000000004 in void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) [clone .isra.44] ()
#8  0x00007ffe22b8c510 in ?? ()
#9  0x00007ffe22b8bf50 in ?? ()
#10 0x00005563b6a2ffd0 in ?? ()
#11 0x00007ffe22b8bf50 in ?? ()
#12 0x0000000000000008 in void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) [clone .isra.44] ()
#13 0x0000556393dabd1e in ?? ()
#14 0x00007f1b3f2da1f0 in ?? ()
#15 0x0000556393dabcd2 in ?? ()
#16 0x00007ffe22b8c610 in ?? ()
#17 0x00007ffe22b8bf00 in ?? ()
#18 0x00007ffe22b8c550 in ?? ()
#19 0x00007ffe22f747d0 in ?? ()
#20 0x00007ffe22b8c220 in ?? ()
#21 0x00007ffe22b8c200 in ?? ()
#22 0x0000000000000032 in void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) [clone .isra.44] ()
#23 0x00000000000211e0 in ?? ()
#24 0x00007ffe22b8c5b0 in ?? ()
#25 0x0000000022b8c490 in ?? ()
Backtrace stopped: Cannot access memory at address 0x195
  • 最后的結果gdb調試結果為:
[New LWP 5070]
Core was generated by `8, 6238, 6768, 6414, 5857, 6219, 6774, 5642, 5099, 6080)

(gdb) frame 0
#0  0x00007fa69aa8f17c in ___vsnprintf_chk (s=0x7ffcb1275ffa ", 5347"<error: Cannot access memory at address 0x7ffcb1276000>, maxlen=<optimized out>, 
    flags=1, slen=<optimized out>, format=0x55e0aef3a657 ", %d", args=args@entry=0x7ffcb0e28c00) at vsnprintf_chk.c:66
66	in vsnprintf_chk.c
(gdb) frame 1
#1  0x00007fa69aa8f095 in ___snprintf_chk (s=<optimized out>, maxlen=<optimized out>, flags=<optimized out>, slen=<optimized out>, 
    format=<optimized out>) at snprintf_chk.c:34
34	snprintf_chk.c: No such file or directory.
(gdb) frame 2
#2  0x000055e0aef2ee70 in writeResult(std::vector<std::vector<int, std::allocator<int> >, std::allocator<std::vector<int, std::allocator<int> > > > const&, std::vector<std::vector<int, std::allocator<int> >, std::allocator<std::vector<int, std::allocator<int> > > >&, std::unordered_map<int, int, std::hash<int>, std::equal_to<int>, std::allocator<std::pair<int const, int> > >&, char*, int) ()
(gdb) frame 3
#3  0x000055e0aef35e5f in scheduling(std::vector<Vehicle, std::allocator<Vehicle> >&, std::vector<Road, std::allocator<Road> >&, std::vector<Cross, std::allocator<Cross> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) ()
(gdb) frame 4
#4  0x35202c3536303620 in ?? ()
  • 以上結果說明在writeResult函數中出現了段錯誤.
  • 利用thread apply all bt full查看每個線程在堆棧中的使用情況.
  • GDB過程中最重要的幾個指令為:
0. gdb core-CodeCraft-2019.5070.jl.1554081713
1. set solib-absolute-prefix /
2. set solib-search-path /
3. file 可執行文件
4. core-file core-CodeCraft-2019.5070.jl.1554081713 
5. frame 2


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM