關鍵詞:coredump、core_pattern、coredump_filter等等。
應用程序在運行過程中由於各種異常或者bug導致退出,在滿足一定條件下產生一個core文件。
通常core文件包含了程序運行時內存、寄存器狀態、堆棧指針、內存管理信息以及函數調用堆棧信息。
core就是程序當前工作轉改存儲生成的一個文件,通過工具分析這個文件,可以定位到程序異常退出的時候對應的堆棧調用等信息,找出問題點並解決。
1. 配置coredump
如果需要使用需要通過ulimit進行設置,可以通過ulimit -c查看當前系統是否支持coredump。如果為0,則表示coredump被關閉。
通過ulimit -c unlimited可以打開coredump。
coredump文件默認存儲位置與可執行文件在同一目錄下,文件名為core。
可以通過/proc/sys/kernel/core_pattern進行設置。
%p 出Core進程的PID %u 出Core進程的UID %s 造成Core的signal號 %t 出Core的時間,從1970-01-0100:00:00開始的秒數 %e 出Core進程對應的可執行文件名
通過echo "core-%e-%p-%s-%t" > /proc/sys/kernel/core_pattern。
在每個進程下都有coredump_filter節點/proc/<pid>/coredump_filter。
通過配置coredump_filter可以選擇需在coredump的時候,將哪些內容dump到core文件中。
- (bit 0) anonymous private memory - (bit 1) anonymous shared memory - (bit 2) file-backed private memory - (bit 3) file-backed shared memory - (bit 4) ELF header pages in file-backed private memory areas (it is effective only if the bit 2 is cleared) - (bit 5) hugetlb private memory - (bit 6) hugetlb shared memory - (bit 7) DAX private memory - (bit 8) DAX shared memory
coredump_filter的默認值是0x33,也即發生coredump時會將所有anonymous內存、ELF頭頁面、hugetlb private memory內容保存。
coredump_filter可以被子進程繼承,可以echo 0xXX > /proc/self/coredump_filter設置當前進程的coredump_filter。
static ssize_t proc_coredump_filter_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos) { ... ret = kstrtouint_from_user(buf, count, 0, &val);-------------------------將buf轉換成val值。 if (ret < 0) return ret; ... for (i = 0, mask = 1; i < MMF_DUMP_FILTER_BITS; i++, mask <<= 1) { if (val & mask) set_bit(i + MMF_DUMP_FILTER_SHIFT, &mm->flags);------------------將coredump_filter的值映射到mm->flags上,后續coredump時使用。 else clear_bit(i + MMF_DUMP_FILTER_SHIFT, &mm->flags); } ... }
其中MMF_DUMP_FILTER_SHIFT為2,所以flags和coredump_filter存在如下對應關系。
#define MMF_DUMP_ANON_PRIVATE 2 #define MMF_DUMP_ANON_SHARED 3 #define MMF_DUMP_MAPPED_PRIVATE 4 #define MMF_DUMP_MAPPED_SHARED 5 #define MMF_DUMP_ELF_HEADERS 6 #define MMF_DUMP_HUGETLB_PRIVATE 7 #define MMF_DUMP_HUGETLB_SHARED 8 #define MMF_DUMP_DAX_PRIVATE 9 #define MMF_DUMP_DAX_SHARED 10
2. coredump原理
在do_signal()中根據信號判斷是否觸發coredump,當然還跟coredump limit、mm->flags等等相關。
滿足coredump條件后,由do_coredump()進行coredump文件生成,核心是由binfmt->core_dump()進行的。
2.1 觸發coredump的條件?
在內核返回用戶空間的時候,會調用do_signal()處理信號。
static void do_signal(struct pt_regs *regs, int syscall) { unsigned int retval = 0, continue_addr = 0, restart_addr = 0; struct ksignal ksig; ... if (get_signal(&ksig)) { ... } ... } int get_signal(struct ksignal *ksig) { ... for (;;) { struct k_sigaction *ka; ... signr = dequeue_signal(current, ¤t->blocked, &ksig->info); ... /* Trace actually delivered signals. */ trace_signal_deliver(signr, &ksig->info, ka); ... if (sig_kernel_coredump(signr)) { if (print_fatal_signals)------------------------------可以通過kernel.print-fatal-signals = 1進行設置,對應的節點是/proc/sys/kernel/print-fatal-signals。 print_fatal_signal(ksig->info.si_signo);----------打印當前信號及當前場景的棧信息。 proc_coredump_connector(current); do_coredump(&ksig->info); } ... } spin_unlock_irq(&sighand->siglock); ksig->sig = signr; return ksig->sig > 0; } #define sig_kernel_coredump(sig) siginmask(sig, SIG_KERNEL_COREDUMP_MASK)
#define SIG_KERNEL_COREDUMP_MASK (\
rt_sigmask(SIGQUIT) | rt_sigmask(SIGILL) | \
rt_sigmask(SIGTRAP) | rt_sigmask(SIGABRT) | \
rt_sigmask(SIGFPE) | rt_sigmask(SIGSEGV) | \
rt_sigmask(SIGBUS) | rt_sigmask(SIGSYS) | \
rt_sigmask(SIGXCPU) | rt_sigmask(SIGXFSZ) | \
SIGEMT_MASK )
在get_signal()中,判斷信號是否會導致coredump。這些信號包括SIGQUIT、SIGILL、SIGTRAP、SIGABRT、SIGFPE、SIGSEGV、SIGBUS、SIGSYS、SIGXCPU、SIGXFSZ。
“終止w/core”表示在進程當前工作目錄的core文件中復制了該進程的存儲圖像(該文件名為core,由此可以看出這種功能很久之前就是UNIX功能的一部分)。
void proc_coredump_connector(struct task_struct *task) { struct cn_msg *msg; struct proc_event *ev; __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8); if (atomic_read(&proc_event_num_listeners) < 1) return; msg = buffer_to_cn_msg(buffer); ev = (struct proc_event *)msg->data; memset(&ev->event_data, 0, sizeof(ev->event_data)); ev->timestamp_ns = ktime_get_ns(); ev->what = PROC_EVENT_COREDUMP; ev->event_data.coredump.process_pid = task->pid; ev->event_data.coredump.process_tgid = task->tgid; memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id)); msg->ack = 0; /* not used */ msg->len = sizeof(*ev); msg->flags = 0; /* not used */ send_msg(msg); }
2.2 coredump如何生成?
void do_coredump(const siginfo_t *siginfo) { struct core_state core_state; struct core_name cn; struct mm_struct *mm = current->mm; struct linux_binfmt * binfmt; const struct cred *old_cred; struct cred *cred; int retval = 0; int ispipe; struct files_struct *displaced; /* require nonrelative corefile path and be extra careful */ bool need_suid_safe = false; bool core_dumped = false; static atomic_t core_dump_count = ATOMIC_INIT(0); struct coredump_params cprm = { .siginfo = siginfo, .regs = signal_pt_regs(), .limit = rlimit(RLIMIT_CORE),-----------------------------------獲取系統對於coredump的限制。 /* * We must use the same mm->flags while dumping core to avoid * inconsistency of bit flags, since this flag is not protected * by any locks. */ .mm_flags = mm->flags, }; audit_core_dumps(siginfo->si_signo); binfmt = mm->binfmt;------------------------------------------------獲取當前進程所使用的程序加載器。 if (!binfmt || !binfmt->core_dump) goto fail; if (!__get_dumpable(cprm.mm_flags))---------------------------------從當前進程的mm->flags中取低兩位判斷是否可以coredump,SUID_DUMP_DISABLE(0)不可以,其他情況都可以。 goto fail; cred = prepare_creds(); if (!cred) goto fail; /* * We cannot trust fsuid as being the "true" uid of the process * nor do we know its entire history. We only know it was tainted * so we dump it as root in mode 2, and only into a controlled * environment (pipe handler or fully qualified path). */ if (__get_dumpable(cprm.mm_flags) == SUID_DUMP_ROOT) {--------------區分SUID_DUMP_USER和SUID_DUMP_ROOT。 /* Setuid core dump mode */ cred->fsuid = GLOBAL_ROOT_UID; /* Dump root private */ need_suid_safe = true; } retval = coredump_wait(siginfo->si_signo, &core_state); if (retval < 0) goto fail_creds; old_cred = override_creds(cred); ispipe = format_corename(&cn, &cprm);-------------------------------根據core_pattern判斷是否是ispipe,然后根據core_pattern的設置生成coredump文件名稱。 if (ispipe) {-------------------------------------------------------通過管道處理coredump信息。 int dump_count; char **helper_argv; struct subprocess_info *sub_info; if (ispipe < 0) { printk(KERN_WARNING "format_corename failed\n"); printk(KERN_WARNING "Aborting core\n"); goto fail_unlock; } if (cprm.limit == 1) { printk(KERN_WARNING "Process %d(%s) has RLIMIT_CORE set to 1\n", task_tgid_vnr(current), current->comm); printk(KERN_WARNING "Aborting core\n"); goto fail_unlock; } cprm.limit = RLIM_INFINITY; dump_count = atomic_inc_return(&core_dump_count); if (core_pipe_limit && (core_pipe_limit < dump_count)) { printk(KERN_WARNING "Pid %d(%s) over core_pipe_limit\n", task_tgid_vnr(current), current->comm); printk(KERN_WARNING "Skipping core dump\n"); goto fail_dropcount; } helper_argv = argv_split(GFP_KERNEL, cn.corename, NULL);----------將cn.corename參數進行拆分。 if (!helper_argv) { printk(KERN_WARNING "%s failed to allocate memory\n", __func__); goto fail_dropcount; } retval = -ENOMEM; sub_info = call_usermodehelper_setup(helper_argv[0], helper_argv, NULL, GFP_KERNEL, umh_pipe_setup, NULL, &cprm);---------------------通過usermodehelper調用用戶空間的helper_argv[0]程序進行core_pattern。 if (sub_info) retval = call_usermodehelper_exec(sub_info, UMH_WAIT_EXEC);-----------------------------UMH_WAIT_EXEC表示在內核exec用戶空間程序之后就退出,此時用戶空間程序就通過pipe等待接收數據。 argv_free(helper_argv); if (retval) { printk(KERN_INFO "Core dump to |%s pipe failed\n", cn.corename); goto close_fail; } } else { struct inode *inode; int open_flags = O_CREAT | O_RDWR | O_NOFOLLOW | O_LARGEFILE | O_EXCL; if (cprm.limit < binfmt->min_coredump) goto fail_unlock; if (need_suid_safe && cn.corename[0] != '/') { printk(KERN_WARNING "Pid %d(%s) can only dump core "\ "to fully qualified path!\n", task_tgid_vnr(current), current->comm); printk(KERN_WARNING "Skipping core dump\n"); goto fail_unlock; } if (!need_suid_safe) { mm_segment_t old_fs; old_fs = get_fs(); set_fs(KERNEL_DS); /* * If it doesn't exist, that's fine. If there's some * other problem, we'll catch it at the filp_open(). */ (void) sys_unlink((const char __user *)cn.corename); set_fs(old_fs); } if (need_suid_safe) {---------------------------------------------創建coredump文件。 struct path root; task_lock(&init_task); get_fs_root(init_task.fs, &root); task_unlock(&init_task); cprm.file = file_open_root(root.dentry, root.mnt, cn.corename, open_flags, 0600); path_put(&root); } else { cprm.file = filp_open(cn.corename, open_flags, 0600); } if (IS_ERR(cprm.file)) goto fail_unlock; inode = file_inode(cprm.file); if (inode->i_nlink > 1)------------------------------------------coredummp文件不能有多個硬鏈接。 goto close_fail; if (d_unhashed(cprm.file->f_path.dentry)) goto close_fail; if (!S_ISREG(inode->i_mode))--------------------------------------coredump文件必須為普通文件。 goto close_fail; if (!uid_eq(inode->i_uid, current_fsuid())) goto close_fail; if ((inode->i_mode & 0677) != 0600) goto close_fail; if (!(cprm.file->f_mode & FMODE_CAN_WRITE))-----------------------coredump文件必須可寫。 goto close_fail; if (do_truncate(cprm.file->f_path.dentry, 0, 0, cprm.file)) goto close_fail; } /* get us an unshared descriptor table; almost always a no-op */ retval = unshare_files(&displaced); if (retval) goto close_fail; if (displaced) put_files_struct(displaced); if (!dump_interrupted()) { file_start_write(cprm.file); core_dumped = binfmt->core_dump(&cprm);---------------------------調用對應程序加載器的core_dump進行處理,將數據寫入到cprm.file中。 file_end_write(cprm.file); } if (ispipe && core_pipe_limit) wait_for_dump_helpers(cprm.file); close_fail: if (cprm.file) filp_close(cprm.file, NULL); fail_dropcount: if (ispipe) atomic_dec(&core_dump_count); fail_unlock: kfree(cn.corename); coredump_finish(mm, core_dumped); revert_creds(old_cred); fail_creds: put_cred(cred); fail: return; }
format_corename()根據core_pattern中的設置,生成coredump文件名。並且判斷coredump文件生成方式,ispipe為真則通過管道傳輸給其他應用處理;否則直接保存成文件。
static int format_corename(struct core_name *cn, struct coredump_params *cprm) { const struct cred *cred = current_cred(); const char *pat_ptr = core_pattern; int ispipe = (*pat_ptr == '|');------------------------------------------|表示通過pipe處理coredump文件。 int pid_in_pattern = 0; int err = 0; cn->used = 0; cn->corename = NULL; if (expand_corename(cn, core_name_size)) return -ENOMEM; cn->corename[0] = '\0'; if (ispipe) ++pat_ptr; /* Repeat as long as we have more pattern to process and more output space */ while (*pat_ptr) { if (*pat_ptr != '%') { err = cn_printf(cn, "%c", *pat_ptr++); } else { switch (*++pat_ptr) { /* single % at the end, drop that */ case 0: goto out; /* Double percent, output one percent */ case '%': err = cn_printf(cn, "%c", '%'); break; /* pid */ case 'p': pid_in_pattern = 1; err = cn_printf(cn, "%d", task_tgid_vnr(current));-------------------------%p表示記錄當前進程組的pid。 break; /* global pid */ case 'P':-------------------------------------------------------%P表示記錄當前進程組的pid。 err = cn_printf(cn, "%d", task_tgid_nr(current)); break; case 'i': err = cn_printf(cn, "%d", task_pid_vnr(current));--------------------------%i表示記錄當前線程的pid。 break; case 'I':------------------------------------------------------%I表示記錄當前線程的pid。 err = cn_printf(cn, "%d", task_pid_nr(current)); break; /* uid */ case 'u':-------------------------------------------------------%u表示當前用戶id。 err = cn_printf(cn, "%u", from_kuid(&init_user_ns, cred->uid)); break; /* gid */ case 'g':-------------------------------------------------------%g表示group id。 err = cn_printf(cn, "%u", from_kgid(&init_user_ns, cred->gid)); break; case 'd': err = cn_printf(cn, "%d", __get_dumpable(cprm->mm_flags));------------------------%d表示dump的用戶類型:SUID_DUMP_DISABLE/SUID_DUMP_USER/SUID_DUMP_ROOT。 break; /* signal that caused the coredump */ case 's': err = cn_printf(cn, "%d", cprm->siginfo->si_signo);----------------------------%s記錄產生coredump的信號。 break; /* UNIX time of coredump */ case 't': { time64_t time; time = ktime_get_real_seconds(); err = cn_printf(cn, "%lld", time);---------------------------%t記錄產生coredump的時間。 break; } /* hostname */ case 'h':--------------------------------------------------------%h記錄主機名。 down_read(&uts_sem); err = cn_esc_printf(cn, "%s", utsname()->nodename); up_read(&uts_sem); break; /* executable */ case 'e': err = cn_esc_printf(cn, "%s", current->comm);----------------%e記錄進程中comm名稱。 break; case 'E': err = cn_print_exe_file(cn);---------------------------------%E記錄可執行文件名稱。 break; /* core limit size */ case 'c': err = cn_printf(cn, "%lu", rlimit(RLIMIT_CORE));------------------------------%c記錄coredump的limit值。 break; default: break; } ++pat_ptr; } if (err) return err; } out: if (!ispipe && !pid_in_pattern && core_uses_pid) { err = cn_printf(cn, ".%d", task_tgid_vnr(current)); if (err) return err; } return ispipe; }
所以core_%e(%I)_%E(%p)_sig(%s)_time(%t)寫入到core_pattern表示core_線程名(線程pid)_進程名(進程pid)_sig(信號值)_time(異常時間點)。
umh_pipe_setup()創建了一個管道,這個管道給內核coredump和用戶空間程序搭建了一個橋梁。
內核coredump的數據寫入管道,用戶空間程序在管道另一端接收進行處理。
static int umh_pipe_setup(struct subprocess_info *info, struct cred *new) { struct file *files[2]; struct coredump_params *cp = (struct coredump_params *)info->data; int err = create_pipe_files(files, 0);----------------------------創建一個pipe管道,files[0]是管道的讀端;files[1]是管道的寫端。 if (err) return err; cp->file = files[1];----------------------------------------------cp->file指向管道的寫端,后面coredump寫入這里。 err = replace_fd(0, files[0], 0);---------------------------------這里將files[0]作為usermodehelper執行程序的輸入,coredump的數據通過管道給用戶空間程序接收。 fput(files[0]); /* and disallow core files too */ current->signal->rlim[RLIMIT_CORE] = (struct rlimit){1, 1}; return err; } int create_pipe_files(struct file **res, int flags) { int err; struct inode *inode = get_pipe_inode(); struct file *f; struct path path; static struct qstr name = { .name = "" }; if (!inode) return -ENFILE; err = -ENOMEM; path.dentry = d_alloc_pseudo(pipe_mnt->mnt_sb, &name); if (!path.dentry) goto err_inode; path.mnt = mntget(pipe_mnt); d_instantiate(path.dentry, inode); f = alloc_file(&path, FMODE_WRITE, &pipefifo_fops);------------------------創建管道的寫一端。 if (IS_ERR(f)) { err = PTR_ERR(f); goto err_dentry; } f->f_flags = O_WRONLY | (flags & (O_NONBLOCK | O_DIRECT)); f->private_data = inode->i_pipe; res[0] = alloc_file(&path, FMODE_READ, &pipefifo_fops);--------------------創建管道的讀一端。 if (IS_ERR(res[0])) { err = PTR_ERR(res[0]); goto err_file; } path_get(&path); res[0]->private_data = inode->i_pipe; res[0]->f_flags = O_RDONLY | (flags & O_NONBLOCK); res[1] = f; return 0; err_file: put_filp(f); err_dentry: free_pipe_info(inode->i_pipe); path_put(&path); return err; err_inode: free_pipe_info(inode->i_pipe); iput(inode); return err; } int replace_fd(unsigned fd, struct file *file, unsigned flags) { int err; struct files_struct *files = current->files; if (!file) return __close_fd(files, fd); if (fd >= rlimit(RLIMIT_NOFILE)) return -EBADF; spin_lock(&files->file_lock); err = expand_files(files, fd); if (unlikely(err < 0)) goto out_unlock; return do_dup2(files, file, fd, flags); out_unlock: spin_unlock(&files->file_lock); return err; }
linux內核支持多種linux_binfmt,這里最常用的是ELF。
所以do_coredump()中的binfmt即為elf_format,binfmt->core_dump()即為elf_coredump()。
elf_core_dump()將當前進程的vma區域進行dummp,附加相關的頭信息等。保存成文件。
static struct linux_binfmt elf_format = { .module = THIS_MODULE, .load_binary = load_elf_binary, .load_shlib = load_elf_library, .core_dump = elf_core_dump, .min_coredump = ELF_EXEC_PAGESIZE, }; static int elf_core_dump(struct coredump_params *cprm) { int has_dumped = 0; mm_segment_t fs; int segs, i; size_t vma_data_size = 0; struct vm_area_struct *vma, *gate_vma; struct elfhdr *elf = NULL; loff_t offset = 0, dataoff; struct elf_note_info info = { }; struct elf_phdr *phdr4note = NULL; struct elf_shdr *shdr4extnum = NULL; Elf_Half e_phnum; elf_addr_t e_shoff; elf_addr_t *vma_filesz = NULL; elf = kmalloc(sizeof(*elf), GFP_KERNEL);-----------------------申請存放elfhdr空間。 if (!elf) goto out; segs = current->mm->map_count;---------------------------------通過current->mm->map_count得到當前進程已映射的內存段數量。 segs += elf_core_extra_phdrs();--------------------------------增加附加段數量。 gate_vma = get_gate_vma(current->mm);--------------------------增加一個segment給vma使用。 if (gate_vma != NULL) segs++; /* for notes section */ segs++;--------------------------------------------------------保留一個segment給PT_NOTE使用。 /* If segs > PN_XNUM(0xffff), then e_phnum overflows. To avoid * this, kernel supports extended numbering. Have a look at * include/linux/elf.h for further information. */ e_phnum = segs > PN_XNUM ? PN_XNUM : segs; /* * Collect all the non-memory information about the process for the * notes. This also sets up the file header. */ if (!fill_note_info(elf, e_phnum, &info, cprm->siginfo, cprm->regs))-----fill_note_info()填充info信息。 goto cleanup; has_dumped = 1; fs = get_fs(); set_fs(KERNEL_DS);------------------------------------------------------在內核中操作用戶空間文件,需要將地址方位擴大。具體參見《Linux內核訪問用戶空間文件:get_fs()/set_fs()的使用》 offset += sizeof(*elf); /* Elf header */ offset += segs * sizeof(struct elf_phdr); /* Program headers */ /* Write notes phdr entry */ { size_t sz = get_note_info_size(&info); sz += elf_coredump_extra_notes_size(); phdr4note = kmalloc(sizeof(*phdr4note), GFP_KERNEL); if (!phdr4note) goto end_coredump; fill_elf_note_phdr(phdr4note, sz, offset); offset += sz; } dataoff = offset = roundup(offset, ELF_EXEC_PAGESIZE); vma_filesz = kmalloc_array(segs - 1, sizeof(*vma_filesz), GFP_KERNEL); if (!vma_filesz) goto end_coredump; for (i = 0, vma = first_vma(current, gate_vma); vma != NULL; vma = next_vma(vma, gate_vma)) { unsigned long dump_size; dump_size = vma_dump_size(vma, cprm->mm_flags);----------------------mm_flags對應coredump_filter,用於確定哪些vma需要dump,哪些忽略掉。 vma_filesz[i++] = dump_size; vma_data_size += dump_size; } offset += vma_data_size; offset += elf_core_extra_data_size(); e_shoff = offset; if (e_phnum == PN_XNUM) { shdr4extnum = kmalloc(sizeof(*shdr4extnum), GFP_KERNEL); if (!shdr4extnum) goto end_coredump; fill_extnum_info(elf, shdr4extnum, e_shoff, segs); } offset = dataoff; if (!dump_emit(cprm, elf, sizeof(*elf)))---------------------------寫入elf頭到cprm->file文件,在使用pipe的情況下,這些數據都交給usermodehelper啟動的用戶空間程序進行處理。 goto end_coredump; if (!dump_emit(cprm, phdr4note, sizeof(*phdr4note)))---------------寫入phdr4node到cprm->file文件。 goto end_coredump; /* Write program headers for segments dump */ for (i = 0, vma = first_vma(current, gate_vma); vma != NULL; vma = next_vma(vma, gate_vma)) { struct elf_phdr phdr; phdr.p_type = PT_LOAD; phdr.p_offset = offset; phdr.p_vaddr = vma->vm_start; phdr.p_paddr = 0; phdr.p_filesz = vma_filesz[i++]; phdr.p_memsz = vma->vm_end - vma->vm_start; offset += phdr.p_filesz; phdr.p_flags = vma->vm_flags & VM_READ ? PF_R : 0; if (vma->vm_flags & VM_WRITE) phdr.p_flags |= PF_W; if (vma->vm_flags & VM_EXEC) phdr.p_flags |= PF_X; phdr.p_align = ELF_EXEC_PAGESIZE; if (!dump_emit(cprm, &phdr, sizeof(phdr))) goto end_coredump; } if (!elf_core_write_extra_phdrs(cprm, offset)) goto end_coredump; /* write out the notes section */ if (!write_note_info(&info, cprm)) goto end_coredump; if (elf_coredump_extra_notes_write(cprm)) goto end_coredump; /* Align to page */ if (!dump_skip(cprm, dataoff - cprm->pos)) goto end_coredump; for (i = 0, vma = first_vma(current, gate_vma); vma != NULL; vma = next_vma(vma, gate_vma)) { unsigned long addr; unsigned long end; end = vma->vm_start + vma_filesz[i++]; for (addr = vma->vm_start; addr < end; addr += PAGE_SIZE) { struct page *page; int stop; page = get_dump_page(addr); if (page) { void *kaddr = kmap(page); stop = !dump_emit(cprm, kaddr, PAGE_SIZE); kunmap(page); put_page(page); } else stop = !dump_skip(cprm, PAGE_SIZE); if (stop) goto end_coredump; } } dump_truncate(cprm); if (!elf_core_write_extra_data(cprm)) goto end_coredump; if (e_phnum == PN_XNUM) { if (!dump_emit(cprm, shdr4extnum, sizeof(*shdr4extnum))) goto end_coredump; } end_coredump: set_fs(fs); cleanup: free_note_info(&info); kfree(shdr4extnum); kfree(vma_filesz); kfree(phdr4note); kfree(elf); out: return has_dumped; } int dump_emit(struct coredump_params *cprm, const void *addr, int nr) { struct file *file = cprm->file; loff_t pos = file->f_pos; ssize_t n; if (cprm->written + nr > cprm->limit) return 0; while (nr) { if (dump_interrupted()) return 0; n = __kernel_write(file, addr, nr, &pos); if (n <= 0) return 0; file->f_pos = pos; cprm->written += n; cprm->pos += n; nr -= n; } return 1; }
判斷一個文件是否是coredump文件,可以通過readelf命令,如果類型是CORE(Core file)。
或者通過file命令進行判斷。
參考文檔:《Core file 文件格式(Linux Coredump文件結構)》,GDB解析coredump文件參考《GDB如何從Coredump文件恢復動態庫信息》。
3. coredump案例
下面創建一個簡單產生coredump的示例,然后通過gdb進行分析。
3.1 coredump示例
#include <stddef.h> #include <stdio.h> #include <stdlib.h> #include <string.h> int myfunc(int i) { *(int*)(NULL) = i; /* line 7 */ return i - 1; } int main(int argc, char **argv) { /* Setup some memory. */ char data_ptr[] = "string in data segment"; char *mmap_ptr; char *text_ptr = "string in text segment"; (void)argv; mmap_ptr = (char *)malloc(sizeof(data_ptr) + 1); strcpy(mmap_ptr, data_ptr); mmap_ptr[10] = 'm'; mmap_ptr[11] = 'm'; mmap_ptr[12] = 'a'; mmap_ptr[13] = 'p'; printf("text addr: %p\n", text_ptr); printf("data addr: %p\n", data_ptr); printf("mmap addr: %p\n", mmap_ptr); /* Call a function to prepare a stack trace. */ return myfunc(argc); }
使用如下命令編譯,-ggdb3表示產生更多適合GDB的調試信息,3是最高等級。
gcc -ggdb3 -std=c99 -Wall -Wextra -pedantic -o main.out main.c
3.2 coredump+gdb分析
通過ulimit -c unlimited打開coredump功能,執行./main.out產生core文件。
text addr: 0x4007d4 data addr: 0x7ffff28fdc30 mmap addr: 0x10bb010 Segmentation fault (core dumped)
通過gdb ./main.out core,顯示了進程由於什么信號導致的coredump(SIGSEGV)?在哪個文件(main.cc)?在哪個函數(myfunc())?具體位置的代碼?等等信息。
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1... Reading symbols from ./main.out...done. [New LWP 8651] Core was generated by `./main.out'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x0000000000400635 in myfunc (i=1) at main.c:7 7 *(int*)(NULL) = i; /* line 7 */
關於core+gdb更詳細的分析方法可以參考《通過core+gdb離線分析》,在分析過程中需要加載動態庫可以參考《GDB動態庫搜索路徑》。
4. coredump使用優化(適用嵌入式)
在/etc/profile中,打開對coredump的配置以及對core_pattern進行配置:
sysctl -p -q -e
ulimit -c unlimited
配置/etc/sysctl.conf文件:
kernel.core_pattern=|/usr/bin/coredump_helper.sh core_%e_%I_%p_sig_%s_time_%t.gz
kernel.core_uses_pid=1
增加處理coredump文件的腳本:
#!/bin/sh
if [ ! -d "/var/coredump" ];then mkdir -p /var/coredump fi gzip > "/var/coredump/$1"
最終在/var/coredump目錄下生成core_<線程名>_<線程ID>_<進程ID>_sig_<信號值>_time_<coredump時間>.gz文件。
5. 小結
至此大概總結了,對coredump的設置(ulimit/core_pattern/coredump_filter)?觸發coredump的條件(SIG_KERNEL_COREDUMP_MASK )?coredump生成core文件流程(do_coredump())?gdb如何識別core文件(《GDB如何從Coredump文件恢復動態庫信息》)?如何通過gdb分析core文件發現問題(gdb->backtrace)?