參考:http://lwn.net/Articles/322666/
http://blog.csdn.net/lcw_202/article/details/7290775
http://m.blog.chinaunix.net/uid-14528823-id-4567325.html
1、靜態探測點,是在內核代碼中調用ftrace提供的相應接口實現,稱之為靜態是因為,是在內核代碼中寫死的,靜態編譯到內核代碼中的,在內核編譯后,就不能再動態修改。在開啟ftrace相關的內核配置選項后,內核中已經在一些關鍵的地方設置了靜態探測點,需要使用時,即可查看到相應的信息。
2、動態探測點,基本原理為:利用mcount機制,在內核編譯時,在每個函數入口保留數個字節,然后在使用ftrace時,將保留的字節替換為需要的指令,比如跳轉到需要的執行探測操作的代碼。

ftrace利用了gcc的profile特性,gcc 的 -pg 選項將在每個函數的入口處加入對mcount的代碼調用。

如果ftrace編寫了自己的mcount stub函數,則可借此實現trace功能。
但是,在每個內核函數入口加入trace代碼,必然影響內核的性能,為了減小對內核性能的影響,ftrace支持動態trace功能。
當COFNIG_DYNAMIC_FTRACE被選中后,內核編譯時會調用recordmcount.pl腳本,將每個函數的地址寫入一個特殊的段:__mcount_loc
1、scripts/Makefile.build:
ifdef CONFIG_FTRACE_MCOUNT_RECORD ifdef BUILD_C_RECORDMCOUNT ifeq ("$(origin RECORDMCOUNT_WARN)", "command line") RECORDMCOUNT_FLAGS = -w endif # Due to recursion, we must skip empty.o. # The empty.o file is created in the make process in order to determine # the target endianness and word size. It is made before all other C # files, including recordmcount. sub_cmd_record_mcount = \ if [ $(@) != "scripts/mod/empty.o" ]; then \ $(objtree)/scripts/recordmcount $(RECORDMCOUNT_FLAGS) "$(@)"; \ fi; recordmcount_source := $(srctree)/scripts/recordmcount.c \ $(srctree)/scripts/recordmcount.h else sub_cmd_record_mcount = set -e ; perl $(srctree)/scripts/recordmcount.pl "$(ARCH)" \ "$(if $(CONFIG_CPU_BIG_ENDIAN),big,little)" \ "$(if $(CONFIG_64BIT),64,32)" \ "$(OBJDUMP)" "$(OBJCOPY)" "$(CC) $(KBUILD_CFLAGS)" \ "$(LD)" "$(NM)" "$(RM)" "$(MV)" \ "$(if $(part-of-module),1,0)" "$(@)"; recordmcount_source := $(srctree)/scripts/recordmcount.pl endif cmd_record_mcount = \ if [ "$(findstring -pg,$(_c_flags))" = "-pg" ]; then \ $(sub_cmd_record_mcount) \ fi; endif
define rule_cc_o_c $(call echo-cmd,checksrc) $(cmd_checksrc) \ $(call echo-cmd,cc_o_c) $(cmd_cc_o_c); \ $(cmd_modversions) \ $(call echo-cmd,record_mcount) \ $(cmd_record_mcount) \ scripts/basic/fixdep $(depfile) $@ '$(call make-cmd,cc_o_c)' > \ $(dot-target).tmp; \ rm -f $(depfile); \ mv -f $(dot-target).tmp $(dot-target).cmd endef
# Built-in and composite module parts $(obj)/%.o: $(src)/%.c $(recordmcount_source) FORCE $(call cmd,force_checksrc) $(call if_changed_rule,cc_o_c)
2、include/asm-generic/vmlinux.lds.h
#ifdef CONFIG_FTRACE_MCOUNT_RECORD #define MCOUNT_REC() . = ALIGN(8); \ VMLINUX_SYMBOL(__start_mcount_loc) = .; \ *(__mcount_loc) \ VMLINUX_SYMBOL(__stop_mcount_loc) = .; #else #define MCOUNT_REC() #endif
ftrace_init初始化:
start_kernel |-->ftrace_init(); |-->rest_init();
在function_trace_call函數內,ftrace記錄函數調用堆棧信息,並將結果寫入ring buffer。用戶可以通過debugfs的trace文件讀取該ring buffer中的內容。
function_trace_call |-->trace_function(tr, ip, parent_ip, flags, pc); |-->struct ftrace_event_call *call = &event_function; | struct ring_buffer *buffer = tr->trace_buffer.buffer; | struct ring_buffer_event *event; | struct ftrace_entry *entry; |-->event = trace_buffer_lock_reserve(buffer, TRACE_FN, sizeof(*entry), flags, pc); | |-->struct task_struct *tsk = current; | | entry->preempt_count = pc & 0xff; | | entry->pid = (tsk) ? tsk->pid : 0; |-->entry = ring_buffer_event_data(event); |-->entry->ip = ip; |-->entry->parent_ip = parent_ip;
irqsoff tracer的實現
irqsoff tracer的實現依賴於IRQ-Flags。在中斷關閉時,記錄下當時的時間戳,此后,中斷被打開時,再計算時間差,由此便可得到中斷禁止時間。
#ifdef CONFIG_TRACE_IRQFLAGS_SUPPORT #define local_irq_enable() \ do { trace_hardirqs_on(); raw_local_irq_enable(); } while (0)
