內核版本:v3.4.xxx
一、前言
內核提供了方便查看當前系統TickDevice、活動的Timer列表以及Timer使用的統計信息。
內核分別用兩個節點來表示TimerList和Timer統計信息。
一個是/proc/timer_list,打印per_cpu的hrtimer_bases信息以及基於此的timer列表,包括三種時鍾MONOTONIC/REALTIME/BOOTTIME;以及Broadcast Tick Device和Per CPU Tick Device信息。
另一個是/proc/timer_stats,需要echo 1 > /proc/timer_stats打開,echo 0 /proc/timer_stats關閉。cat /proc/timer_stats可以獲取統計信息。
二、背景介紹
Per CPU TickDevice是維護當前CPU Tick和hrtimer的基礎,只有工作在OneShot模式下才能實現hrtimer。在CPU進入cpuidle后,Per CPU TickDevice可能會被關閉,這時候可以打開Broadcast Tick作為喚醒源。超時后,就會將系統從cpuidle退出,進入正常工作模式打開Per CPU TickDevice。
hrtimer有三種時鍾基准,這三種時鍾基准對應不同的時間獲取函數,但主要是基於TimeKeeping的統計。
三、認識timer_list
kernel/time/timer_list.c中init_timer_list_procfs創建/proc/timer_list節點。
static int timer_list_show(struct seq_file *m, void *v) { u64 now = ktime_to_ns(ktime_get()); int cpu; SEQ_printf(m, "Timer List Version: v0.6\n");------------------------------------------------------------------(1) SEQ_printf(m, "HRTIMER_MAX_CLOCK_BASES: %d\n", HRTIMER_MAX_CLOCK_BASES); SEQ_printf(m, "now at %Ld nsecs\n", (unsigned long long)now); for_each_online_cpu(cpu)----------------------------------------------------------------------------------------(2) print_cpu(m, cpu, now); SEQ_printf(m, "\n"); timer_list_show_tickdevices(m);--------------------------------------------------------------------------------(3) return 0; } |
(1)打印概況信息,hrtimer基准時鍾個數和系統MONOTONIC時間。
(2)這里是per_cpu信息,按照hrtimer_cpu_base->hrtimer_clock_base->hrtimer層級關系打印。
(3)打印Broadcast TickDevice和per_cpu的TickDevice設備信息。
下面根據一個示例來分析:
Timer List Version: v0.6 HRTIMER_MAX_CLOCK_BASES: 3 now at 86212951044093 nsecs cpu: 0-----------------------print_cpuCPU0的信息如下,從clock0到clock2 clock 0:---------------------clock0作為MONOTONIC使用ktime_get獲取當前時間,是timerkeeper的xtime和wall_to_monotonic之和。 .base: c0531950 .index: 0 .resolution: 1 nsecs .get_time: ktime_get .offset: 0 nsecs------------和MONOTONIC相比的offset。 active timers:--------------print_active_timers打印所有的hrtimer,按照時間排列,timerqueue_iterate_next讀取rbtree的next。 --依次是:序列號、hrtimer、timer狀態、hrtimer超時函數、進程名/進程號 --超時絕對時間和相對時間。 #0: <c0532620>, tick_sched_timer, S:01, hrtimer_start_range_ns, swapper/0 # expires at 86212955000000-86212955000000 nsecs [in 3955907 to 3955907 nsecs] #1: <c11d1f20>, hrtimer_wakeup, S:01, hrtimer_start_range_ns, xxx_log_agent/1326 # expires at 86212957402735-86212957452735 nsecs [in 6358642 to 6408642 nsecs] #2: <c1289ab8>, hrtimer_wakeup, S:01, hrtimer_start_range_ns, xxx_ipsec_proxy/1349 # expires at 86213073541650-86213076541650 nsecs [in 122497557 to 125497557 nsecs] #3: <c11d3f20>, hrtimer_wakeup, S:01, hrtimer_start_range_ns, xxx_log_agent/1327 # expires at 86213291925439-86213291975439 nsecs [in 340881346 to 340931346 nsecs] #4: <c105df20>, hrtimer_wakeup, S:01, hrtimer_start_range_ns, xxx_log_agent/1324 # expires at 86213768121728-86213768171728 nsecs [in 817077635 to 817127635 nsecs] #5: <c108bf20>, hrtimer_wakeup, S:01, hrtimer_start_range_ns, xxx_log_agent/1325 # expires at 86213768152245-86213768202245 nsecs [in 817108152 to 817158152 nsecs] #6: <c104fab8>, hrtimer_wakeup, S:01, hrtimer_start_range_ns, at_ctl/785 # expires at 86214382959375-86214387959375 nsecs [in 1431915282 to 1436915282 nsecs] #7: <c12bbf20>, hrtimer_wakeup, S:01, hrtimer_start_range_ns, dm_ci/1365 # expires at 86248421289453-86248421339453 nsecs [in 35470245360 to 35470295360 nsecs] #8: <c11d7ab8>, hrtimer_wakeup, S:01, hrtimer_start_range_ns, udhcpd/1147 # expires at 86409238207327-86409338207327 nsecs [in 196287163234 to 196387163234 nsecs] #9: <c1aadd90>, it_real_fn, S:01, hrtimer_start, SoftTimerLoop/782 # expires at 86421323022854-86421323022854 nsecs [in 208371978761 to 208371978761 nsecs] #10: <c11dff20>, hrtimer_wakeup, S:01, hrtimer_start_range_ns, netmonitor/1313 # expires at 86421323999417-86421324049417 nsecs [in 208372955324 to 208373005324 nsecs] clock 1: .base: c0531988 .index: 1 .resolution: 1 nsecs .get_time: ktime_get_real .offset: 946759810014430770 nsecs active timers: #0: <c0563760>, alarmtimer_fired, S:01, hrtimer_start, sample/1354 # expires at 946846022969655783-946846022969655783 nsecs [in 946759810018611690 to 946759810018611690 nsecs] clock 2: .base: c05319c0 .index: 2 .resolution: 1 nsecs .get_time: ktime_get_boottime .offset: 706858435 nsecs active timers: -------------------------------------------------------------------hrtimer_cpu_base部分,參照結構體解釋---------------------------------------------------------------------- .expires_next : 86212955000000 nsecs---------CPU層次看hrtimer信息,可以在上面找到對應的最近一次expires .hres_active : 1 .nr_events : 4648009 .nr_retries : 4467 .nr_hangs : 0 .max_hang_time : 0 nsecs ---------------------------------------------------------------------tick_sched部分,參照結構體解釋----------------------------------------------------------------------------- .nohz_mode : 2--------------------------------------nohz_mode對應NOHZ_MODE_INACTIVE(0)、NOHZ_MODE_LOWRES(1)、NOHZ_MODE_HIGHRES(2)。 .idle_tick : 86212945000000 nsecs .tick_stopped : 0 .idle_jiffies : 17182588 .idle_calls : 4632242 .idle_sleeps : 4573348 .idle_entrytime : 86212940790186 nsecs .idle_waketime : 86212940790186 nsecs .idle_exittime : 86212940790186 nsecs .idle_sleeptime : 84593571589272 nsecs .iowait_sleeptime: 0 nsecs .last_jiffies : 17182588 .next_jiffies : 17182604 .idle_expires : 86213020000000 nsecs jiffies: 17182590 Tick Device: mode: 1-----------------------Broadcast TickDevice Broadcast device Clock Event Device: <NULL> tick_broadcast_mask: 00000000 tick_broadcast_oneshot_mask: 00000000 Tick Device: mode: 1---------------------per_cpu TickDevice,詳細信息參照clock_event_device結構體解釋。 Per CPU device: 0 Clock Event Device: xxx_ap_timer0 max_delta_ns: 660764185443 min_delta_ns: 15385 mult: 13958644 shift: 32 mode: 3 next_event: 86212955000000 nsecs set_next_event: xxx_timer_set_next_event set_mode: xxx_timer_set_mode event_handler: hrtimer_interrupt retries: 165992 |
四、認識timer_stats
kernel/time/timer_stats.c中init_tstas_procfs創建了/proc/timer_stats節點。統計細心
在每次hrtimer超時函數中都會進行統計,必須打開CONFIG_TIMER_STATS才有效。
__hrtimer_start_range_ns-->raise_softirq_irqoff-->HRTIMER_SOFTIRQ-->run_hrtimer_softirq-->hrtimer_interrupt-->__run_hrtimer-->timer_stats_account_hrtimer |
timer_stats_account_hrtimer是執行統計信息的主體,這里面有一個timer_stats_active開關。是通過echo [0|1] > /proc/timer_stats進行開關。
void timer_stats_update_stats(void *timer, pid_t pid, void *startf, void *timerf, char *comm, unsigned int timer_flag) { /* * It doesn't matter which lock we take: */ raw_spinlock_t *lock; struct entry *entry, input; unsigned long flags; if (likely(!timer_stats_active))----------------------------------------是否打開開關 return; lock = &per_cpu(tstats_lookup_lock, raw_smp_processor_id()); input.timer = timer;---------------------------------------------------填充entry結構體 input.start_func = startf; input.expire_func = timerf; input.pid = pid; input.timer_flag = timer_flag; raw_spin_lock_irqsave(lock, flags); if (!timer_stats_active) goto out_unlock; entry = tstat_lookup(&input, comm);-------------------------------在tstat_hash_table中查找input,找到count遞增;沒找到新增一個entry插入hash表。 if (likely(entry)) entry->count++; else atomic_inc(&overflow_count); out_unlock: raw_spin_unlock_irqrestore(lock, flags); } |
顯示的地方在tstats_show,所有的統計信息存放在static struct entry entries[MAX_ENTRIES]結構體數組中,當前entry數目通過nr_entries來記錄。
static int tstats_show(struct seq_file *m, void *v) { struct timespec period; struct entry *entry; unsigned long ms; long events = 0; ktime_t time; int i; mutex_lock(&show_mutex); /* * If still active then calculate up to now: */ if (timer_stats_active) time_stop = ktime_get(); time = ktime_sub(time_stop, time_start); period = ktime_to_timespec(time); ms = period.tv_nsec / 1000000; seq_puts(m, "Timer Stats Version: v0.2\n"); seq_printf(m, "Sample period: %ld.%03ld s\n", period.tv_sec, ms); if (atomic_read(&overflow_count)) seq_printf(m, "Overflow: %d entries\n", atomic_read(&overflow_count)); for (i = 0; i < nr_entries; i++) {----------------------遍歷entries數組,數組內容在每次hrtimer中斷處理中進行更新。 entry = entries + i; if (entry->timer_flag & TIMER_STATS_FLAG_DEFERRABLE) { seq_printf(m, "%4luD, %5d %-16s ", entry->count, entry->pid, entry->comm); } else { seq_printf(m, " %4lu, %5d %-16s ", entry->count, entry->pid, entry->comm); } print_name_offset(m, (unsigned long)entry->start_func); seq_puts(m, " ("); print_name_offset(m, (unsigned long)entry->expire_func); seq_puts(m, ")\n"); events += entry->count; } ms += period.tv_sec * 1000; if (!ms) ms = 1; if (events && period.tv_sec) seq_printf(m, "%ld total events, %ld.%03ld events/sec\n", events, events * 1000 / ms, (events * 1000000 / ms) % 1000); else seq_printf(m, "%ld total events\n", events); mutex_unlock(&show_mutex); return 0; } |
一個實例分析:
Timer Stats Version: v0.2 Sample period: 42.997 s --依次是:timer count、進程號、進程名稱、超時函數、啟動timer函數 2112, 1300 xxx_log_agent hrtimer_start_range_ns (hrtimer_wakeup) 43, 1299 xxx_log_agent hrtimer_start_range_ns (hrtimer_wakeup) 43, 1298 xxx_log_agent hrtimer_start_range_ns (hrtimer_wakeup) 43, 1301 xxx_log_agent hrtimer_start_range_ns (hrtimer_wakeup) 5, 1881 sample hrtimer_start (alarmtimer_fired) 116, 0 swapper hrtimer_start_range_ns (tick_sched_timer) 14, 1355 xxx_ipsec_proxy hrtimer_start_range_ns (hrtimer_wakeup) 28, 0 swapper hrtimer_start (tick_sched_timer) 17D, 0 swapper cpufreq_interactive_timer_resched.constprop.8 (cpufreq_interactive_timer) 8, 794 at_ctl hrtimer_start_range_ns (hrtimer_wakeup) 6, 0 swapper run_timer_softirq (sync_supers_timer_fn) 4, 0 swapper DWC_TIMER_SCHEDULE (timer_callback) 4, 0 swapper DWC_TIMER_SCHEDULE (timer_callback) 4, 0 swapper DWC_TIMER_SCHEDULE (timer_callback) 4, 0 swapper br_hello_timer_expired (br_hello_timer_expired) 3, 613 zx_wdt_thread msleep (process_timeout) 2D, 4 kworker/0:0 neigh_periodic_work (delayed_work_timer_fn) 2D, 4 kworker/0:0 neigh_periodic_work (delayed_work_timer_fn) 2, 1366 adbd DWC_TIMER_SCHEDULE (timer_callback) 1, 0 swapper igmp6_group_queried (igmp6_timer_handler) 1, 1371 dm_ci hrtimer_start_range_ns (hrtimer_wakeup) 1, 0 swapper igmp6_group_queried (igmp6_timer_handler) 1, 0 swapper igmp6_group_queried (igmp6_timer_handler) 2464 total events, 57.306 events/sec |
五、使用
1.通過timer_stats可以知道系統的timer使用程度,一定程度上反映了進程的活躍狀態。
2.timer_list可以知道系統TickDevice相關設備信息。
3.還可以知道timer狀態,以及即將發生的Timer。
4.系統相關信息尤其是tick_sched(系統調度jiffies、idle、iowait等)和hrtimer_cpu_base(retries、hang)部分。