1. 介紹
Perf is a profiler tool for Linux 2.6+ based systems that abstracts away CPU hardware differences in Linux performance measurements and presents a simple commandline interface. Perf is based on the perf_events interface exported by recent versions of the Linux kernel. This article demonstrates the perf tool through example runs. Output was obtained on a Ubuntu 11.04 system with kernel 2.6.38-8-generic results running on an HP 6710b with dual-core Intel Core2 T7100 CPU). For readability, some output is abbreviated using ellipsis ([...]).
2. perf支持的功能
perf usage: perf [--version] [--help] COMMAND [ARGS] The most commonly used perf commands are: annotate Read perf.data (created by perf record) and display annotated code archive Create archive with object files with build-ids found in perf.data file bench General framework for benchmark suites buildid-cache Manage <tt>build-id</tt> cache. buildid-list List the buildids in a perf.data file diff Read two perf.data files and display the differential profile inject Filter to augment the events stream with additional information kmem Tool to trace/measure kernel memory(slab) properties kvm Tool to trace/measure kvm guest os list List all symbolic event types lock Analyze lock events probe Define new dynamic tracepoints record Run a command and record its profile into perf.data report Read perf.data (created by perf record) and display the profile sched Tool to trace/measure scheduler properties (latencies) script Read perf.data (created by perf record) and display trace output stat Run a command and gather performance counter statistics test Runs sanity tests. timechart Tool to visualize total system behavior during a workload top System profiling tool. See 'perf help COMMAND' for more information on a specific command.
3. perf的編譯方法
編譯ARM版本的perf可以使用下面命令編譯:
cd kernel/tools/perf export CROSS_COMPILE=arm-linux-gnueabihf- /usr/bin/make ARCH=arm LDFLAGS="-static"
4. perf的常用debug方法
# CPU counter statistics for the specified command: perf stat command # Detailed CPU counter statistics (includes extras) for the specified command: perf stat -d command # CPU counter statistics for the specified PID, until Ctrl-C: perf stat -p PID # CPU counter statistics for the entire system, for 5 seconds: perf stat -a sleep 5 # Various basic CPU statistics, system wide, for 10 seconds: perf stat -e cycles,instructions,cache-references,cache-misses,bus-cycles -a sleep 10 # Various CPU level 1 data cache statistics for the specified command: perf stat -e L1-dcache-loads,L1-dcache-load-misses,L1-dcache-stores command # Various CPU data TLB statistics for the specified command: perf stat -e dTLB-loads,dTLB-load-misses,dTLB-prefetch-misses command # Various CPU last level cache statistics for the specified command: perf stat -e LLC-loads,LLC-load-misses,LLC-stores,LLC-prefetches command # Using raw PMC counters, eg, counting unhalted core cycles: perf stat -e r003c -a sleep 5 # PMCs: counting cycles and frontend stalls via raw specification: perf stat -e cycles -e cpu/event=0x0e,umask=0x01,inv,cmask=0x01/ -a sleep 5 # Count syscalls per-second system-wide: perf stat -e raw_syscalls:sys_enter -I 1000 -a # Count system calls by type for the specified PID, until Ctrl-C: perf stat -e 'syscalls:sys_enter_*' -p PID # Count system calls by type for the entire system, for 5 seconds: perf stat -e 'syscalls:sys_enter_*' -a sleep 5 # Count scheduler events for the specified PID, until Ctrl-C: perf stat -e 'sched:*' -p PID # Count scheduler events for the specified PID, for 10 seconds: perf stat -e 'sched:*' -p PID sleep 10 # Count ext4 events for the entire system, for 10 seconds: perf stat -e 'ext4:*' -a sleep 10 # Count block device I/O events for the entire system, for 10 seconds: perf stat -e 'block:*' -a sleep 10 # Count all vmscan events, printing a report every second: perf stat -e 'vmscan:*' -a -I 1000
reference:
https://perf.wiki.kernel.org/index.php/Tutorial
http://www.brendangregg.com/perf.html#StaticKernelTracing