Valgrind.Callgrind使用

本文轉載自查看原文 2018-10-19 20:11 1222 Linux/ Algorithm/ Valgrind

Callgrind介紹

用來對統計程序的函數調用之間的關系, 並統計每個函數的耗時
Callgrind之所以能夠發現函數調用的關系, 依賴於平台的明確返回和調用指令. 在x86和amd64平台上works best, 但在PowerPC ARM Thumb以及MIPS上無法運行
可以使用gprof2dot來把profile結果生成圖片
SOF上一個關於C++ profile的很好的問題

對整個程序進行profile

先執行valgrind --tool=callgrind ./prog_name, 跑完之后會生成一個callgrind.out.X的profile文件, X為線程號
使用KCachegrind分析結果文件:kcachegrind callgrind.out.X. KCachegrind文檔

只對程序某個片段進行profile

使用上面的方式的一大缺點就是會對整個程序都進行profile, 這樣會導致過程很慢. 如果我們只想對程序的某個部分進行profile, 那么可以如下使用:

在一個shell中輸入命令valgrind --tool=callgrind --dump-instr=yes -v --instr-atstart=no ./prog_name > log.txt, 其中, --dump-instr=yes表示生成匯編指令注釋, --instr-atstart=no表示不是程序啟動時就啟動profile, 方便控制節點.
當程序運行到我們想要profile的片段時, 在另一個shell中輸入callgrind_control -i on
當想要profile部分結束之后, 輸入callgrind_control -k
使用KCachegrind分析Callgrind.out文件

更先進做法

上面的做法也就只能大概控制profile片段, 實操性不強. 根據文檔說明, 可以使用指令來控制具體的Callgrind進行profile起止時間:


#include <valgrind/callgrind.h>

//codes...

//request callgrind to start full profile
CALLGRIND_START_INSTRUMENTATION;

//codes...

//request callgrind to stop full profile
CALLGRIND_STOP_INSTRUMENTATION;

callgrind.h頭文件見這里. 上面的兩個請求命令其實是在頭文件里面定義兩個宏.

/* Start full callgrind instrumentation if not already switched on.
   When cache simulation is done, it will flush the simulated cache;
   this will lead to an artifical cache warmup phase afterwards with
   cache misses which would not have happened in reality. */
#define CALLGRIND_START_INSTRUMENTATION                              \
  VALGRIND_DO_CLIENT_REQUEST_STMT(VG_USERREQ__START_INSTRUMENTATION, \
                                  0, 0, 0, 0, 0)

/* Stop full callgrind instrumentation if not already switched off.
   This flushes Valgrinds translation cache, and does no additional
   instrumentation afterwards, which effectivly will run at the same
   speed as the "none" tool (ie. at minimal slowdown).
   Use this to bypass Callgrind aggregation for uninteresting code parts.
   To start Callgrind in this mode to ignore the setup phase, use
   the option "--instr-atstart=no". */
#define CALLGRIND_STOP_INSTRUMENTATION                               \
  VALGRIND_DO_CLIENT_REQUEST_STMT(VG_USERREQ__STOP_INSTRUMENTATION,  \
                                  0, 0, 0, 0, 0)

在一次程序運行中dump多次

使用命令CALLGRIND_DUMP_STATS;可以讓Callgrind立即生成一個dump. 每次遇到這個命令都會生成一個dump, 即使在比如for循環里面, 那么就會生成循環次數相等的dump. 注意,對於CALLGRIND_START_INSTRUMENTATION和CALLGRIND_STOP_INSTRUMENTATION這對組合控制的是讓Callgrind只統計命令區間內的代碼, 即使這對組合放在比如for循環中, 如果沒有CALLGRIND_DUMP_STATS;, 那么也只會生成一個dump.
使用CALLGRIND_ZERO_STATS;可以清除Callgrind當前的數據狀態.

使用KCachegrind打開的profile dump界面:

通過按Sl排列, 就可以很容易的看出哪個函數耗時占比最大, 從而針對性的優化

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 調試技巧之 :valgrind –tool=callgrind & kcachegrind 【轉】valgrind 的使用簡介 valgrind 性能測試工具學習使用 Valgrind使用指南和錯誤分析用valgrind檢查內存問題內存問題排查工具 --- valgrind 內存泄露檢測valgrind神器 [轉]Windows上的valgrind--deleaker 移植Valgrind檢測Android JNI內存泄漏 15.QT-Valgrind內存分析