SystemTap - 安裝


按照SystemTap Beginners Guide的Installation and Setup部分安裝了SystemTap,沒想到竟然還有點曲折,在這里紀錄一下。

 

環境

  • Linux發行版本:CentOS Linux release 7.4.1708 (Core)
  • 內核版本:3.10.0-693.2.2.el7.x86_64
  • uname -a: Linux hostname 3.10.0-693.2.2.el7.x86_64 #1 SMP Tue Sep 12 22:26:13 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

 

安裝

安裝SystemTap

先安裝如下兩個RPM包:

  • systemtap
  • systemtap-runtime

以root運行如下命令:

# yum install systemtap systemtap-runtime

 

在運行SystemTap之間,還需要裝必要的內核信息包。在現代系統上,可以運行如下stap-prep來安裝這些包,如下:

# stap-prep
Need to install the following packages:
kernel-devel-3.10.0-693.2.2.el7.x86_64
kernel-debuginfo-3.10.0-693.2.2.el7.x86_64
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
No package kernel-devel-3.10.0-693.2.2.el7.x86_64 available.
No package kernel-debuginfo-3.10.0-693.2.2.el7.x86_64 available.
Error: Nothing to do
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
Could not find debuginfo for main pkg: kernel-3.10.0-693.2.2.el7.x86_64
No debuginfo packages available to install
package kernel-devel-3.10.0-693.2.2.el7.x86_64 is not installed
package kernel-debuginfo-3.10.0-693.2.2.el7.x86_64 is not installed
problem installing rpm(s) kernel-devel-3.10.0-693.2.2.el7.x86_64 kernel-debuginfo-3.10.0-693.2.2.el7.x86_64

運行stap-prep的時候,它探測出還要安裝kernel-devel-3.10.0-693.2.2.el7.x86_64包和kernel-debuginfo-3.10.0-693.2.2.el7.x86_64包 (實際上還有kernel-debuginfo-common包),但是自動安裝失敗。我們可以按照如下方法手動安裝。

 

手動安裝必要的內核信息包

SystemTap需要安裝內核內核符號文件來probe內核。必要的內核信息包含在如下三個包中:

  • kernel-debuginfo
  • kernel-debuginfo-common
  • kernel-devel

一定要安裝與當前內核版本一致的包。當前環境的內核版本是3.10.0-693.2.2.el7.x86_64,所以需要安裝的包為:

  • kernel-debuginfo-3.10.0-693.2.2.el7.x86_64
  • kernel-debuginfo-common-3.10.0-693.2.2.el7.x86_64
  • kernel-devel-3.10.0-693.2.2.el7.x86_64

注意:其實在上一步運行stap-prep時,已經把需要的包的名稱及其內核精准地打印在屏幕上了。

 

接下來安裝這三個包,注意不要直接yum install kernel-debuginfo kernel-debuginfo-common kernel-devel, 即使能找到相應的包,也是安裝的最新版本,不會自動匹配當前版本。所以我們下載RPM包,再用rpm命令依次安裝。

 

對於CentOS來說,內核符號文件一版在http://debuginfo.centos.org上有各個版本非常完整的包,但是一般從境內下載都比較慢,特別是kernel-debuginfo,比較大下載可能非常慢。所以在debuginfo.centos.org上下了kernel-debuginfo-common包,另外兩個包在Google上搜了一把,分別找了兩個鏡像。下了之后才發現這個地方有坑,這個坑在后面展開講。

wget https://ftp.sjtu.edu.cn/scientific/7/archive/debuginfo/kernel-debuginfo-3.10.0-693.2.2.el7.x86_64.rpm
wget http://debuginfo.centos.org/7/x86_64/kernel-debuginfo-common-x86_64-3.10.0-693.2.2.el7.x86_64.rpm
wget ftp://mirror.switch.ch/pool/4/mirror/scientificlinux/7.0/x86_64/updates/security/kernel-devel-3.10.0-693.2.2.el7.x86_64.rpm

下載之后,直接用rpm命令安裝就好:

# rpm -ivh kernel-debuginfo-common-x86_64-3.10.0-693.2.2.el7.x86_64.rpm
# rpm -ivh kernel-debuginfo-3.10.0-693.2.2.el7.x86_64.rpm
# rpm -ivh kernel-devel-3.10.0-693.2.2.el7.x86_64.rpm

至此安裝步驟完畢,下面來測試SystemTap能不能正常運行。

 

運行SystemTap

為了測試stap是否能正常運行,用如下簡單命令打印:

# stap -e 'probe begin{printf("Hello, World"); exit();}'

運行失敗...結果如下:

# stap -e 'probe begin{printf("Hello, World"); exit();}'
ERROR: module version mismatch (#1 SMP Tue Sep 12 10:10:26 CDT 2017 vs #1 SMP Tue Sep 12 22:26:13 UTC 2017), release 3.10.0-693.2.2.el7.x86_64
WARNING: /usr/bin/staprun exited with status: 1
Pass 5: run failed.  [man error::pass5]

錯誤信息是:"ERROR: module version mismatch (#1 SMP Tue Sep 1210:10:26 CDT 2017 vs #1 SMP Tue Sep 1222:26:13 UTC 2017)"。

 

解決"ERROR: module version mismatch"問題

stap運行的時候加上-v參數,打印更多信息看看還有沒有更多線索:

# stap -e 'probe begin{printf("Hello, World"); exit();}' -v
Pass 1: parsed user script and 470 library scripts using 228224virt/41280res/3348shr/38020data kb, in 330usr/20sys/346real ms.
Pass 2: analyzed script: 1 probe, 1 function, 0 embeds, 0 globals using 229148virt/42332res/3536shr/38944data kb, in 0usr/0sys/6real ms.
Pass 3: using cached /root/.systemtap/cache/0b/stap_0bc9e27aef7a1de50ea41889a27fc524_1010.c
Pass 4: using cached /root/.systemtap/cache/0b/stap_0bc9e27aef7a1de50ea41889a27fc524_1010.ko
Pass 5: starting run.
ERROR: module version mismatch (#1 SMP Tue Sep 12 10:10:26 CDT 2017 vs #1 SMP Tue Sep 12 22:26:13 UTC 2017), release 3.10.0-693.2.2.el7.x86_64
WARNING: /usr/bin/staprun exited with status: 1
Pass 5: run completed in 0usr/10sys/38real ms.
Pass 5: run failed.  [man error::pass5]

查看c文件,vi /root/.systemtap/cache/0b/stap_0bc9e27aef7a1de50ea41889a27fc524_1010.c,搜錯誤信息"module version mismatch",能搜到報錯發生在下面的第13行,至於UTS_RELEASE和UTS_VERSION是在哪里設置的,直接Google一把。

 1 #ifndef STP_NO_VERREL_CHECK
 2     const char* release = UTS_RELEASE;
 3     #ifdef STAPCONF_GENERATED_COMPILE
 4     const char* version = UTS_VERSION;
 5     #endif
 6     might_sleep();
 7     if (strcmp (release, "3.10.0-693.2.2.el7.x86_64")) {
 8       _stp_error ("module release mismatch (%s vs %s)", release, "3.10.0-693.2.2.el7.x86_64");
 9       rc = -EINVAL;
10     }
11     #ifdef STAPCONF_GENERATED_COMPILE
12     if (strcmp (utsname()->version, version)) {
13       _stp_error ("module version mismatch (%s vs %s), release %s", version, utsname()->version, release);
14       rc = -EINVAL;
15     }
16     #endif
17     #endif

 

有兩篇文章里面提到了同樣的坑,文章連接在底部的參考中。在kernel-devel包的所以文件中搜以下變量UTS_VERSION,

# rpm -ql kernel-devel | xargs grep UTS_VERSION
/usr/src/kernels/3.10.0-693.2.2.el7.x86_64/include/generated/compile.h:#define UTS_VERSION "#1 SMP Tue Sep 12 10:10:26 CDT 2017"

可以看到在compile.h中有#define UTS_VERSION "#1 SMP Tue Sep 12 10:10:26 CDT 2017". 這個是不是很熟悉... 對比下上面運行stap的報錯信息, module mismatch的時間就是這個。文件compile.h是自動生成的,可能和當時編譯時的時間相關。但是stap要求這個也和當前系統uname -a里面的時間完全一直,如果下個CentOS原生的kernel-devel應該就沒這個問題。

解決問題的另一個簡單方法就是直接修改這個compile.h文件,原來的文件如下:

# cat /usr/src/kernels/3.10.0-693.2.2.el7.x86_64/include/generated/compile.h
/* This file is auto generated, version 1 */
/* SMP */
#define UTS_MACHINE "x86_64"
#define UTS_VERSION "#1 SMP Tue Sep 12 10:10:26 CDT 2017"
#define LINUX_COMPILE_BY "mockbuild"
#define LINUX_COMPILE_HOST "sl7-uefisign.fnal.gov"
#define LINUX_COMPILER "gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) "

修改define UTS_VERSION那一行,如下:

#define UTS_VERSION "#1 SMP Tue Sep 12 10:10:26 CDT 2017" -> #define UTS_VERSION "#1 SMP Tue Sep 12 22:26:13 UTC 2017"

 

再次運行stap:

# stap -e 'probe begin{printf("Hello, World"); exit();}' -v
Pass 1: parsed user script and 470 library scripts using 228220virt/41276res/3348shr/38016data kb, in 350usr/10sys/355real ms.
Pass 2: analyzed script: 1 probe, 1 function, 0 embeds, 0 globals using 229144virt/42328res/3536shr/38940data kb, in 0usr/0sys/6real ms.
Pass 3: using cached /root/.systemtap/cache/0b/stap_0bc9e27aef7a1de50ea41889a27fc524_1010.c
Pass 4: using cached /root/.systemtap/cache/0b/stap_0bc9e27aef7a1de50ea41889a27fc524_1010.ko
Pass 5: starting run.
ERROR: module version mismatch (#1 SMP Tue Sep 12 10:10:26 CDT 2017 vs #1 SMP Tue Sep 12 22:26:13 UTC 2017), release 3.10.0-693.2.2.el7.x86_64
WARNING: /usr/bin/staprun exited with status: 1
Pass 5: run completed in 0usr/10sys/38real ms.
Pass 5: run failed.  [man error::pass5]

因為中間生成的C文件和ko模塊都是用的cache (藍色標注的部分),我們把上面的cache文件刪除,再重新運行,這次可以成功了。

# stap -e 'probe begin{printf("Hello, World"); exit();}'
Hello, World

 

參考

https://sourceware.org/systemtap/SystemTap_Beginners_Guide/using-systemtap.html#using-setup

ERROR: module version mismatch

https://groups.google.com/forum/#!topic/openresty/nlEc3qlDyOc

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM