linux 3.10的kdump配置的小坑


之前在2.6系列linux內核中,當發現某個模塊不要在保留內核中加載的時候,可以通過blacklist參數將其在/etc/kdump.conf中屏蔽

blacklist <list of kernel modules>

最近發現某個sas驅動存在問題,所以打算也這么屏蔽,結果,出錯了:

[root@localhost ~]# service kdump restart
Redirecting to /bin/systemctl restart kdump.service
Job for kdump.service failed because the control process exited with error code. See "systemctl status kdump.service" and "journalctl -xe" for details.
[root@localhost ~]# systemctl status kdump.service
* kdump.service - Crash recovery kernel arming
   Loaded: loaded (/usr/lib/systemd/system/kdump.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Tue 2017-11-28 11:58:28 UTC; 10s ago
  Process: 60563 ExecStop=/usr/bin/kdumpctl stop (code=exited, status=0/SUCCESS)
  Process: 60572 ExecStart=/usr/bin/kdumpctl start (code=exited, status=1/FAILURE)
 Main PID: 60572 (code=exited, status=1/FAILURE)

Nov 28 11:58:28 localhost.localdomain kdumpctl[60572]: Deprecated kdump config option: blacklist. Refer to kdump.conf manpage for alternatives.
Nov 28 11:58:28 localhost.localdomain kdumpctl[60572]: Starting kdump: [FAILED]
Nov 28 11:58:28 localhost.localdomain systemd[1]: kdump.service: main process exited, code=exited, status=1/FAILURE
Nov 28 11:58:28 localhost.localdomain systemd[1]: Failed to start Crash recovery kernel arming.
Nov 28 11:58:28 localhost.localdomain systemd[1]: Unit kdump.service entered failed state.
Nov 28 11:58:28 localhost.localdomain systemd[1]: kdump.service failed.
[root@localhost ~]# journalctl -xe
Nov 28 11:58:28 localhost.localdomain kdumpctl[60563]: kexec: unloaded kdump kernel
Nov 28 11:58:28 localhost.localdomain kdumpctl[60563]: Stopping kdump: [OK]
Nov 28 11:58:28 localhost.localdomain kdumpctl[60572]: Deprecated kdump config option: blacklist. Refer to kdump.conf manpage for alternatives.
Nov 28 11:58:28 localhost.localdomain kdumpctl[60572]: Starting kdump: [FAILED]
Nov 28 11:58:28 localhost.localdomain systemd[1]: kdump.service: main process exited, code=exited, status=1/FAILURE
Nov 28 11:58:28 localhost.localdomain systemd[1]: Failed to start Crash recovery kernel arming.
-- Subject: Unit kdump.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit kdump.service has failed.
-- 
-- The result is failed.
Nov 28 11:58:28 localhost.localdomain systemd[1]: Unit kdump.service entered failed state.
Nov 28 11:58:28 localhost.localdomain systemd[1]: kdump.service failed.
Nov 28 11:58:28 localhost.localdomain polkitd[2087]: Unregistered Authentication Agent for unix-process:60547:533046 (system bus name :1.5128, object path /org/freedesktop/PolicyKit1/AuthenticationAgent
[root@localhost ~]# 

發現blacklist是過時的用法了,然后參照提示:

man kdump.conf 看到如下打印:

blacklist option was recently being used to prevent loading modules in initramfs. General terminology for blacklist has been that module is present in initramfs but it is not actu-
ally loaded in kernel. Hence retaining blacklist option creates more confusing behavior. It has been deprecated. 
Instead, use rd.driver.blacklist option on second kernel to blacklist a certain module. One can edit /etc/sysconfig/kdump.conf and edit KDUMP_COMMANDLINE_APPEND to pass kernel com-
mand line options. Refer to dracut.cmdline man page for more details on module blacklist option.

好吧,按照最新的要求,打算修改/etc/sysconfig/kdump.conf,發現這個文件不存在,當然配置文件路徑不是關鍵,/etc/kdump.conf里面配置也行,

我按照manpage的提示,修改文件名是/etc/sysconfig/kdump,然后修改KDUMP_COMMANDLINE_APPEND這行命令,具體的格式參考:

man  dracut.cmdline

  rd.driver.blacklist=<drivername>[,<drivername>,...]
           do not load kernel module <drivername>. This parameter can be specified multiple times.

       rd.driver.pre=<drivername>[,<drivername>,...]
           force loading kernel module <drivername>. This parameter can be specified multiple times.

       rd.driver.post=<drivername>[,<drivername>,...]
           force loading kernel module <drivername> after all automatic loading modules have been loaded. This parameter can be specified multiple times.

 

另外需要注意的是,當修改了配置,就要重啟kdump服務,而這個時候,由於修改了blacklist,會導致重啟的時候比較慢,因為在涉及到配置文件變動時,如生成路徑修改或blacklist內容增加,都需要重新生成kdump的RAM文件,不然其在發生問題時還是使用老的img RAM文件,這類文件在/boot下以kdump.img結尾的文件就是:

[root@localhost ~]# ls -l /boot/*kdump*
-rw------- 1 root root 16878919 Nov 29 01:02 /boot/initramfs-3.10.0-693.5.2.el7.x86_64kdump.img
-rw------- 1 root root 35261890 Nov 27 07:04 /boot/initramfs-3.10.0caq1.0kdump.img
-rw------- 1 root root 36508192 Nov 24 06:21 /boot/initramfs-3.10.0kdump.img
[root@localhost ~]# 

 最后需要注意的就是,當配置的保留內核在加載驅動或者運行的時候,遇到panic,這個時候就再也沒有內核去接管它了,只能在屏幕上打印,或者接串口查看。之前遇到過保留內存不夠的

情況下,保留內核自己出現oom了,導致無法收集到crash,查看當前的保留內存可以使用:

[root@localhost ~]# cat /sys/kernel/kexec_crash_size
536870912

查看保留內核是否加載,可以使用:
[root@localhost ~]# cat /sys/kernel/kexec_crash_loaded
1


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM