之前在2.6系列linux內核中,當發現某個模塊不要在保留內核中加載的時候,可以通過blacklist參數將其在/etc/kdump.conf中屏蔽
blacklist <list of kernel modules>
最近發現某個sas驅動存在問題,所以打算也這么屏蔽,結果,出錯了:
[root@localhost ~]# service kdump restart Redirecting to /bin/systemctl restart kdump.service Job for kdump.service failed because the control process exited with error code. See "systemctl status kdump.service" and "journalctl -xe" for details. [root@localhost ~]# systemctl status kdump.service * kdump.service - Crash recovery kernel arming Loaded: loaded (/usr/lib/systemd/system/kdump.service; enabled; vendor preset: enabled) Active: failed (Result: exit-code) since Tue 2017-11-28 11:58:28 UTC; 10s ago Process: 60563 ExecStop=/usr/bin/kdumpctl stop (code=exited, status=0/SUCCESS) Process: 60572 ExecStart=/usr/bin/kdumpctl start (code=exited, status=1/FAILURE) Main PID: 60572 (code=exited, status=1/FAILURE) Nov 28 11:58:28 localhost.localdomain kdumpctl[60572]: Deprecated kdump config option: blacklist. Refer to kdump.conf manpage for alternatives. Nov 28 11:58:28 localhost.localdomain kdumpctl[60572]: Starting kdump: [FAILED] Nov 28 11:58:28 localhost.localdomain systemd[1]: kdump.service: main process exited, code=exited, status=1/FAILURE Nov 28 11:58:28 localhost.localdomain systemd[1]: Failed to start Crash recovery kernel arming. Nov 28 11:58:28 localhost.localdomain systemd[1]: Unit kdump.service entered failed state. Nov 28 11:58:28 localhost.localdomain systemd[1]: kdump.service failed. [root@localhost ~]# journalctl -xe Nov 28 11:58:28 localhost.localdomain kdumpctl[60563]: kexec: unloaded kdump kernel Nov 28 11:58:28 localhost.localdomain kdumpctl[60563]: Stopping kdump: [OK] Nov 28 11:58:28 localhost.localdomain kdumpctl[60572]: Deprecated kdump config option: blacklist. Refer to kdump.conf manpage for alternatives. Nov 28 11:58:28 localhost.localdomain kdumpctl[60572]: Starting kdump: [FAILED] Nov 28 11:58:28 localhost.localdomain systemd[1]: kdump.service: main process exited, code=exited, status=1/FAILURE Nov 28 11:58:28 localhost.localdomain systemd[1]: Failed to start Crash recovery kernel arming. -- Subject: Unit kdump.service has failed -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit kdump.service has failed. -- -- The result is failed. Nov 28 11:58:28 localhost.localdomain systemd[1]: Unit kdump.service entered failed state. Nov 28 11:58:28 localhost.localdomain systemd[1]: kdump.service failed. Nov 28 11:58:28 localhost.localdomain polkitd[2087]: Unregistered Authentication Agent for unix-process:60547:533046 (system bus name :1.5128, object path /org/freedesktop/PolicyKit1/AuthenticationAgent [root@localhost ~]#
發現blacklist是過時的用法了,然后參照提示:
man kdump.conf 看到如下打印:
blacklist option was recently being used to prevent loading modules in initramfs. General terminology for blacklist has been that module is present in initramfs but it is not actu- ally loaded in kernel. Hence retaining blacklist option creates more confusing behavior. It has been deprecated. Instead, use rd.driver.blacklist option on second kernel to blacklist a certain module. One can edit /etc/sysconfig/kdump.conf and edit KDUMP_COMMANDLINE_APPEND to pass kernel com- mand line options. Refer to dracut.cmdline man page for more details on module blacklist option.
好吧,按照最新的要求,打算修改/etc/sysconfig/kdump.conf,發現這個文件不存在,當然配置文件路徑不是關鍵,/etc/kdump.conf里面配置也行,
我按照manpage的提示,修改文件名是/etc/sysconfig/kdump,然后修改KDUMP_COMMANDLINE_APPEND這行命令,具體的格式參考:
man dracut.cmdline
rd.driver.blacklist=<drivername>[,<drivername>,...] do not load kernel module <drivername>. This parameter can be specified multiple times. rd.driver.pre=<drivername>[,<drivername>,...] force loading kernel module <drivername>. This parameter can be specified multiple times. rd.driver.post=<drivername>[,<drivername>,...] force loading kernel module <drivername> after all automatic loading modules have been loaded. This parameter can be specified multiple times.
另外需要注意的是,當修改了配置,就要重啟kdump服務,而這個時候,由於修改了blacklist,會導致重啟的時候比較慢,因為在涉及到配置文件變動時,如生成路徑修改或blacklist內容增加,都需要重新生成kdump的RAM文件,不然其在發生問題時還是使用老的img RAM文件,這類文件在/boot下以kdump.img結尾的文件就是:
[root@localhost ~]# ls -l /boot/*kdump* -rw------- 1 root root 16878919 Nov 29 01:02 /boot/initramfs-3.10.0-693.5.2.el7.x86_64kdump.img -rw------- 1 root root 35261890 Nov 27 07:04 /boot/initramfs-3.10.0caq1.0kdump.img -rw------- 1 root root 36508192 Nov 24 06:21 /boot/initramfs-3.10.0kdump.img [root@localhost ~]#
最后需要注意的就是,當配置的保留內核在加載驅動或者運行的時候,遇到panic,這個時候就再也沒有內核去接管它了,只能在屏幕上打印,或者接串口查看。之前遇到過保留內存不夠的
情況下,保留內核自己出現oom了,導致無法收集到crash,查看當前的保留內存可以使用:
[root@localhost ~]# cat /sys/kernel/kexec_crash_size
536870912
查看保留內核是否加載,可以使用:
[root@localhost ~]# cat /sys/kernel/kexec_crash_loaded
1