Linux系统异常重启后的messages日志分析


https://unix.stackexchange.com/questions/9819/how-to-find-out-from-the-logs-what-caused-system-shutdown

 

TLDR

Use these 2 commands and keep reading for more information.

last -x | head | tac

grep -iv ': starting\|kernel: .*: Power Button\|watching system buttons\|Stopped Cleaning Up\|Started Crash recovery kernel' \
  /var/log/messages /var/log/syslog /var/log/apcupsd* \
  | grep -iw 'recover[a-z]*\|power[a-z]*\|shut[a-z ]*down\|rsyslogd\|ups'

1) Regarding the output of last -x command

Run this command* and compare the output to the examples below:

last -x | head | tac

Normal shutdown examples

A normal shutdown and power-up looks like this (note that you have a shutdown event and then a system boot event):

runlevel (to lvl 0)   2.6.32- Sat Mar 17 08:48 - 08:51  (00:02) 
shutdown system down  ... <-- first the system shuts down   
reboot   system boot  ... <-- afterwards the system boots
runlevel (to lvl 3)       

In some cases you may see this (note that there is no line about the shutdown but the system was at runlevel 0 which is the "halt state"):

runlevel (to lvl 0)   ... <-- first the system shuts down (init level 0)
reboot   system boot  ... <-- afterwards the system boots
runlevel (to lvl 2)   2.6.24-... Fri Aug 10 15:58 - 15:32 (2+23:34)   

Unexpected shutdown examples

An unexpected shutdown from power loss looks like this (note that you have a system boot event without a prior system shutdown event):

runlevel (to lvl 3)   ... <-- the system was running since this momemnt
reboot   system boot  ... <-- then we've a boot WITHOUT a prior shutdown
runlevel (to lvl 3)   3.10.0-693.21.1. Sun Jun 17 15:40 - 09:51  (18:11)    

2) Regarding the logs in /var/log/

A bash command to filter the most interesting log messages is this:

grep -iv ': starting\|kernel: .*: Power Button\|watching system buttons\|Stopped Cleaning Up\|Started Crash recovery kernel' \
  /var/log/messages /var/log/syslog /var/log/apcupsd* \
  | grep -iw 'recover[a-z]*\|power[a-z]*\|shut[a-z ]*down\|rsyslogd\|ups'

When an unexpected power off or hardware failure occurs the filesystems will not be properly unmounted so in the next boot you may get logs like this:

EXT4-fs ... INFO: recovery required ... 
Starting XFS recovery filesystem ...
systemd-fsck: ... recovering journal
systemd-journald: File /var/log/journal/.../system.journal corrupted or uncleanly shut down, renaming and replacing.

When the system powers off because user pressed the power button you get logs like this:

systemd-logind: Power key pressed.
systemd-logind: Powering Off...
systemd-logind: System is powering down.

Only when the system shuts down orderly you get logs like this:

rsyslogd: ... exiting on signal 15

When the system shuts down due to overheating you get logs like this:

critical temperature reached...,shutting down

If you have a UPS and running a daemon to monitor power and shutdown you should obviously check its logs (NUT logs on /var/log/messages but apcupsd logs on /var/log/apcupsd*)


Notes

*: Here's the description of last from its man page:

last [...] prints information about connect times of users. 
Records are printed from most recent to least recent.  
[...]
The special users reboot and shutdown log in when the system reboots
or (surprise) shuts down. 

We use head to keep the latest 10 events and we use tac to invert the ordering so that we don't get confused by the fact that last prints from most recent to least recent event.


免责声明!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系本站邮箱yoyou2525@163.com删除。



 
粤ICP备18138465号  © 2018-2025 CODEPRJ.COM