top 命令輸出(輸入 M 進程按內存排序,輸入 m 可視化內存占用):
top - 13:24:35 up 22 days, 23:01, 1 user, load average: 15.64, 18.52, 12.97
Tasks: 358 total, 5 running, 353 sleeping, 0 stopped, 0 zombie
%Cpu(s): 10.3 us, 21.7 sy, 0.0 ni, 61.5 id, 1.5 wa, 0.0 hi, 5.1 si, 0.0 st
KiB Mem : 98.5/16247608 [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| ]
KiB Swap: 0.0/0 [ ]
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 20 0 9605472 9.0g 348 S 0.0 57.8 798:26.87 systemd
82720 polkitd 20 0 4978420 276312 0 S 2.3 1.7 29:45.70 mysqld
57250 root 20 0 6290796 193804 11516 S 43.9 1.2 4573:23 kubelet
97595 root 20 0 256856 166472 44 R 30.8 1.0 235:49.03 ruby
57071 root 20 0 1796564 110960 564 S 15.1 0.7 2129:00 dockerd
82641 root 20 0 26.1g 85152 0 S 2.0 0.5 37:26.06 dotnet
82103 root 20 0 26.1g 81612 764 S 3.3 0.5 75:23.47 dotnet
81469 root 20 0 26.1g 78808 852 S 2.6 0.5 37:30.94 dotnet
85327 root 20 0 26.1g 78404 0 S 0.3 0.5 15:03.34 dotnet
15729 polkitd 20 0 1853084 78152 1876 S 2.6 0.5 8:14.68 mongod
89447 root 20 0 26.1g 77728 0 S 3.6 0.5 36:48.23 dotnet
systemctl status
發現狀態為 degraded
systemctl --failed
輸出:
UNIT LOAD ACTIVE SUB DESCRIPTION
● chronyd.service loaded failed failed NTP client/server
● irqbalance.service loaded failed failed irqbalance daemon
● postfix.service loaded failed failed Postfix Mail Transport Agent
● tuned.service loaded failed failed Dynamic System Tuning Daemon
● vgauthd.service loaded failed failed VGAuth Service for open-vm-tools
● vmtoolsd.service loaded failed failed Service for virtual machines hosted on VMware
LOAD = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB = The low-level unit activation state, values depend on unit type.
6 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.
感覺上很像 systemd 發生了內存泄漏,但是 journalctl
的日志看不明白,ps 1
(1 即 systemd pid) 搞一圈也找不到問題點。。。
網上搜一圈有說是 CentOS7 的 bug。給出的解決方案,我總結如下:
- 重啟虛擬機,回收內存。
- 實測網上給出的回收內存命令:
systemctl daemon-reexec
沒有任何效果。。 - 只能關閉電源,實測
reboot
/poweroff
/init 0
命令均無法關機,報錯:
- 實測網上給出的回收內存命令:
Failed to start poweroff.target: Connection timed out
See system logs and 'systemctl status poweroff.target' for details.
Broadcast message from root@xxx-xxx on pts/0 (Tue 2020-11-24 13:45:11 CST):
The system is going down for power-off NOW!
- 升級 systemd 到新版本,新版本應該修復了這個 Bug:
yum update systemd
- 或者升級整個系統
yum update
- 或者升級整個系統
升級后應該就沒毛病了.