惠普ProLiant DL380p Gen8服務器內存故障處理


現象:莫名的重起

查看日志:

Aug  2 20:26:25 localhost kernel: EDAC MC1: 26 CE memory read error on CPU_SrcID#1_Ha#0_Chan#2_DIMM#0 (channel:2 slot:0 page:0x3bc505 offset:0x9c0 grain:32 syndrome:0x0 -  OVERFLOW area:DRAM err_code:0001:0092 socket:1 ha:0 channel_mask:4 rank:0)
Aug  2 20:26:25 localhost kernel: EDAC MC1: 29 CE memory read error on CPU_SrcID#1_Ha#0_Chan#2_DIMM#0 (channel:2 slot:0 page:0x3ba088 offset:0x5c0 grain:32 syndrome:0x0 -  OVERFLOW area:DRAM err_code:0001:0092 socket:1 ha:0 channel_mask:4 rank:0)

 

edac-utils安裝命令

yum install -y libsysfs edac-utils

檢測結果,有55個錯誤

[root@localhost ~]#edac-util -v
mc0: 0 Uncorrected Errors with no DIMM info
mc0: 0 Corrected Errors with no DIMM info
mc0: csrow0: 0 Uncorrected Errors
mc0: csrow0: CPU_SrcID#0_Ha#0_Chan#1_DIMM#0: 0 Corrected Errors
mc0: csrow0: CPU_SrcID#0_Ha#0_Chan#2_DIMM#0: 0 Corrected Errors
mc0: csrow0: CPU_SrcID#0_Ha#0_Chan#3_DIMM#0: 0 Corrected Errors
mc1: 0 Uncorrected Errors with no DIMM info
mc1: 0 Corrected Errors with no DIMM info
mc1: csrow0: 0 Uncorrected Errors
mc1: csrow0: CPU_SrcID#1_Ha#0_Chan#0_DIMM#0: 0 Corrected Errors
mc1: csrow0: CPU_SrcID#1_Ha#0_Chan#1_DIMM#0: 0 Corrected Errors
mc1: csrow0: CPU_SrcID#1_Ha#0_Chan#2_DIMM#0: 55 Corrected Errors
mc1: csrow0: CPU_SrcID#1_Ha#0_Chan#3_DIMM#0: 0 Corrected Errors
[root@localhost ~]#

 

服務器面板報錯:

 

 

拔掉服務器上的1巢上面的內存,再次開機啟動,問題解決

 

 

再用軟件進行測試,工作正常

 

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM