關於dell x86架構服務器報錯:EDAC MC1: CE row 0, channel 0, label "CPU_SrcID#1_Channel#1_DIMM#0


1、查看messages,dmesg發現有許多關於EMC的報錯,如下:

EDAC MC1: CE row 1, channel 0, label "CPU_SrcID#1_Channel#1_DIMM#1": 6317 Unknown error(s): memory read on FATAL area OVERFLOW: cpu=1 Err=0001:0092 (ch=2), addr = 0x7fc3afe40 => socket=1, Channel=1(mask=2), rank=4

EDAC MC1: CE row 2, channel 0, label "CPU_SrcID#1_Channel#2_DIMM#0": 5793 Unknown error(s): memory read on FATAL area OVERFLOW: cpu=1 Err=0001:0092 (ch=2), addr = 0x546da78c0 => socket=1, Channel=2(mask=4), rank=0

EDAC MC1: CE row 3, channel 0, label "CPU_SrcID#1_Channel#2_DIMM#1": 5017 Unknown error(s): memory read on FATAL area OVERFLOW: cpu=1 Err=0001:0092 (ch=2), addr = 0x696e5cbc0 => socket=1, Channel=2(mask=4), rank=5

EDAC MC1: CE row 1, channel 0, label "CPU_SrcID#1_Channel#1_DIMM#1": 3525 Unknown error(s): memory read on FATAL area OVERFLOW: cpu=1 Err=0001:0092 (ch=2), addr = 0x74e70c240 => socket=1, Channel=1(mask=2), rank=4

2、找出錯誤的DIMM,如下分別是cpu0,cpu1上8根內存條報錯,count不為0表示有錯誤

mc代表第幾個cpu,csrow內存通道,ch第幾個內存

[root@localhost ~]#  grep "[0-9]" /sys/devices/system/edac/mc/mc*/csrow*/ch*_ce_count
/sys/devices/system/edac/mc/mc0/csrow0/ch0_ce_count:0
/sys/devices/system/edac/mc/mc0/csrow1/ch0_ce_count:0
/sys/devices/system/edac/mc/mc0/csrow2/ch0_ce_count:0
/sys/devices/system/edac/mc/mc0/csrow3/ch0_ce_count:0
/sys/devices/system/edac/mc/mc1/csrow0/ch0_ce_count:21248125
/sys/devices/system/edac/mc/mc1/csrow1/ch0_ce_count:11360507
/sys/devices/system/edac/mc/mc1/csrow2/ch0_ce_count:18691380
/sys/devices/system/edac/mc/mc1/csrow3/ch0_ce_count:9044537

 dmidecode -t memory 可查看內存詳細信息


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM