MegaCLI 檢測磁盤狀態並更換磁盤(實戰)


通過遠控發現有幾塊壞的硬盤
Raid10環境下換硬盤還是很簡單的,支持熱插拔,直接拔下換掉就可以了,下面是操作步驟。
 
通過磁盤SN查看壞磁盤是哪個(可以在遠控查看磁盤SN)
/opt/MegaRAID/MegaCli/MegaCli64 -PDList -aAll -NoLog | grep -B 25  3SL1KEF2
卸載故障硬盤
/opt/MegaRAID/MegaCli/MegaCli64 -PDOffline -PhysDrv[32:7] -a0
 
上面命令中 32 和 7 以及 -a0 的對應關系:
Adapter #0
Enclosure Device ID: 32
Slot Number: 7
 
點亮指定硬盤(定位,讓磁盤閃燈)
/opt/MegaRAID/MegaCli/MegaCli64 -PdLocate -start -physdrv[32:7] -a0
 
注:磁盤換完后關閉指定硬盤指示燈
/opt/MegaRAID/MegaCli/MegaCli64 -PdLocate -stop -physdrv[32:7] -a0
 
替換故障硬盤
 
此時故障硬盤已經OFFLINE,在服務器現場查看時,故障硬盤閃爍的是黃燈,正常硬盤的綠燈; 拔下故障硬盤,插上好硬盤,硬盤燈閃爍為綠色,並硬盤快速旋轉,表示硬盤正在rebuild狀態,查看狀態如下:
$ MegaCli -PDList -aAll -NoLog
...
Enclosure Device ID: 32
Slot Number: 7
...
Firmware state: Rebuild
 
 
查看rebuild進度
# /opt/MegaRAID/MegaCli/MegaCli64 -PDRbld -ShowProg -PhysDrv[32:7] -aAll
Rebuild Progress on Device at Enclosure 32, Slot 3 Completed 16% in 94 Minutes.
或者以動態可視化文字界面顯示
#/opt/MegaRAID/MegaCli/MegaCli64 -PDRbld -ProgDsply -PhysDrv[32:7] -a0
      Rebuild progress of physical drives...
  Enclosure:Slot               Percent Complete                       Time Elps
       032 :07     #######****************15 %*********************** 00:24:37
    Press <ESC> key to quit...
 
換盤完成
# /opt/MegaRAID/MegaCli/MegaCli64 -PDList -aAll -NoLog | grep 'Firmware state'
Firmware state: Copyback
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Hotspare, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Offline
 
設置熱備
為了防止磁盤損壞過多,為raid設置一個熱備盤
 
# /opt/MegaRAID/MegaCli/MegaCli64 -PDHSP  -Set -Dedicated  -Array1 -physdrv[32:9] -a0  #添加局部熱備盤,其中array1表示第1個raid(Target Id: 1)
添加完成后查看熱備的位置
# /opt/MegaRAID/MegaCli/MegaCli64  -LDInfo -Lall -aALL
 
Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name                :Virtual Disk 0
RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
Size                : 223.0 GB
Sector Size         : 512
Mirror Data         : 223.0 GB
State               : Optimal
Strip Size          : 64 KB
Number Of Drives    : 2
Span Depth          : 1
Default Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy   : Disk's Default
Encryption Type     : None
Default Power Savings Policy: Controller Defined
Current Power Savings Policy: None
Can spin up in 1 minute: No
LD has drives that support T10 power conditions: No
LD's IO profile supports MAX power savings with cached writes: No
Bad Blocks Exist: No
Is VD Cached: No
 
Virtual Drive: 1 (Target Id: 1)
Name                :
RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
Size                : 1.635 TB
Sector Size         : 512
Mirror Data         : 1.635 TB
State               : Degraded
Strip Size          : 64 KB
Number Of Drives per span:2
Span Depth          : 3
Default Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy   : Disk's Default
Encryption Type     : None
Default Power Savings Policy: Controller Defined
Current Power Savings Policy: None
Can spin up in 1 minute: Yes
LD has drives that support T10 power conditions: Yes
LD's IO profile supports MAX power savings with cached writes: No
Bad Blocks Exist: No
Is VD Cached: No
 
Number of Dedicated Hot Spares: 1
    0 : EnclId - 32 SlotId - 9
 
Exit Code: 0x00
 
# 查看邏輯盤詳細信息
sudo /opt/MegaRAID/MegaCli/MegaCli64 -LdPdInfo -aAll -NoLog
 
當有raid有熱備的時候,更換磁盤會是Firmware state: Copyback的狀態
查看copyback的進度可以直接查看日志
# watch -n 30 'MegaCli -FwTermLog -Dsply -aALL | tail -f'
Every 30.0s: MegaCli -FwTermLog -Dsply -aALL | tail -f
07/29/19 13:16:36: Load Balance Statistics Path0PDs d Path1PDs 0
07/29/19 13:16:36: EVT#25896-07/29/19 13:16:36:  91=Inserted: PD 00(e0x20/s0)
07/29/19 13:16:36: EVT#25897-07/29/19 13:16:36: 247=Inserted: PD 00(e0x20/s0) Info: enclPd=20, scsiType=0, portMap=00, sasAddr=5000c500720794fd,0000000000000000
07/29/19 13:16:37: request temp sensor i2c failed
07/29/19 13:16:37: PD_InsertionPostProcess: Setting foreign DDF type on pd=0
07/29/19 13:16:37: EVT#25898-07/29/19 13:16:37: 114=State change on PD 00(e0x20/s0) from UNCONFIGURED_BAD(1) to UNCONFIGURED_GOOD(0)
07/29/19 13:16:37: pdHspHistCheckInsertedPdCallback: Start copy back from sparePd=03 to pd=0, changing entryType to ok
07/29/19 13:16:37: ArDiskTypeMisMatch : NO_MIXING_VIOLATION  array=1  destPD=0
07/29/19 13:16:37: EVT#25899-07/29/19 13:16:37: 281=CopyBack automatically started on PD 00(e0x20/s0) from PD 03(e0x20/s3)
07/29/19 13:16:37: EVT#25900-07/29/19 13:16:37: 114=State change on PD 00(e0x20/s0) from UNCONFIGURED_GOOD(0) to COPYBACK(20)
07/29/19 13:18:18: EVT#25901-07/29/19 13:18:18: 279=CopyBack progress on PD 00(e0x20/s0) is 0.99%(99s)
07/29/19 13:19:57: EVT#25902-07/29/19 13:19:57: 279=CopyBack progress on PD 00(e0x20/s0) is 1.99%(197s)
07/29/19 13:21:37: EVT#25903-07/29/19 13:21:37: 279=CopyBack progress on PD 00(e0x20/s0) is 2.99%(297s)
07/29/19 13:23:17: EVT#25904-07/29/19 13:23:17: 279=CopyBack progress on PD 00(e0x20/s0) is 3.99%(397s)
07/29/19 13:24:57: EVT#25905-07/29/19 13:24:57: 279=CopyBack progress on PD 00(e0x20/s0) is 4.99%(497s)
07/29/19 13:26:39: EVT#25906-07/29/19 13:26:39: 279=CopyBack progress on PD 00(e0x20/s0) is 5.99%(598s)
Exit Code: 0x00
 
megacli基本用法
 
# 查raid級別
$ megacli -LDInfo -Lall -aALL
 
# 查看邏輯盤詳細信息
$ /opt/MegaRAID/MegaCli/MegaCli64 -LdPdInfo -aAll -NoLog
 
# 查raid卡信息
$ megacli -AdpAllInfo -aALL
 
# 查看硬盤信息
$ /opt/MegaRAID/MegaCli/MegaCli64 -PDList -aALL
 
# 查看電池信息
$ megacli -AdpBbuCmd -aAll
 
# 查看raid卡日志
$ /opt/MegaRAID/MegaCli/MegaCli64 -FwTermLog -Dsply -aALL
 
# 顯示適配器個數
$ megacli -adpCount
 
# 顯示適配器時間
$ megacli -AdpGetTime –aALL
 
# 顯示所有適配器信息
$ megacli -AdpAllInfo -aAll     
 
# 顯示所有邏輯磁盤組信息
$ megacli -LDInfo -LALL -aAll    
 
# 顯示所有的物理信息
$ megacli -PDList -aAll     
 
# 查看充電狀態
$ megacli -AdpBbuCmd -GetBbuStatus -aALL |grep 'Charger Status'
 
# 顯示BBU狀態信息
$ megacli -AdpBbuCmd -GetBbuStatus -aALL
 
# 顯示BBU容量信息
$ megacli -AdpBbuCmd -GetBbuCapacityInfo -aALL
 
# 顯示BBU設計參數
$ megacli -AdpBbuCmd -GetBbuDesignInfo -aALL    
 
# 顯示當前BBU屬性
$ megacli -AdpBbuCmd -GetBbuProperties -aALL    
 
# 顯示Raid卡型號,Raid設置,Disk相關信息
$ megacli -cfgdsply -aALL    
 
## 磁帶狀態的變化,從拔盤,到插盤的過程中。
Device           |Normal |Damage  |Rebuild |Normal
Virtual Drive    |Optimal|Degraded|Degraded|Optimal
Physical Drive   |Online |Failed Unconfigured|Rebuild|Online
 
# 查看物理磁盤狀態:
$ megacli -PDRbld -ShowProg -PhysDrv  [Enclosure Device ID:Slot Number]  -a0
 
## Rebuild 中的物理磁盤狀態中會顯示:"Firmware state: Rebuild"
 
# 查詢 Rebuild 進度:
$ megacli -pdrbld -showprog -physdrv[E:S] -aALL
 
## 返回內容類似於下面這樣:
Rebuild Progress on Device at Enclosure 32, Slot 5 Completed 77% in 101 Minutes.
 
# 以文本進度條樣式顯示 Rebuild 進度:
$ megacli -pdrbld -progdsply -physdrv[E:S] -aALL
 
## 屏幕顯示類似下面的內容:
Rebuild progress of physical drives...
 
Enclosure:Slot               Percent Complete                       Time Elps
 
      032 :05   #######################87 %################*******  01:59:07
 
Press key to quit...
 
# 查看 RAID 卡 Rebuild 參數:
$ megacli -AdpAllinfo -aALL | grep -i rebuild
 
## 返回結果類似下面這樣
Rebuild Rate                     : 30%
Auto Rebuild                     : Enabled
Rebuild Rate                     : YesForce
Rebuild                    : Yes
 
# 設置 RAID 卡 Rebuild 比例為60%(提升Rebuild速度):
$ /opt/MegaRAID/MegaCli/MegaCli64 -AdpSetProp RebuildRate -60 -a0
 
## 設置成功后返回:
Adapter 0: Set rebuild rate to 60% success.
 
# 設置HotSpare
/opt/MegaRAID/MegaCli/MegaCli64 -pdhsp -set[-Dedicated[-Array2]][-EnclAffinity][-nonRevertible]-PhysDrv[4:11]-a0
/opt/MegaRAID/MegaCli/MegaCli64 -pdhsp -set[-EnclAffinity][-nonRevertible]-PhysDrv[32:1}]-a0
 
MegaCli -PDHSP   -Set   -Dedicated  -Array0  -physdrv[E:S] -a0   添加局部熱備盤,其中array0表示第0個raid(Target Id: 0)
示范:sudo /opt/MegaRAID/MegaCli/MegaCli64 -PDHSP  -Set -Dedicated  -Array1 -physdrv[32:9] -a0  #添加局部熱備盤,其中array1表示第1個raid(Target Id: 1)
 
MegaCli  -pdhsp  -set   -physdrv[E:S]  -a0                                       添加全局熱備盤
 
MegaCli  -pdhsp  -rmv  -physdrv[E:S]  -a0                                      移除全局和熱備局部熱備
示范:sudo /opt/MegaRAID/MegaCli/MegaCli64 -PDHSP  -rmv -physdrv[32:9] -a0
 
# 刪除陣列
/opt/MegaRAID/MegaCli/MegaCli64  -cfglddel -L2 -Force -a0 強制刪除指定的raid組(Target Id: 2)的raid組,可以通過上面的“查看邏輯盤詳細信息”得到。(有時不加強制參數,會報錯--Virtual Disk is associate with Cache Cade. Please Use force option to delete)
/opt/MegaRAID/MegaCli/MegaCli64  -cfgclr  -a0       清除所有的raid組的配置
 
# 清除外來配置
/opt/MegaRAID/MegaCli/MegaCli64 -cfgforeign -clear -a0
 
# 再次掃描外來配置的個數
/opt/MegaRAID/MegaCli/MegaCli64 -cfgforeign -scan -a0
 
 
常見問題:
 
1.Firmware state: Unconfigured(good), Spun Up( Idrac監控報錯:登陸idrac卡后如下如所示:硬盤狀態是感嘆號,狀態是外來)
 
解決辦法:/opt/MegaRAID/MegaCli/MegaCli64  -CfgForeign -Import -aall
 
導入后我們發現了另外一個問題,就是這塊磁盤歸屬到一個只有一塊磁盤的raid組中了,這和我本來要把這塊磁盤加到熱備的目的有沖突
於是我們刪除新出現的raid組
/opt/MegaRAID/MegaCli/MegaCli64  -cfglddel -L2 -Force -a0 強制刪除指定的raid組(Target Id: 2)的raid組,可以通過上面的“查看邏輯盤詳細信息”得到。(有時不加強制參數,會報錯--Virtual Disk is associate with Cache Cade. Please Use force option to delete)
最后執行
將驅動設置為熱備(hotspare)。
sudo /opt/MegaRAID/MegaCli/MegaCli64 -PDHSP  -Set -Dedicated  -Array1 -physdrv[32:9] -a0
 
2.Firmware state: Unconfigured(bad) 怎么解決--我有新的磁盤想作為磁盤組的熱備
Enclosure Device ID: 32
Slot Number: 9
Enclosure position: 1
Device Id: 9
Firmware state: Unconfigured(bad)
服務器硬盤出現Unconfigured Bad可能是因為驅動器出現誤差,具體操作如下:
1、用命令行監測一下驅動是否配置良好。
sudo /opt/MegaRAID/MegaCli/MegaCli64 -PDMakeGood -physdrv[32:9] -a0
2、再檢測一下32:9的狀態是否配置良好。
Enclosure Device ID: 32
Slot Number: 9
Enclosure position: 1
Device Id: 9
Firmware state: Unconfigured(good), Spun Up
3、然后需要清理一下foreign conifig。(坑的一毛 整個服務器掛機了,千萬不要執行清理foreign conifig,要不只能去bios里導入foreign conifig才能恢復)
### sudo /opt/MegaRAID/MegaCli/MegaCli64 -cfgforeign -clear -a0 
/opt/MegaRAID/MegaCli/MegaCli64  -CfgForeign -Import -aall  #謹慎操作
參考: http://www.51niux.com/?id=77(MegaCLI 工具的使用)
4、最后清除以前的外部配置,將驅動設置為熱備(hotspare)。
sudo /opt/MegaRAID/MegaCli/MegaCli64 -PDHSP  -Set -Dedicated  -Array1 -physdrv[32:9] -a0
 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM