使用 MegaCLI 檢測磁盤狀態並更換磁盤


https://my.oschina.net/adailinux/blog/2231519

之前寫了一篇文章介紹如何更換線上服務器磁盤操作流程,當時是把整體機器的磁盤全部不換掉了,但是最近另一台機器部分磁盤損壞,raid類型為10,經檢測,只需要更換壞掉的磁盤即可,補充文檔如下。

安裝MegaCLI

安裝包 下載地址 。

安裝過程

# 首先下載獲取安裝包 # 解壓 $ tar -zxf MegaCli8.07.10.tar.gz $ cd MegaCli8.07.10/Linux/ $ rpm -ivh Lib_Utils-1.00-09.noarch.rpm MegaCli-8.02.21-1.noarch.rpm # 加入系統環境 $ ln -s /opt/MegaRAID/MegaCli/MegaCli64 /usr/local/bin/MegaCli $ MegaCli -v MegaCLI SAS RAID Management Tool Ver 8.02.21 Oct 21, 2011 (c)Copyright 2011, LSI Corporation, All Rights Reserved. Exit Code: 0x00 # 安裝完成! 
  • 沖突處理:

    $ rpm -ivh Lib_Utils-1.00-09.noarch.rpm MegaCli-8.02.21-1.noarch.rpm 
    准備中...                          ################################# [100%] file /opt/lsi/3rdpartylibs/x86_64/libsysfs.so.2.0.2 from install of Lib_Utils-1.00-09.noarch conflicts with file from package srvadmin-storelib-sysfs-9.1.0-2757.12163.el7.x86_64 
  • 原因: Lib_Utils和Dell服務器自帶的包srvadmin沖突,直接將其卸載,然后安裝即可。

    rpm -e srvadmin-storelib-sysfs-9.1.0-2757.12163.el7.x86_64 --nodeps
    

使用指南

基本用法

# 查raid級別 $ megacli -LDInfo -Lall -aALL  # 查raid卡信息 $ megacli -AdpAllInfo -aALL  # 查看硬盤信息 $ megacli -PDList -aALL  # 查看電池信息 $ megacli -AdpBbuCmd -aAll  # 查看raid卡日志 $ megacli -FwTermLog -Dsply -aALL  # 顯示適配器個數 $ megacli -adpCount  # 顯示適配器時間 $ megacli -AdpGetTime –aALL  # 顯示所有適配器信息 $ megacli -AdpAllInfo -aAll      # 顯示所有邏輯磁盤組信息 $ megacli -LDInfo -LALL -aAll     # 顯示所有的物理信息 $ megacli -PDList -aAll      # 查看充電狀態 $ megacli -AdpBbuCmd -GetBbuStatus -aALL |grep 'Charger Status'  # 顯示BBU狀態信息 $ megacli -AdpBbuCmd -GetBbuStatus -aALL  # 顯示BBU容量信息 $ megacli -AdpBbuCmd -GetBbuCapacityInfo -aALL  # 顯示BBU設計參數 $ megacli -AdpBbuCmd -GetBbuDesignInfo -aALL     # 顯示當前BBU屬性 $ megacli -AdpBbuCmd -GetBbuProperties -aALL     # 顯示Raid卡型號,Raid設置,Disk相關信息 $ megacli -cfgdsply -aALL     ## 磁帶狀態的變化,從拔盤,到插盤的過程中。 Device           |Normal |Damage  |Rebuild |Normal Virtual Drive    |Optimal|Degraded|Degraded|Optimal Physical Drive   |Online |Failed Unconfigured|Rebuild|Online # 查看物理磁盤狀態: $ megacli -PDRbld -ShowProg -PhysDrv  [Enclosure Device ID:Slot Number]  -a0 ## Rebuild 中的物理磁盤狀態中會顯示:"Firmware state: Rebuild" # 查詢 Rebuild 進度: $ megacli -pdrbld -showprog -physdrv[E:S] -aALL ## 返回內容類似於下面這樣: Rebuild Progress on Device at Enclosure 32, Slot 5 Completed 77% in 101 Minutes. # 以文本進度條樣式顯示 Rebuild 進度: $ megacli -pdrbld -progdsply -physdrv[E:S] -aALL ## 屏幕顯示類似下面的內容: Rebuild progress of physical drives... Enclosure:Slot               Percent Complete                       Time Elps       032 :05   #######################87 %################*******  01:59:07  Press key to quit... # 查看 RAID 卡 Rebuild 參數: $ megacli -AdpAllinfo -aALL | grep -i rebuild ## 返回結果類似下面這樣 Rebuild Rate                     : 30% Auto Rebuild                     : Enabled Rebuild Rate                     : YesForce Rebuild                    : Yes # 設置 RAID 卡 Rebuild 比例為60%: $ megacli -AdpSetProp { RebuildRate -60} -aALL ## 設置成功后返回: Adapter 0: Set rebuild rate to 60% success. 

MegaCLI使用方法:http://blog.51cto.com/daixuan/1863567

重要參數

參數名稱 含義
Firmware state 磁盤狀態
Firmware state: Online, Spun Up 磁盤正常
Firmware state: Unconfigured(good), Spun Up 磁盤已安裝,但未啟用
Firmware state: Unconfigured(bad) 故障, 對應hwcheck的 Non-Critical
Firmware state: Failed 故障, 對應hwcheck的Critical
Firmware state: Rebuild 重建,一般在更換磁盤時顯示
Enclosure Device ID: 32 設備
Slot Number: 1 磁盤在服務器上的槽位
Adapter #0 適配器編號,對應 -a 參數

實戰:raid10環境下替換硬盤

Raid10環境下換硬盤還是很簡單的,支持熱插拔,直接拔下換掉就可以了,下面是操作步驟。

主要環境

服務器: R720

系統: CentOS7

raid類型:raid10

查看硬盤信息

為了更加清楚的呈現操作過程,未對信息簡化處理。

$ MegaCli -PDList -aAll -NoLog
                                     
Adapter #0 Enclosure Device ID: 32 Slot Number: 0 Drive's postion: DiskGroup: 0, Span: 0, Arm: 0 Enclosure position: 0 Device Id: 0 WWN: 5000C50076CD09B4 Sequence Number: 1 Media Error Count: 0 Other Error Count: 0 Predictive Failure Count: 28 Last Predictive Failure Event Seq Number: 4378 PD Type: SAS Raw Size: 558.911 GB [0x45dd2fb0 Sectors] Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors] Coerced Size: 558.375 GB [0x45cc0000 Sectors] Firmware state: Unconfigured(good), Spun Up Device Firmware Level: ES66 Shield Counter: 0 Successful diagnostics completion on : N/A SAS Address(0): 0x5000c50076cd09b5 SAS Address(1): 0x0 Connected Port Number: 5(path0) Inquiry Data: SEAGATE ST3600057SS ES666SL8SASQ FDE Enable: Disable Secured: Unsecured Locked: Unlocked Needs EKM Attention: No Foreign State: Foreign Foreign Secure: Drive is not secured by a foreign lock key Device Speed: 6.0Gb/s Link Speed: 6.0Gb/s Media Type: Hard Disk Device Drive Temperature :40C (104.00 F) PI Eligibility: No Drive is formatted for PI information: No PI: No PI Drive's write cache : Disabled Port-0 : Port status: Active Port's Linkspeed: 6.0Gb/s Port-1 : Port status: Active Port's Linkspeed: Unknown Drive has flagged a S.M.A.R.T alert : Yes Enclosure Device ID: 32 Slot Number: 2 Enclosure position: 0 Device Id: 2 WWN: 5000C50076CD05BC Sequence Number: 2 Media Error Count: 0 Other Error Count: 0 Predictive Failure Count: 0 Last Predictive Failure Event Seq Number: 0 PD Type: SAS Raw Size: 0 KB [0x0 Sectors] Non Coerced Size: 0 KB [0x0 Sectors] Coerced Size: 0 KB [0x0 Sectors] Firmware state: Unconfigured(bad) Device Firmware Level: ES66 Shield Counter: 0 Successful diagnostics completion on : N/A SAS Address(0): 0x5000c50076cd05bd SAS Address(1): 0x0 Connected Port Number: 1(path0) Inquiry Data: SEAGATE ST3600057SS ES666SL8SAVC FDE Enable: Disable Secured: Unsecured Locked: Unlocked Needs EKM Attention: No Foreign State: None Device Speed: Unknown Link Speed: Unknown Media Type: Hard Disk Device Drive: Not Supported Drive Temperature :0C (32.00 F) PI Eligibility: No Drive is formatted for PI information: No PI: No PI Drive's write cache : Disabled Port-0 : Port status: Active Port's Linkspeed: Unknown Port-1 : Port status: Active Port's Linkspeed: Unknown Drive has flagged a S.M.A.R.T alert : No Enclosure Device ID: 32 Slot Number: 1 Drive's postion: DiskGroup: 0, Span: 0, Arm: 1 Enclosure position: 0 Device Id: 1 WWN: 5000C500983873BC Sequence Number: 2 Media Error Count: 0 Other Error Count: 0 Predictive Failure Count: 0 Last Predictive Failure Event Seq Number: 0 PD Type: SAS Raw Size: 558.911 GB [0x45dd2fb0 Sectors] Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors] Coerced Size: 558.375 GB [0x45cc0000 Sectors] Firmware state: Online, Spun Up Device Firmware Level: VT31 Shield Counter: 0 Successful diagnostics completion on : N/A SAS Address(0): 0x5000c500983873bd SAS Address(1): 0x0 Connected Port Number: 3(path0) Inquiry Data: SEAGATE ST600MP0005 VT31S7M1CSLT FDE Enable: Disable Secured: Unsecured Locked: Unlocked Needs EKM Attention: No Foreign State: None Device Speed: Unknown Link Speed: 6.0Gb/s Media Type: Hard Disk Device Drive Temperature :41C (105.80 F) PI Eligibility: No Drive is formatted for PI information: No PI: No PI Drive's write cache : Disabled Port-0 : Port status: Active Port's Linkspeed: 6.0Gb/s Port-1 : Port status: Active Port's Linkspeed: Unknown Drive has flagged a S.M.A.R.T alert : No Enclosure Device ID: 32 Slot Number: 3 Drive's postion: DiskGroup: 0, Span: 1, Arm: 1 Enclosure position: 0 Device Id: 3 WWN: 5000C50076CE2F30 Sequence Number: 2 Media Error Count: 5 Other Error Count: 71 Predictive Failure Count: 15 Last Predictive Failure Event Seq Number: 4379 PD Type: SAS Raw Size: 558.911 GB [0x45dd2fb0 Sectors] Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors] Coerced Size: 558.375 GB [0x45cc0000 Sectors] Firmware state: Online, Spun Up Device Firmware Level: ES66 Shield Counter: 0 Successful diagnostics completion on : N/A SAS Address(0): 0x5000c50076ce2f31 SAS Address(1): 0x0 Connected Port Number: 2(path0) Inquiry Data: SEAGATE ST3600057SS ES666SL8SAKA FDE Enable: Disable Secured: Unsecured Locked: Unlocked Needs EKM Attention: No Foreign State: None Device Speed: 6.0Gb/s Link Speed: 6.0Gb/s Media Type: Hard Disk Device Drive Temperature :48C (118.40 F) PI Eligibility: No Drive is formatted for PI information: No PI: No PI Drive's write cache : Disabled Port-0 : Port status: Active Port's Linkspeed: 6.0Gb/s Port-1 : Port status: Active Port's Linkspeed: Unknown Drive has flagged a S.M.A.R.T alert : Yes Enclosure Device ID: 32 Slot Number: 4 Drive's postion: DiskGroup: 1, Span: 0, Arm: 0 Enclosure position: 0 Device Id: 4 WWN: 5000C5007E70F0F8 Sequence Number: 2 Media Error Count: 0 Other Error Count: 0 Predictive Failure Count: 0 Last Predictive Failure Event Seq Number: 0 PD Type: SAS Raw Size: 558.911 GB [0x45dd2fb0 Sectors] Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors] Coerced Size: 558.375 GB [0x45cc0000 Sectors] Firmware state: Online, Spun Up Device Firmware Level: ES66 Shield Counter: 0 Successful diagnostics completion on : N/A SAS Address(0): 0x5000c5007e70f0f9 SAS Address(1): 0x0 Connected Port Number: 0(path0) Inquiry Data: SEAGATE ST3600057SS ES666SL9F1JB FDE Enable: Disable Secured: Unsecured Locked: Unlocked Needs EKM Attention: No Foreign State: None Device Speed: 6.0Gb/s Link Speed: 6.0Gb/s Media Type: Hard Disk Device Drive Temperature :46C (114.80 F) PI Eligibility: No Drive is formatted for PI information: No PI: No PI Drive's write cache : Disabled Port-0 : Port status: Active Port's Linkspeed: 6.0Gb/s Port-1 : Port status: Active Port's Linkspeed: Unknown Drive has flagged a S.M.A.R.T alert : No Enclosure Device ID: 32 Slot Number: 5 Drive's postion: DiskGroup: 1, Span: 0, Arm: 1 Enclosure position: 0 Device Id: 5 WWN: 5000C5007E708E3C Sequence Number: 2 Media Error Count: 0 Other Error Count: 0 Predictive Failure Count: 0 Last Predictive Failure Event Seq Number: 0 PD Type: SAS Raw Size: 558.911 GB [0x45dd2fb0 Sectors] Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors] Coerced Size: 558.375 GB [0x45cc0000 Sectors] Firmware state: Online, Spun Up Device Firmware Level: ES66 Shield Counter: 0 Successful diagnostics completion on : N/A SAS Address(0): 0x5000c5007e708e3d SAS Address(1): 0x0 Connected Port Number: 4(path0) Inquiry Data: SEAGATE ST3600057SS ES666SL9F2RB FDE Enable: Disable Secured: Unsecured Locked: Unlocked Needs EKM Attention: No Foreign State: None Device Speed: 6.0Gb/s Link Speed: 6.0Gb/s Media Type: Hard Disk Device Drive Temperature :45C (113.00 F) PI Eligibility: No Drive is formatted for PI information: No PI: No PI Drive's write cache : Disabled Port-0 : Port status: Active Port's Linkspeed: 6.0Gb/s Port-1 : Port status: Active Port's Linkspeed: Unknown Drive has flagged a S.M.A.R.T alert : No Exit Code: 0x00 

由以上信息可知該服務器有6塊磁盤(Device Id)。

卸載故障硬盤

$ MegaCli -PDOffline -PhysDrv[32:2] -a0
$ MegaCli -PDOffline -PhysDrv[32:0] -a0

上面命令中 32 和 2 以及 -a0 的對應關系:

Adapter #0 Enclosure Device ID: 32 Slot Number: 2 

替換故障硬盤

此時故障硬盤已經OFFLINE,在服務器現場查看時,故障硬盤閃爍的是黃燈,正常硬盤的綠燈; 拔下故障硬盤,插上好硬盤,硬盤燈閃爍為綠色,並硬盤快速旋轉,表示硬盤正在rebuild狀態,查看狀態如下:

$ MegaCli -PDList -aAll -NoLog
...
Enclosure Device ID: 32
Slot Number: 3
...
Firmware state: Rebuild
...

查看rebuild進度

$ MegaCli -PDRbld -ShowProg -PhysDrv[32:2] -aAll

Rebuild Progress on Device at Enclosure 32, Slot 3 Completed 16% in 94 Minutes. 

磁盤更換完成

$ MegaCli -PDList -aAll -NoLog | grep 'Firmware state' Firmware state: Online, Spun Up Firmware state: Online, Spun Up Firmware state: Online, Spun Up Firmware state: Online, Spun Up Firmware state: Online, Spun Up Firmware state: Online, Spun Up

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM