netapp存儲設備更換硬盤
一、 狀態檢查
通過命令“disk show –v”檢查磁盤狀態以及磁盤屬於的機頭,當有磁盤故障時,磁盤狀態會顯示為“FAILED”
此時通過“aggr status -s”可查看熱備盤數量是否減少,如果熱備盤比之前少了一塊則說明熱備盤已經開始頂替故障盤進行工作。
建議存儲磁盤故障后,不要立即進行更換,待熱備盤完全頂替故障盤后再進行換盤操作,也可以通過查看系統日志來進行信息確認:
netapp-db-B>rdfile /etc/messages
netapp-db-B>rdfile /etc/messages.0
netapp-db-B>rdfile /etc/messages.1 、
netapp-db-B>rdfile /etc/messages.2
netapp-db-B>rdfile /etc/messages.3
netapp-db-B>rdfile /etc/messages.4
查看日志會有類似如下提示:
ue Jan 1 03:51:44 CST [xxzx-netapp-db-B: raid.rg.recons.done:notice]: /aggr0/plex0/rg0: reconstruction completed for 2a.23 in 3:19:43.02
(重構結束時間)
當看到類似的日志時候,即可以開始更換磁盤
二、 物理磁盤更換
Netapp物理機拔出黃燈報警硬盤,幾秒鍾后插入新盤,注意觀察有閃黃燈變為綠燈過程
三、 系統配置
- 狀態確認
通過“disk show –v”查看磁盤歸屬情況,可看到0a.00.18與0a.0019狀態為“Not Owed”與“NONE”,說明此兩塊磁盤未歸屬任何機頭。 - 系統配置
如果機頭為多個,則可根據實際需求,將不同的磁盤分配給不同的機頭進行管理,此時則需要用串口登陸磁盤所要分配的機頭按照下面的命令進行磁盤分配,(本例機頭只有一個,即所有的磁盤均受一個機頭管理,故無需考慮分配問題)
1) 進入維護模式
cdst-netapp> priv set advanced (維護模式)
Warning: These advanced commands are potentially dangerous; use
them only when directed to do so by NetApp
personnel.
cdst-netapp> (此時命令前帶)
2) 磁盤分配
cdst-netapp> disk assign 0a.00.19 (分配磁盤0a.00.19)
Sat Jul 16 10:23:21 CST [cdst-netapp:diskown.changingOwner:info]: changing ownership for disk 0a.00.18 (S/N LXWH5A0M) from unowned (ID 4294967295) to cdst-netapp (ID 2014870888)
cdst-netapp> Sat Jul 16 10:23:21 CST [cdst-netapp:raid.assim.disk.nolabels:error]: Disk 0a.00.18 Shelf 0 Bay 18 [NETAPP X412_HVIPC560A15 NA01] S/N [LXWH5A0M] has no valid labels. It will be taken out of service to prevent possible data loss.
Sat Jul 16 10:23:21 CST [cdst-netapp:raid.config.disk.bad.label:error]: Disk 0a.00.18 Shelf 0 Bay 18 [NETAPP X412_HVIPC560A15 NA01] S/N [LXWH5A0M] has bad label.
Sat Jul 16 10:23:21 CST [cdst-netapp:callhome.dsk.label:CRITICAL]: Call home for DISK BAD LABEL
Sat Jul 16 10:23:21 CST [cdst-netapp:sfu.firmwareUpToDate:info]: Firmware is up-to-date on all disk shelves.
可看到此時“oa.00.19”磁盤已經分配了機頭,而“oa.00.18”還未分配
cdst-netapp> disk assign 0a.00.18
Sat Jul 16 10:23:21 CST [cdst-netapp:diskown.changingOwner:info]: changing ownership for disk 0a.00.18 (S/N LXWH5A0M) from unowned (ID 4294967295) to cdst-netapp (ID 2014870888)
cdst-netapp> Sat Jul 16 10:23:21 CST [cdst-netapp:raid.assim.disk.nolabels:error]: Disk 0a.00.18 Shelf 0 Bay 18 [NETAPP X412_HVIPC560A15 NA01] S/N [LXWH5A0M] has no valid labels. It will be taken out of service to prevent possible data loss.
Sat Jul 16 10:23:21 CST [cdst-netapp:raid.config.disk.bad.label:error]: Disk 0a.00.18 Shelf 0 Bay 18 [NETAPP X412_HVIPC560A15 NA01] S/N [LXWH5A0M] has bad label.
Sat Jul 16 10:23:21 CST [cdst-netapp:callhome.dsk.label:CRITICAL]: Call home for DISK BAD LABEL
Sat Jul 16 10:23:21 CST [cdst-netapp:sfu.firmwareUpToDate:info]: Firmware is up-to-date on all disk shelves.
3) 檢查磁盤標簽
通過“sysconfig –r”查看各磁盤組的狀況,其中可查看到熱備盤的狀況:
新更換的兩塊磁盤標簽為“bad label”,需將此盤轉換為熱備盤。
4) 熱備盤轉換
cdst-netapp> disk unfail -s 0a.00.18 (將0a.00.18磁盤轉換為熱備盤)
disk unfail: unfailing disk 0a.00.18...
cdst-netapp> Sat Jul 16 10:37:56 CST [cdst-netapp:raid.disk.unfail.done:info]: Disk 0a.00.18 Shelf 0 Bay 18 [NETAPP X412_HVIPC560A15 NA01] S/N [LXWH5A0M] unfailed, and is now a spare
Sat Jul 16 10:38:05 CST [cdst-netapp:raid.disk.offline:notice]: Marking Disk 0a.00.18 Shelf 0 Bay 18 [NETAPP X412_HVIPC560A15 NA01] S/N [LXWH5A0M] offline.
Sat Jul 16 10:38:05 CST [cdst-netapp:bdfu.selected:info]: Disk 0a.00.18 [NETAPP X412_HVIPC560A15 NA01] S/N [LXWH5A0M] selected for background disk firmware update.
Sat Jul 16 10:38:05 CST [cdst-netapp:dfu.firmwareDownloading:info]: Now downloading firmware file /etc/disk_fw/X412_HVIPC560A15.NA02.LOD on 1 disk(s) of plex [Pool0]...
Sat Jul 16 10:38:21 CST [cdst-netapp:raid.disk.online:notice]: Onlining Disk 0a.00.18 Shelf 0 Bay 18 [NETAPP X412_HVIPC560A15 NA01] S/N [LXWH5A0M].
Sat Jul 16 10:38:21 CST [cdst-netapp:sfu.firmwareUpToDate:info]: Firmware is up-to-date on all disk shelves.
cdst-netapp> disk unfail -s 0a.00.19 (將0a.00.19磁盤轉換為熱備盤)
disk unfail: unfailing disk 0a.00.18...
cdst-netapp> Sat Jul 16 10:37:56 CST [cdst-netapp:raid.disk.unfail.done:info]: Disk 0a.00.18 Shelf 0 Bay 18 [NETAPP X412_HVIPC560A15 NA01] S/N [LXWH5A0M] unfailed, and is now a spare
Sat Jul 16 10:38:05 CST [cdst-netapp:raid.disk.offline:notice]: Marking Disk 0a.00.18 Shelf 0 Bay 18 [NETAPP X412_HVIPC560A15 NA01] S/N [LXWH5A0M] offline.
Sat Jul 16 10:38:05 CST [cdst-netapp:bdfu.selected:info]: Disk 0a.00.18 [NETAPP X412_HVIPC560A15 NA01] S/N [LXWH5A0M] selected for background disk firmware update.
Sat Jul 16 10:38:05 CST [cdst-netapp:dfu.firmwareDownloading:info]: Now downloading firmware file /etc/disk_fw/X412_HVIPC560A15.NA02.LOD on 1 disk(s) of plex [Pool0]...
Sat Jul 16 10:38:21 CST [cdst-netapp:raid.disk.online:notice]: Onlining Disk 0a.00.18 Shelf 0 Bay 18 [NETAPP X412_HVIPC560A15 NA01] S/N [LXWH5A0M].
Sat Jul 16 10:38:21 CST [cdst-netapp:sfu.firmwareUpToDate:info]: Firmware is up-to-date on all disk shelves.
至此兩塊故障盤均轉換為熱備盤,但此時狀態為“not zeroed”,需進行磁盤零花操作
5) 熱備盤零化
輸入“disk zero spares”,此時沒有零花的熱備盤會開始零花操作,可通過“sysconfig –r”查看零花過程
cdst-netapp*> disk zero spares