NetApp存儲方案及巡檢命令


一、MCC概述

Clustered Metro Cluster(簡稱MCC)是Netapp Data Ontap提供的存儲雙活解決方案,當初的方案是把1個FAS/ V系列雙控在數據中心之間拉遠形成異地HA Pair,每站點只有單控制器節點,數據中心兩站點之間通過額外的FC/VI集群適配器相連,數據中心間SAS磁盤框通過SAS轉FC的FibreBridge相連在500米以內、同一個機房采用直接光纖通道交換機連接;在500米以上(最遠100km)采用光纖通道和DWDM交換機相連。

640?wx_fmt=png&wxfrom=5&wx_lazy=1

 

0?wx_fmt=png

      MetroCluster在此架構上也進行了演變。通過在站點A、B兩個站點分別放置兩套FAS/ V雙控陣列,陣列A的A控和陣列B的A控,陣列A的B控和陣列B的B控分別形成集群,這樣可以充分把A、B站點數據中心資源充分利用,同時對外提供存儲服務;但陣列內的A、B不是集群。如果站點間形成集群Pair的任意一個控制器節點故障,故障站點的主機都需要遠程訪問遠端控制器節點;如何站點間形成集群Pair的兩個節點同時故障,就會發生業務中斷。

      Netapp Data Ontap8.3版本推出了4控雙活解決方案,最遠支持200公里距離,4控Metro Cluster方案首先由2個HA Pair組成2個本地集群,然后再從2個集群上做4節點集群。集群控制器之間內存日志通過存放在NVRAM里面,NVRAM對沒有下盤的日志做了鏡像,保證節點故障以后,HA Pair集群的Partner節點能夠接管業務;或者站點故障以后,遠端HA Pair集群能夠接管業務。當日志到達一定水位或者發生系統操作刷盤時,下盤數據同步通過SyncMirror實現主從站點雙寫,從而確保一個站點磁盤故障以后,另外一個站點磁盤還能提供系統訪問,實現站點故障切換,保證業務不中斷。

0?wx_fmt=png

      MetroCluster使用兩個不同地點的鏡像和集群來保護數據,每個集群把數據和Storage Virtual Machine (SVM) 配置都鏡像同步另一個集群。當某個站點發生災難時,管理員可以激活遠端SVM並在另一站點接管業務。此外,每個集群在本地節點均配置為HA Pair,從而提供了本地故障轉移能力。

0?wx_fmt=png

      NetApp MetroCluster是以NetApp SyncMirror是配合Cluster_remote和控制器Cluster Failover的功能實現的。

      • Clustered Failover – 在主存儲和容災存儲間提供高可用性失敗恢復能力,故障接管的決策是由管理員通過單一命令行決定的。

      • SyncMirror – 為遠端存儲提供即時的數據拷貝,當故障接管時,數據可以僅通過遠端的存儲進行訪問。

      • ClusterRemote – 提供管理機制用以判斷災難的發生並初始遠端存儲進行接管。

二、MCC巡檢常用命令

1、系統健康狀態檢查

cluster1::> system health status show
Status
---------------
ok

2、集群狀態檢查

cluster1::> cluster show              
Node                  Health  Eligibility
--------------------- ------- ------------
cluster1-01           true    true
cluster1-02           true    true
2 entries were displayed.

3、集群統計狀態檢查

cluster1::> cluster statistics show
         Counter             Value         Delta
---------------- ----------------- -------------
       CPU Busy:                0%             -
     Operations:
          Total:                 0             -
            NFS:                 0             -
           CIFS:                 0             -
   Data Network:
           Busy:                0%             -
       Received:            5.78GB             -
           Sent:            13.7GB             -
Cluster Network:
           Busy:                0%             -
       Received:             967KB             -
           Sent:             979KB             -
   Storage Disk:
           Read:            6.38PB             -
          Write:            6.26PB             -

4、查看RAID組信息

cluster1::> aggr show
                                                                      

Aggregate     Size Available Used% State   #Vols  Nodes            RAID Status
--------- -------- --------- ----- ------- ------ ---------------- ------------
aggr0_A1   953.8GB   247.3GB   74% online       1 cluster1-01      raid4,
                                                                   mirrored,
                                                                   normal
aggr0_A2   953.8GB   247.3GB   74% online       1 cluster1-02      raid4,
                                                                   mirrored,
                                                                   normal
aggr_data_A1 
           68.93TB   16.04TB   77% online      32 cluster1-01      mixed_raid_
                                                                   type,
                                                                   mirrored,
                                                                   hybrid,
                                                                   normal
aggr_data_A2 
           68.93TB   14.77TB   79% online      31 cluster1-02      mixed_raid_
                                                                   type,
                                                                   mirrored,
                                                                   hybrid,
                                                                   normal
4 entries were displayed.

5、查看節點信息

cluster1::> node show
Node      Health Eligibility Uptime        Model       Owner    Location  
--------- ------ ----------- ------------- ----------- -------- ---------------
cluster1-01 
          true   true        
                            369 days 19:12 FAS8040              gz_idc
cluster1-02 
          true   true        
                            369 days 19:23 FAS8040              gz_idc
2 entries were displayed.

6、查看版本信息

cluster1::> version
NetApp Release 8.3.2P9: Fri Jan 06 05:54:05 UTC 2017

7、查看序列號

cluster1::> system license show

Serial Number: 1-80-023992
Owner: cluster1
Package           Type    Description           Expiration
----------------- ------- --------------------- --------------------
Base              license Cluster Base License  -

Serial Number: 1-81-0000000000000451515******
Package           Type    Description           Expiration
----------------- ------- --------------------- --------------------
NFS               license NFS License           -
iSCSI             license iSCSI License         -

Serial Number: 1-81-0000000000000451515******
Owner: cluster1-02
Package           Type    Description           Expiration
----------------- ------- --------------------- --------------------
NFS               license NFS License           -
iSCSI             license iSCSI License         -
5 entries were displayed.

8、查看子系統健康狀態

cluster1::> system health subsystem show
Subsystem         Health
----------------- ------------------
SAS-connect       ok
Environment       ok
Memory            ok
Service-Processor ok
Switch-Health     ok
CIFS-NDO          ok
Motherboard       ok
IO                ok
MetroCluster      ok
MetroCluster_Node ok
FHM-Switch        ok
FHM-Bridge        ok
12 entries were displayed.

9、查看MCC集群信息狀態及節點信息狀態

cluster1::> metrocluster show

Configuration: fabric

Cluster                        Configuration State    Mode
------------------------------ ---------------------- ------------------------
 Local: cluster1               configured             normal
Remote: cluster1_dr            configured             normal

cluster1::> metrocluster node show
DR                               Configuration  DR
Group Cluster Node               State          Mirroring Mode
----- ------- ------------------ -------------- --------- --------------------
1     cluster1
              cluster1-01        configured     enabled   normal
              cluster1-02        configured     enabled   normal
      cluster1_dr
              cluster1_dr-01     configured     enabled   normal
              cluster1_dr-02     configured     enabled   normal
4 entries were displayed.

10、查看控制器狀態

cluster1::> system controller show
Controller Name           System ID     Serial Number     Model    Status      
------------------------- ------------- ----------------- -------- ----------- 
cluster1-01               536964819     451515******      FAS8040  ok
cluster1-02               536961600     451515******      FAS8040  ok
2 entries were displayed.

11、查看故障硬盤

cluster1::> storage disk show -broken 
There are no entries matching your query.

12、查看spare硬盤

cluster1::> storage disk show -spare  
Original Owner: cluster1-01                                           
  Checksum Compatibility: block
                                                            Usable Physical
    Disk            HA Shelf Bay Chan   Pool  Type    RPM     Size     Size Owner
    --------------- ------------ ---- ------ ----- ------ -------- -------- --------
    1.30.11         3a    30  11    A  Pool0   SAS  10000   1.09TB   1.09TB cluster1-01
    1.30.13         3a    30  13    A  Pool0   SAS  10000   1.09TB   1.09TB cluster1-01
    1.31.4          3a    31   4    A  Pool0   SAS  10000   1.09TB   1.09TB cluster1-01
    1.32.20         4b    32  20    B  Pool0   SAS  10000   1.09TB   1.09TB cluster1-01
    1.32.23         3a    32  23    A  Pool0   SAS  10000   1.09TB   1.09TB cluster1-01
    1.33.0          3a    33   0    A  Pool0   SAS  10000   1.09TB   1.09TB cluster1-01
    1.33.1          3a    33   1    A  Pool0   SAS  10000   1.09TB   1.09TB cluster1-01
    1.33.10         4b    33  10    B  Pool0   SAS  10000   1.09TB   1.09TB cluster1-01
    2.42.22         3a    42  22    A  Pool1   SAS  10000   1.09TB   1.09TB cluster1-01
    2.42.23         4b    42  23    B  Pool1   SAS  10000   1.09TB   1.09TB cluster1-01
    2.43.2          4b    43   2    B  Pool1   SAS  10000   1.09TB   1.09TB cluster1-01
    2.43.22         3b    43  22    A  Pool1   SAS  10000   1.09TB   1.09TB cluster1-01
    2.43.23         4b    43  23    B  Pool1   SAS  10000   1.09TB   1.09TB cluster1-01
    3.11.21         4b    11  21    B  Pool0   SSD      -  372.4GB  372.6GB cluster1-01
    4.20.21         3a    20  21    A  Pool1   SSD      -  372.4GB  372.6GB cluster1-01
    4.21.14         3a    21  14    A  Pool1   SAS  10000   1.09TB   1.09TB cluster1-01
Original Owner: cluster1-02
  Checksum Compatibility: block
                                                            Usable Physical
    Disk            HA Shelf Bay Chan   Pool  Type    RPM     Size     Size Owner
    --------------- ------------ ---- ------ ----- ------ -------- -------- --------
    2.44.23         3b    44  23    A  Pool1   SAS  10000   1.09TB   1.09TB cluster1-02
    3.12.21         4a    12  21    B  Pool0   SSD      -  372.4GB  372.6GB cluster1-02
    4.23.21         3b    23  21    A  Pool1   SSD      -  372.4GB  372.6GB cluster1-02
    5.60.23         3b    60  23    B  Pool1   SAS  10000   1.09TB   1.09TB cluster1-02
20 entries were displayed.

13、查看SAS橋故障

cluster1::> storage bridge show
                                       Is        Monitor
Bridge                   Symbolic Name Monitored Status  Vendor Model                 Bridge WWN
------------------------ ------------- --------- ------- ------ --------------------- ----------------
ATTO_10.0.15.17          BRIDGE_B_1
                                       true      ok      Atto   FibreBridge 6500N     2000001086627bc0
ATTO_10.0.15.18          BRIDGE_B_2
                                       true      ok      Atto   FibreBridge 6500N     2000001086630f0e
ATTO_10.0.15.19          BRIDGE_B_3
                                       true      ok      Atto   FibreBridge 6500N     2000001086630edc
ATTO_10.0.15.20          BRIDGE_B_4
                                       true      ok      Atto   FibreBridge 6500N     2000001086630ed2
ATTO_10.0.15.6           BRIDGE_A_1
                                       true      ok      Atto   FibreBridge 6500N     2000001086630eb4
ATTO_10.0.15.7           BRIDGE_A_2
                                       true      ok      Atto   FibreBridge 6500N     2000001086630efa
ATTO_10.0.15.8           BRIDGE_A_3
                                       true      ok      Atto   FibreBridge 6500N     2000001086630f18
ATTO_10.0.15.9           BRIDGE_A_4
                                       true      ok      Atto   FibreBridge 6500N     2000001086630ef0
ATTO_FibreBridge6500N_10 -
                                       false     -       Atto   FibreBridge6500N      200000108663e514
ATTO_FibreBridge6500N_11 -
                                       false     -       Atto   FibreBridge6500N      200000108663e3f2
ATTO_FibreBridge6500N_12 -
                                       false     -       Atto   FibreBridge6500N      200000108663e488
ATTO_FibreBridge6500N_13 -
                                       false     -       Atto   FibreBridge6500N      20000010866114ec
ATTO_FibreBridge6500N_14 -
                                       false     -       Atto   FibreBridge6500N      2000001086627bc0
ATTO_FibreBridge6500N_7  -
                                       false     -       Atto   FibreBridge6500N      2000001086630e96
ATTO_FibreBridge6500N_9  -
                                       false     -       Atto   FibreBridge6500N      200000108663e4c4
15 entries were displayed.

14、查看纖交換機故障

cluster1::> storage switch show
                      Symbolic                                Is        Monitor
Switch                Name     Vendor  Model Switch WWN       Monitored Status
--------------------- -------- ------- ----- ---------------- --------- -------
Brocade_10.0.15.10
                      SW_A_1
                               Brocade Brocade6505
                                             100050eb1a88327f true      ok
Brocade_10.0.15.11
                      SW_A_2
                               Brocade Brocade6505
                                             100050eb1a881582 true      ok
Brocade_10.0.15.21
                      SW_B_3
                               Brocade Brocade6505
                                             100050eb1a882f69 true      ok
Brocade_10.0.15.22
                      SW_B_4
                               Brocade Brocade6505
                                             100050eb1a881522 true      ok
4 entries were displayed.

15、查看failover狀態

cluster1::> storage failover show 
                              Takeover          
Node           Partner        Possible State Description  
-------------- -------------- -------- -------------------------------------
cluster1-01    cluster1-02    true     Connected to cluster1-02
cluster1-02    cluster1-01    true     Connected to cluster1-01
2 entries were displayed.

16、查看嚴重告警日志及錯誤告警日志

cluster1::> event log show -severity critical 
There are no entries matching your query.

cluster1::> event log show -severity error
Time                Node             Severity      Event
------------------- ---------------- ------------- ---------------------------
3/6/2018 02:28:30   cluster1-02      ERROR         asup.post.drop: AutoSupport message (HA Group Notification from cluster1-02 (MANAGEMENT_LOG) INFO) for host (0) was not posted to NetApp. The system will drop the message.
3/6/2018 01:28:18   cluster1-02      ERROR         asup.post.drop: AutoSupport message (HA Group Notification from cluster1-02 (PERFORMANCE DATA) INFO) for host (0) was not posted to NetApp. The system will drop the message.
3/6/2018 00:00:07   cluster1-02      ERROR         mgmtgwd.certificate.expired: A digital certificate with Fully Qualified Domain Name (FQDN) cluster1, Serial Number 5589765F, Certificate Authority 'cluster1' and type server for Vserver cluster1 has expired.
3/6/2018 00:00:07   cluster1-02      ERROR         mgmtgwd.certificate.expired: A digital certificate with Fully Qualified Domain Name (FQDN) UC_SVM2, Serial Number 55A03966, Certificate Authority 'SVM2' and type server for Vserver SVM2 has expired.
3/6/2018 00:00:07   cluster1-02      ERROR         mgmtgwd.certificate.expired: A digital certificate with Fully Qualified Domain Name (FQDN) UC_SVM, Serial Number 559FFD76, Certificate Authority 'SVM' and type server for Vserver SVM has expired.
3/6/2018 00:00:07   cluster1-02      ERROR         mgmtgwd.certificate.expired: A digital certificate with Fully Qualified Domain Name (FQDN) UCS_SVM_DR, Serial Number 545845C16E278, Certificate Authority 'SVM_DR' and type server for Vserver SVM_DR-mc has expired.
3/6/2018 00:00:07   cluster1-02      ERROR         mgmtgwd.certificate.expired: A digital certificate with Fully Qualified Domain Name (FQDN) UCS_SVM2_DR, Serial Number 545845A7B01FA, Certificate Authority 'SVM2_DR' and type server for Vserver SVM2_DR-mc has expired.
7 entries were displayed.

 17、查看某個聚合下的Volume狀態信息
cluster1::> vol show -aggregate aggr_data_A1

 18、查看Lun信息及Lun詳細信息

cluster1::> lun show
cluster1::> lun show -v

 19、查看map信息及map詳情

cluster1::> igroup show
cluster1::> igroup show -v

 20、查看Lun的map情況

cluster1::> lun show -m

21、進入某一節點

cluster1::> run -node cluster1-01 
Type 'exit' or 'Ctrl-D' to return to the CLI
cluster1-01>

 22、節點下查看spare disks

cluster1-01> vol status -s

Local spares

Pool1 spare disks

RAID Disk       Device                  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------       ------                  ------------- ---- ---- ---- ----- --------------    --------------
Spare disks for block checksum
spare           SW_B_3:6.126L41         3a    21  14  FC:A   1   SAS 10000 1142352/2339537408 1144641/2344225968 (not zeroed)
spare           SW_B_3:7.126L75         3a    42  22  FC:A   1   SAS 10000 1142352/2339537408 1144641/2344225968 
spare           SW_B_3:7.126L101        3b    43  22  FC:A   1   SAS 10000 1142352/2339537408 1144641/2344225968 
spare           SW_B_4:7.126L76         4b    42  23  FC:B   1   SAS 10000 1142352/2339537408 1144641/2344225968 
spare           SW_B_4:7.126L29         4b    43  2   FC:B   1   SAS 10000 1142352/2339537408 1144641/2344225968 
spare           SW_B_4:7.126L50         4b    43  23  FC:B   1   SAS 10000 1142352/2339537408 1144641/2344225968 
spare           SW_B_3:6.126L22         3a    20  21  FC:A   1   SSD   N/A 381304/780910592  381554/781422768 

Pool0 spare disks

RAID Disk       Device                  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------       ------                  ------------- ---- ---- ---- ----- --------------    --------------
Spare disks for block checksum
spare           SW_A_1:7.126L12         3a    30  11  FC:A   0   SAS 10000 1142352/2339537408 1144641/2344225968 
spare           SW_A_1:7.126L14         3a    30  13  FC:A   0   SAS 10000 1142352/2339537408 1144641/2344225968 
spare           SW_A_1:7.126L31         3a    31  4   FC:A   0   SAS 10000 1142352/2339537408 1144641/2344225968 
spare           SW_A_1:7.126L76         3a    32  23  FC:A   0   SAS 10000 1142352/2339537408 1144641/2344225968 
spare           SW_A_1:7.126L79         3a    33  0   FC:A   0   SAS 10000 1142352/2339537408 1144641/2344225968 
spare           SW_A_1:7.126L80         3a    33  1   FC:A   0   SAS 10000 1142352/2339537408 1144641/2344225968 
spare           SW_A_2:7.126L73         4b    32  20  FC:B   0   SAS 10000 1142352/2339537408 1144641/2344225968 
spare           SW_A_2:7.126L37         4b    33  10  FC:B   0   SAS 10000 1142352/2339537408 1144641/2344225968 
spare           SW_A_2:6.126L74         4b    11  21  FC:B   0   SSD   N/A 381304/780910592  381554/781422768

 23、節點下查看fail disk

cluster1-01> vol status -f

Broken disks (empty)

 24、顯示沒有ownership(歸屬權)的硬盤

cluster1-01> disk show -n

disk show : No unassigned disks

 25、分配硬盤的歸屬(硬盤更換常用)

cluster1-01> disk assign all 

  26、查看所有硬盤位置信息

cluster1-01> storage show disk -p 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM