一、MAD
Multi-Active Detection,多Active檢測。IRF鏈路故障會導致一個IRF變成多個新的IRF。這些IRF擁有相同的IP地址等三層配置,會引起地址沖突,導致故障在網絡中擴大。為了提高系統的可用性,當IRF分裂時我們就需要一種機制,能夠檢測出網絡中同時存在多個IRF,並進行相應的處理盡量降低IRF分裂對業務的影響。MAD就是這樣一種檢測和處理機制。它主要提供以下功能:
(1)分裂檢測
通過ARP(Address Resolution Protocol)、ND(Neighbor Discovery Protocol)、LACP(Link Aggregation Control Protocol,鏈路聚合控制協議)或者BFD(Bidirectional Forwarding Detection,雙向轉發檢測)來檢測網絡中是否存在多個IRF。
(2)沖突處理
IRF分裂后,通過分裂檢測機制IRF會檢測到網絡中存在其它處於Active狀態(表示IRF處於正常工作狀態)的IRF。
· 對於BFD MAD/ ARP MAD/ND MAD檢測,沖突處理會直接讓Master成員編號小的IRF處於Active狀態,繼續正常工作;其它IRF遷移到Recovery狀態。
· 對於LACP MAD檢測,沖突處理會先比較兩個IRF中成員設備的數量,數量多的IRF處於Active狀態,繼續工作;數量少的遷移到Recovery狀態;如果成員數量相等,則Master成員編號小的IRF處於Active狀態,繼續正常工作;其它IRF遷移到Recovery狀態。
IRF遷移到Recovery狀態后會關閉該IRF中所有成員設備上除保留端口以外的其它所有物理端口(通常為業務接口),以保證該IRF不能再轉發業務報文。缺省情況下,只有IRF鏈路物理端口是保留端口,用戶也可以通過mad exclude interface命令行將其它端口設置為保留端口。
(3)MAD故障恢復
IRF鏈路故障導致IRF分裂,從而引起多Active沖突。因此修復故障的IRF鏈路,讓沖突的IRF重新合並為一個IRF,就能恢復MAD故障。如果在MAD故障恢復前,處於Active狀態的IRF出現其他故障,則可以通過命令行先啟用Recovery狀態的IRF,讓它接替原IRF工作,以便保證業務盡量少受影響,再恢復MAD故障。
二、BFD
Bidirectional Forwarding Detection,雙向轉發檢測。如果說MAD是機制,那么BFD就是檢測手段。
(一) BFD MAD檢測原理
BFD MAD檢測是通過BFD協議來實現的。要使BFD MAD檢測功能正常運行,除在三層接口下使能BFD MAD檢測功能外,還需要在該接口上配置MAD IP地址。MAD IP地址與普通IP地址不同的地方在於:MAD IP地址與成員設備是綁定的,IRF中的每個成員設備上都需要配置,且所有成員設備的MAD IP必須屬於同一網段。
l 當IRF正常運行時,只有Master上配置的MAD IP地址生效,Slave設備上配置的MAD IP地址不生效,BFD會話處於down狀態;(使用display bfd session命令查看BFD會話的狀態。如果Session State顯示為Up,則表示激活狀態;如果顯示為Down,則表示處於down狀態)
l 當IRF分裂形成多個IRF系統時,不同IRF中Master上配置的MAD IP地址均會生效,BFD會話被激活,此時會檢測到多Active沖突。
l 檢測到多Active沖突后,會直接讓Master成員編號小的IRF處於Active狀態,繼續正常工作;其它IRF上報MAD沖突事件給IRF模塊,IRF模塊將該IRF遷移到MAD Recovery狀態。
圖6 BFD MAD交互流程
(二) BFD MAD檢測組網要求
BFD MAD檢測方式需要使用中間設備(如圖7所示),每個成員設備都需要連接到中間設備,這些BFD鏈路專用於MAD檢測。這些鏈路連接的接口必須屬於同一VLAN,在該VLAN接口視圖下給不同成員設備配置同一網段下的不同IP地址。
在用於BFD MAD檢測的接口下必須使用mad ip address命令配置MAD IP地址,而不要配置其它IP地址(包括使用ip address命令配置的普通IP地址、VRRP虛擬IP地址等),以免影響MAD檢測功能。
圖7 BFD MAD檢測組網示意圖
IRF支持的MAD檢測的四種方式區別:
IRF支持的MAD檢測方式有:LACP MAD檢測、BFD MAD檢測、ARP MAD檢測和ND MAD檢測。四種MAD檢測機制各有特點,用戶可以根據現有組網情況進行選擇。由於LACP MAD和BFD MAD、ARP MAD、ND MAD沖突處理的原則不同,請不要同時配置。BFD MAD、ARP MAD、ND MAD這三種方式獨立工作,彼此之間互不干擾,可以同時配置。
MAD檢測方式 |
優勢 |
限制 |
LACP MAD |
檢測速度快,利用現有聚合組網即可實現,無需占用額外接口,利用聚合鏈路同時傳輸普通業務報文和MAD檢測報文(擴展LACP報文) |
組網中需要使用H3C設備作為中間設備,每個成員設備都需要連接到中間設備 |
BFD MAD |
檢測速度較快,組網形式靈活,對其它設備沒有要求 |
當堆疊設備大於兩台時,組網中需要使用中間設備,每個成員設備都需要連接到中間設備,這些BFD鏈路專用於MAD檢測 |
ARP MAD |
非聚合的IPv4組網環境,和MSTP配合使用,無需占用額外端口。在使用中間設備的組網中對中間設備沒有要求 |
檢測速度慢於前兩種。
|
ND MAD |
非聚合的IPv6組網環境,和MSTP配合使用,無需占用額外端口。在使用中間設備的組網中對中間設備沒有要求 |
檢測速度慢於前兩種 |
表1 MAD檢測機制的比較
三、MAD 、BFD及IRF分裂驗證(重要)
《1》 堆疊分裂后,設備並不會自動重啟,只有加入現有的堆疊組的情況下才會。
[H3C]interface FortyGigE 1/0/53
[H3C-FortyGigE1/0/53]SHUT
[H3C-FortyGigE1/0/53]%May 21 16:49:23:411 2020 H3C STM/3/STM_LINK_DOWN: IRF port 1 went down.
%May 21 16:49:23:413 2020 H3C DEV/3/BOARD_REMOVED: Board was removed from slot 2, type is H3C S5820V2-54Q.
%May 21 16:49:23:415 2020 H3C IFNET/3/PHY_UPDOWN: Physical state on the interface FortyGigE1/0/53 changed to down.
%May 21 16:49:23:428 2020 H3C LAGG/6/LAGG_INACTIVE_PHYSTATE: Member port GE2/0/2 of aggregation group BAGG10 changed to the inactive state, because the physical state of the port is down.
%May 21 16:49:23:428 2020 H3C LAGG/6/LAGG_INACTIVE_PHYSTATE: Member port GE2/0/3 of aggregation group BAGG1 changed to the inactive state, because the physical state of the port is down.
%May 21 16:49:23:446 2020 H3C IFNET/5/LINK_UPDOWN: Line protocol state on the interface FortyGigE1/0/53 changed to down.
%May 21 16:49:23:464 2020 H3C IFNET/3/IF_WARN: The jumboframe of the aggregate interface Bridge-Aggregation1 is not supported on the member port GigabitEthernet1/0/3
%May 21 16:49:23:505 2020 H3C BFD/5/BFD_CHANGE_FSM: Sess[192.168.1.1/192.168.1.2, LD/RD:129/129, Interface:Vlan4000, SessType:Ctrl, LinkType:INET], Ver:1, Sta: DOWN->INIT, Diag: 0 (No Diagnostic)
%May 21 16:49:23:798 2020 H3C SHELL/5/SHELL_LOGOUT: Console logged out from con1.
%May 21 16:49:23:879 2020 H3C IFNET/3/IF_WARN: The jumboframe of the aggregate interface Bridge-Aggregation10 is not supported on the member port GigabitEthernet1/0/2
%May 21 16:49:24:407 2020 H3C BFD/5/BFD_MAD_INTERFACE_CHANGE_STATE: BFD MAD function enabled on Vlan-interface4000 changed to the normal state.
%May 21 16:49:24:707 2020 H3C BFD/5/BFD_CHANGE_FSM: Sess[192.168.1.1/192.168.1.2, LD/RD:129/129, Interface:Vlan4000, SessType:Ctrl, LinkType:INET], Ver:1, Sta: INIT->UP, Diag: 0 (No Diagnostic)
%May 21 16:49:24:709 2020 H3C IFNET/3/PHY_UPDOWN: Physical state on the interface GigabitEthernet1/0/1 changed to down.
%May 21 16:49:24:710 2020 H3C IFNET/5/LINK_UPDOWN: Line protocol state on the interface GigabitEthernet1/0/1 changed to down.
%May 21 16:49:24:710 2020 H3C IFNET/3/PHY_UPDOWN: Physical state on the interface Vlan-interface4000 changed to down.
%May 21 16:49:24:710 2020 H3C IFNET/5/LINK_UPDOWN: Line protocol state on the interface Vlan-interface4000 changed to down.
查看兩部機子的狀態:
A機:
<H3C>dis irf
MemberID Role Priority CPU-Mac Description
*+1 Master 32 764d-ea56-0104 ---
--------------------------------------------------
* indicates the device is the master.
+ indicates the device through which the user logs in.
The bridge MAC of the IRF is: 764d-ea56-0100
Auto upgrade : yes
Mac persistent : 6 min
Domain ID : 0
B機:
<H3C>dis irf
MemberID Role Priority CPU-Mac Description
*+2 Master 1 764e-0a91-0204 ---
--------------------------------------------------
* indicates the device is the master.
+ indicates the device through which the user logs in.
The bridge MAC of the IRF is: 764d-ea56-0100
Auto upgrade : yes
Mac persistent : 6 min
Domain ID : 0
《2》 MAD檢查線主要保證在堆疊分裂后雙Acitve的裝載下只有member小的成員業務端口開啟,其他成員則進入Recovery狀態;
A機:
<H3C>dis int brief
Brief information on interfaces in route mode:
Link: ADM - administratively down; Stby - standby
Protocol: (s) - spoofing
Interface Link Protocol Primary IP Description
InLoop0 UP UP(s) --
MGE0/0/0 DOWN DOWN --
NULL0 UP UP(s) --
REG0 UP -- --
Vlan1 UP UP 10.1.1.2
Vlan4000 DOWN DOWN 192.168.1.1
B機:
<H3C>dis ip int brief
*down: administratively down
(s): spoofing (l): loopback
Interface Physical Protocol IP Address Description
MGE0/0/0 down down -- --
Vlan1 down down 10.1.1.2 --
Vlan4000 down down 192.168.1.2 --
<H3C>dis int vlan 1
Vlan-interface1
Current state: MAD ShutDown
Line protocol state: DOWN
Description: Vlan-interface1 Interface
Bandwidth: 100000 kbps
Maximum transmission unit: 1500
Internet address: 10.1.1.2/24 (primary)
IP packet frame type: Ethernet II, hardware address: 764d-ea56-0102
IPv6 packet frame type: Ethernet II, hardware address: 764d-ea56-0102
Last clearing of counters: Never
Last 300 seconds input rate: 1 bytes/sec, 8 bits/sec, 0 packets/sec
Last 300 seconds output rate: 0 bytes/sec, 0 bits/sec, 0 packets/sec
Input: 5 packets, 320 bytes, 0 drops
Output: 3 packets, 138 bytes, 0 drops
《3》 命令mad restore適用於在配置了MAD檢測和IRF堆疊的設備斷了堆疊口后,主設備在Active跑業務發生不可預料的情況下,改用處於Recovery狀態的備機跑業務。(使用命令時要確認MAD檢測是否已失效否則輸入命令后接口會出現先UP后DOWN)
[H3C]interface GigabitEthernet 2/0/1
[H3C-GigabitEthernet2/0/1]dis th
#
interface GigabitEthernet2/0/1
port link-mode bridge
port access vlan 4000
combo enable fiber
#
return
[H3C-GigabitEthernet2/0/1]shut
[H3C-GigabitEthernet2/0/1]qu
[H3C]mad restore
This command will restore the device from multi-active conflict state. Continue? [Y/N]:y
Restoring from multi-active conflict state, please wait...
%May 21 16:56:33:633 2020 H3C IFNET/3/PHY_UPDOWN: Physical state on the interface GigabitEthernet2/0/2 changed to up.
%May 21 16:56:33:634 2020 H3C IFNET/3/PHY_UPDOWN: Physical state on the interface GigabitEthernet2/0/3 changed to up.
[H3C]%May 21 16:56:33:638 2020 H3C LAGG/6/LAGG_ACTIVE: Member port GE2/0/2 of aggregation group BAGG10 changed to the active state.
%May 21 16:56:33:640 2020 H3C IFNET/5/LINK_UPDOWN: Line protocol state on the interface GigabitEthernet2/0/2 changed to up.
%May 21 16:56:33:641 2020 H3C IFNET/3/PHY_UPDOWN: Physical state on the interface Bridge-Aggregation10 changed to up.
%May 21 16:56:33:641 2020 H3C IFNET/5/LINK_UPDOWN: Line protocol state on the interface Bridge-Aggregation10 changed to up.
%May 21 16:56:33:642 2020 H3C LAGG/6/LAGG_ACTIVE: Member port GE2/0/3 of aggregation group BAGG1 changed to the active state.
%May 21 16:56:33:645 2020 H3C IFNET/5/LINK_UPDOWN: Line protocol state on the interface GigabitEthernet2/0/3 changed to up.
%May 21 16:56:33:646 2020 H3C IFNET/3/PHY_UPDOWN: Physical state on the interface Bridge-Aggregation1 changed to up.
%May 21 16:56:33:646 2020 H3C IFNET/5/LINK_UPDOWN: Line protocol state on the interface Bridge-Aggregation1 changed to up.
%May 21 16:56:33:660 2020 H3C IFNET/3/PHY_UPDOWN: Physical state on the interface Vlan-interface1 changed to up.
%May 21 16:56:33:660 2020 H3C IFNET/5/LINK_UPDOWN: Line protocol state on the interface Vlan-interface1 changed to up.
[H3C]
[H3C]dis ip int brief
*down: administratively down
(s): spoofing (l): loopback
Interface Physical Protocol IP Address Description
MGE0/0/0 down down -- --
Vlan1 up up 10.1.1.2 --
Vlan4000 down down 192.168.1.2 --
本文章引用以下文章:
skytwen H3C IRF MAD檢測原理及相關問題驗證 https://www.cnblogs.com/sky5hat/p/10481939.html
IRF MAD應用模型及技術分析 https://www.h3c.com/cn/d_201510/922083_30005_0.htm
S7500E虛擬化技術配置指導 http://www.h3c.com/cn/d_201708/1018599_30005_0.htm