專題:Channel Bonding/bonding


  EtherChannel最初是由cisco提出,通過聚合多條物理鏈路為單條邏輯鏈路,從而實現高可用及提高吞吐量等目的。AgP(Port Aggregation Protocol,Cisco專有協議)、LACP(IEEE 802.3ad)是應用最廣泛的兩種實現。Linux中的實現稱為Bonding,HA的實現需要系統層面Bonding和物理層面switch緊密配合。

  參考資料:http://www.mjmwired.net/kernel/Documentation/networking/bonding.txt


 一、Bonding的7種模式


 mode=0 balance-rr

  • round robin模式,所有報文由各個slave平均承擔
  • 優點:提供7種模式中最高的理論帶寬,任一slave失效,其任務由剩余slave均攤
  • 缺點:不同端口輪循發報,容易導致亂序,對端會要求重發,從而影響吞吐量;需要switch端配置port channel

mode=1 active-backup

  • 主備模式,僅有當primary devivce DOWN掉時,備用設備轉換為primary狀態
  • 優點:對switch無要求,可接入任何鏈路
  • 缺點:device利用率最低

mode=2 balance-xor

  • xor異或hash算法,發往同一目的MAC地址的報文由同一端口全部承擔,因此,在單switch網絡環境下,相當於active-backup,不能提升帶寬
  • 優點:multi-switch環境下有可能提供優於balance-rr的吞吐量,不存在Bonding本身導致的亂序問題
  • 缺點:single-switch環境下無效率提升;需要swtch端配置port channel

mode=3 broadcast

  • 所有報文會復制N份,由每個端口同時發出
  • 優點:提供最好的網絡容錯機制,不存在其它mode下的端口切換期間的丟包現象(業務上不會感知有downtime),適用於金融行業等對穩定性要求極高的領域
  • 缺點:占用N倍網絡帶寬,影響整體吞吐量

mode=4 802.3ad

  • IEEE標准,所有實現了802.3ad標准的對端均可以有效的合作
  • 優點:switch端通常只需要少量的配置;幀按順序傳遞,通常不會出現亂序現象
  • 缺點:支持802.3ad的設備相對較少;通常要求所有的slave具有相同的spead和雙工mode;和除了balance-rr之外的其它mode一樣,任何連接都不能使用多於1個的interface的帶寬

mode=5 balance-tlb

  • 根據outgoing流量在各個slave間均衡,適用於multi-switch環境
  • 優點:無須switch端的特別配置;非單點路由環境下,以優於XOR的算法做均衡;各slave速率可以不同
  • 缺點:無法對incoming流量進行均衡處理;不支持arp監控

mode=6 balance-alb

  • 在mode=5之上的改進,通過arp協商實現對ipv4的incoming流量負載均衡
  • 優點:mode=5優點+可以實現incoming負載均衡
  • 缺點:僅在大的集群環境中有較大優勢

  綜上所述,常用的mode為balance-rr(mode=0)、active-backup(mode=1)、broadcast(mode=3)、balance-alb6;其中balance-alb可以看作是balance-xob、balance-tlb的改進版。


 二、Bonding管理:create、change、destroy、monitor


   最常用的三種Bonding管理方式:iproute2、sysfs、發行版的網絡配置文件;其中前兩者具有更好的通用性,可跨不同的Linux發行版使用,是本文着重介紹的對象。

Creating Bonds:

  • modprobe bonding(可選操作,模塊導入后默認生成一個名為bond0的master)
  • ip link add device bond0 type bond
  • OR
  • echo +bond0 > /sys/class/net/bonding_masters

Show all existing bonds:

  • ip link | grep MASTER
  • OR
  • cat /sys/class/net/bonding_masters

Show the status of bonds:

  • cat /proc/net/bonding/bondX
  • NOTE:Each bonding device has a read-only file residing in the /proc/net/bonding directory.The file contents include information about the bonding configuration, options and state of each slave.Notice all slaves of bond0 have the same MAC address (HWaddr) as bond0 for all modes except TLB and ALB that require a unique MAC address for each slave.

Remove an existing bond:

  • ip link del bond0
  • OR
  • echo -bond0 > /sys/class/net/bonding_masters

  NOTE: due to 4K size limitation of sysfs files, this list may be truncated if you have more than a few hundred bonds.This is unlikely to occur under normal operating conditions.

Adding Slaves:

  • ip link set bond0 up
  • ip link set eth0 down
  • ip link set eth0 nomaster
  • ip link set dev eth0 master bond0
  • OR
  • echo +eth0 > /sys/class/net/bond0/bonding/slaves

Show all slaves belong to bond0:

  • ip link | grep -P 'master.*bond0'
  • OR
  • cat /sys/class/net/bond0/bonding/slaves

Free slave eth0 from bond bond0:

  • ip link set eth0 nomaster
  • OR
  • echo -eth0 > /sys/class/net/eth0/master/bonding/slaves
  • #The two operations above will free eth0 from whatever bond it is enslaved to, regardless of the name of the bond interface
  • OR
  • echo -eth0 > /sys/class/net/bond0/bonding/slaves

  NOTE:When an interface is enslaved to a bond, symlinks between the two are created in the sysfs filesystem.In this case, you would get /sys/class/net/bond0/slave_eth0 pointing to /sys/class/net/eth0, and /sys/class/net/eth0/master pointing to /sys/class/net/bond0.This means that you can tell quickly whether or not an interface is enslaved by looking for the master symlink.

Changing a Bond's Configuration

  Each bond may be configured individually by manipulating the files located in /sys/class/net/<bond name>/bonding.除了arp_ip_target之外,都可以用echo “+-”的方式改變/sys下對應文件的內容,可以實現與配置文件或命令參數同樣的功能。

To configure bond0 for balance-alb mode:

  • ip link set bond0 down
  • echo 6 > /sys/class/net/bond0/bonding/mode
  • OR
  • echo balance-alb > /sys/class/net/bond0/bonding/mode
  • NOTE: The bond interface must be down before the mode can be changed

To enable MII monitoring on bond0 with a 1 second interval:

  • echo 1000 > /sys/class/net/bond0/bonding/miimon
  • NOTE: If ARP monitoring is enabled, it will disabled when MII monitoring is enabled, and vice-versa(反之亦然)

To add ARP targets:

  • echo +192.168.0.100 > /sys/class/net/bond0/bonding/arp_ip_target
  • echo +192.168.0.101 > /sys/class/net/bond0/bonding/arp_ip_target
  • NOTE: up to 16 target addresses may be specified

To remove an ARP target:

  • echo -192.168.0.100 > /sys/class/net/bond0/bonding/arp_ip_target

To configure the interval between learning packet transmits:

  • echo 12 > /sys/class/net/bond0/bonding/lp_interval
  • NOTE: the lp_inteval is the number of seconds between instances where the bonding driver sends learning packets to each slaves peer switch.The default interval is 1 second.The lp_inteval has effect only in balance-alb and balance-tlb modes

Script Example:

#!/bin/env bash
modprobe bonding
bG=bond0
iP="10.1.7.66/24"
ip link del $bG 2>/dev/null #通過刪除$bG,清除其下可能存在的slave
echo +"$bG" > /sys/class/net/bonding_masters #創建$bG
ip link set $bG down #確保$bG處於down狀態
############################################################
for x in {0..2} #清除目標端口的初始ip、route信息
do
    ip addr flush dev eth"$x"
    ip route flush dev eth"$x"
done
############################################################
PS3="Select bonding mode:"
select i in "balance-rr" "active-backup" "balance-xor" "broadcast" "802.3ad" "balance-tlb" "balance-alb"
#交互式設定bonding模式,若要設置開機啟動,修改成固定模式即可
do echo $i > /sys/class/net/"$bG"/bonding/mode break done ############################################################ echo 1000 > /sys/class/net/"$bG"/bonding/miimon #示例:設置monitor參數 ############################################################ for x in {0..2} #綁定端口 do echo +eth"$x" > /sys/class/net/"$bG"/bonding/slaves done ############################################################ ip link set $bG up #啟用bonding設備並添加ip ip addr add $iP dev $bG ############################################################

設置開機啟動:

  不同的發行版,其開機啟動腳本不盡相同

  • rhel5、6系列,在/etc/rc.d/rc.local中配置
  • gentoo_openrc在/etc/rc.local/目錄下創建foo.start腳本
  • systemd的環境(如debian、rhel7、gentoo_systemd、suse12等),撰寫foo.service文件,置於/etc/systemd/system/目錄,systemctl enable foo.service

 三、Switch Configuration 


   "switch" refers to whatever system the bonded devices are directly connected to (i.e., where the other end of the cable plugs into).This may be an actual dedicated switch device,or it may be another regular system (e.g., another computer running Linux),The active-backup, balance-tlb and balance-alb modes do not require any specific configuration of the switch.

  The 802.3ad mode requires that the switch have the appropriate ports configured as an 802.3ad aggregation.The precise method used to configure this varies from switch to switch, but, for example, a Cisco 3550 series switch requires that the appropriate ports first be grouped together in a single etherchannel instance, then that etherchannel is set to mode "lacp" to enable 802.3ad (instead of standard EtherChannel).
The balance-rr, balance-xor and broadcast modes generally require that the switch have the appropriate ports grouped together.The nomenclature for such a group differs between switches, it may be called an "etherchannel" (as in the Cisco example, above), a "trunk group" or some other similar variation.For these modes, each switch will also have its own configuration options for the switch's transmit policy to the bond.Typical choices include XOR of either the MAC or IP addresses.The transmit policy of the two peers does not need to match.For these three modes, the bonding mode really selects a transmit policy for an EtherChannel group; all three will interoperate with another EtherChannel group.


 四、Link Monitoring


   The bonding driver at present supports two schemes for monitoring a slave device's link state: the ARP monitor and the MII monitor.At the present time, due to implementation restrictions in the bonding driver itself, it is not possible to enable both ARP and MII monitoring simultaneously.

  Miimon only checks for the device's carrier state.It has no way to determine the state of devices on or beyond other ports of a switch, or if a switch is refusing to pass traffic while still maintaining carrier on.
  Loading the bonding driver before any network drivers participating in a bond.
  When bonding is configured, it is important that the slave devices not have routes that supersede routes of the master (or,generally, not have routes at all).
  The ARP monitor (and ARP itself) may become confused by this configuration, because ARP requests (generated by the ARP monitor) will be sent on one interface (bond0 etc.), but the corresponding reply will arrive on a different interface (eth0 etc.).This reply looks to ARP as an unsolicited ARP reply (because ARP matches replies on an interface basis), and is discarded.The MII monitor is not affected by the state of the routing table.
  Insure that slaves do not have routes of their own, and if for some reason they must, those routes do not supersede routes of their master.This should generally be the case, but unusual configurations or errant manual or automatic static route additions may cause trouble.


 五、Promiscuous mode


   When running network monitoring tools, e.g., tcpdump, it is common to enable promiscuous mode on the device, so that all traffic is seen (instead of seeing only traffic destined for the local host).The bonding driver handles promiscuous mode changes to the bonding master device (e.g., bond0), and propagates the setting to the slave devices.

  • For the balance-rr, balance-xor, broadcast, and 802.3ad modes,the promiscuous mode setting is propagated to all slaves.
  • For the active-backup, balance-tlb and balance-alb modes, the promiscuous mode setting is propagated only to the active slave.
  • For balance-tlb mode, the active slave is the slave currently receiving inbound traffic.
  • For balance-alb mode, the active slave is the slave used as a "primary." This slave is used for mode-specific control traffic, for sending to peers that are unassigned or if the load is unbalanced.
  • For the active-backup, balance-tlb and balance-alb modes, when the active slave changes (e.g., due to a link failure), the promiscuous setting will be propagated(傳承) to the new active slave.

 六、Configuring Bonding for High Availability


   High Availability refers to configurations that provide maximum network availability by having redundant or backup devices,links or switches between the host and the rest of the world.The goal is to provide the maximum availability of network connectivity(i.e., the network always works), even though other configurations could provide higher throughput.

  High Availability in a Multiple Switch Topology(interface && switch):

  In multiple switch topologies, there is a trade off between network availability and usable bandwidth.

HA Bonding Mode Selection for Multiple Switch Topology:

  • active-backup: This is generally the preferred mode, particularly if the switches have an ISL and play together well.If the network configuration is such that one switch is specifically a backup switch (e.g., has lower capacity, higher cost, etc),then the primary option can be used to insure that the preferred link is always used when it is available.
  • broadcast: This mode is really a special purpose mode, and is suitable only for very specific needs.For example, if the two switches are not connected (no ISL), and the networks beyond them are totally independent.In this case, if it is necessary for some specific one-way traffic to reach both independent networks, then the broadcast mode may be suitable.

  HA Link Monitoring Selection for Multiple Switch Topology:
  The choice of link monitoring ultimately depends upon your switch.If the switch can reliably fail ports in response to other failures, then either the MII or ARP monitors should work.
In general, however, in a multiple switch topology, the ARP monitor can provide a higher level of reliability in detecting end to end connectivity failures (which may be caused by the failure of any individual component to pass traffic for any reason).Additionally,the ARP monitor should be configured with multiple targets (at least one for each switch in the network).This will insure that,regardless of which switch is active, the ARP monitor has a suitable target to query.


七、MT Bonding Mode Selection for Single Switch Topology 


  • balance-rr: This mode is the only mode that will permit a single TCP/IP connection to stripe traffic across multiple interfaces.It is therefore the only mode that will allow a single TCP/IP stream to utilize more than one interface's worth of throughput.This comes at a cost, however: the striping generally results in peer systems receiving packets out of order, causing TCP/IP's congestion control system to kick in, often by retransmitting segments.This mode requires the switch to have the appropriate ports configured for "etherchannel" or "trunking."
  • active-backup: There is not much advantage in this network topology to the active-backup mode, as the inactive backup devices are all connected to the same peer as the primary.In this case, a load balancing mode (with link monitoring) will provide the same level of network availability, but with increased available bandwidth.On the plus side, active-backup mode does not require any configuration of the switch, so it may have value if the hardware available does not support any of the load balance modes.
  • balance-xor: This mode will limit traffic such that packets destined for specific peers will always be sent over the same interface.Since the destination is determined by the MAC addresses involved, this mode works best in a "local" network configuration (as described above), with destinations all on the same local network.This mode is likely to be suboptimal if all your traffic is passed through a single router (i.e., a "gatewayed" network configuration, as described above).As with balance-rr, the switch ports need to be configured for "etherchannel" or "trunking."
  • broadcast: Like active-backup, there is not much advantage to this mode in this type of network topology.
  • 802.3ad: This mode can be a good choice for this type of network topology.The 802.3ad mode is an IEEE standard, so all peers that implement 802.3ad should interoperate well.The 802.3ad protocol includes automatic configuration of the aggregates,so minimal manual configuration of the switch is needed (typically only to designate that some set of devices is available for 802.3ad).The 802.3ad standard also mandates that frames be delivered in order (within certain limits), so in general single connections will not see misordering of packets.The 802.3ad mode does have some drawbacks: the standard mandates that all devices in the aggregate operate at the same speed and duplex.Also, as with all bonding load balance modes other than balance-rr, no single connection will be able to utilize more than a single interface's worth of bandwidth.Additionally, the linux bonding 802.3ad implementation distributes traffic by peer (using an XOR of MAC addresses and packet type ID), so in a "gatewayed" configuration, all outgoing traffic will generally use the same device.Incoming traffic may also end up on a single device, but that is dependent upon the balancing policy of the peer's 8023.ad implementation.In a "local" configuration, traffic will be distributed across the devices in the bond.Finally, the 802.3ad mode mandates the use of the MII monitor,therefore, the ARP monitor is not available in this mode.
  • balance-tlb: The balance-tlb mode balances outgoing traffic by peer.Since the balancing is done according to MAC address, in a "gatewayed" configuration (as described above), this mode will send all traffic across a single device.However, in a "local" network configuration, this mode balances multiple local network peers across devices in a vaguely intelligent manner (not a simple XOR as in balance-xor or 802.3ad mode),so that mathematically unlucky MAC addresses (i.e., ones that XOR to the same value) will not all "bunch up" on a single interface.Unlike 802.3ad, interfaces may be of differing speeds, and no special switch configuration is required.On the down side,in this mode all incoming traffic arrives over a single interface, this mode requires certain ethtool support in the network device driver of the slave interfaces, and the ARP monitor is not available.
  • balance-alb: This mode is everything that balance-tlb is, and more.It has all of the features (and restrictions) of balance-tlb,and will also balance incoming traffic from local network peers,The only additional down side to this mode is that the network device driver must support changing the hardware address while the device is open.

  MT Link Monitoring for Single Switch Topology.The choice of link monitoring may largely depend upon which mode you choose to use.The more advanced load balancing modes do not support the use of the ARP monitor, and are thus restricted to using the MII monitor (which does not provide as high a level of end to end
assurance as the ARP monitor).


八、MT Bonding Mode Selection for Multiple Switch Topology


   In actual practice, the bonding mode typically employed in configurations of this type is balance-rr.MT Link Monitoring for Multiple Switch Topology:again, in actual practice, the MII monitor is most often used in this configuration, as performance is given preference over availability.The ARP monitor will function in this topology, but its advantages over the MII monitor are mitigated by the volume of probes needed as the number of systems involved grows (remember that each host in the network is configured with bonding).


 九、FQA常見問題


 Where does a bonding device get its MAC address from?

  • When using slave devices that have fixed MAC addresses, or when the fail_over_mac option is enabled, the bonding device's MAC address is the MAC address of the active slave.For other configurations, if not explicitly configured (with ifconfig or ip link), the MAC address of the bonding device is taken from its first slave device.This MAC address is then passed to all following slaves and remains persistent (even if the first slave is removed) until the bonding device is brought down or reconfigured.You can change the MAC address with ip link.

Which switches/systems does it work with?

  • The full answer to this depends upon the desired mode.In the basic balance modes (balance-rr and balance-xor), it 2741 works with any system that supports etherchannel (also called trunking).Most managed switches currently available have such support, and many unmanaged switches as well.The advanced balance modes (balance-tlb and balance-alb) do not have special switch requirements, but do need device drivers that support specific features.In 802.3ad mode, it works with systems that support IEEE 802.3ad Dynamic Link Aggregation.Most managed and many unmanaged switches currently available support 802.3ad.The active-backup mode should work with any Layer-II switch.

What happens when a slave link dies?

  • If link monitoring is enabled, then the failing device will be disabled.The active-backup mode will fail over to a backup link, and other modes will ignore the failed link.The link will continue to be monitored, and should it recover, it will rejoin the bond (in whatever manner is appropriate for the mode).Link monitoring can be enabled via either the miimon or arp_interval parameters.In general, miimon monitors the carrier state as sensed by the underlying network device, and the arp monitor (arp_interval) monitors connectivity to another host on the local network.If no link monitoring is configured, the bonding driver will be unable to detect link failures, and will assume that all links are always available.This will likely result in lost packets, and a resulting degradation of performance.The precise performance loss depends upon the bonding mode and network configuration.


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2026 CODEPRJ.COM