記一次bond引起的網絡故障


本案中3個關鍵服務器
物理服務器:192.168.6.63,簡稱P,(Physical server)
KVM-VM:192.168.6.150,是物理服務器P上的一個KVM虛機,簡稱VM
NAS:外部NAS服務器,用來做ping/arp測試服務器,簡稱NAS

物理服務器P的配置:

#uname -a
Linux cz63 4.15.18-11-pve #1 SMP PVE 4.15.18-34 (Mon, 25 Feb 2019 14:51:06 +0100) x86_64 GNU/Linux

#cat /etc/network/interfaces
auto lo
iface lo inet loopback

auto enp3s0f0
iface enp3s0f0 inet manual

auto enp3s0f1
iface enp3s0f1 inet manual

auto ens1f0
iface ens1f0 inet manual

auto ens1f1
iface ens1f1 inet manual

auto bond0
iface bond0 inet manual
	bond-slaves enp3s0f0 enp3s0f1
	bond-miimon 100
	bond-mode balance-rr

auto bond1
iface bond1 inet manual
	bond-slaves ens1f0 ens1f1
	bond-miimon 100
	bond-mode balance-rr

auto vmbr0
iface vmbr0 inet static
	address  192.168.6.63
	netmask  255.255.255.0
	gateway  192.168.6.1
	bridge-ports bond0
	bridge-stp off
	bridge-fd 0

auto vmbr1
iface vmbr1 inet static
	address  10.1.1.63
	netmask  255.255.255.0
	bridge-ports bond1
	bridge-stp off
	bridge-fd 0
#brctl show
bridge name	bridge id		STP enabled	interfaces
vmbr0		8000.ac1f6b342094	no		bond0
							tap401000001i0
vmbr1		8000.74a4b500e768	no		bond1

安裝KVM-VM,之后VM與外部服務器的連接非常不穩定,90%情況下會出現ping不可達。這里測試了centos、ubuntu、win7都是一樣的不穩定。
在VM內執行ping 192.168.6.40(NAS)
結果ping不通。
此時,在P上執行
tcpdump -leni vmbr0 arp
tcpdump -leni tap401000001i0 arp
發現情況如下:

P上
#tcpdump -leni vmbr0 arp | grep 2a:f0:5f:ae:c9:8b
2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.6.40 (ff:ff:ff:ff:ff:ff) tell 192.168.6.150, length 28
2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has 192.168.6.40 (ff:ff:ff:ff:ff:ff) tell 192.168.6.150, length 46
00:50:56:87:86:b9 > 2a:f0:5f:ae:c9:8b, ethertype ARP (0x0806), length 60: Reply 192.168.6.40 is-at 00:50:56:87:86:b9, length 46
2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.6.40 (ff:ff:ff:ff:ff:ff) tell 192.168.6.150, length 28
2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has 192.168.6.40 (ff:ff:ff:ff:ff:ff) tell 192.168.6.150, length 46
00:50:56:87:86:b9 > 2a:f0:5f:ae:c9:8b, ethertype ARP (0x0806), length 60: Reply 192.168.6.40 is-at 00:50:56:87:86:b9, length 46
2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.6.40 (ff:ff:ff:ff:ff:ff) tell 192.168.6.150, length 28
2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has 192.168.6.40 (ff:ff:ff:ff:ff:ff) tell 192.168.6.150, length 46
00:50:56:87:86:b9 > 2a:f0:5f:ae:c9:8b, ethertype ARP (0x0806), length 60: Reply 192.168.6.40 is-at 00:50:56:87:86:b9, length 46
2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.6.40 (ff:ff:ff:ff:ff:ff) tell 192.168.6.150, length 28
2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has 192.168.6.40 (ff:ff:ff:ff:ff:ff) tell 192.168.6.150, length 46
00:50:56:87:86:b9 > 2a:f0:5f:ae:c9:8b, ethertype ARP (0x0806), length 60: Reply 192.168.6.40 is-at 00:50:56:87:86:b9, length 46
2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.6.40 (ff:ff:ff:ff:ff:ff) tell 192.168.6.150, length 28
2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has 192.168.6.40 (ff:ff:ff:ff:ff:ff) tell 192.168.6.150, length 46
00:50:56:87:86:b9 > 2a:f0:5f:ae:c9:8b, ethertype ARP (0x0806), length 60: Reply 192.168.6.40 is-at 00:50:56:87:86:b9, length 46

P上
#tcpdump -leni tap401000001i0  arp | grep 2a:f0:5f:ae:c9:8b
2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.6.40 (ff:ff:ff:ff:ff:ff) tell 192.168.6.150, length 28
2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has 192.168.6.40 (ff:ff:ff:ff:ff:ff) tell 192.168.6.150, length 46
2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.6.40 (ff:ff:ff:ff:ff:ff) tell 192.168.6.150, length 28
2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has 192.168.6.40 (ff:ff:ff:ff:ff:ff) tell 192.168.6.150, length 46
2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.6.40 (ff:ff:ff:ff:ff:ff) tell 192.168.6.150, length 28
2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has 192.168.6.40 (ff:ff:ff:ff:ff:ff) tell 192.168.6.150, length 46
2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.6.40 (ff:ff:ff:ff:ff:ff) tell 192.168.6.150, length 28
2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has 192.168.6.40 (ff:ff:ff:ff:ff:ff) tell 192.168.6.150, length 46
2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.6.40 (ff:ff:ff:ff:ff:ff) tell 192.168.6.150, length 28
2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has 192.168.6.40 (ff:ff:ff:ff:ff:ff) tell 192.168.6.150, length 46
2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.6.40 (ff:ff:ff:ff:ff:ff) tell 192.168.6.150, length 28
2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has 192.168.6.40 (ff:ff:ff:ff:ff:ff) tell 192.168.6.150, length 46
2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.6.40 (ff:ff:ff:ff:ff:ff) tell 192.168.6.150, length 28

NAS上
# tcpdump -leni vmx0 arp | grep c9:8b
02:03:50.930907 2a:f0:5f:ae:c9:8b > 00:50:56:87:86:b9, ethertype ARP (0x0806), length 60: Request who-has 192.168.6.40 tell 192.168.6.150, length 46
02:03:50.930923 00:50:56:87:86:b9 > 2a:f0:5f:ae:c9:8b, ethertype ARP (0x0806), length 42: Reply 192.168.6.40 is-at 00:50:56:87:86:b9, length 28
02:04:02.669823 2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has 192.168.6.32 tell 192.168.6.150, length 46
02:04:02.670131 00:50:56:87:e3:6d > 2a:f0:5f:ae:c9:8b, ethertype ARP (0x0806), length 60: Reply 192.168.6.32 is-at 00:50:56:87:e3:6d, length 46
02:04:03.670770 2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has 192.168.6.32 tell 192.168.6.150, length 46
02:04:03.671059 00:50:56:87:e3:6d > 2a:f0:5f:ae:c9:8b, ethertype ARP (0x0806), length 60: Reply 192.168.6.32 is-at 00:50:56:87:e3:6d, length 46
02:04:04.672736 2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has 192.168.6.32 tell 192.168.6.150, length 46
02:04:04.672992 00:50:56:87:e3:6d > 2a:f0:5f:ae:c9:8b, ethertype ARP (0x0806), length 60: Reply 192.168.6.32 is-at 00:50:56:87:e3:6d, length 46
02:04:06.671878 2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has 192.168.6.32 tell 192.168.6.150, length 46
02:04:06.672021 00:50:56:87:e3:6d > 2a:f0:5f:ae:c9:8b, ethertype ARP (0x0806), length 60: Reply 192.168.6.32 is-at 00:50:56:87:e3:6d, length 46
02:04:07.674726 2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has 192.168.6.32 tell 192.168.6.150, length 46
02:04:07.674773 00:50:56:87:e3:6d > 2a:f0:5f:ae:c9:8b, ethertype ARP (0x0806), length 60: Reply 192.168.6.32 is-at 00:50:56:87:e3:6d, length 46
02:04:08.676733 2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has 192.168.6.32 tell 192.168.6.150, length 46
02:04:08.676868 00:50:56:87:e3:6d > 2a:f0:5f:ae:c9:8b, ethertype ARP (0x0806), length 60: Reply 192.168.6.32 is-at 00:50:56:87:e3:6d, length 46
02:04:10.673678 2a:f0:5f:ae:c9:8b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has 192.168.6.32 tell 192.168.6.150, length 46
02:04:10.674026 00:50:56:87:e3:6d > 2a:f0:5f:ae:c9:8b, ethertype ARP (0x0806), length 60: Reply 192.168.6.32 is-at 00:50:56:87:e3:6d, length 46

vmbr0上發現了2個request包和1個reply包
tap401000001i0上僅發現了request包,而且length不一致,未發現reply包。
NAS上一切正常
看到的現象就是VM在發送icmp之前需要先獲得NAS的MAC地址,所以需要先通過ARP廣播獲取到NAS的MAC地址。

  • P的vmbr0異常:接收到2個request,1個reply
  • P的vmbr0異常:接收到reply未發送至tap401000001i0
    由於以上2個原因,導致VM無法收到reply,所以icmp不能進行,ping失敗。

問題定位在數據鏈路層的ARP協議上。
在VM上執行arping -c 10 192.168.6.40
現象依據。再次證明以上2個現象。

# brctl show
bridge name	bridge id		STP enabled	interfaces
vmbr0		8000.ac1f6b342094	no		bond0
							tap401000001i0
vmbr1		8000.74a4b500e768	no		bond1
# brctl showstp  vmbr0
vmbr0
 bridge id		8000.ac1f6b342094
 designated root	8000.ac1f6b342094
 root port		   0			path cost		   0
 max age		  20.00			bridge max age		  20.00
 hello time		   2.00			bridge hello time	   2.00
 forward delay		   0.00			bridge forward delay	   0.00
 ageing time		 300.00
 hello timer		   0.00			tcn timer		   0.00
 topology change timer	   0.00			gc timer		   4.18
 flags


bond0 (1)
 port id		8001			state		     forwarding
 designated root	8000.ac1f6b342094	path cost		   4
 designated bridge	8000.ac1f6b342094	message age timer	   0.00
 designated port	8001			forward delay timer	   0.00
 designated cost	   0			hold timer		   0.00
 flags

tap401000001i0 (2)
 port id		8002			state		     forwarding
 designated root	8000.ac1f6b342094	path cost		 100
 designated bridge	8000.ac1f6b342094	message age timer	   0.00
 designated port	8002			forward delay timer	   0.00
 designated cost	   0			hold timer		   0.00
 flags

# brctl showmacs vmbr0
port no	mac addr		is local?	ageing timer
  1	2a:f0:5f:ae:c9:8b	no		   1.13
  1	00:50:56:87:e3:6d	no		   0.00
# iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
# ebtables -L
Bridge table: filter

Bridge chain: INPUT, entries: 0, policy: ACCEPT

Bridge chain: FORWARD, entries: 0, policy: ACCEPT

Bridge chain: OUTPUT, entries: 0, policy: ACCEPT
# ip rule show
0:	from all lookup local
32766:	from all lookup main
32767:	from all lookup default
# ip route show table all
default via 192.168.6.1 dev vmbr0 onlink
10.1.1.0/24 dev vmbr1 proto kernel scope link src 10.1.1.63
192.168.6.0/24 dev vmbr0 proto kernel scope link src 192.168.6.63
broadcast 10.1.1.0 dev vmbr1 table local proto kernel scope link src 10.1.1.63
local 10.1.1.63 dev vmbr1 table local proto kernel scope host src 10.1.1.63
broadcast 10.1.1.255 dev vmbr1 table local proto kernel scope link src 10.1.1.63
broadcast 127.0.0.0 dev lo table local proto kernel scope link src 127.0.0.1
local 127.0.0.0/8 dev lo table local proto kernel scope host src 127.0.0.1
local 127.0.0.1 dev lo table local proto kernel scope host src 127.0.0.1
broadcast 127.255.255.255 dev lo table local proto kernel scope link src 127.0.0.1
broadcast 192.168.6.0 dev vmbr0 table local proto kernel scope link src 192.168.6.63
local 192.168.6.63 dev vmbr0 table local proto kernel scope host src 192.168.6.63
broadcast 192.168.6.255 dev vmbr0 table local proto kernel scope link src 192.168.6.63
# bridge vlan show
port	vlan ids
bond0	 1 PVID Egress Untagged

vmbr0	 1 PVID Egress Untagged

bond1	 1 PVID Egress Untagged

vmbr1	 1 PVID Egress Untagged

tap401000001i0	 1 PVID Egress Untagged

以上一切正常,未發現輸出問題,思路該怎么展開?

google了一下午加一個晚上,各種不解決。
最后看到了linux關於bond的說明:
https://forum.huawei.com/enterprise/zh/thread-282727.html
再看看P的網卡,明明是mode0啊!
這時候再去交換機上一看!哈啊~!

interface GigabitEthernet0/0/5
 port link-type access
 port default vlan 6
#
interface GigabitEthernet0/0/6
 port link-type access
 port default vlan 6
#
interface GigabitEthernet0/0/7
 port link-type access
 port default vlan 6
#
interface GigabitEthernet0/0/8
 port link-type access
 port default vlan 6

感覺好像是這個問題。
然后ifdown ens1f1,問題解決。
VM與所有外部網絡通訊正常,無任何故障現象。
通過閱讀華為網站的文章,確定了balance-rr需要交換機做修改。
不想勞煩網管,所以自己改成mode6。
一切歸於平靜。

問題只是臨時解決了,但是產生現象的原理還沒明白。
2009年自己就詳細看了windows teaming linux bond的原理及配置,苦於只是知道OS層面,不知道switch側應該怎么對應,所以知識點一直有欠缺,2017年、2018年都曾經入坑,可惜哪里有2台物理服務器和物理交換機實踐一下呀! 這次總算補齊了OS和switch的配置。
能用mode6就用mode6,再其次就是mode4。

以下給自己看的:

brctl show							#查看bridge信息
brctl showstp  vmbr0				#查看vnbr0的stp信息,是否轉發等
brctl showmacs vmbr0 | grep c9:8b	#查看bridge的MAC地址
tcpdump -leni vmbr0 arp				#debug vmbr0網卡的arp包信息
tcpdump -leni vmbr0 icmp			#debug vmbr0網卡的icmp包信息
iptables -A FORWARD -i vmbr0 -o vmbr0 -j ACCEPT
iptables -L		#查看iptables
ebtables -L		#查看數據鏈路層的table信息
arping -c 10 192.168.1.1			#arp ping
ip rule show
ip route show table local
ip route show table all
bridge monitor			#monitor fdb update
bridge vlan show		#查看bridge的vlan信息

ping -c 1 -I veth1 192.168.3.1		#指定網卡進行ping測試

/proc/sys/net/bridge/bridge-nf-call-iptables	#干嘛的
/proc/sys/net/ipv4/ip_forward					#多網卡下,數據包轉發

sysctl -a | grep bridge
net.bridge.bridge-nf-call-arptables中的0或者1代表什么意思??

echo 1 > /proc/sys/net/ipv4/conf/eth0/arp_ignore		#什么意思??
echo 8 > /proc/sys/net/ipv4/conf/eth0/arp_announce		#什么意思??


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM