0. 測試環境
硬件環境:還是使用四節點OpenStack部署環境,參見 http://www.cnblogs.com/sammyliu/p/4190843.html
OpenStack配置:
- tenant:三個tenant:demo,tenant-one,tenant-two
- network:三個tenanet公用public network,每個tenant擁有自己的subnet,都有一個router連接自己的subnet到public net
- 虛機:三個虛機,tenant-one一個,tenant-two兩個,都在compute node上
1. Neutron節點上的網絡組件
使用 http://www.cnblogs.com/sammyliu/p/4201143.html 中相同的方法,畫出Neutron節點上網絡組件圖:
可見:
(1). 關於Neutron上的三種Agent的作用:
- Neutron-OVS-Agent:從OVS-Plugin上接收tunnel和tunnel flow的配置,驅動OVS來建立GRE Tunnel
- Neutron-DHCP-Agent:為每一個配置了DHCP的網絡/子網配置dnsmasq,也負責把Mac地址/IP地址 信息寫入dnsmasq dhcp lease 文件
- Neturon-L3-Agent:設置iptables/routing/NAT表
(2). Neutorn節點上同樣有OVS Tunnel bridge br-tun和OVS Integration bridge br-int,多了br-ex來提供外部網絡連接,br-ex和物理網卡eth0綁定。這里出現的一個問題是eth0的IP無法ping通,OVS提供的解決方法如下。究其原因,一塊物理以太網卡如果作為 OpenvSwitch bridge 的一部分,則它不能擁有 IP 地址,如果有,也會完全不起作用。如果發生了上述情況,可以將 IP 地址綁定至某 OpenvSwitch “internal” 設備來恢復網絡訪問功能。
ifconfig eth0 0.0.0.0 ifconfig br-ex 192.168.1.19
(3). Neutron使用Linux network namespace來實現tenant之間的網絡隔離。本例中有三個network namespace,每個network namspace包括router,dhcp,interface,routing tables,iptable rules等。
root@network:/home/s1# ip netns qdhcp-d24963da-5221-481e-adf5-fe033d6e0b4e qrouter-e506f8fe-3260-4880-bd06-32246225aeae qdhcp-d04a0a06-7206-4d05-9432-3443843bc199 qrouter-33e2b1bf-04cb-4811-9c58-7e03856022c1 qrouter-9ba04071-f32b-435e-8f44-e32936568102 qdhcp-0a4cd030-d951-401a-8202-937b788bea43
(4). Neutron 為每一個 network 分配一個本地的 VLAN ID,每個 network 分配一個 network namespace,該DHCP 通過一個 tap 連接在 br-int 上,該 tap 的 tag 為該 local VLAN ID。H1/H2/H3端口上分布有不同的VLAN ID。
#在存在多個 network 的情況下,br-int 上DHCP namespace 端口的 tag 情況 Port "tap0f45d165-9f" tag: 5 Interface "tap0f45d165-9f" type: internal Port "tap89874f55-97" tag: 4 Interface "tap89874f55-97" type: internal Port "tap5522533d-fe" tag: 3 Interface "tap5522533d-fe" type: internal Port "tap56c9730c-9c" tag: 4095 Interface "tap56c9730c-9c" type: internal Port "tap1fd04a93-09" tag: 4095 Interface "tap1fd04a93-09" type: internal Port "tap777c1047-ed" tag: 2 Interface "tap777c1047-ed" type: internal Port "tap3fca96e0-c6" tag: 1 Interface "tap3fca96e0-c6" type: internal
1.1 br-tun OpenFlow rules
插播Mac地址的基礎知識:
- MAC地址是以太網二層使用的一個48bit(6字節十六進制數)的地址,用來標識設備位置。MAC地址分成兩部分,前24位是組織唯一標識符(OUI, Organizationally unique identifier),后24位由廠商自行分配。48bit的MAC地址一般用6字節的十六進制來表示,如XX-XX-XX-XX-XX-XX。
- 廣播地址:FF:FF:FF:FF:FF:FF
- 組播地址:MAC組播地址的特征是頭8位的最低位是1。例如01:80:C2:00:00:00是一個組播地址,表示802.1d網橋多播組。網橋就是使用這個地址,相互之間交換配置信息,運行分布式生成樹算法,消除網絡拓撲結構中的環路。
- 單播地址:單播地址的特征是頭8位的最低位為0。每個網卡出廠時被分配唯一一個單播地址,頭24位是設備制造廠商的編號,由IEEE(電氣與電子工程師協會)分配,后24位是設備廠商為網卡制定的唯一編號。例如8C-70-5A-29-3A-48 是單播地址的例子 (8C = 10001100)。
root@network:/home/s1# ovs-ofctl dump-flows br-tun
NXST_FLOW reply (xid=0x4):
cookie=0x0, duration=33.236s, table=0, n_packets=0, n_bytes=0, idle_age=33, priority=1,in_port=1 actions=resubmit(,2) //從H1進來的traffic,到table 2
cookie=0x0, duration=32.131s, table=0, n_packets=0, n_bytes=0, idle_age=32, priority=1,in_port=2 actions=resubmit(,3) //從GRE端口進來的traffic,到table 3
cookie=0x0, duration=33.178s, table=0, n_packets=6, n_bytes=480, idle_age=24, priority=0 actions=drop
cookie=0x0, duration=33.121s, table=2, n_packets=0, n_bytes=0, idle_age=33, priority=0,dl_dst=00:00:00:00:00:00/01:00:00:00:00:00 actions=resubmit(,20) //目的地址為單播地址,到table 20
cookie=0x0, duration=33.066s, table=2, n_packets=0, n_bytes=0, idle_age=33, priority=0,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00 actions=resubmit(,22) //目的地址為組播(包括廣播)地址,到table 22
cookie=0x0, duration=30.614s, table=3, n_packets=0, n_bytes=0, idle_age=30, priority=1,tun_id=0x1 actions=mod_vlan_vid:1,resubmit(,10) //Tunnel 1的traffic,修改VLAN ID 為 1, 再到 table 10
cookie=0x0, duration=29.291s, table=3, n_packets=0, n_bytes=0, idle_age=29, priority=1,tun_id=0x2 actions=mod_vlan_vid:3,resubmit(,10) //Tunnel 2的traffic,修改VLAN ID 為 2, 再到 table 10
cookie=0x0, duration=30.241s, table=3, n_packets=0, n_bytes=0, idle_age=30, priority=1,tun_id=0x3 actions=mod_vlan_vid:2,resubmit(,10) //Tunnel 3的traffic,修改VLAN ID 為 3, 再到 table 10
cookie=0x0, duration=33.001s, table=3, n_packets=0, n_bytes=0, idle_age=33, priority=0 actions=drop
cookie=0x0, duration=32.932s, table=4, n_packets=0, n_bytes=0, idle_age=32, priority=0 actions=drop
cookie=0x0, duration=32.874s, table=10, n_packets=0, n_bytes=0, idle_age=32, priority=1 actions=learn(table=20,hard_timeout=300,priority=1,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:NXM_OF_IN_PORT[]),output:1 //學習一條新的規則添加到table 20,發到端口1,進入br-int
cookie=0x0, duration=32.815s, table=20, n_packets=0, n_bytes=0, idle_age=32, priority=0 actions=resubmit(,22) //到table 22
cookie=0x0, duration=29.35s, table=22, n_packets=0, n_bytes=0, idle_age=29, dl_vlan=3 actions=strip_vlan,set_tunnel:0x2,output:2
cookie=0x0, duration=30.293s, table=22, n_packets=0, n_bytes=0, idle_age=30, dl_vlan=2 actions=strip_vlan,set_tunnel:0x3,output:2
cookie=0x0, duration=30.682s, table=22, n_packets=0, n_bytes=0, idle_age=30, dl_vlan=1 actions=strip_vlan,set_tunnel:0x1,output:2 //以上三條rule,根據目的VLAN ID,修改Tunnel ID,並去掉VLAN ID,發到GRE端口,經過GRE Tunnel到compute node
cookie=0x0, duration=32.752s, table=22, n_packets=0, n_bytes=0, idle_age=32, priority=0 actions=drop
總之,br-tun會:
- 把從GRE端口來的traffic設置相應的VLAN ID,發到br-int
- 把從br-int/patch-int來的traffic,去掉VLAN ID,設置相應的Trunne ID,經過GRE端口H1 發到Compute節點
2. Router Server
2.1 以tenant-one (有一個虛機)的router為例,先看看它的interface (略去lo)
root@network:/home/s1# ip netns exec qrouter-33e2b1bf-04cb-4811-9c58-7e03856022c1 ip addr 22: qr-d3d3e235-d4: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default link/ether fa:16:3e:b3:06:e8 brd ff:ff:ff:ff:ff:ff inet 10.0.11.1/24 brd 10.0.11.255 scope global qr-d3d3e235-d4 valid_lft forever preferred_lft forever inet6 fe80::f816:3eff:feb3:6e8/64 scope link valid_lft forever preferred_lft forever 26: qg-6c06581b-bd: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default link/ether fa:16:3e:0b:ac:82 brd ff:ff:ff:ff:ff:ff inet 192.168.1.114/24 brd 192.168.1.255 scope global qg-6c06581b-bd valid_lft forever preferred_lft forever inet 192.168.1.115/32 brd 192.168.1.115 scope global qg-6c06581b-bd valid_lft forever preferred_lft forever inet6 fe80::f816:3eff:fe0b:ac82/64 scope link valid_lft forever preferred_lft forever
可見:
- qg-6c06581b-bd 連接 br-ex
- qr-d3d3e235-d4連接br-int
再看看它的route規則:
root@network:/home/s1# ip netns exec qrouter-33e2b1bf-04cb-4811-9c58-7e03856022c1 route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 192.168.1.1 0.0.0.0 UG 0 0 0 qg-6c06581b-bd
//默認路由,所有目的地址不在本網絡中的traffic都要通過 qg-d3657c7f-28 interface 發到外網網關192.168.1.1
10.0.11.0 0.0.0.0 255.255.255.0 U 0 0 0 qr-d3d3e235-d4
//目的為本子網內的traffic 經過 qr-d3d3e235-d4 發到子網網關 10.0.11.1
192.168.1.0 0.0.0.0 255.255.255.0 U 0 0 0 qg-6c06581b-bd
//目的為 192.168.1.0/24 的traffic通過 qg-6c06581b-bd 發到網關192.168.1.100
2.2 Neutorn Floating IP 實現原理
Router namespace中的 netfilter NAT 表負責 Neutron Floating IP 的實現。下面是tenant-two (有兩個虛機)的router的NAT表:
-P PREROUTING ACCEPT
-P INPUT ACCEPT
-P OUTPUT ACCEPT
-P POSTROUTING ACCEPT
-N neutron-l3-agent-OUTPUT
-N neutron-l3-agent-POSTROUTING
-N neutron-l3-agent-PREROUTING
-N neutron-l3-agent-float-snat
-N neutron-l3-agent-snat
-N neutron-postrouting-bottom
-A PREROUTING -j neutron-l3-agent-PREROUTING
-A OUTPUT -j neutron-l3-agent-OUTPUT
-A POSTROUTING -j neutron-l3-agent-POSTROUTING
-A POSTROUTING -j neutron-postrouting-bottom
-A neutron-l3-agent-OUTPUT -d 192.168.1.118/32 -j DNAT --to-destination 10.0.22.200
-A neutron-l3-agent-OUTPUT -d 192.168.1.117/32 -j DNAT --to-destination 10.0.22.202
-A neutron-l3-agent-POSTROUTING ! -i qg-cba7b139-04 ! -o qg-cba7b139-04 -m conntrack ! --ctstate DNAT -j ACCEPT
-A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697
-A neutron-l3-agent-PREROUTING -d 192.168.1.118/32 -j DNAT --to-destination 10.0.22.200
-A neutron-l3-agent-PREROUTING -d 192.168.1.117/32 -j DNAT --to-destination 10.0.22.202
-A neutron-l3-agent-float-snat -s 10.0.22.200/32 -j SNAT --to-source 192.168.1.118
-A neutron-l3-agent-float-snat -s 10.0.22.202/32 -j SNAT --to-source 192.168.1.117
-A neutron-l3-agent-snat -j neutron-l3-agent-float-snat
-A neutron-l3-agent-snat -s 10.0.22.0/24 -j SNAT --to-source 192.168.1.116
-A neutron-postrouting-bottom -j neutron-l3-agent-snat
- SNAT (源地址轉換) 負責把從虛機來的traffic的 IP源地址 即fixed ip 10.0.22.200/202 轉化為 floating ip 192.168.1.118/117,然后該traffic被路由到 br-ex 再到外網
- DNAT (目的地址轉換)負責把從外網來的traffic的 IP目的地址 即floating ip 192.168.1.118/117 轉化為虛機所使用的 fixed ip 10.0.22.200/202,然后該traffic被路由到br-int 再到虛機
3. DHCP Server
每一個有DHCP的網絡都在Neutron節點上有一個DHCP服務,每個DHCP Server都是一個運行在一個network namespace中的dnsmasq進程。 dnsmasq是一個用在Linux上的輕型DNS和DHCP服務,具體見 http://www.thekelleys.org.uk/dnsmasq/docs/dnsmasq-man.html.
3.1 每個DHCP在neutron host上都有一個process,其ID是qdhcp-<net id>:
nobody 2049 1 0 06:43 ? 00:00:00 dnsmasq --no-hosts --no-resolv --strict-order --bind-interfaces --interface=tap15865c29-9b --except-interface=lo --pid-file=/var/lib/neutron/dhcp/d24963da-5221-481e-adf5-fe033d6e0b4e/pid --dhcp-hostsfile=/var/lib/neutron/dhcp/d24963da-5221-481e-adf5-fe033d6e0b4e/host --addn-hosts=/var/lib/neutron/dhcp/d24963da-5221-481e-adf5-fe033d6e0b4e/addn_hosts --dhcp-optsfile=/var/lib/neutron/dhcp/d24963da-5221-481e-adf5-fe033d6e0b4e/opts --leasefile-ro --dhcp-range=set:tag0,10.0.22.0,static,86400s --dhcp-lease-max=256 --conf-file= --domain=openstacklocal
說明:
1. --interface=tap15865c29-9b: 該process綁定/監聽一個TAP設備,即上圖中的 H3
2. --dhcp-hostsfile=/var/lib/neutron/dhcp/d24963da-5221-481e-adf5-fe033d6e0b4e/host:
root@network:/home/s1# cat /var/lib/neutron/dhcp/d24963da-5221-481e-adf5-fe033d6e0b4e/host
fa:16:3e:4d:6b:44,host-10-0-22-201.openstacklocal,10.0.22.201 //本子網DHCP Server自己(M3)的Mac地址以及IP
fa:16:3e:79:07:5e,host-10-0-22-1.openstacklocal,10.0.22.1 //本子網Router Server ( N3) 的Mac地址,名字和 IP
fa:16:3e:bf:69:36,host-10-0-22-200.openstacklocal,10.0.22.200 //本子網虛機1的Mac地址,虛機的主機名字,虛機的fixed IP
fa:16:3e:19:65:62,host-10-0-22-202.openstacklocal,10.0.22.202 //本子網虛機2的Mac地址,虛機的主機名字,虛機的fixed IP
fa:16:3e:88:99:c1,host-10-0-0-116.openstacklocal,10.0.0.116 //子網1的DHCP Server (H1)的Mac地址,以及IP地址。那么這里為什么沒H2的相應信息?
在虛機的創建過程中,Neutron會把這些信息(應該是從neutron db中拿到一個可用的IP地址)寫到該文件中,這樣,當虛機使用Mac地址向DHCP Server查詢IP地址的時候,dnsmasq會讀取該文件把IP地址返回給它。
3.2 DHCP的interface (省去lo)
root@network:/home/s1# ip netns exec qdhcp-0a4cd030-d951-401a-8202-937b788bea43 ip addr
18: tap6356d532-32: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default
link/ether fa:16:3e:88:99:c1 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.116/24 brd 10.0.0.255 scope global tap6356d532-32
valid_lft forever preferred_lft forever
inet6 fe80::f816:3eff:fe88:99c1/64 scope link
valid_lft forever preferred_lft forever
root@network:/home/s1# ip netns exec qdhcp-d04a0a06-7206-4d05-9432-3443843bc199 ip addr
17: tap8dfd0bd8-45: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default
link/ether fa:16:3e:82:fd:26 brd ff:ff:ff:ff:ff:ff
inet 10.0.11.101/24 brd 10.0.11.255 scope global tap8dfd0bd8-45
valid_lft forever preferred_lft forever
inet6 fe80::f816:3eff:fe82:fd26/64 scope link
valid_lft forever preferred_lft forever
root@network:/home/s1# ip netns exec qdhcp-d24963da-5221-481e-adf5-fe033d6e0b4e ip addr 19: tap15865c29-9b: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default link/ether fa:16:3e:4d:6b:44 brd ff:ff:ff:ff:ff:ff inet 10.0.22.201/24 brd 10.0.22.255 scope global tap15865c29-9b valid_lft forever preferred_lft forever inet6 fe80::f816:3eff:fe4d:6b44/64 scope link valid_lft forever preferred_lft forever
DHCP使用fix ip range的第一個可用IP地址做為其IP地址。它的interface的MAC地址 fa:16:3e:4d:6b:44 會出現在br-tun的rules里面。
3.3 虛機向DHCP Server申請/查詢Fixed IP
具體步驟在下一篇博文中詳細描述。