macvlan接口類型簡單說類似於子接口,但相比子接口來說,macvlan接口擁有自己獨立的mac地址,因此使用macvlan接口可以允許更多的二層操作。macvlan有四種模式:VEPA,bridge,Private和Passthru
macvlan接口會監聽並接收鏈路上到達本mac地址的報文,因此macvlan(除bridge外)僅能向外部網絡發送報文,並接受目的為本機mac的報文。
+---------------+ | network stack | +---------------+ | | | | +---------+ | | +------------------+ | | +------------------+ | | +------------------+ | | | | | | | aa +----------+ | | | | eth0 +-----| macvlan0 |---+ | | | / +----------+ | | Wire +------+ +---------------+ bb +----------+ | | --------| eth0 |------/ if dst mac is /--------| macvlan1 |------+ | +------+ +---------------+ \ +----------+ | \ cc +----------+ | +-----| macvlan2 |---------+ +----------+
模擬環境:windows主機上安裝vmware centos虛擬機作為host主機,centos上安裝docker
VEPA模式:在這種模式下,macvlan設備不能直接接收在同一個物理網卡的其他macvlan設備的數據包,但是其他macvlan設備可以將數據包通過物理網卡發送出去,然后通過hairpin設備返回的給其他macvlan設備,用於管理內部vm直接的流量,並且需要特殊設備支持。
使用如下命令創建一個容器的vepa模式的macvlan,名稱為vepamv,其中192.168.128.0和192.168.128.2分別為docker所在的host主機eth0的網段和網關。
docker network create -d macvlan --subnet=192.168.128.0/24 --gateway=192.168.128.2 -o parent=eth0 -o macvlan_mode=vepa vepamv
使用上述網絡運行2個容器
docker run -itd --net=vepamv--ip=192.168.128.222 --name=centos1-2 f322035379ab /bin/bash
docker run -itd --net=vepamv --ip=192.168.128.233 --name=centos1-3 f322035379ab /bin/bash
查看網絡信息,可以看到驅動類型為macvlan,macvlan模型為vepa,兩個網卡有獨立的mac地址,底層物理網卡為eth0
[root@localhost ~]# docker network inspect evapmv [ { "Name": "vepamv", "Id": "84af6a040cf1e1063c122ed9b80b421ef2896d31100c87bec9cde7a0e8690833", "Created": "2018-09-16T22:16:23.938521926+08:00", "Scope": "local", "Driver": "macvlan", "EnableIPv6": false, "IPAM": { "Driver": "default", "Options": {}, "Config": [ { "Subnet": "192.168.128.0/24", "Gateway": "192.168.128.2" } ] }, "Internal": false, "Attachable": false, "Containers": { "49eb565de8f9ec41ba69285c6ced2971a861a104247dc10c257ce3dd7a74d006": { "Name": "centos1-3", "EndpointID": "adc576f3cfa1c5b6649f3d322ba11487e8ef3eadebeed72eb830f55a8a5768f6", "MacAddress": "02:42:c0:a8:80:e9", "IPv4Address": "192.168.128.233/24", "IPv6Address": "" }, "5f0fe3a769ca17717afea9f1d444b00a4380289b2744d02d5ade260e7e687868": { "Name": "centos1-2", "EndpointID": "caa0766bb243e43986c1ee435b9d2666c615b92c06964c749d5e93ba7ef8849f", "MacAddress": "02:42:c0:a8:80:de", "IPv4Address": "192.168.128.222/24", "IPv6Address": "" } }, "Options": { "macvlan_mode": "vepa", "parent": "eth0" }, "Labels": {} } ]
在centos1-2中ping centos1-3發現無法ping通,因為本地環境上並沒有開啟hairpin模式的交換機或路由器,報文發送到鏈路上之后無法返回來。即無法在internal內部進行報文傳輸
[root@0dd61dcf26f3 /]# ping 192.168.128.222 PING 192.168.128.222 (192.168.128.222) 56(84) bytes of data. From 192.168.128.233 icmp_seq=1 Destination Host Unreachable From 192.168.128.233 icmp_seq=2 Destination Host Unreachable
但在external network的機器(192.168.128.1)是可以直接訪問該容器的(首先該容器的IP屬於external network)
D:/> ping 192.168.128.222 PING 192.168.128.222 (192.168.128.222) 56(84) bytes of data. 64 bytes from 192.168.128.222: icmp_seq=1 ttl=64 time=0.080 ms 64 bytes from 192.168.128.222: icmp_seq=1 ttl=64 time=0.080 ms
抓包如下,可以看到centos1-2的源mac地址與上述的mac地址是一致的。
使用IPOP構包模擬hairpin的交換機,模擬從192.168.128.233 發送arp請求192.168.128.222,報文如下:
使用抓包工具可以看到192.168.128.222回復了來自192.168.128.223的arp請求
private模式:該模式類似於VEPA,但在VEPA基礎上添加了新的特性,即如果兩個macvlan在同一個網卡上,這兩個macvlan接口無法通信,即使使用啟用hairpin的交換機或路由器。仍然使用上述條件構造從192.168.128.222到192.168.128.233的arp請求報文,可以看到192.168.128.222並沒有回復192.168.128.233的arp請求。但是從windows機器直接ping 192.168.128.222是可以ping通的。private模式下隔離了來自同網卡的macvlan的廣播報文。
passthru模式:該模式僅允許一塊網卡上面部署一個macvlan接口,其他使用macvlan的容器將啟動失敗,但只要不使用macvlan,該容器還是可以正常啟動。如果需要在單個物理網卡上啟動多個macvlan_mode=passthru的容器,可以使用子接口方式,參見 https://blog.csdn.net/daye5465/article/details/77412619。
[root@localhost home]# docker run -itd --net=passmv f322 /bin/bash
17b0f2c446671f716bcf136e9c9d8c781ec84901c87e1d4ae0a20aa98e5fb710
/usr/bin/docker-current: Error response from daemon: failed to create the macvlan port: invalid argument.
[root@localhost home]# docker run -itd f322 /bin/bash
6aac5b6a284b1d5c2294936d7943007947a602fc7cdcc133c32b5e861ed17865
bridge 模式(docker默認模式):在這種模式下,寄生在同一個物理設備的macvlan設備可以直接通訊,不需要外接的hairpin設備幫助,使用如下的命令創建一個bridge的macvlan網絡
docker network create -d macvlan --subnet=192.168.226.0/24 --gateway=192.168.226.2 -o parent=eth0 -o macvlan_mode=bridge bridmv
使用bridge可以保證在不使用hairpin設備的前提下實現inter-network和external-network的連通,查看docker的bridge信息如下
[root@localhost netns]# docker network inspect bridmv [ { "Name": "bridmv", "Id": "b2920c8721701d47ac891aa8528d95f60e6a71a1a7485d0e2f21bae30f8604bf", "Created": "2018-09-18T09:16:34.549499448+08:00", "Scope": "local", "Driver": "macvlan", "EnableIPv6": false, "IPAM": { "Driver": "default", "Options": {}, "Config": [ { "Subnet": "192.168.226.0/24", "Gateway": "192.168.226.2" } ] }, "Internal": false, "Attachable": false, "Containers": { "031e1de7ed2cf13c25083e98d9cee131ea00a466fd169a0531c70818a25c7a7f": { "Name": "centos2", "EndpointID": "b95efe7ddb8d2c4ce9228c06f019601c18daedbf7fc79462939efba128e84936", "MacAddress": "02:42:c0:a8:80:e9", "IPv4Address": "192.168.128.233/24", "IPv6Address": "" }, "8e23e7011f7cbc0962ba975974ae313dd4dab10a4114775b689ba70ae88dac72": { "Name": "centos1", "EndpointID": "d2fb36b842f89128e3a862fc70624d4946b703bf0bb921fd11839d7f775fa8e0", "MacAddress": "02:42:c0:a8:80:de", "IPv4Address": "192.168.128.222/24", "IPv6Address": "" } }, "Options": { "macvlan_mode": "bridge", "parent": "eth0" }, "Labels": {} } ]
查看/var/run/docker/netns,有2個ns,這兩個就是192.168.128.222和192.168.128.233的容器網絡空間
[root@localhost netns]# ll /var/run/docker/netns/ total 0 -r--r--r--. 1 root root 0 Sep 18 09:18 59b305d0d01e -r--r--r--. 1 root root 0 Sep 18 09:18 a41362fa7ed2
macvlan的bridge無法使用brctl show獲得相關信息。查看容器網卡信息如下,可以看到Ip地址是與兩個容器對應的,在容器的eth后面有一個@if2,表示有一個接口與該接口對應,根據macvlan的原理,該接口為macvlan所在的host主機的eth0接口
[root@localhost netns]# ip netns exec 59b305d0d01e ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 18: eth0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default link/ether 02:42:c0:a8:80:e9 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 192.168.226.233/24 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::42:c0ff:fea8:e2e9/64 scope link valid_lft forever preferred_lft forever [root@localhost netns]# ip netns exec a41362fa7ed2 ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 19: eth0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default link/ether 02:42:c0:a8:80:de brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 192.168.226.222/24 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::42:c0ff:fea8:e2de/64 scope link valid_lft forever preferred_lft forever
在host主機上查看,網卡序號為2的正是macvlan接口所在的網卡eth0,即使用host的eth0作為了bridge(--parent指定)
[root@localhost netns]# ip link 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000 link/ether 00:0c:29:f1:38:bf brd ff:ff:ff:ff:ff:ff 3: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000 link/ether 52:54:00:51:d1:17 brd ff:ff:ff:ff:ff:ff 4: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN mode DEFAULT group default qlen 1000 link/ether 52:54:00:51:d1:17 brd ff:ff:ff:ff:ff:ff 5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default link/ether 02:42:71:8b:5a:6e brd ff:ff:ff:ff:ff:ff
相比與docker 的bridge,macvlan類型的bridge下,只要多個容器在同一個子網IP范圍內就可以通信,而無需在同一個bridge下,也即macvlan模擬真實物理網卡的功能。macvlan同bridge一樣,都是linux原生支持的,可以手動實現自己的macvlan通信,具體配置操作參見:linux 網絡虛擬化:macvlan
總結:通過以上示例可以看出,macvlan類型的接口可以當作正常的host接口使用,如果要組建跨網絡訪問,則需要路由器或交換機的支持,如hairpin,以及路由等。
參考:
https://blog.csdn.net/daye5465/article/details/77412619
https://blog.csdn.net/dog250/article/details/45788279
https://backreference.org/2014/03/20/some-notes-on-macvlanmacvtap/
https://superuser.com/questions/1205346/macvtap-interface-created-on-top-of-macvlan-interface-of-a-docker-container-cann
https://docs.docker.com/network/macvlan/#8021q-trunk-bridge-mode
https://docs.docker.com/v17.09/engine/userguide/networking/get-started-macvlan/#macvlan-bridge-mode-example-usage
https://hicu.be/bridge-vs-macvlan