當我們在單台物理機或虛擬機中運行多個docker容器應用時,這些容器之間是如何進行通信的呢,或者外界是如何訪問這些容器的? 這里就涉及了單機容器網絡相關的知識。docker 安裝后默認
情況下會在宿主機上創建三種類型的網絡,我們可以通過:docker network ls 查看,如下所示:
docker network ls NETWORK ID NAME DRIVER SCOPE 8ad1446836a4 bridge bridge local 3be441aa5d9f host host local e0542a92df5c none null local
下面將分別介紹這三種網絡:
1. none
none網絡,只有回環網絡,容器內部只掛載了lo虛擬網卡,創建了該網絡的容器是不能跟外界進行通信的。我們可以通過--network=none 指定none網絡,下面創建一個none網絡的容器,並查看
容器里的網絡設備。
[root@VM_0_12_centos ~]# docker run -dit --net=none --name=bbox3 busybox 7c97d179f742bcd280d2f436427b18b2381c9588773ca612cd71bf840e0db2ea [root@VM_0_12_centos ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 7c97d179f742 busybox "sh" 2 seconds ago Up 2 seconds bbox3
進入容器只有lo網卡
[root@VM_0_12_centos ~]# docker exec -it bbox3 sh / # ifconfig lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
訪問外界網絡
/ # ping www.baidu.com ping: bad address 'www.baidu.com'
可以ping通本地回環網絡
/ # ping 127.0.0.1 PING 127.0.0.1 (127.0.0.1): 56 data bytes 64 bytes from 127.0.0.1: seq=0 ttl=64 time=0.068 ms 64 bytes from 127.0.0.1: seq=1 ttl=64 time=0.051 ms 64 bytes from 127.0.0.1: seq=2 ttl=64 time=0.067 ms 64 bytes from 127.0.0.1: seq=3 ttl=64 time=0.068 ms 64 bytes from 127.0.0.1: seq=4 ttl=64 time=0.071 ms ^C --- 127.0.0.1 ping statistics --- 5 packets transmitted, 5 packets received, 0% packet loss round-trip min/avg/max = 0.051/0.065/0.071 ms
無法ping通外部 / # ping 192.168.1.201 PING 192.168.1.201 (192.168.1.201): 56 data bytes ping: sendto: Network is unreachable
none 網絡使用場景一般比較少見,主要應用於安全性比較高的場景,且不需要與外部進行通信的任務。
2. host 網絡
使用host網絡的容器,會與宿主機共享網絡棧,如共享ip與端口,優點:性能高,缺點:存在端口號沖突,多個容器無法對外暴露相同端口號。
容器使用host網絡可以通過--net=host指定:如下所示:
給容器指定host網絡模式
[root@VM_0_12_centos ~]# docker run -dit --net=host --name=nginx1 nginx 48e1df0c535193bfdef92f718de2c76427d88a371f14be1274022288cbe4ece6
可以通過宿主機ip與端口號訪問容器應用。
[root@VM_0_12_centos ~]# telnet 172.26.0.12 80 Trying 172.26.0.12... Connected to 172.26.0.12. Escape character is '^]'.
在容器中可以看到 host 的所有網絡設備
[root@VM_0_12_centos ~]# docker exec -it nginx1 ifconfig docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 inet 172.17.0.1 netmask 255.255.0.0 broadcast 0.0.0.0 ether 02:42:29:18:39:27 txqueuelen 0 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 172.26.0.12 netmask 255.255.240.0 broadcast 172.26.15.255 ether 52:54:00:5d:99:43 txqueuelen 1000 (Ethernet) RX packets 185960339 bytes 46982274554 (43.7 GiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 169979929 bytes 50689492409 (47.2 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 loop txqueuelen 1 (Local Loopback) RX packets 86 bytes 4827 (4.7 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 86 bytes 4827 (4.7 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
從host網絡與宿主機共享網絡棧特性可以得出,host網絡能夠支持容器的跨主機通信,不同容器之間的通信其實就是不同宿主機之間的不同端口之間的通信。這種方式的很大的弊端就是存在端口
號沖突,同一宿主機多個容器不能暴露同一端口號。其次,這種網絡模式並沒有充分發揮容器的隔離特性,容器與容器之間其實與宿主機共享網絡棧。在大規模部署場景下,容器網絡模式不會定義成
host模式。
3. bridge網絡模式
默認情況下,我們不指定--net時,創建的容器都是使用的是bridge網絡模式。
在該模式下,每創建一個容器都會給容器分配自己的network namespace,如 ip地址。
每個容器都有自己的ip,那么同一台宿主機上這些容器之間需要通信,則需要借助網橋的設備。我們在安裝docker時,默認情況下會創建一個docker0 linux bridge的虛擬網橋設備,我們創建的容器
都會掛到docker0網橋上。關於linux虛擬網橋的知識我們可以參考這篇博文(https://segmentfault.com/a/1190000009491002)
現在分別創建兩個bridge網絡的容器,如下:
[root@VM_0_12_centos ~]# docker run -dit --name=bbox2 busybox 7b4221300b296026126d3cf600db39bed68e4048729982d6e09100f59ec900b7 [root@VM_0_12_centos ~]# docker run -dit --name=bbox1 busybox 91dd8c37571d69eacafe562f97a7c476f67a6189a0e58b4a95cc9a8f2ac013df
我們可以查看bridge網絡的配置,子網分配,網關地址(docker0網橋)以及分配給每個容器的ip地址及mac地址,如下紅色標注:
[root@VM_0_12_centos ~]# docker network inspect bridge [ { "Name": "bridge", "Id": "5b6d64e4b4433bb639b26a3f4a0e828ecb1b2a54984cb01225dd682322fa61d4", "Created": "2019-10-15T15:52:52.560070504+08:00", "Scope": "local", "Driver": "bridge", "EnableIPv6": false, "IPAM": { "Driver": "default", "Options": null, "Config": [ { "Subnet": "172.17.0.0/16", "Gateway": "172.17.0.1" } ] }, "Internal": false, "Attachable": false, "Ingress": false, "ConfigFrom": { "Network": "" }, "ConfigOnly": false, "Containers": { "7b4221300b296026126d3cf600db39bed68e4048729982d6e09100f59ec900b7": { "Name": "bbox2", "EndpointID": "b3745248db52c2ea8e6eb7d05a5e581113105c4979717759d2d49bfebda2be49", "MacAddress": "02:42:ac:11:00:03", "IPv4Address": "172.17.0.3/16", "IPv6Address": "" }, "91dd8c37571d69eacafe562f97a7c476f67a6189a0e58b4a95cc9a8f2ac013df": { "Name": "bbox1", "EndpointID": "60b01cb330806b9fbbc9b212530b6561f7f56e697ac6ea1040f2e31419ed74d5", "MacAddress": "02:42:ac:11:00:04", "IPv4Address": "172.17.0.4/16", "IPv6Address": "" }, "f2f176f13894b434b4e7f6f6bcc68af1782559295d7ca430561eaf679d288deb": { "Name": "nginx1", "EndpointID": "d0f39a169126cd8222514eddfbba008a8e7f6727df8d49c25e19e35f6e53e191", "MacAddress": "02:42:ac:11:00:02", "IPv4Address": "172.17.0.2/16", "IPv6Address": "" } },
通過ifconfig 可以看出,分別創建了兩個以Veth開頭的虛擬網卡
[root@VM_0_12_centos ~]# ifconfig docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 172.17.0.1 netmask 255.255.0.0 broadcast 0.0.0.0 ether 02:42:29:18:39:27 txqueuelen 0 (Ethernet) RX packets 7 bytes 278 (278.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 6 bytes 452 (452.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 172.26.0.12 netmask 255.255.240.0 broadcast 172.26.15.255 ether 52:54:00:5d:99:43 txqueuelen 1000 (Ethernet) RX packets 185985778 bytes 46985652515 (43.7 GiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 170003246 bytes 50693831930 (47.2 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 loop txqueuelen 1 (Local Loopback) RX packets 86 bytes 4827 (4.7 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 86 bytes 4827 (4.7 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 veth7614922: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 ether be:a8:fb:ed:e5:e8 txqueuelen 0 (Ethernet) RX packets 5 bytes 378 (378.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 6 bytes 420 (420.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 vethe3d2ca0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 ether f2:ec:f2:ab:55:a1 txqueuelen 0 (Ethernet) RX packets 11 bytes 712 (712.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 11 bytes 830 (830.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
虛擬設備的一端的兩個虛擬網卡都被掛載在docker0網橋上 [root@VM_0_12_centos ~]# brctl show bridge name bridge id STP enabled interfaces docker0 8000.024229183927 no veth7614922 vethe3d2ca0
而虛擬設備的另外一端分別掛在了兩個容器上,在容器內部為eth0虛擬網卡,與宿主機 veth虛擬網卡一一對應。
分別進入兩個容器內部查看網絡設備
[root@VM_0_12_centos ~]# docker exec -it bbox1 ifconfig
eth0 Link encap:Ethernet HWaddr 02:42:AC:11:00:04 ------------ (對應虛擬設備的另外一端)
inet addr:172.17.0.4 Bcast:172.17.255.255 Mask:255.255.0.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:11 errors:0 dropped:0 overruns:0 frame:0
TX packets:11 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:830 (830.0 B) TX bytes:712 (712.0 B)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
[root@VM_0_12_centos ~]# docker exec -it bbox2 ifconfig
eth0 Link encap:Ethernet HWaddr 02:42:AC:11:00:03 -------------(對應虛擬設備的另外一端)
inet addr:172.17.0.3 Bcast:172.17.255.255 Mask:255.255.0.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:6 errors:0 dropped:0 overruns:0 frame:0
TX packets:5 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:420 (420.0 B) TX bytes:378 (378.0 B)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
可以看到兩個容器內部都有自己的虛擬網卡eth0,mac地址,ip地址等。現在進入bbox1 來 ping bbox2的ip看是否能夠ping通。
[root@VM_0_12_centos ~]# docker exec -it bbox1 ping 172.17.0.3 PING 172.17.0.3 (172.17.0.3): 56 data bytes 64 bytes from 172.17.0.3: seq=0 ttl=64 time=0.087 ms 64 bytes from 172.17.0.3: seq=1 ttl=64 time=0.081 ms 64 bytes from 172.17.0.3: seq=2 ttl=64 time=0.063 ms 64 bytes from 172.17.0.3: seq=3 ttl=64 time=0.086 ms 64 bytes from 172.17.0.3: seq=4 ttl=64 time=0.082 ms 64 bytes from 172.17.0.3: seq=5 ttl=64 time=0.095 ms
以bbox1 ping容器bbox2對應的ip地址172.17.0.3來看兩個容器之間是如何通信的。
首先bbox1 ping的過程中會發送 icmp包,ip層包頭會填上:原地址為172.17.0.2,目的地址為172.17.0.3。容器的內核協議棧會經過路由選擇:我們可以通過route來查看數據包從哪個端口發送。
/ # route Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface default 172.17.0.1 0.0.0.0 UG 0 0 0 eth0 172.17.0.0 * 255.255.0.0 U 0 0 0 eth0
根據路由表項可知,目的ip為172.17.0.3匹配的是第二項,因為Gateway為*, 可以通過容器bbox1的的虛擬網卡eth0在二層直接發送出去,不需要經過三層網絡轉發。數據包在經過二層封包的過程中
需要知道目的ip地址172.17.0.2對應的mac地址,進而將數據包
通過二層網絡發送出去。初始通信的過程中bbox1是不知道ip(172.17.0.2)對應的mac地址的,所以需要向172.17.0.2發送ARP請求(廣播包,目的mac地址填上以太網幀首部的硬件地址填
FF:FF:FF:FF:FF:FF),請求獲取其mac地址。ARP包經過bbox1的eth0流向了另外一端
的虛擬網卡 veth7614922,該虛擬網卡作為網橋docker0的從設備,以端口號的形式掛載在docker0上。最終ARP包通過veth7614922設備端口號流入docker0網橋,docker0網橋處理ARP的過程如
下:
1. 首先會根據源mac地址與進入的端口號建立mac地址與端口號的映射關系存入CAM表(mac地址與端口映射表),學習用。
2. 判斷該包是否為廣播包(通過目的mac地址判斷)如果為廣播包,則進入4的流程。如果為非廣播包,進入3流程。
3.從CAM表中查找目的mac地址對應的端口號是否存在,存在,則直接將這個數據包從對應端口轉發出去。
4. 通過向所有的端口(除數據包進入的端口)發送arp包,在本例中,當bbox2中收到該arp包后,其會向bbox1回復響應包。響應包中會填上bbox2的mac地址。響應包經過bbox2的eth0網卡流入到
docker0的vethe3d2ca0端口進入docker0,docker0會根據1的規則建立bbox2的mac地址與端口號的映射關系,並通過先前建立 bbox1的mac地址與端口號的映射關系,將響應包通過veth7614922
端口回復給bbox1,bbox1拿到ARP的響應包后,通過二層封包,填上目的ip地址對應的mac地址將數據包以單播的形式經過docker0網橋轉發出去。由於docker0網橋已經學習到bbox2的mac地址與
端口號的映射關系,這次可以直接通過目的mac地址對應的端口號,將數據包通過該端口號發送出去,從而完成從bbox1到bbox2的通信過程。
注:bbox1在獲取到bbox2的ARP響應包后,會在本地存儲對應的ip地址與mac地址的映射關系,如下圖所示:
/ # arp -a ? (172.17.0.1) at 02:42:29:18:39:27 [ether] on eth0 ? (172.17.0.3) at 02:42:ac:11:00:03 [ether] on eth0 / #