Linux的namespace和cgroups簡介


          Linux的namespace和cgroups簡介

                                       作者:尹正傑

版權聲明:原創作品,謝絕轉載!否則將追究法律責任。

 

 

 

一.Linux Namespace技術

  Namespace是Linux系統的底層概念,在內核層實現,即有一些不同類型的命名空間被部署在核內,各個docker容器運行在同一個docker主進程並且共用同一個宿主機系統內核。

  各docker容器運行在宿主機的用戶空間,每個容器都要有類似於虛擬機一樣的相互隔離的運行空間,但是容器技術是在一個進程內實現運行指定服務的運行環境,並且還可以保護宿主機內核不受其他進程的干擾和影響,如文件系統空間,網絡空間,進程空間等,目前主要通過以下技術實現容器運行空間的相互隔離。

1>.MNT Namespace 

  每個容器都要有獨立的根文件系統用戶空間,以實現在容器里面啟動服務並且使用容器的運行環境,即一個宿主機是ubuntu的服務器,可以在里面啟動一個centos運行環境的容器並且在里面啟動一個Nginx服務,此Nginx運行時使用的運行環境就是centos系統目錄的運行環境,但是在容器里面不能訪問宿主機的資源,宿主機是使用了chroot技術把容器鎖定到一個指的運行目錄里面。

 

2>.IPC Namespace 

  一個容器內的進程間通信,允許一個容器內的不同進程的(內存,緩存等)數據訪問,但是不能跨容器訪問其他容器的數據 。

3>.UTS Namespace 

  UTS namespace(UNIX Timesharing System 包含了運行內核的名稱、版本底層體系結構類型等信息)用於系統標識,其中包含了hostname和域名domainname,它使得一個容器擁有屬於自己hostname標識,這個主機名標識獨立於宿主機系統和其上的他容器 。

 

4>.PID Namespace 

  Linux系統中,有一個PID為1的進程(init/systemd)是其他所有進程的父,那么 在每個容器內也要有一個父進程來管理其下屬的子進程,那么多個容器的進程通的PID namespace進程隔離(比如PID編號重復、器內的主進程與回收子進程等)。
root@docker101:~# docker images                          #查看現有的鏡像
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
nginx               latest              c7460dfcab50        2 days ago          126MB
centos              latest              0f3e07c0138f        3 months ago        220MB
root@docker101:~# 
root@docker101:~# docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
968e7ecc39f2        centos              "/bin/bash"         3 hours ago         Up 3 hours                              keen_meitner
root@docker101:~# 
root@docker101:~# docker run -d -it nginx                  #基於nginx鏡像運行一個容器
44edd3477c0d7380ab23dc23f00055b7a17eecd483a666c47e11fac6786a2f3e
root@docker101:~# 
root@docker101:~# docker ps                                #查看正在運行的容器
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
44edd3477c0d        nginx               "nginx -g 'daemon of…"   4 seconds ago       Up 2 seconds        80/tcp              stupefied_driscoll
968e7ecc39f2        centos              "/bin/bash"              3 hours ago         Up 3 hours                              keen_meitner
root@docker101:~# 
root@docker101:~# ps -ef | grep docker                     #我們現在宿主機上查看所有docker相關的容器,我們通過目錄前綴就可以判斷出PID為"7183"的進程是咱們剛剛啟動的"nginx"容器。
root       6171      1  0 08:18 ?        00:00:11 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
root       6553   4451  0 08:24 ?        00:00:02 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/968e7ecc39f277a3d3f98b658f8f496de622edccfa4ef45d8ec64c46f5012d4c -address /run/containerd/containerd.sock -containerd-binary /usr/bin/containerd -runtime-root /var/run/docker/runtime-runc
root       7831   4451  0 11:27 ?        00:00:00 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/44edd3477c0d7380ab23dc23f00055b7a17eecd483a666c47e11fac6786a2f3e -address /run/containerd/containerd.sock -containerd-binary /usr/bin/containerd -runtime-root /var/run/docker/runtime-runc
root       7923   6955  0 11:27 pts/1    00:00:00 grep --color=auto docker
root@docker101:~# 
root@docker101:~# ps -ef | grep 4451
root       4451      1  0 08:03 ?        00:01:40 /usr/bin/containerd
root       6553   4451  0 08:24 ?        00:00:02 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/968e7ecc39f277a3d3f98b658f8f496de622edccfa4ef45d8ec64c46f5012d4c -address /run/containerd/containerd.sock -containerd-binary /usr/bin/containerd -runtime-root /var/run/docker/runtime-runc
root       7831   4451  0 11:27 ?        00:00:00 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/44edd3477c0d7380ab23dc23f00055b7a17eecd483a666c47e11fac6786a2f3e -address /run/containerd/containerd.sock -containerd-binary /usr/bin/containerd -runtime-root /var/run/docker/runtime-runc
root       8136   6955  0 11:43 pts/1    00:00:00 grep --color=auto 4451
root@docker101:~# 
root@docker101:~# ps -ef | grep 7831                       #查看nginx容器的進程信息。
root       7831   4451  0 11:27 ?        00:00:00 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/44edd3477c0d7380ab23dc23f00055b7a17eecd483a666c47e11fac6786a2f3e -address /run/containerd/containerd.sock -containerd-binary /usr/bin/containerd -runtime-root /var/run/docker/runtime-runc
root       7862   7831  0 11:27 pts/0    00:00:00 nginx: master process nginx -g daemon off;
root       7925   6955  0 11:29 pts/1    00:00:00 grep --color=auto 7831
root@docker101:~# 
root@docker101:~# pstree -p 7831                           #很明顯,在宿主機上"7831"運行的是"nginx"容器,而該容器中運行了nginx的主進程(pid為7862)和工作進程(pid為7904)
containerd-shim(7831)─┬─nginx(7862)───nginx(7904)
                      ├─{containerd-shim}(7832)
                      ├─{containerd-shim}(7833)
                      ├─{containerd-shim}(7834)
                      ├─{containerd-shim}(7835)
                      ├─{containerd-shim}(7836)
                      ├─{containerd-shim}(7837)
                      ├─{containerd-shim}(7838)
                      ├─{containerd-shim}(7840)
                      ├─{containerd-shim}(7892)
                      └─{containerd-shim}(8110)
root@docker101:~# 
宿主機上查看PID信息
root@docker101:~# docker ps                      #我們可以看到基於nginx鏡像的容器處於正常運行狀態。
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
44edd3477c0d        nginx               "nginx -g 'daemon of…"   28 minutes ago      Up 28 minutes       80/tcp              stupefied_driscoll
968e7ecc39f2        centos              "/bin/bash"              4 hours ago         Up 4 hours                              keen_meitner
root@docker101:~# 
root@docker101:~# 
root@docker101:~# docker exec -it 44edd3477c0d bash
root@44edd3477c0d:/# 
root@44edd3477c0d:/# cat /etc/issue                 #查看當前nginx的鏡像在哪個操作系統開發的,我們看到的信息是在Debian系統開發的,這很正常。
Debian GNU/Linux 10 \n \l

root@44edd3477c0d:/# 
root@44edd3477c0d:/# uname -a                     #很顯然,容器除了有自己的主機名,內核版本使用的是宿主機ubuntu的。
Linux 44edd3477c0d 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 GNU/Linux
root@44edd3477c0d:/# 
root@44edd3477c0d:/# apt-get update                          #必須先更新軟件源,否則無法執行下面的安裝命令。
root@44edd3477c0d:/# 
root@44edd3477c0d:/# apt-get -y install net-tools           #Debian系統安裝網絡工具
root@44edd3477c0d:/# 
root@44edd3477c0d:/# apt-get -y install curl                  #Debian系統安裝curl命令
root@44edd3477c0d:/# 
root@44edd3477c0d:/# apt-get -y install procps             #Debian系統安裝top命令
root@44edd3477c0d:/# 
root@44edd3477c0d:/# apt-get -y install iputils-ping         #Debian系統安裝ping命令
root@44edd3477c0d:/# 
root@44edd3477c0d:/# top                         #不難發現,使用top命令我們可以看到PID為1的進程竟然是Nginx的主進程。
top - 12:09:29 up  6:27,  0 users,  load average: 0.00, 0.00, 0.00
Tasks:   4 total,   1 running,   3 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   3921.8 total,   2027.7 free,    359.8 used,   1534.2 buff/cache
MiB Swap:   8192.0 total,   8192.0 free,      0.0 used.   3312.7 avail Mem 

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                                                                                                                                
     1 root      20   0   10632   5468   4760 S   0.0   0.1   0:00.30 nginx                                                                                                                                                                                                  
     7 nginx     20   0   11120   2564   1436 S   0.0   0.1   0:00.00 nginx                                                                                                                                                                                                  
    37 root      20   0    3988   3284   2784 S   0.0   0.1   0:00.32 bash                                                                                                                                                                                                   
  2968 root      20   0    8024   3132   2664 R   0.0   0.1   0:00.00 top                                                                                                                                                                                                    
root@44edd3477c0d:/# 
root@44edd3477c0d:/# ps -ef | grep nginx
root          1      0  0 11:27 pts/0    00:00:00 nginx: master process nginx -g daemon off;
nginx         7      1  0 11:27 pts/0    00:00:00 nginx: worker process
root       2973     37  0 12:11 pts/1    00:00:00 grep nginx
root@44edd3477c0d:/# 
root@44edd3477c0d:/# netstat -untalp          #可以看到nginx進程監聽了80端口
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      1/nginx: master pro 
root@44edd3477c0d:/# 
root@44edd3477c0d:/# curl -I 127.0.0.1         #nginx的服務也是可以正常范圍的
HTTP/1.1 200 OK
Server: nginx/1.17.7
Date: Sun, 12 Jan 2020 12:17:40 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 24 Dec 2019 13:07:53 GMT
Connection: keep-alive
ETag: "5e020da9-264"
Accept-Ranges: bytes

root@44edd3477c0d:/# 
root@44edd3477c0d:/# pstree -p 1
nginx(1)---nginx(7)
root@44edd3477c0d:/# 
root@44edd3477c0d:/# nginx -s reload                #我們可以對nginx進行重新加載,並不會讓當前容器結束運行。
2020/01/12 12:12:22 [notice] 2975#2975: signal process started
root@44edd3477c0d:/# 
root@44edd3477c0d:/# ps -ef | grep nginx
root          1      0  0 11:27 pts/0    00:00:00 nginx: master process nginx -g daemon off;
nginx      2976      1  0 12:12 pts/0    00:00:00 nginx: worker process
root       2978     37  0 12:12 pts/1    00:00:00 grep nginx
root@44edd3477c0d:/# 
root@44edd3477c0d:/# 
root@44edd3477c0d:/# pstree -p 1
nginx(1)---nginx(2976)
root@44edd3477c0d:/# 
root@44edd3477c0d:/# nginx -s stop                #如果我們將nginx容器中的nginx進程給停掉后,發現該容器也會跟着停止使用了。
2020/01/12 12:14:15 [notice] 2983#2983: signal process started
root@44edd3477c0d:/# root@docker101:~# 
root@docker101:~# 
root@docker101:~# docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
968e7ecc39f2        centos              "/bin/bash"         4 hours ago         Up 4 hours                              keen_meitner
root@docker101:~# 
root@docker101:~# docker ps -a                  #可以看到當前nginx容器是退出狀態的。
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS                      PORTS               NAMES
44edd3477c0d        nginx               "nginx -g 'daemon of…"   47 minutes ago      Exited (0) 11 seconds ago                       stupefied_driscoll
968e7ecc39f2        centos              "/bin/bash"              4 hours ago         Up 4 hours                                      keen_meitner
root@docker101:~# 
root@docker101:~# docker start 44edd3477c0d          #當然,咱們也可以再次將該容器啟動
44edd3477c0d
root@docker101:~# 
root@docker101:~# docker ps -a
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
44edd3477c0d        nginx               "nginx -g 'daemon of…"   48 minutes ago      Up 2 seconds        80/tcp              stupefied_driscoll
968e7ecc39f2        centos              "/bin/bash"              4 hours ago         Up 4 hours                              keen_meitner
root@docker101:~# 
root@docker101:~#
查看容器中的PID相關信息

5>.Net Namespace 

  每一個容器都類似於虛擬機一樣有自己的網卡,監聽端口,TCP/IP協議棧等,Docker使用network namespace啟動一個vethX接口,這樣你的容器將擁有它自己的橋接ip地址,通常是docker0,而docker0實質就是Linux的虛擬網橋,網橋是在OSI七層模型的數據鏈路網絡設備,通過mac地址對網絡進行划分,並且在不同網絡直接傳遞數據。
root@docker101:~# docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
44edd3477c0d        nginx               "nginx -g 'daemon of…"   About an hour ago   Up 16 minutes       80/tcp              stupefied_driscoll
968e7ecc39f2        centos              "/bin/bash"              4 hours ago         Up 4 hours                              keen_meitner
root@docker101:~# 
root@docker101:~# docker exec -it 44edd3477c0d bash
root@44edd3477c0d:/# 
root@44edd3477c0d:/# ifconfig 
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.17.0.3  netmask 255.255.0.0  broadcast 172.17.255.255
        ether 02:42:ac:11:00:03  txqueuelen 0  (Ethernet)
        RX packets 13  bytes 1006 (1006.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 44  bytes 4822 (4.7 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 44  bytes 4822 (4.7 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

root@44edd3477c0d:/# 
root@44edd3477c0d:/# exit 
exit
root@docker101:~# 
root@docker101:~# docker exec -it 968e7ecc39f2 bash
[root@968e7ecc39f2 /]# 
[root@968e7ecc39f2 /]# yum -y install net-tools
[root@968e7ecc39f2 /]# 
[root@968e7ecc39f2 /]# ifconfig 
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.17.0.2  netmask 255.255.0.0  broadcast 172.17.255.255
        ether 02:42:ac:11:00:02  txqueuelen 0  (Ethernet)
        RX packets 3708  bytes 15399326 (14.6 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 3356  bytes 185759 (181.4 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@968e7ecc39f2 /]# 
[root@968e7ecc39f2 /]# ping 172.17.0.3            #很顯然,同一個宿主機的不同容器默認是可以相互通信的。
PING 172.17.0.3 (172.17.0.3) 56(84) bytes of data.
bytes from 172.17.0.3: icmp_seq=1 ttl=64 time=0.101 ms
bytes from 172.17.0.3: icmp_seq=2 ttl=64 time=0.042 ms
bytes from 172.17.0.3: icmp_seq=3 ttl=64 time=0.108 ms
bytes from 172.17.0.3: icmp_seq=4 ttl=64 time=0.052 ms
bytes from 172.17.0.3: icmp_seq=5 ttl=64 time=0.112 ms
^C
--- 172.17.0.3 ping statistics ---
packets transmitted, 5 received, 0% packet loss, time 83ms
rtt min/avg/max/mdev = 0.042/0.083/0.112/0.029 ms
[root@968e7ecc39f2 /]# 
[root@968e7ecc39f2 /]# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         172.17.0.1      0.0.0.0         UG    0      0        0 eth0
172.17.0.0      0.0.0.0         255.255.0.0     U     0      0        0 eth0
[root@968e7ecc39f2 /]# 
[root@968e7ecc39f2 /]# 
同一個宿主機的不同容器默認是可以相互通信的

root@docker101:~# ifconfig 
docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.17.0.1  netmask 255.255.0.0  broadcast 172.17.255.255
        inet6 fe80::42:87ff:febc:3cd8  prefixlen 64  scopeid 0x20<link>
        ether 02:42:87:bc:3c:d8  txqueuelen 0  (Ethernet)
        RX packets 5364  bytes 225742 (225.7 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 6098  bytes 28280444 (28.2 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.6.101  netmask 255.255.248.0  broadcast 192.168.7.255
        inet6 fe80::20c:29ff:fe57:8cb7  prefixlen 64  scopeid 0x20<link>
        ether 00:0c:29:57:8c:b7  txqueuelen 1000  (Ethernet)
        RX packets 179980  bytes 248819555 (248.8 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 37062  bytes 4513196 (4.5 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 314  bytes 31398 (31.3 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 314  bytes 31398 (31.3 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

veth47b028a: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::2ce5:b3ff:fe43:2cc4  prefixlen 64  scopeid 0x20<link>
        ether 2e:e5:b3:43:2c:c4  txqueuelen 0  (Ethernet)
        RX packets 7  bytes 574 (574.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 24  bytes 1832 (1.8 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

vethed7471a: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::9cef:51ff:fe7a:fd0c  prefixlen 64  scopeid 0x20<link>
        ether 9e:ef:51:7a:fd:0c  txqueuelen 0  (Ethernet)
        RX packets 3364  bytes 186375 (186.3 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 3719  bytes 15400120 (15.4 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

root@docker101:~# 
root@docker101:~# 
查看宿主機的網卡信息,如上圖所示。
root@docker101:~# apt-get -y install bridge-utils                         #安裝查看宿主機網橋的命令,安裝后才能使用下面的"brctl"命令
root@docker101:~# 
root@docker101:~# brctl show
bridge name    bridge id        STP enabled    interfaces
docker0        8000.024287bc3cd8    no        veth47b028a
                            vethed7471a
root@docker101:~# 
查看宿主機的橋接設備
root@docker101:~# iptables -t nat -vnL                        #查看宿主機的iptables規則。
Chain PREROUTING (policy ACCEPT 60 packets, 3937 bytes)
 pkts bytes target     prot opt in     out     source               destination         
   52 DOCKER     all  --  *      *       0.0.0.0/0            0.0.0.0/0            ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT 9 packets, 676 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 68 packets, 5251 bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0 DOCKER     all  --  *      *       0.0.0.0/0           !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT 69 packets, 5335 bytes)
 pkts bytes target     prot opt in     out     source               destination         
 3177 MASQUERADE  all  --  *      !docker0  172.17.0.0/16        0.0.0.0/0           

Chain DOCKER (2 references)
 pkts bytes target     prot opt in     out     source               destination         
    0 RETURN     all  --  docker0 *       0.0.0.0/0            0.0.0.0/0           
root@docker101:~# 
root@docker101:~# iptables  -vnL
Chain INPUT (policy ACCEPT 25019 packets, 126M bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy DROP 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
11436   28M DOCKER-USER  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
11436   28M DOCKER-ISOLATION-STAGE-1  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
 6080   28M ACCEPT     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    1    84 DOCKER     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0           
 5355  225K ACCEPT     all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0           
    1    84 ACCEPT     all  --  docker0 docker0  0.0.0.0/0            0.0.0.0/0           

Chain OUTPUT (policy ACCEPT 21954 packets, 3062K bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain DOCKER (1 references)
 pkts bytes target     prot opt in     out     source               destination         

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
 pkts bytes target     prot opt in     out     source               destination         
 5355  225K DOCKER-ISOLATION-STAGE-2  all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0           
11436   28M RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-ISOLATION-STAGE-2 (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DROP       all  --  *      docker0  0.0.0.0/0            0.0.0.0/0           
 5355  225K RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-USER (1 references)
 pkts bytes target     prot opt in     out     source               destination         
11436   28M RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0           
root@docker101:~# 
root@docker101:~# 
查看宿主機的iptables規則,docker網絡通信默認就是基於iptables規則實現的。docker的邏輯網絡如下圖所示。

6>.User Namespace

  各個容器內可能會出現重名的用戶和用戶組名稱,或重復的用戶UID或者GID,那么怎隔離各個容器內的用戶空間呢?
  User Namespace允許在各個宿主機的各個容器空間內創建相同的用戶名以及相同的用戶UID和GID,只是會用戶的作用范圍限制在每個容器內,即A容器和B容器可以有相同的用戶名稱和ID的賬戶,但是此用戶的有效范圍僅是當前容器內,不能訪問另外一個容器內的文件系統,即相互隔離,互不影響,永不相見 。
root@docker101:~# 
root@docker101:~# cat /etc/passwd
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/usr/sbin/nologin
man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin
mail:x:8:8:mail:/var/mail:/usr/sbin/nologin
news:x:9:9:news:/var/spool/news:/usr/sbin/nologin
uucp:x:10:10:uucp:/var/spool/uucp:/usr/sbin/nologin
proxy:x:13:13:proxy:/bin:/usr/sbin/nologin
www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin
backup:x:34:34:backup:/var/backups:/usr/sbin/nologin
list:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin
irc:x:39:39:ircd:/var/run/ircd:/usr/sbin/nologin
gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/usr/sbin/nologin
nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin
systemd-network:x:100:102:systemd Network Management,,,:/run/systemd/netif:/usr/sbin/nologin
systemd-resolve:x:101:103:systemd Resolver,,,:/run/systemd/resolve:/usr/sbin/nologin
syslog:x:102:106::/home/syslog:/usr/sbin/nologin
messagebus:x:103:107::/nonexistent:/usr/sbin/nologin
_apt:x:104:65534::/nonexistent:/usr/sbin/nologin
lxd:x:105:65534::/var/lib/lxd/:/bin/false
uuidd:x:106:110::/run/uuidd:/usr/sbin/nologin
dnsmasq:x:107:65534:dnsmasq,,,:/var/lib/misc:/usr/sbin/nologin
landscape:x:108:112::/var/lib/landscape:/usr/sbin/nologin
pollinate:x:109:1::/var/cache/pollinate:/bin/false
sshd:x:110:65534::/run/sshd:/usr/sbin/nologin
jason:x:1000:1000:jason:/home/jason:/bin/bash
root@docker101:~# 
root@docker101:~# 
root@docker101:~# docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
44edd3477c0d        nginx               "nginx -g 'daemon of…"   3 hours ago         Up 2 hours          80/tcp              stupefied_driscoll
968e7ecc39f2        centos              "/bin/bash"              6 hours ago         Up 6 hours                              keen_meitner
root@docker101:~# 
root@docker101:~# docker exec -it 968e7ecc39f2 bash
[root@968e7ecc39f2 /]# 
[root@968e7ecc39f2 /]# cat /etc/passwd
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin
games:x:12:100:games:/usr/games:/sbin/nologin
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
nobody:x:65534:65534:Kernel Overflow User:/:/sbin/nologin
dbus:x:81:81:System message bus:/:/sbin/nologin
systemd-coredump:x:999:997:systemd Core Dumper:/:/sbin/nologin
systemd-resolve:x:193:193:systemd Resolver:/:/sbin/nologin
[root@968e7ecc39f2 /]# 
[root@968e7ecc39f2 /]# exit 
exit
root@docker101:~# 
root@docker101:~# docker exec -it 44edd3477c0d bash
root@44edd3477c0d:/# 
root@44edd3477c0d:/# 
root@44edd3477c0d:/# cat /etc/passwd
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/usr/sbin/nologin
man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin
mail:x:8:8:mail:/var/mail:/usr/sbin/nologin
news:x:9:9:news:/var/spool/news:/usr/sbin/nologin
uucp:x:10:10:uucp:/var/spool/uucp:/usr/sbin/nologin
proxy:x:13:13:proxy:/bin:/usr/sbin/nologin
www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin
backup:x:34:34:backup:/var/backups:/usr/sbin/nologin
list:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin
irc:x:39:39:ircd:/var/run/ircd:/usr/sbin/nologin
gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/usr/sbin/nologin
nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin
_apt:x:100:65534::/nonexistent:/usr/sbin/nologin
nginx:x:101:101:nginx user,,,:/nonexistent:/bin/false
root@44edd3477c0d:/# 
root@44edd3477c0d:/# 
每個容器內部都有超級管理員root及其它普通用戶,且與其它容器ID相同,但並不會相互影響,因為每個容器內部的用戶只作用於其所在的容器,如下圖所示。

 

二.Linux control groups

1>.什么是Linux Cgroups

  一個容器如果不對其做任何資源限制,則宿主機會允許其占用無限大的內存空間,有時候會因為代碼bug程序會一直申請內存,直到把宿主機內存占完。

  為了避免此類的問題出現,宿主機有必要對容器進行資源分配限制,比如CPU,內存等,Linux Cgroups的全稱是Linux Control Groups,它最主要的作用就是限制一個進程組能夠使用的資源上限,包括CPU,內存,磁盤,網絡帶寬等等。

  此外,Linux Cgroups還能夠對進程優先級設置,以及將進程掛起和恢復等操作。

2>.驗證系統Linux Cgroups

[root@computing121.yinzhengjie.org.cn ~]# cat /etc/redhat-release 
CentOS Linux release 7.2.1511 (Core) 
[root@computing121.yinzhengjie.org.cn ~]# 
[root@computing121.yinzhengjie.org.cn ~]# uname -r
3.10.0-327.el7.x86_64
[root@computing121.yinzhengjie.org.cn ~]# 
[root@computing121.yinzhengjie.org.cn ~]# uname -m
x86_64
[root@computing121.yinzhengjie.org.cn ~]# 
[root@computing121.yinzhengjie.org.cn ~]# cat /boot/config-3.10.0-327.el7.x86_64 | grep -i cgroup
CONFIG_CGROUPS=y
# CONFIG_CGROUP_DEBUG is not set
CONFIG_CGROUP_FREEZER=y
CONFIG_CGROUP_DEVICE=y
CONFIG_CGROUP_CPUACCT=y
CONFIG_CGROUP_HUGETLB=y
CONFIG_CGROUP_PERF=y
CONFIG_CGROUP_SCHED=y
CONFIG_BLK_CGROUP=y
# CONFIG_DEBUG_BLK_CGROUP is not set
CONFIG_NETFILTER_XT_MATCH_CGROUP=m
CONFIG_NET_CLS_CGROUP=y
CONFIG_NETPRIO_CGROUP=m
[root@computing121.yinzhengjie.org.cn ~]# 
[root@computing121.yinzhengjie.org.cn ~]# 
[root@computing121.yinzhengjie.org.cn ~]# cat /boot/config-3.10.0-327.el7.x86_64 | grep -i cgroup | wc -l
13
[root@computing121.yinzhengjie.org.cn ~]# 
[root@computing121.yinzhengjie.org.cn ~]# 
CentOS7.2 Cgroups
root@docker101:~# hostname
docker101.yinzhengjie.org.cn
root@docker101:~# 
root@docker101:~# uname -r
4.15.0-74-generic
root@docker101:~# 
root@docker101:~# uname -m
x86_64
root@docker101:~# 
root@docker101:~# uname -a
Linux docker101.yinzhengjie.org.cn 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
root@docker101:~# 
root@docker101:~# 
root@docker101:~# cat /boot/config-4.15.0-74-generic | grep -i cgroup
CONFIG_CGROUPS=y
CONFIG_BLK_CGROUP=y
# CONFIG_DEBUG_BLK_CGROUP is not set
CONFIG_CGROUP_WRITEBACK=y
CONFIG_CGROUP_SCHED=y
CONFIG_CGROUP_PIDS=y
CONFIG_CGROUP_RDMA=y
CONFIG_CGROUP_FREEZER=y
CONFIG_CGROUP_HUGETLB=y
CONFIG_CGROUP_DEVICE=y
CONFIG_CGROUP_CPUACCT=y
CONFIG_CGROUP_PERF=y
CONFIG_CGROUP_BPF=y
# CONFIG_CGROUP_DEBUG is not set
CONFIG_SOCK_CGROUP_DATA=y
CONFIG_NETFILTER_XT_MATCH_CGROUP=m
CONFIG_NET_CLS_CGROUP=m
CONFIG_CGROUP_NET_PRIO=y
CONFIG_CGROUP_NET_CLASSID=y
root@docker101:~# 
root@docker101:~# 
root@docker101:~# cat /boot/config-4.15.0-74-generic | grep -i cgroup | wc -l
19
root@docker101:~# 
Ubuntu18.04 Cgroups

 

  Cgroups在內核層默認已經開啟,從Centos(如上圖所示)和Ubuntu(如下圖所示)對比結果來看,顯然內核較新的ubuntu支持的功能更多。

3>.查看系統cgroups

root@docker101:~# ll /sys/fs/cgroup/
total 0
drwxr-xr-x 15 root root 380 Jan 12 05:41 ./
drwxr-xr-x 10 root root   0 Jan 12 05:41 ../
dr-xr-xr-x  5 root root   0 Jan 12 05:41 blkio/                      #塊設備IO限制。
lrwxrwxrwx  1 root root  11 Jan 12 05:41 cpu -> cpu,cpuacct/              #使用調度程序為cgroup任務提供cpu的訪問。
lrwxrwxrwx  1 root root  11 Jan 12 05:41 cpuacct -> cpu,cpuacct/            #產生cgroup任務的cpu資源報告。
dr-xr-xr-x  5 root root   0 Jan 12 05:41 cpu,cpuacct/
dr-xr-xr-x  3 root root   0 Jan 12 05:41 cpuset/                      #如果是多核心的cpu,這個子系統會為cgroup任務分配單獨的cpu和內存。
dr-xr-xr-x  5 root root   0 Jan 12 05:41 devices/                     #允許或拒絕cgroup任務對設備的訪問。
dr-xr-xr-x  3 root root   0 Jan 12 05:41 freezer/                     #暫停和恢復cgroup任務。
dr-xr-xr-x  3 root root   0 Jan 12 05:41 hugetlb/                  
dr-xr-xr-x  5 root root   0 Jan 12 05:41 memory/                      #設置每個cgroup的內存限制以及產生內存資源報告。
lrwxrwxrwx  1 root root  16 Jan 12 05:41 net_cls -> net_cls,net_prio/         #標記每個網絡包以及提供cgroup方便是使用。
dr-xr-xr-x  3 root root   0 Jan 12 05:41 net_cls,net_prio/
lrwxrwxrwx  1 root root  16 Jan 12 05:41 net_prio -> net_cls,net_prio/
dr-xr-xr-x  3 root root   0 Jan 12 05:41 perf_event/                   #增加了對每個group的監測跟蹤的能力,可以監測屬於某個特定的group的所有線程以及運行在特定CPU上的線程。
dr-xr-xr-x  5 root root   0 Jan 12 05:41 pids/
dr-xr-xr-x  2 root root   0 Jan 12 05:41 rdma/
dr-xr-xr-x  6 root root   0 Jan 12 05:41 systemd/
dr-xr-xr-x  5 root root   0 Jan 12 05:41 unified/
root@docker101:~# 
root@docker101:~# 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM