今天,在搭建k8s node節點環境的時候,好巧不巧,執行了如下命令:
[root@hxin221 ~]# ifconfig docker0 down &>/dev/null [root@hxin221 ~]# brctl delbr docker0 &>/dev/null
妥妥的把docker網橋刪除了,不要問我為什么刪除它,我只能說當時神游太虛了。
這下子,問題來了,我在k8s創建一個pod的時候,出問題了:
test mywebcalculator-1-0-1-index0 0/1 ImageNotReady 0 4s [cpu:1/1 memory:268435456/268435456] <none> ***.***.***.221
嗯,出錯了,就查原因了,先查看下docker的狀態吧
[root@hxin221 ~]# systemctl status docker
● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
Drop-In: /usr/lib/systemd/system/docker.service.d
└─flannel.conf
Active: active (running) since Tue 2018-07-24 14:41:09 CST; 6s ago
Docs: https://docs.docker.com
Process: 3887 ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT (code=exited, status=0/SUCCESS)
Process: 3885 ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT (code=exited, status=0/SUCCESS)
Process: 3883 ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT (code=exited, status=0/SUCCESS)
Main PID: 3190 (dockerd)
Memory: 33.5M
CGroup: /system.slice/docker.service
├─3190 /usr/bin/dockerd --bip=10.0.77.1/24 --mtu=1450 --bip=10.0.77.1/24 --mtu=1450 --bip=10.0.77.1/24 --mtu=1450
├─3210 docker-containerd -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --metrics-interval=0 --start-timeout 2m --state-dir /var/run/docker/libcontainerd/containerd --shim docker-containerd-shim --runtime docker-runc
└─5370 docker-containerd-shim fb19c7c56afcc16e3b08977de9be597cb7cf153fafc998717a0449b3d00f9d27 /var/run/docker/libcontainerd/fb19c7c56afcc16e3b08977de9be597cb7cf153fafc998717a0449b3d00f9d27 docker-runc
Jul 24 14:41:08 hxin221 dockerd[3190]: time="2018-07-24T14:41:08.253923793+08:00" level=info msg="Graph migration to content-addressability took 0.00 seconds"
Jul 24 14:41:08 hxin221 dockerd[3190]: time="2018-07-24T14:41:08.254224418+08:00" level=warning msg="mountpoint for pids not found"
Jul 24 14:41:08 hxin221 dockerd[3190]: time="2018-07-24T14:41:08.254460443+08:00" level=info msg="Loading containers: start."
Jul 24 14:41:08 hxin221 dockerd[3190]: time="2018-07-24T14:41:08.273844445+08:00" level=info msg="Firewalld running: false"
Jul 24 14:41:09 hxin221 dockerd[3190]: time="2018-07-24T14:41:09.307222239+08:00" level=info msg="Loading containers: done."
Jul 24 14:41:09 hxin221 dockerd[3190]: time="2018-07-24T14:41:09.322792104+08:00" level=info msg="Daemon has completed initialization"
Jul 24 14:41:09 hxin221 dockerd[3190]: time="2018-07-24T14:41:09.322832435+08:00" level=info msg="Docker daemon" commit=092cba3 graphdriver=devicemapper version=1.13.1
Jul 24 14:41:09 hxin221 dockerd[3190]: time="2018-07-24T14:41:09.332075018+08:00" level=info msg="API listen on /var/run/docker.sock"
Jul 24 14:41:09 hxin221 systemd[1]: Started Docker Application Container Engine.
Jul 24 14:41:12 hxin221 dockerd[3190]: time="2018-07-24T14:41:12.618808849+08:00" level=error msg="Handler for GET /images/registry.wae.haplat.net/test/mywebcalculator:1.0.0/json returned error: No such image: registry.wae.haplat.net/test/mywebcalculator:1.0.0"
最后一條報錯的原因,下載不到,為什么呢?
我又在master上查看了事件:
[root@wscdn09 ~]# kubectl get events --namespace test FIRSTSEEN LASTSEEN COUNT NAME KIND SUBOBJECT REASON SOURCE MESSAGE
8s 8s 1 mywebcalculator-1-0-1-index0 Pod FailedSync {kubelet ***.***.***.221} Error syncing pod, skipping: API error (404): {"message":"failed to create endpoint k8s_POD.8c50e42c_mywebcalculator-1-0-1-index0_test_75e67623-8f0d-11e8-8336-d4bed9aa7cbc_e13625a3 on network bridge: adding interface veth721b28f to bridge docker0 failed: could not find bridge docker0: route ip+net: no such network interface"}
就是這個原因了,找不到bridge,呵呵了,這個坑可以有,查看ifconfig確認下
flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
inet 10.0.77.0 netmask 255.255.0.0 broadcast 0.0.0.0
inet6 fe80::48ed:42ff:fec3:2cb prefixlen 64 scopeid 0x20<link>
ether 4a:ed:42:c3:02:cb txqueuelen 0 (Ethernet)
RX packets 6496081 bytes 305348102 (291.2 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 4819250 bytes 404274861 (385.5 MiB)
TX errors 0 dropped 616845 overruns 0 carrier 0 collisions 0
真沒有docker0,只有flannel1.1,既然找不到,找不到就創建一個唄,根據flannel1.1
[root@hxin221 ~]# docker network create --driver bridge --subnet 10.0.77.1/24 --gateway 10.0.77.1 docker0 Error response from daemon: failed to allocate gateway (10.0.77.1): Address already in use
(⊙o⊙)…,子網段在用?
[root@hxin221 ~]# docker network inspect bridge
[
{
"Name": "bridge",
"Id": "7cf94d44da578e9ead3aeca12f772ce9bae3c5faedacf870fd4c7da0e33b9d42",
"Created": "2018-07-24T14:45:33.910042834+08:00",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "10.0.77.1/24",
"Gateway": "10.0.77.1"
}
]
},
"Internal": false,
"Attachable": false,
"Containers": {},
"Options": {
"com.docker.network.bridge.default_bridge": "true",
"com.docker.network.bridge.enable_icc": "true",
"com.docker.network.bridge.enable_ip_masquerade": "true",
"com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
"com.docker.network.bridge.name": "docker0",
"com.docker.network.driver.mtu": "1450"
},
"Labels": {}
}
]
好吧,你還真的在,不是已經刪除了嗎?怎么還在呢?我也不知道那么多了。網上撈了一圈,看到一條有用的信息,反正大家copy來copy去的,也不知道真假,誰都說自己是原創,不管他,先試試;
[root@hxin221 ~]# systemctl daemon-reload
[root@hxin221 ~]# systemctl restart docker
[root@hxin221 ~]# ifconfig
docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 10.0.77.1 netmask 255.255.255.0 broadcast 0.0.0.0
ether 02:42:98:1f:bc:cc txqueuelen 0 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
好吧,回來了,可以確定,docker0確實是在 systemctl restart docker 重新啟動的時候就會再次檢查創建!
以上
