參考於:
https://blog.csdn.net/learner198461/article/details/78036854
https://liyang.pro/solve-k8s-pod-containercreating/
https://blog.csdn.net/golduty2/article/details/80625485
根據實際情況稍微做了修改和說明。
在創建Dashborad時,查看狀態總是ContainerCreating
[root@MyCentos7 k8s]# kubectl get pod --namespace=kube-system NAME READY STATUS RESTARTS AGE kubernetes-dashboard-2094756401-kzhnx 0/1 ContainerCreating 0 10m
通過kubectl describe命令查看具體信息(或查看日志/var/log/message)
[root@MyCentos7 k8s]# kubectl describe pod kubernetes-dashboard-2094756401-kzhnx --namespace=kube-system Name: kubernetes-dashboard-2094756401-kzhnx Namespace: kube-system Node: mycentos7-1/192.168.126.131 Start Time: Tue, 05 Jun 2018 19:28:25 +0800 Labels: app=kubernetes-dashboard pod-template-hash=2094756401 Status: Pending IP: Controllers: ReplicaSet/kubernetes-dashboard-2094756401 Containers: kubernetes-dashboard: Container ID: Image: daocloud.io/megvii/kubernetes-dashboard-amd64:v1.8.0 Image ID: Port: 9090/TCP Args: --apiserver-host=http://192.168.126.130:8080 State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Liveness: http-get http://:9090/ delay=30s timeout=30s period=10s #success=1 #failure=3 Volume Mounts: <none> Environment Variables: <none> Conditions: Type Status Initialized True Ready False PodScheduled True No volumes. QoS Class: BestEffort Tolerations: <none> Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 11m 11m 1 {default-scheduler } Normal Scheduled Successfully assigned kubernetes-dashboard-2094756401-kzhnx to mycentos7-1 11m 49s 7 {kubelet mycentos7-1} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "image pull failede:latest, this may be because there are no credentials on this request. details: (open /etc/docker/certs.d/registry.access.redhat.com/redhat-ca.crt: no such file or directory)" 11m 11s 47 {kubelet mycentos7-1} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "POD" with ImagePullBackOff: "Back-off pulling image \"registry.access.redh
在工作節點(node)上執行
發現此時會pull一個鏡像registry.access.redhat.com/rhel7/pod-infrastructure:latest,當我手動pull時,提示如下錯誤:
[root@MyCentos7 k8s]# docker pull registry.access.redhat.com/rhel7/pod-infrastructure:latest
Trying to pull repository registry.access.redhat.com/rhel7/pod-infrastructure ...
open /etc/docker/certs.d/registry.access.redhat.com/redhat-ca.crt: no such file or directory
通過提示的路徑查找該文件,是個軟連接,鏈接目標是/etc/rhsm,查看沒有rhsm
[root@MyCentos7 ca]# cd /etc/docker/certs.d/registry.access.redhat.com/ [root@MyCentos7 registry.access.redhat.com]# ll 總用量 0 lrwxrwxrwx. 1 root root 27 5月 11 14:30 redhat-ca.crt -> /etc/rhsm/ca/redhat-uep.pem
[root@MyCentos7 ca]# cd /etc/rhsm
-bash: cd: /etc/rhsm: 沒有那個文件或目錄
安裝rhsm(node上):
yum install *rhsm* 已加載插件:fastestmirror, langpacks Loading mirror speeds from cached hostfile * base: mirror.lzu.edu.cn * extras: mirror.lzu.edu.cn * updates: ftp.sjtu.edu.cn base | 3.6 kB 00:00:00 extras | 3.4 kB 00:00:00 updates | 3.4 kB 00:00:00 軟件包 python-rhsm-1.19.10-1.el7_4.x86_64 被已安裝的 subscription-manager-rhsm-1.20.11-1.el7.centos.x86_64 取代 軟件包 subscription-manager-rhsm-1.20.11-1.el7.centos.x86_64 已安裝並且是最新版本 軟件包 python-rhsm-certificates-1.19.10-1.el7_4.x86_64 被已安裝的 subscription-manager-rhsm-certificates-1.20.11-1.el7.centos.x86_64 取代 軟件包 subscription-manager-rhsm-certificates-1.20.11-1.el7.centos.x86_64 已安裝並且是最新版本
但是在
/etc/rhsm/ca/
目錄下依舊沒有證書文件,於是反復卸載與安裝都不靠譜,后來發現大家所謂yum install *rhsm*其實安裝的的是python-rhsm-1.19.10-1.el7_4.x86_64
和python-rhsm-certificates-1.19.10-1.el7_4.x86_64
,但是在實際安裝過程中會有如下提示:
軟件包 python-rhsm-1.19.10-1.el7_4.x86_64 被已安裝的 subscription-manager-rhsm-1.20.11-1.el7.centos.x86_64 取代 軟件包 subscription-manager-rhsm-1.20.11-1.el7.centos.x86_64 已安裝並且是最新版本 軟件包 python-rhsm-certificates-1.19.10-1.el7_4.x86_64 被已安裝的 subscription-manager-rhsm-certificates-1.20.11-1.el7.centos.x86_64 取代 軟件包 subscription-manager-rhsm-certificates-1.20.11-1.el7.centos.x86_64 已安裝並且是最新版本
罪魁禍首在這里。原來我們想要安裝的rpm包被取代了。而取代后的rpm包在安裝完成后之創建了目錄,並沒有證書文件redhat-uep.pem
。於是乎,手動下載以上兩個包
wget ftp://ftp.icm.edu.pl/vol/rzm6/linux-scientificlinux/7.4/x86_64/os/Packages/python-rhsm-certificates-1.19.9-1.el7.x86_64.rpm wget ftp://ftp.icm.edu.pl/vol/rzm6/linux-scientificlinux/7.4/x86_64/os/Packages/python-rhsm-1.19.9-1.el7.x86_64.rpm
注:在此處有時會報錯,提示找不到這兩個rpm文件,此時需要手動登錄到此FTP進行下載,文件要稍等會才會加載出來,然后下載所需的這兩個rpm(可能是網絡原因,有時不穩定)
注意版本要匹配,卸載安裝錯的包
yum remove *rhsm*
然后執行安裝命令
rpm -ivh *.rpm
rpm -ivh *.rpm 警告:python-rhsm-1.19.9-1.el7.x86_64.rpm: 頭V4 DSA/SHA1 Signature, 密鑰 ID 192a7d7d: NOKEY 准備中... ################################# [100%] 正在升級/安裝... 1:python-rhsm-certificates-1.19.9-1################################# [ 50%] 2:python-rhsm-1.19.9-1.el7 ################################# [100%]
我在這一步有出錯了
[root@neal registry.access.redhat.com]# rpm -ivh *.rpm
警告:python-rhsm-1.19.9-1.el7.x86_64.rpm: 頭V4 DSA/SHA1 Signature, 密鑰 ID 192a7d7d: NOKEY
錯誤:依賴檢測失敗:
python-rhsm <= 1.20.3-1 被 (已安裝) subscription-manager-rhsm-1.20.11-1.el7.centos.x86_64 取代
python-rhsm-certificates <= 1.20.3-1 被 (已安裝) subscription-manager-rhsm-certificates-1.20.11-1.el7.centos.x86_64 取代
此時跳到分割線之下,用分割線下面的文章的方法remove掉已經有的包,再重新用上面的命令安裝。
接着驗證手動pull鏡像
docker pull registry.access.redhat.com/rhel7/pod-infrastructure:latest Trying to pull repository registry.access.redhat.com/rhel7/pod-infrastructure ... latest: Pulling from registry.access.redhat.com/rhel7/pod-infrastructure 26e5ed6899db: Pull complete 66dbe984a319: Pull complete 9138e7863e08: Pull complete Digest: sha256:92d43c37297da3ab187fc2b9e9ebfb243c1110d446c783ae1b989088495db931 Status: Downloaded newer image for registry.access.redhat.com/rhel7/pod-infrastructure:latest
問題解決。
--------------------------------------------------------------------------------------------------------------------------------
在《kubernetes權威指南》入門的一個例子中,發現pod一直處於ContainerCreating
的狀態,用kubectl describe pod mysql
的時候發現如下報錯:
-
Events:
-
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
-
--------- -------- ----- ---- ------------- -------- ------ -------
-
1h 24m 17 {kubelet 127.0.0.1} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "image pull failed for registry.access.redhat.com/rhel7/pod-infrastructure:latest, this may be because there are no credentials on this request. details: (open /etc/docker/certs.d/registry.access.redhat.com/redhat-ca.crt: no such file or directory)"
-
1h 19m 291 {kubelet 127.0.0.1} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "POD" with ImagePullBackOff: "Back-off pulling image \"registry.access.redhat.com/rhel7/pod-infrastructure:latest\""
-
15m 15m 1 {kubelet 127.0.0.1} Warning MissingClusterDNS kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to DNSDefault policy.
-
15m 15m 1 {kubelet 127.0.0.1} spec.containers{mysql} Normal Pulling pulling image "mysql"
-
7m 7m 1 {kubelet 127.0.0.1} Warning MissingClusterDNS kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to DNSDefault policy.
-
7m 7m 1 {kubelet 127.0.0.1} spec.containers{mysql} Normal Pulling pulling image "mysql"
問題是比較明顯的,就是沒有/etc/docker/certs.d/registry.access.redhat.com/redhat-ca.crt
文件,用ls -l
查看之后發現是一個軟鏈接,鏈接到/etc/rhsm/ca/redhat-uep.pem
,但是這個文件不存在,使用yum search *rhsm*
命令:
- 安裝
python-rhsm-certificates
包:
# yum install python-rhsm-certificates -y
這里又出現問題了:
python-rhsm-certificates <= 1.20.3-1 被 (已安裝) subscription-manager-rhsm-certificates-1.20.11-1.el7.centos.x86_64 取代
那么怎么辦呢,我們直接卸載掉subscription-manager-rhsm-certificates
包,使用yum remove subscription-manager-rhsm-certificates -y
命令,然后下載python-rhsm-certificates
包:
# wget http://mirror.centos.org/centos/7/os/x86_64/Packages/python-rhsm-certificates-1.19.10-1.el7_4.x86_64.rpm
然后手動安裝該rpm包:
# rpm -ivh python-rhsm-certificates
這時發現/etc/rhsm/ca/redhat-uep.pem
文件已存在。
- 使用
docker pull registry.access.redhat.com/rhel7/pod-infrastructure:latest
命令下載鏡像,但是可能會很慢,可以到https://dashboard.daocloud.io網站上注冊賬號,然后點擊加速器,然后復制代碼執行,之后重啟docker就會進行加速,如果重啟docker服務的時候無法啟動,使用systemctl status docker
:
-
# systemctl status docker
-
● docker.service - Docker Application Container Engine
-
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
-
Active: failed (Result: exit-code) since 一 2018-05-28 22:13:37 CST; 13s ago
-
Docs: http: //docs.docker.com
-
Process: 79849 ExecStart=/usr/bin/dockerd-current --add-runtime docker-runc=/usr/libexec/docker/docker-runc-current --default-runtime=docker-runc --exec-opt native.cgroupdriver=systemd --userland-proxy-path=/usr/libexec/docker/docker-proxy-current --init-path=/usr/libexec/docker/docker-init-current --seccomp-profile=/etc/docker/seccomp.json $OPTIONS $DOCKER_STORAGE_OPTIONS $DOCKER_NETWORK_OPTIONS $ADD_REGISTRY $BLOCK_REGISTRY $INSECURE_REGISTRY $REGISTRIES (code=exited, status=1/FAILURE)
-
Main PID: 79849 (code=exited, status=1/FAILURE)
-
5月 28 22:13:37 kube.example.com systemd[1]: Starting Docker Application Container Engine...
-
5月 28 22:13:37 kube.example.com dockerd-current[79849]: unable to configure the Docker daemon with file /etc/docker/daemon.json: invalid character '}' loo...y string
-
5月 28 22:13:37 kube.example.com systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE
-
5月 28 22:13:37 kube.example.com systemd[1]: Failed to start Docker Application Container Engine.
-
5月 28 22:13:37 kube.example.com systemd[1]: Unit docker.service entered failed state.
-
5月 28 22:13:37 kube.example.com systemd[1]: docker.service failed.
-
Hint: Some lines were ellipsized, use -l to show in full
這時將/etc/docker/seccomp.json
刪除,再次重啟即可
- 這時將之前創建的rc、svc和pod全部刪除重新創建,過一會就會發現pod啟動成功
原因猜想:根據報錯信息,pod啟動需要
registry.access.redhat.com/rhel7/pod-infrastructure:latest
鏡像,需要去紅帽倉庫里下載,但是沒有證書,安裝證書之后就可以了