CentOS 7 yum安裝 k8s 創建Pod一直處於ContainerCreating狀態 問題解決


問題描述

使用CentOS7的 yum 包管理器安裝了 Kubernetes 集群,使用 kubectl 創建服務成功后,執行 kubectl get pods,發現AGE雖然在不斷增加,但狀態始終不變

本文內容

  • 分析問題原因
  • 給出直接解決此問題的方式 (不完美)
  • 給出其他方案

且聽我娓娓道來~

問題分析與解決

kubectl 提供了 describe 子命令來輸出指定的一個/多個資源的詳細信息。

執行 kubectl describe pod mytomcat-9lcq5,查看問題 Pod 的狀態信息,輸出如下:

[root@kube-master app]# kubectl describe pod mytomcat-9lcq5
Name:		mytomcat-9lcq5
Namespace:	default
Node:		kube-node-2/192.168.87.145
Start Time:	Fri, 17 Apr 2020 15:53:50 +0800
Labels:		app=mytomcat
Status:		Pending
IP:		
Controllers:	ReplicationController/mytomcat
Containers:
  mytomcat:
    Container ID:		
    Image:			tomcat:9-jre8-alpine
    Image ID:			
    Port:			8080/TCP
    State:			Waiting
      Reason:			ContainerCreating
    Ready:			False
    Restart Count:		0
    Volume Mounts:		<none>
    Environment Variables:	<none>
Conditions:
  Type		Status
  Initialized 	True 
  Ready 	False 
  PodScheduled 	True 
No volumes.
QoS Class:	BestEffort
Tolerations:	<none>
Events:
  FirstSeen	LastSeen	Count	From			SubObjectPath	Type		Reason		Message
  ---------	--------	-----	----			-------------	--------	------		-------
  5m		5m		1	{default-scheduler }			Normal		Scheduled	Successfully assigned mytomcat-9lcq5 to kube-node-2
  4m		4m		1	{kubelet kube-node-2}			Warning		FailedSync	Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "image pull failed for registry.access.redhat.com/rhel7/pod-infrastructure:latest, this may be because there are no credentials on this request.  details: (Get https://registry.access.redhat.com/v1/_ping: net/http: TLS handshake timeout)"

  3m	3m	1	{kubelet kube-node-2}		Warning	FailedSync	Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "image pull failed for registry.access.redhat.com/rhel7/pod-infrastructure:latest, this may be because there are no credentials on this request.  details: (Network timed out while trying to connect to https://registry.access.redhat.com/v1/repositories/rhel7/pod-infrastructure/images. You may want to check your internet connection or if you are behind a proxy.)"

  2m	2m	1	{kubelet kube-node-2}		Warning	FailedSync	Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "image pull failed for registry.access.redhat.com/rhel7/pod-infrastructure:latest, this may be because there are no credentials on this request.  details: (Error: image rhel7/pod-infrastructure:latest not found)"

  3m	1m	3	{kubelet kube-node-2}		Warning	FailedSync	Error syncing pod, skipping: failed to "StartContainer" for "POD" with ImagePullBackOff: "Back-off pulling image \"registry.access.redhat.com/rhel7/pod-infrastructure:latest\""

通過查看最下方的輸出信息,Successfully assigned mytomcat-9lcq5 to kube-node-2 說明這個 Pod 分配到 kube-node-2 這個主機上了,然后在這個主機上創建 Pod 失敗,

原因是 image pull failed for registry.access.redhat.com/rhel7/pod-infrastructure:latest, this may be because there are no credentials on this request.

通過以上信息,我們了解到通過紅帽自家的 docker 倉庫 pull 鏡像,需要使用 CA 證書進行認證,才能 pull 成功

docker的證書在 /etc/docker/certs.d 目錄下,根據上邊的錯誤提示域名是 registry.access.redhat.com,證書在這個目錄中

經過 ll 命令查看,發現 /etc/docker/certs.d/registry.access.redhat.com/redhat-ca.crt 是一個軟鏈接(軟鏈接是什么?),指向到 /etc/rhsm/ca/redhat-uep.pem

熟悉軟連接的我們知道,處於紅色閃爍狀態的目標是不存在,需要生成 /etc/rhsm/ca/redhat-uep.pem 證書文件

生成證書:

# openssl s_client -showcerts -servername registry.access.redhat.com -connect registry.access.redhat.com:443 </dev/null 2>/dev/null | openssl x509 -text > /etc/rhsm/ca/redhat-uep.pem

生成證書命令執行有時會出現 unable to load certificate 139930742028176:error:0906D06C:PEM routines:PEM_read_bio:no start line:pem_lib.c:707:Expecting: TRUSTED CERTIFICATE 問題,重新執行就好

命令執行完畢后,查看軟鏈接指向的證書文件:

[root@kube-node-2 registry.access.redhat.com]# ll /etc/rhsm/ca/redhat-uep.pem
-rw-r--r-- 1 root root 9233 Apr 17 16:55 /etc/rhsm/ca/redhat-uep.pem

證書文件已經存在,我們去 k8s 管理節點 kube-master 主機刪除剛才的 Pods,等待 Pod 重新創建成功 (第二個節點因為網絡問題沒有拉成功鏡像……)

至此完成 Pod 的創建

但是還有存在些問題的,當前國內網絡環境訪問外邊的網絡偶爾會有問題,導致創建 Pod 失敗,通過 describe 描述還是同樣的信息提示,但是查看證書文件卻存在且有內容

原因分析與其他方案

k8s 管理節點分配創建 Pod 到執行節點,到達執行節點后,拉取紅帽 docker 倉庫的 Pod基礎鏡像 pod-infrastructure:latest,由於其倉庫使用 https 需要驗證證書,證書不存在導致失敗

另外就是因為拉取的鏡像是紅帽 docker 倉庫中的,在國內網絡環境下握手失敗,無法下載鏡像

所以問題就成了 如何解決 k8s pod-infrastructure 鏡像拉取失敗,這里給出一個方案,步驟如下:

  • 拉取 docker 官方倉庫其他人上傳的 pod-infrastructure 鏡像,docker pull tianyebj/pod-infrastructure

  • 添加tag標簽,改為私有倉庫地址,如:docker tag tianyebj/pod-infrastructure 10.2.7.70:5000/dev/pod-infrastructure

  • push鏡像到私有倉庫,如:docker push 10.2.7.70:5000/dev/pod-infrastructure

  • 修改所有 worker 節點的 /etc/kubernetes/kubelet,修改 registry.access.redhat.com/rhel7/pod-infrastructure 為剛才設置的 tag 標簽

    sed -i "s#registry.access.redhat.com/rhel7/pod-infrastructure#<私有倉庫pod-infrastructure鏡像tag>#" /etc/kubernetes/kubelet
    

  • 重啟所有 worker 節點的 kubelet,systemctl restart kubelet,即可

注意事項:

  • 上傳的鏡像要設為公開鏡像,否則 kubelet 自己沒權限拉鏡像的,另外也可以去 ssh 登錄 worker 節點登錄倉庫,執行docker pull <私有倉庫pod-infrastructure鏡像tag>

最后的效果:

參考

https://github.com/CentOS/sig-atomic-buildscripts/issues/329
https://cloud.tencent.com/developer/article/1156329

本文采用 CC BY 4.0 協議進行授權,轉載請標注作者署名及來源。
https://www.cnblogs.com/hellxz/p/k8s-pod-always-container-creating-status-problem.html


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM