k8s集群證書過期問題處理


1.問題出現

   因為臨時想在kubernetes集群(測試環境)上創建Pod發現好久沒用得k8s集群無法使用,報錯如下:

   kubectl get ns

   The connection to the server 10.21.4.113:6443 was refused - did you specify the right host or port?  

   就想到是kube-spiserver出了問題。所以就使用systemctl status kubelet -l 查看具體原因:

   

   systemctl status kubelet.service
  ● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /usr/lib/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since 二 2021-12-07 15:20:44 CST; 35s ago
Docs: https://kubernetes.io/docs/
Main PID: 28356 (kubelet)
Tasks: 55
Memory: 48.5M
CGroup: /system.slice/kubelet.service
└─28356 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kub...

12月 07 15:21:18 k8smaster kubelet[28356]: E1207 15:21:18.985804 28356 kubelet.go:2263] node "k8smaster" not found
12月 07 15:21:19 k8smaster kubelet[28356]: E1207 15:21:19.086034 28356 kubelet.go:2263] node "k8smaster" not found
12月 07 15:21:19 k8smaster kubelet[28356]: E1207 15:21:19.186265 28356 kubelet.go:2263] node "k8smaster" not found
12月 07 15:21:19 k8smaster kubelet[28356]: E1207 15:21:19.286503 28356 kubelet.go:2263] node "k8smaster" not found
12月 07 15:21:19 k8smaster kubelet[28356]: E1207 15:21:19.386722 28356 kubelet.go:2263] node "k8smaster" not found
12月 07 15:21:19 k8smaster kubelet[28356]: E1207 15:21:19.486971 28356 kubelet.go:2263] node "k8smaster" not found
12月 07 15:21:19 k8smaster kubelet[28356]: E1207 15:21:19.587204 28356 kubelet.go:2263] node "k8smaster" not found
12月 07 15:21:19 k8smaster kubelet[28356]: E1207 15:21:19.687460 28356 kubelet.go:2263] node "k8smaster" not found
12月 07 15:21:19 k8smaster kubelet[28356]: E1207 15:21:19.787686 28356 kubelet.go:2263] node "k8smaster" not found
12月 07 15:21:19 k8smaster kubelet[28356]: E1207 15:21:19.888025 28356 kubelet.go:2263] node "k8smaster" not found

根據報錯,認為兩個組件出現了問題,要么是kube-apiserver,要么就是etcd出了問題。

2.處理問題

因為我的測試k8s集群是由kubeadm安裝的,所以我就使用docker ps 還有docker logs查看kube-apiserver和etcd的相關log。看看問題到底出現在哪里

 

 可以看出來k8s集群得證書過期了,然后更新證書就可以了

Kubelet組件證書默認有效期為1年。集群運行1年以后就會導致報 certificate has expired or is not yet valid 錯誤,導致集群 Node不能於集群 Master正常通信。重啟的話k8s就起不來了。

重新生成證書
驗證證書是否過期:

openssl x509 -noout -text -in /etc/kubernetes/pki/apiserver.crt

1.14后的版本可以使用這個命令查看過期時間,如下:

kubeadm alpha certs check-expiration

kubeadm 安裝得證書默認為 1 年,注意原證書文件必須保留在服務器上才能做延期操作,否則就會重新生成,集群可能無法恢復。
先把原配置和證書備份

cp -rp /etc/kubernetes /etc/kubernetes.bak

如果 kubeadm配置文件找不到了,就先生成一個默認的,然后自行修改:

kubeadm config print init-defaults > kubeadm.yaml

然后根據你自己的實際情況修改:主要修改:kubernetesVersion、advertiseAddress、imageRepository、serviceSubnet

apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 10.21.4.113
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: k8smaster
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.17.0
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
scheduler: {}

修改完,用以上配置重新生成證書:

kubeadm alpha certs renew all --config=/data/kubeadm.yaml

延期配置之后需要更新配置文件

# 注意:更新配置文件前先以 move 方式備份,或刪除配置文件 mv /etc/kubernetes/*.conf /data/kubeconfback/ kubeadm init phase kubeconfig all --config=/data/kubeadm.yaml

之后重啟 kube-apiserver,etcd,scheduler,controller 容器

docker ps | grep -v pause | grep -E "etcd|scheduler|controller|apiserver" | awk '{print $1}' | awk '{print "docker","restart",$1}' | bash

或者重啟 kubelet

systemctl restart kubelet

 

如果重啟kubelet后發現出現這樣得報錯  error: You must be logged in to the server (Unauthorized)。還得更改下權限(因為重新為組件分配了證書)

 echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile                 

source ~/.bash_profile

如果是非root用戶:

mkdir -p $HOME/.kube          

sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config         

sudo chown $(id -u):$(id -g) $HOME/.kube/config

 

最后查看集群狀態:

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM