kubernetes 證書過期
kubernetes 集群是使用kubeadm工具安裝的。
證書過期的表現:
- kubeclt 不能正常使用
- kube-apiserver、kube-controller-manager、kube-scheduler的日志會有certificate、Unauthorized關鍵字的錯誤提示:
# kubectl logs -n kube-system kube-apiserver-vonedaomaster1 --tail=10 -f
E0819 05:25:16.691962 1 authentication.go:53] Unable to authenticate the request due to an error: x509: certificate has expired or is not yet valid
# kubectl logs -f --tail=100 kube-scheduler-vonedaomaster1 -n kube-system
E0819 05:49:52.909861 1 reflector.go:178] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.PersistentVolume: Unauthorized
E0819 05:49:59.011448 1 reflector.go:178] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.StorageClass: Unauthorized
E0819 05:50:02.003645 1 reflector.go:178] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.PersistentVolumeClaim: Unauthorized
E0819 05:50:02.352984 1 reflector.go:178] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.CSINode: Unauthorized
E0819 05:50:04.750558 1 reflector.go:178] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.Service: Unauthorized
E0819 05:50:11.741815 1 reflector.go:178] k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:233: Failed to list *v1.Pod: Unauthorized
證書續期步驟
本集群只有一個master。(多個master沒有驗證過)
所有操作步驟都是在master上執行。
1. 備份舊數據
不管做什么操作,備份是必須的。
# cp /etc/kubernetes /etc/kubernetes.bak -rf
2. 導出kubeadm配置
# kubeadm config view > cluster.yaml
3. 重新生成證書
# kubeadm alpha certs renew all --config cluster.yaml
4. 替換~/.kube/config
# cp -i /etc/kubernetes/admin.conf /root/.kube/config
5. 重啟kubelet
# systemctl restart kubelet
6. 重啟kube-apiserver、kube-controller-manager、kube-scheduler組件pod
錯誤的重啟方式:用kubectl delete pods
刪除組件pod讓其自動啟動,如圖:
可以看到紅色框框的pod,最后一列AGE的值變成了重啟過的時間
查看日志:
查看組件的容器,可以看出容器並沒有重啟,還是4周之前啟動的:
原因:證書已經過期,使用kubectl delete pods
方式管理的容器自動重啟是沒法進行的。
當組件的證書沒生效的時候,去執行創建動作,可以查看到
kubectl get DaemonSet -n ingress-nginx
的結果都是0,用kubectl get pods -n ingress-nginx
查看會出現沒有pod的情況:
下面是正確重啟kube-apiserver、kube-controller-manager、kube-scheduler組件容器的方式
# docker ps |grep kube-apiserver|grep -v pause|awk '{print $1}'|xargs -i docker restart {}
# docker ps |grep kube-controller-manage|grep -v pause|awk '{print $1}'|xargs -i docker restart {}
# docker ps |grep kube-scheduler|grep -v pause|awk '{print $1}'|xargs -i docker restart {}
查看kube-apiserver、kube-controller-manager、kube-scheduler組件的日志,已經正常: