1、概述
#問題:當k8s集群中的某個節點出現故障時,在上面運行的pod會有什么樣的行為?
OK,本文檔就介紹下在節點故障時,pod的驅逐行為是如何定義的。
2、一個實驗
在這個實驗中,我們關閉k8s中的一個節點,然后看下這個節點上的信息會有哪些的變化及pod的運行的行為的變化。
2.1、運行一個deployment
確保在要測試的節點上,有pod運行。
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-taints
namespace: default
spec:
progressDeadlineSeconds: 600
selector:
matchLabels:
app: nginx-taints
replicas: 5
template:
metadata:
labels:
app: nginx-taints
spec:
containers:
- image: 172.20.58.152/middleware/nginx:1.21.4
imagePullPolicy: IfNotPresent
name: nginx
dnsPolicy: ClusterFirst
restartPolicy: Always
基於以上的配置,創建一個deployment.
[root@nccztsjb-node-23 ~]# kubectl apply -f nginx-taints.yaml
deployment.apps/nginx-taints created
[root@nccztsjb-node-23 ~]# kubectl get pod -l app=nginx-taints -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-taints-6698889db5-j546r 1/1 Running 0 12s 172.39.157.212 nccztsjb-node-24 <none> <none>
nginx-taints-6698889db5-tpmb2 1/1 Running 0 12s 172.39.209.124 nccztsjb-node-23 <none> <none>
nginx-taints-6698889db5-w7rdm 1/1 Running 0 12s 172.39.209.123 nccztsjb-node-23 <none> <none>
nginx-taints-6698889db5-w7zjm 1/1 Running 0 12s 172.39.157.211 nccztsjb-node-24 <none> <none>
nginx-taints-6698889db5-x9mdz 1/1 Running 0 12s 172.39.21.67 nccztsjb-node-25 <none> <none>
ok,pod已經運行。
我們這次以節點nccztsjb-node-24為例來進行驗證。
2.2、將節點kubelet進程關閉
關閉節點nccztsjb-node-24的kubelet進程
systemctl stop kubelet
關閉服務,幾分鍾后······
查看集群中,節點的狀態
[root@nccztsjb-node-23 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
nccztsjb-node-23 Ready control-plane,master 36d v1.23.2
nccztsjb-node-24 NotReady <none> 36d v1.23.2
nccztsjb-node-25 Ready ingress,prometheus-server 36d v1.23.2
[root@nccztsjb-node-23 ~]#
節點nccztsjb-node-24的狀態已經變為NoteReady了。
查看節點的信息變化
[root@nccztsjb-node-23 ~]# kubectl describe nodes nccztsjb-node-24 | more
Name: nccztsjb-node-24
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=nccztsjb-node-24
kubernetes.io/os=linux
Annotations: kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: 0
projectcalico.org/IPv4Address: 172.20.58.65/24
projectcalico.org/IPv4IPIPTunnelAddr: 172.39.157.192
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Tue, 25 Jan 2022 12:07:13 +0800
Taints: node.kubernetes.io/unreachable:NoExecute
node.kubernetes.io/unreachable:NoSchedule
發現已經自動加上了如下的taints
Taints: node.kubernetes.io/unreachable:NoExecute
node.kubernetes.io/unreachable:NoSchedule
查看pod的變化
[root@nccztsjb-node-23 ~]# kubectl get pod
NAME READY STATUS RESTARTS AGE
nginx-taints-6698889db5-j546r 1/1 Running 0 2m5s
nginx-taints-6698889db5-tpmb2 1/1 Running 0 2m5s
nginx-taints-6698889db5-w7rdm 1/1 Running 0 2m5s
nginx-taints-6698889db5-w7zjm 1/1 Running 0 2m5s
nginx-taints-6698889db5-x9mdz 1/1 Running 0 2m5s
[root@nccztsjb-node-23 ~]#
kubectl get pod nginx-taints-6698889db5-x9mdz -o yaml
發現··· ···
被加上了如下的tolerations
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
這樣pod上就加上了tolerations,就是在節點not-ready時,tolerationSeconds: 300還會在節點上運行5分鍾,而不會立即被驅逐。
觀察nccztsjb-node-24節點上,docker進程的狀態
[root@nccztsjb-node-24 ~]# docker ps | grep nginx-taints
efc6733b1866 ea335eea17ab "/docker-entrypoint.…" 6 minutes ago Up 5 minutes k8s_nginx_nginx-taints-6698889db5-j546r_default_c67a09b1-cb53-4f98-b2b6-c6e7ad45b818_0
ed4dce36693c ea335eea17ab "/docker-entrypoint.…" 6 minutes ago Up 5 minutes k8s_nginx_nginx-taints-6698889db5-w7zjm_default_3eb2dbcf-ee55-420b-8758-0512016747b4_0
c5a78f9b2459 gotok8s/pause:3.6 "/pause" 6 minutes ago Up 5 minutes k8s_POD_nginx-taints-6698889db5-j546r_default_c67a09b1-cb53-4f98-b2b6-c6e7ad45b818_0
e78370d4fcf6 gotok8s/pause:3.6 "/pause" 6 minutes ago Up 5 minutes k8s_POD_nginx-taints-6698889db5-w7zjm_default_3eb2dbcf-ee55-420b-8758-0512016747b4_0
[root@nccztsjb-node-24 ~]#
處於運行的狀態,因為……沒有人給kubelet下發任務來關閉docker服務
觀察pod的狀態,5分鍾后··· ···
[root@nccztsjb-node-23 ~]# kubectl get pod -o wide -w
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-taints-6698889db5-j546r 1/1 Running 0 4m40s 172.39.157.212 nccztsjb-node-24 <none> <none>
nginx-taints-6698889db5-tpmb2 1/1 Running 0 4m40s 172.39.209.124 nccztsjb-node-23 <none> <none>
nginx-taints-6698889db5-w7rdm 1/1 Running 0 4m40s 172.39.209.123 nccztsjb-node-23 <none> <none>
nginx-taints-6698889db5-w7zjm 1/1 Running 0 4m40s 172.39.157.211 nccztsjb-node-24 <none> <none>
nginx-taints-6698889db5-x9mdz 1/1 Running 0 4m40s 172.39.21.67 nccztsjb-node-25 <none> <none>
nginx-taints-6698889db5-w7zjm 1/1 Terminating 0 6m26s 172.39.157.211 nccztsjb-node-24 <none> <none>
nginx-taints-6698889db5-j546r 1/1 Terminating 0 6m26s 172.39.157.212 nccztsjb-node-24 <none> <none>
nginx-taints-6698889db5-dlqht 0/1 Pending 0 0s <none> <none> <none> <none>
nginx-taints-6698889db5-msdnh 0/1 Pending 0 0s <none> <none> <none> <none>
nginx-taints-6698889db5-dlqht 0/1 Pending 0 0s <none> nccztsjb-node-25 <none> <none>
nginx-taints-6698889db5-msdnh 0/1 Pending 0 0s <none> nccztsjb-node-23 <none> <none>
nginx-taints-6698889db5-dlqht 0/1 ContainerCreating 0 0s <none> nccztsjb-node-25 <none> <none>
nginx-taints-6698889db5-msdnh 0/1 ContainerCreating 0 0s <none> nccztsjb-node-23 <none> <none>
nginx-taints-6698889db5-msdnh 0/1 ContainerCreating 0 1s <none> nccztsjb-node-23 <none> <none>
nginx-taints-6698889db5-dlqht 0/1 ContainerCreating 0 1s <none> nccztsjb-node-25 <none> <none>
nginx-taints-6698889db5-dlqht 1/1 Running 0 2s 172.39.21.68 nccztsjb-node-25 <none> <none>
nginx-taints-6698889db5-msdnh 1/1 Running 0 2s 172.39.209.125 nccztsjb-node-23 <none> <none>
nccztsjb-node-24節點上的pod處於Terminating的狀態,並且在其他的節點重新啟動了2個實例
[root@nccztsjb-node-23 ~]# kubectl get pods --sort-by=.spec.nodeName -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-taints-6698889db5-msdnh 1/1 Running 0 5m57s 172.39.209.125 nccztsjb-node-23 <none> <none>
nginx-taints-6698889db5-tpmb2 1/1 Running 0 12m 172.39.209.124 nccztsjb-node-23 <none> <none>
nginx-taints-6698889db5-w7rdm 1/1 Running 0 12m 172.39.209.123 nccztsjb-node-23 <none> <none>
nginx-taints-6698889db5-j546r 1/1 Terminating 0 12m 172.39.157.212 nccztsjb-node-24 <none> <none>
nginx-taints-6698889db5-w7zjm 1/1 Terminating 0 12m 172.39.157.211 nccztsjb-node-24 <none> <none>
nginx-taints-6698889db5-dlqht 1/1 Running 0 5m57s 172.39.21.68 nccztsjb-node-25 <none> <none>
nginx-taints-6698889db5-x9mdz 1/1 Running 0 12m 172.39.21.67 nccztsjb-node-25 <none> <none>
那么,此時在節點nccztsjb-node-24上的docker容器是什么狀態?
[root@nccztsjb-node-24 ~]# docker ps | grep nginx-taints
efc6733b1866 ea335eea17ab "/docker-entrypoint.…" 13 minutes ago Up 13 minutes k8s_nginx_nginx-taints-6698889db5-j546r_default_c67a09b1-cb53-4f98-b2b6-c6e7ad45b818_0
ed4dce36693c ea335eea17ab "/docker-entrypoint.…" 13 minutes ago Up 13 minutes k8s_nginx_nginx-taints-6698889db5-w7zjm_default_3eb2dbcf-ee55-420b-8758-0512016747b4_0
c5a78f9b2459 gotok8s/pause:3.6 "/pause" 13 minutes ago Up 13 minutes k8s_POD_nginx-taints-6698889db5-j546r_default_c67a09b1-cb53-4f98-b2b6-c6e7ad45b818_0
e78370d4fcf6 gotok8s/pause:3.6 "/pause" 13 minutes ago Up 13 minutes k8s_POD_nginx-taints-6698889db5-w7zjm_default_3eb2dbcf-ee55-420b-8758-0512016747b4_0
[root@nccztsjb-node-24 ~]#
依然,處於運行的狀態。
原因很簡單,kubelet和apiserver失聯,無法接收到關閉pod的指令。
2.3、重新啟動節點的kubelet服務
systemctl start kubelet
此時,再次,查看節點的狀態
[root@nccztsjb-node-23 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
nccztsjb-node-23 Ready control-plane,master 36d v1.23.2
nccztsjb-node-24 Ready <none> 36d v1.23.2
nccztsjb-node-25 Ready ingress,prometheus-server 36d v1.23.2
[root@nccztsjb-node-23 ~]#
恢復正常,為Ready的狀態。
查看pod的狀態
[root@nccztsjb-node-23 ~]# kubectl get pods --sort-by=.spec.nodeName -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-taints-6698889db5-msdnh 1/1 Running 0 8m47s 172.39.209.125 nccztsjb-node-23 <none> <none>
nginx-taints-6698889db5-tpmb2 1/1 Running 0 15m 172.39.209.124 nccztsjb-node-23 <none> <none>
nginx-taints-6698889db5-w7rdm 1/1 Running 0 15m 172.39.209.123 nccztsjb-node-23 <none> <none>
nginx-taints-6698889db5-dlqht 1/1 Running 0 8m47s 172.39.21.68 nccztsjb-node-25 <none> <none>
nginx-taints-6698889db5-x9mdz 1/1 Running 0 15m 172.39.21.67 nccztsjb-node-25 <none> <none>
[root@nccztsjb-node-23 ~]#
之前為Terminating狀態的pod,順利被刪除。
節點nccztsjb-node-24上查看docker容器
[root@nccztsjb-node-24 ~]# docker ps | grep nginx-taints
[root@nccztsjb-node-24 ~]#
已經被關閉。原因很簡單,kubelet正常和api server通信,獲取api server指令,關閉了節點上的pod.
查看pod的描述信息
kubectl get pod nginx-taints-6698889db5-x9mdz -o yaml
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
這些被加上去的toleration還在沒有被刪除掉。因為,對pod的運行沒有影響。
OK,以上就是整個的實驗,關於模擬,k8s集群節點故障的實驗。
3、思考及解釋
- 1、node上的taint是如何加上去的?
- 2、pod上的tolerations是如何被加上去的?
- 3、node故障時,還會運行多久?
OK,那讓我們來一一說明以上的問題……
1、node上的taints是如何加上去的?
node controller(節點控制器)在某些條件下,會自動的為節點上taints.
詳細可參考:https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
2、pod上的tolerations是如何加上去的?
pod上tolerations是由Admission Controller加上去的。
默認的Admission Controller中的DefaultTolerationSeconds插件,會自動將node.kubernetes.io/not-ready 和node.kubernetes.io/unreachable這2個tolerations加上,並且默認的tolerationSeconds=300(單位:秒)
3、node故障時,pod還會運行多久?
通過以上的實驗,tolerationSeconds=300即默認,node故障時,node會自動加上taints,pod會增加這個tolerations屬性,默認容忍時間是300s,5分鍾。
即,節點故障時,pod可再運行5分鍾。