K8S中Pod的生命周期與ExecAction、TCPSocketAction和HTTPGetAction探針檢測
主機配置規划
服務器名稱(hostname) | 系統版本 | 配置 | 內網IP | 外網IP(模擬) |
---|---|---|---|---|
k8s-master | CentOS7.7 | 2C/4G/20G | 172.16.1.110 | 10.0.0.110 |
k8s-node01 | CentOS7.7 | 2C/4G/20G | 172.16.1.111 | 10.0.0.111 |
k8s-node02 | CentOS7.7 | 2C/4G/20G | 172.16.1.112 | 10.0.0.112 |
Pod容器生命周期
Pause容器說明
每個Pod里運行着一個特殊的被稱之為Pause的容器,其他容器則為業務容器,這些業務容器共享Pause容器的網絡棧和Volume掛載卷,因此他們之間通信和數據交換更為高效。在設計時可以充分利用這一特性,將一組密切相關的服務進程放入同一個Pod中;同一個Pod里的容器之間僅需通過localhost就能互相通信。
PID命名空間:Pod中的不同應用程序可以看到其他應用程序的進程ID。
網絡命名空間:Pod中的多個容器能夠訪問同一個IP和端口范圍。
IPC命名空間:Pod中的多個容器能夠使用SystemV IPC或POSIX消息隊列進行通信。
UTS命名空間:Pod中的多個容器共享一個主機名;Volumes(共享存儲卷)。
Pod中的各個容器可以訪問在Pod級別定義的Volumes。
容器探針
探針是由 kubelet 對容器執行的定期診斷。要執行診斷,則需kubelet 調用由容器實現的 Handler。探針有三種類型的處理程序:
- ExecAction:在容器內執行指定命令。如果命令退出時返回碼為 0 則認為診斷成功。
- CPSocketAction:對指定端口上的容器的 IP 地址進行 TCP 檢查。如果端口打開,則診斷被認為是成功的。
- HTTPGetAction:對指定的端口和路徑上的容器的 IP 地址執行 HTTP Get 請求。如果響應的狀態碼大於等於200 且小於 400,則診斷被認為是成功的。
每次探測都將獲得以下三種結果之一:
- 成功:容器通過了診斷。
- 失敗:容器未通過診斷。
- 未知:診斷失敗,因此不會采取任何行動。
Kubelet 可以選擇是否在容器上運行三種探針執行和做出反應:
- livenessProbe:指示容器是否正在運行。如果存活探測失敗,則 kubelet 會殺死容器,並且容器將受到其重啟策略的影響。如果容器不提供存活探針,則默認狀態為 Success。
- readinessProbe:指示容器是否准備好服務請求【對外接受請求訪問】。如果就緒探測失敗,端點控制器將從與 Pod 匹配的所有 Service 的端點中刪除該 Pod 的 IP 地址。初始延遲之前的就緒狀態默認為 Failure。如果容器不提供就緒探針,則默認狀態為 Success。
- startupProbe: 指示容器中的應用是否已經啟動。如果提供了啟動探測(startup probe),則禁用所有其他探測,直到它成功為止。如果啟動探測失敗,kubelet 將殺死容器,容器服從其重啟策略進行重啟。如果容器沒有提供啟動探測,則默認狀態為成功Success。
備注:可以以Tomcat web服務為例。
容器重啟策略
PodSpec 中有一個 restartPolicy 字段,可能的值為 Always、OnFailure 和 Never。默認為 Always。
Always表示一旦不管以何種方式終止運行,kubelet都將重啟;OnFailure表示只有Pod以非0退出碼退出才重啟;Nerver表示不再重啟該Pod。
restartPolicy 適用於 Pod 中的所有容器。restartPolicy 僅指通過同一節點上的 kubelet 重新啟動容器。失敗的容器由 kubelet 以五分鍾為上限的指數退避延遲(10秒,20秒,40秒…)重新啟動,並在成功執行十分鍾后重置。如 Pod 文檔中所述,一旦pod綁定到一個節點,Pod 將永遠不會重新綁定到另一個節點。
存活(liveness)和就緒(readiness)探針的使用場景
如果容器中的進程能夠在遇到問題或不健康的情況下自行崩潰,則不一定需要存活探針;kubelet 將根據 Pod 的restartPolicy 自動執行正確的操作。
如果你希望容器在探測失敗時被殺死並重新啟動,那么請指定一個存活探針,並指定restartPolicy 為 Always 或 OnFailure。
如果要僅在探測成功時才開始向 Pod 發送流量,請指定就緒探針。在這種情況下,就緒探針可能與存活探針相同,但是 spec 中的就緒探針的存在意味着 Pod 將在沒有接收到任何流量的情況下啟動,並且只有在探針探測成功后才開始接收流量。
Pod phase(階段)
Pod 的 status 定義在 PodStatus 對象中,其中有一個 phase 字段。
Pod 的運行階段(phase)是 Pod 在其生命周期中的簡單宏觀概述。該階段並不是對容器或 Pod 的綜合匯總,也不是為了做為綜合狀態機。
Pod 相位的數量和含義是嚴格指定的。除了本文檔中列舉的內容外,不應該再假定 Pod 有其他的 phase 值。
下面是 phase 可能的值:
- 掛起(Pending):Pod 已被 Kubernetes 系統接受,但有一個或者多個容器鏡像尚未創建。等待時間包括調度 Pod 的時間和通過網絡下載鏡像的時間,這可能需要花點時間。
- 運行中(Running):該 Pod 已經綁定到了一個節點上,Pod 中所有的容器都已被創建。至少有一個容器正在運行,或者正處於啟動或重啟狀態。
- 成功(Succeeded):Pod 中的所有容器都被成功終止,並且不會再重啟。
- 失敗(Failed):Pod 中的所有容器都已終止了,並且至少有一個容器是因為失敗終止。也就是說,容器以非0狀態退出或者被系統終止。
- 未知(Unknown):因為某些原因無法取得 Pod 的狀態,通常是因為與 Pod 所在主機通信失敗。
檢測探針-就緒檢測
pod yaml腳本
1 [root@k8s-master lifecycle]# pwd 2 /root/k8s_practice/lifecycle 3 [root@k8s-master lifecycle]# cat readinessProbe-httpget.yaml 4 apiVersion: v1 5 kind: Pod 6 metadata: 7 name: readiness-httpdget-pod 8 namespace: default 9 labels: 10 test: readiness-httpdget 11 spec: 12 containers: 13 - name: readiness-httpget 14 image: registry.cn-beijing.aliyuncs.com/google_registry/nginx:1.17 15 imagePullPolicy: IfNotPresent 16 readinessProbe: 17 httpGet: 18 path: /index1.html 19 port: 80 20 initialDelaySeconds: 5 #容器啟動完成后,kubelet在執行第一次探測前應該等待 5 秒。默認是 0 秒,最小值是 0。 21 periodSeconds: 3 #指定 kubelet 每隔 3 秒執行一次存活探測。默認是 10 秒。最小值是 1
創建 Pod,並查看pod狀態
1 [root@k8s-master lifecycle]# kubectl apply -f readinessProbe-httpget.yaml 2 pod/readiness-httpdget-pod created 3 [root@k8s-master lifecycle]# kubectl get pod -n default -o wide 4 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES 5 readiness-httpdget-pod 0/1 Running 0 5s 10.244.2.25 k8s-node02 <none> <none>
查看pod詳情
1 [root@k8s-master lifecycle]# kubectl describe pod readiness-httpdget-pod 2 Name: readiness-httpdget-pod 3 Namespace: default 4 Priority: 0 5 Node: k8s-node02/172.16.1.112 6 Start Time: Sat, 23 May 2020 16:10:04 +0800 7 Labels: test=readiness-httpdget 8 Annotations: kubectl.kubernetes.io/last-applied-configuration: 9 {"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"labels":{"test":"readiness-httpdget"},"name":"readiness-httpdget-pod","names... 10 Status: Running 11 IP: 10.244.2.25 12 IPs: 13 IP: 10.244.2.25 14 Containers: 15 readiness-httpget: 16 Container ID: docker://066d66aaef191b1db08e1b3efba6a9be75378d2fe70e99400fc513b91242089c 17 ……………… 18 Port: <none> 19 Host Port: <none> 20 State: Running 21 Started: Sat, 23 May 2020 16:10:05 +0800 22 Ready: False ##### 狀態為False 23 Restart Count: 0 24 Readiness: http-get http://:80/index1.html delay=5s timeout=1s period=3s #success=1 #failure=3 25 Environment: <none> 26 Mounts: 27 /var/run/secrets/kubernetes.io/serviceaccount from default-token-v48g4 (ro) 28 Conditions: 29 Type Status 30 Initialized True 31 Ready False ##### 為False 32 ContainersReady False ##### 為False 33 PodScheduled True 34 Volumes: 35 default-token-v48g4: 36 Type: Secret (a volume populated by a Secret) 37 SecretName: default-token-v48g4 38 Optional: false 39 QoS Class: BestEffort 40 Node-Selectors: <none> 41 Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s 42 node.kubernetes.io/unreachable:NoExecute for 300s 43 Events: 44 Type Reason Age From Message 45 ---- ------ ---- ---- ------- 46 Normal Scheduled <unknown> default-scheduler Successfully assigned default/readiness-httpdget-pod to k8s-node02 47 Normal Pulled 49s kubelet, k8s-node02 Container image "registry.cn-beijing.aliyuncs.com/google_registry/nginx:1.17" already present on machine 48 Normal Created 49s kubelet, k8s-node02 Created container readiness-httpget 49 Normal Started 49s kubelet, k8s-node02 Started container readiness-httpget 50 Warning Unhealthy 2s (x15 over 44s) kubelet, k8s-node02 Readiness probe failed: HTTP probe failed with statuscode: 404
由上可見,容器未就緒。
我們進入pod的第一個容器,然后創建對應的文件
1 [root@k8s-master lifecycle]# kubectl exec -it readiness-httpdget-pod -c readiness-httpget bash 2 root@readiness-httpdget-pod:/# cd /usr/share/nginx/html 3 root@readiness-httpdget-pod:/usr/share/nginx/html# ls 4 50x.html index.html 5 root@readiness-httpdget-pod:/usr/share/nginx/html# echo "readiness-httpdget info" > index1.html 6 root@readiness-httpdget-pod:/usr/share/nginx/html# ls 7 50x.html index.html index1.html
之后看pod狀態與詳情
1 [root@k8s-master lifecycle]# kubectl get pod -n default -o wide 2 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES 3 readiness-httpdget-pod 1/1 Running 0 2m30s 10.244.2.25 k8s-node02 <none> <none> 4 [root@k8s-master lifecycle]# kubectl describe pod readiness-httpdget-pod 5 Name: readiness-httpdget-pod 6 Namespace: default 7 Priority: 0 8 Node: k8s-node02/172.16.1.112 9 Start Time: Sat, 23 May 2020 16:10:04 +0800 10 Labels: test=readiness-httpdget 11 Annotations: kubectl.kubernetes.io/last-applied-configuration: 12 {"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"labels":{"test":"readiness-httpdget"},"name":"readiness-httpdget-pod","names... 13 Status: Running 14 IP: 10.244.2.25 15 IPs: 16 IP: 10.244.2.25 17 Containers: 18 readiness-httpget: 19 Container ID: docker://066d66aaef191b1db08e1b3efba6a9be75378d2fe70e99400fc513b91242089c 20 ……………… 21 Port: <none> 22 Host Port: <none> 23 State: Running 24 Started: Sat, 23 May 2020 16:10:05 +0800 25 Ready: True ##### 狀態為True 26 Restart Count: 0 27 Readiness: http-get http://:80/index1.html delay=5s timeout=1s period=3s #success=1 #failure=3 28 Environment: <none> 29 Mounts: 30 /var/run/secrets/kubernetes.io/serviceaccount from default-token-v48g4 (ro) 31 Conditions: 32 Type Status 33 Initialized True 34 Ready True ##### 為True 35 ContainersReady True ##### 為True 36 PodScheduled True 37 Volumes: 38 default-token-v48g4: 39 Type: Secret (a volume populated by a Secret) 40 SecretName: default-token-v48g4 41 Optional: false 42 QoS Class: BestEffort 43 Node-Selectors: <none> 44 Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s 45 node.kubernetes.io/unreachable:NoExecute for 300s 46 Events: 47 Type Reason Age From Message 48 ---- ------ ---- ---- ------- 49 Normal Scheduled <unknown> default-scheduler Successfully assigned default/readiness-httpdget-pod to k8s-node02 50 Normal Pulled 2m33s kubelet, k8s-node02 Container image "registry.cn-beijing.aliyuncs.com/google_registry/nginx:1.17" already present on machine 51 Normal Created 2m33s kubelet, k8s-node02 Created container readiness-httpget 52 Normal Started 2m33s kubelet, k8s-node02 Started container readiness-httpget 53 Warning Unhealthy 85s (x22 over 2m28s) kubelet, k8s-node02 Readiness probe failed: HTTP probe failed with statuscode: 404
由上可見,容器已就緒。
檢測探針-存活檢測
存活檢測-執行命令
pod yaml腳本
1 [root@k8s-master lifecycle]# pwd 2 /root/k8s_practice/lifecycle 3 [root@k8s-master lifecycle]# cat livenessProbe-exec.yaml 4 apiVersion: v1 5 kind: Pod 6 metadata: 7 name: liveness-exec-pod 8 labels: 9 test: liveness 10 spec: 11 containers: 12 - name: liveness-exec 13 image: registry.cn-beijing.aliyuncs.com/google_registry/busybox:1.24 14 imagePullPolicy: IfNotPresent 15 args: 16 - /bin/sh 17 - -c 18 - touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600 19 livenessProbe: 20 exec: 21 command: 22 - cat 23 - /tmp/healthy 24 initialDelaySeconds: 5 # 第一次檢測前等待5秒 25 periodSeconds: 3 # 檢測周期3秒一次
這個容器生命的前 30 秒,/tmp/healthy 文件是存在的。所以在這最開始的 30 秒內,執行命令 cat /tmp/healthy 會返回成功碼。30 秒之后,執行命令 cat /tmp/healthy 就會返回失敗狀態碼。
創建 Pod
1 [root@k8s-master lifecycle]# kubectl apply -f livenessProbe-exec.yaml 2 pod/liveness-exec-pod created
在 30 秒內,查看 Pod 的描述:
1 [root@k8s-master lifecycle]# kubectl get pod -o wide 2 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES 3 liveness-exec-pod 1/1 Running 0 17s 10.244.2.21 k8s-node02 <none> <none> 4 [root@k8s-master lifecycle]# kubectl describe pod liveness-exec-pod 5 Name: liveness-exec-pod 6 Namespace: default 7 Priority: 0 8 Node: k8s-node02/172.16.1.112 9 ……………… 10 Events: 11 Type Reason Age From Message 12 ---- ------ ---- ---- ------- 13 Normal Scheduled 25s default-scheduler Successfully assigned default/liveness-exec-pod to k8s-node02 14 Normal Pulled 24s kubelet, k8s-node02 Container image "registry.cn-beijing.aliyuncs.com/google_registry/busybox:1.24" already present on machine 15 Normal Created 24s kubelet, k8s-node02 Created container liveness-exec 16 Normal Started 24s kubelet, k8s-node02 Started container liveness-exec
輸出結果顯示:存活探測器成功。
35 秒之后,再來看 Pod 的描述:
1 [root@k8s-master lifecycle]# kubectl get pod -o wide # 顯示 RESTARTS 的值增加了 1 2 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES 3 liveness-exec-pod 1/1 Running 1 89s 10.244.2.22 k8s-node02 <none> <none> 4 [root@k8s-master lifecycle]# kubectl describe pod liveness-exec-pod 5 ……………… 6 Events: 7 Type Reason Age From Message 8 ---- ------ ---- ---- ------- 9 Normal Scheduled 42s default-scheduler Successfully assigned default/liveness-exec-pod to k8s-node02 10 Normal Pulled 41s kubelet, k8s-node02 Container image "registry.cn-beijing.aliyuncs.com/google_registry/busybox:1.24" already present on machine 11 Normal Created 41s kubelet, k8s-node02 Created container liveness-exec 12 Normal Started 41s kubelet, k8s-node02 Started container liveness-exec 13 Warning Unhealthy 2s (x3 over 8s) kubelet, k8s-node02 Liveness probe failed: cat: can't open '/tmp/healthy': No such file or directory 14 Normal Killing 2s kubelet, k8s-node02 Container liveness-exec failed liveness probe, will be restarted
由上可見,在輸出結果的最下面,有信息顯示存活探測器失敗了,因此這個容器被殺死並且被重建了。
存活檢測-HTTP請求
pod yaml腳本
1 [root@k8s-master lifecycle]# pwd 2 /root/k8s_practice/lifecycle 3 [root@k8s-master lifecycle]# cat livenessProbe-httpget.yaml 4 apiVersion: v1 5 kind: Pod 6 metadata: 7 name: liveness-httpget-pod 8 labels: 9 test: liveness 10 spec: 11 containers: 12 - name: liveness-httpget 13 image: registry.cn-beijing.aliyuncs.com/google_registry/nginx:1.17 14 imagePullPolicy: IfNotPresent 15 ports: 16 - name: http 17 containerPort: 80 18 livenessProbe: 19 httpGet: # 任何大於或等於 200 並且小於 400 的返回碼表示成功,其它返回碼都表示失敗。 20 path: /index.html 21 port: 80 22 httpHeaders: #請求中自定義的 HTTP 頭。HTTP 頭字段允許重復。 23 - name: Custom-Header 24 value: Awesome 25 initialDelaySeconds: 5 26 periodSeconds: 3
創建 Pod,查看pod狀態
1 [root@k8s-master lifecycle]# kubectl apply -f livenessProbe-httpget.yaml 2 pod/liveness-httpget-pod created 3 [root@k8s-master lifecycle]# kubectl get pod -n default -o wide 4 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES 5 liveness-httpget-pod 1/1 Running 0 3s 10.244.2.27 k8s-node02 <none> <none>
查看pod詳情
1 [root@k8s-master lifecycle]# kubectl describe pod liveness-httpget-pod 2 Name: liveness-httpget-pod 3 Namespace: default 4 Priority: 0 5 Node: k8s-node02/172.16.1.112 6 Start Time: Sat, 23 May 2020 16:45:25 +0800 7 Labels: test=liveness 8 Annotations: kubectl.kubernetes.io/last-applied-configuration: 9 {"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"labels":{"test":"liveness"},"name":"liveness-httpget-pod","namespace":"defau... 10 Status: Running 11 IP: 10.244.2.27 12 IPs: 13 IP: 10.244.2.27 14 Containers: 15 liveness-httpget: 16 Container ID: docker://4b42a351414667000fe94d4f3166d75e72a3401e549fed723126d2297124ea1a 17 ……………… 18 Port: 80/TCP 19 Host Port: 8080/TCP 20 State: Running 21 Started: Sat, 23 May 2020 16:45:26 +0800 22 Ready: True 23 Restart Count: 0 24 Liveness: http-get http://:80/index.html delay=5s timeout=1s period=3s #success=1 #failure=3 25 Environment: <none> 26 Mounts: 27 /var/run/secrets/kubernetes.io/serviceaccount from default-token-v48g4 (ro) 28 Conditions: 29 Type Status 30 Initialized True 31 Ready True 32 ContainersReady True 33 PodScheduled True 34 ……………… 35 Events: 36 Type Reason Age From Message 37 ---- ------ ---- ---- ------- 38 Normal Scheduled <unknown> default-scheduler Successfully assigned default/liveness-httpget-pod to k8s-node02 39 Normal Pulled 5m52s kubelet, k8s-node02 Container image "registry.cn-beijing.aliyuncs.com/google_registry/nginx:1.17" already present on machine 40 Normal Created 5m52s kubelet, k8s-node02 Created container liveness-httpget 41 Normal Started 5m52s kubelet, k8s-node02 Started container liveness-httpget
由上可見,pod存活檢測正常
我們進入pod的第一個容器,然后刪除對應的文件
1 [root@k8s-master lifecycle]# kubectl exec -it liveness-httpget-pod -c liveness-httpget bash 2 root@liveness-httpget-pod:/# cd /usr/share/nginx/html/ 3 root@liveness-httpget-pod:/usr/share/nginx/html# ls 4 50x.html index.html 5 root@liveness-httpget-pod:/usr/share/nginx/html# rm -f index.html 6 root@liveness-httpget-pod:/usr/share/nginx/html# ls 7 50x.html
再次看pod狀態和詳情,可見Pod的RESTARTS從0變為了1。
1 [root@k8s-master lifecycle]# kubectl get pod -n default -o wide # RESTARTS 從0變為了1 2 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES 3 liveness-httpget-pod 1/1 Running 1 8m16s 10.244.2.27 k8s-node02 <none> <none> 4 [root@k8s-master lifecycle]# kubectl describe pod liveness-httpget-pod 5 Name: liveness-httpget-pod 6 Namespace: default 7 Priority: 0 8 Node: k8s-node02/172.16.1.112 9 Start Time: Sat, 23 May 2020 16:45:25 +0800 10 Labels: test=liveness 11 Annotations: kubectl.kubernetes.io/last-applied-configuration: 12 {"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"labels":{"test":"liveness"},"name":"liveness-httpget-pod","namespace":"defau... 13 Status: Running 14 IP: 10.244.2.27 15 IPs: 16 IP: 10.244.2.27 17 Containers: 18 liveness-httpget: 19 Container ID: docker://5d0962d383b1df5e59cd3d1100b259ff0415ac37c8293b17944034f530fb51c8 20 ……………… 21 Port: 80/TCP 22 Host Port: 8080/TCP 23 State: Running 24 Started: Sat, 23 May 2020 16:53:38 +0800 25 Last State: Terminated 26 Reason: Completed 27 Exit Code: 0 28 Started: Sat, 23 May 2020 16:45:26 +0800 29 Finished: Sat, 23 May 2020 16:53:38 +0800 30 Ready: True 31 Restart Count: 1 32 Liveness: http-get http://:80/index.html delay=5s timeout=1s period=3s #success=1 #failure=3 33 Environment: <none> 34 Mounts: 35 /var/run/secrets/kubernetes.io/serviceaccount from default-token-v48g4 (ro) 36 Conditions: 37 Type Status 38 Initialized True 39 Ready True 40 ContainersReady True 41 PodScheduled True 42 Volumes: 43 default-token-v48g4: 44 Type: Secret (a volume populated by a Secret) 45 SecretName: default-token-v48g4 46 Optional: false 47 QoS Class: BestEffort 48 Node-Selectors: <none> 49 Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s 50 node.kubernetes.io/unreachable:NoExecute for 300s 51 Events: 52 Type Reason Age From Message 53 ---- ------ ---- ---- ------- 54 Normal Scheduled <unknown> default-scheduler Successfully assigned default/liveness-httpget-pod to k8s-node02 55 Normal Pulled 7s (x2 over 8m19s) kubelet, k8s-node02 Container image "registry.cn-beijing.aliyuncs.com/google_registry/nginx:1.17" already present on machine 56 Normal Created 7s (x2 over 8m19s) kubelet, k8s-node02 Created container liveness-httpget 57 Normal Started 7s (x2 over 8m19s) kubelet, k8s-node02 Started container liveness-httpget 58 Warning Unhealthy 7s (x3 over 13s) kubelet, k8s-node02 Liveness probe failed: HTTP probe failed with statuscode: 404 59 Normal Killing 7s kubelet, k8s-node02 Container liveness-httpget failed liveness probe, will be restarted
由上可見,當liveness-httpget檢測失敗,重建了Pod容器
存活檢測-TCP端口
pod yaml腳本
1 [root@k8s-master lifecycle]# pwd 2 /root/k8s_practice/lifecycle 3 [root@k8s-master lifecycle]# cat livenessProbe-tcp.yaml 4 apiVersion: v1 5 kind: Pod 6 metadata: 7 name: liveness-tcp-pod 8 labels: 9 test: liveness 10 spec: 11 containers: 12 - name: liveness-tcp 13 image: registry.cn-beijing.aliyuncs.com/google_registry/nginx:1.17 14 imagePullPolicy: IfNotPresent 15 ports: 16 - name: http 17 containerPort: 80 18 livenessProbe: 19 tcpSocket: 20 port: 80 21 initialDelaySeconds: 5 22 periodSeconds: 3
TCP探測正常情況
創建 Pod,查看pod狀態
1 [root@k8s-master lifecycle]# kubectl apply -f livenessProbe-tcp.yaml 2 pod/liveness-tcp-pod created 3 [root@k8s-master lifecycle]# kubectl get pod -o wide 4 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES 5 liveness-tcp-pod 1/1 Running 0 50s 10.244.4.23 k8s-node01 <none> <none>
查看pod詳情
1 [root@k8s-master lifecycle]# kubectl describe pod liveness-tcp-pod 2 Name: liveness-tcp-pod 3 Namespace: default 4 Priority: 0 5 Node: k8s-node01/172.16.1.111 6 Start Time: Sat, 23 May 2020 18:02:46 +0800 7 Labels: test=liveness 8 Annotations: kubectl.kubernetes.io/last-applied-configuration: 9 {"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"labels":{"test":"liveness"},"name":"liveness-tcp-pod","namespace":"default"}... 10 Status: Running 11 IP: 10.244.4.23 12 IPs: 13 IP: 10.244.4.23 14 Containers: 15 liveness-tcp: 16 Container ID: docker://4de13e7c2e36c028b2094bf9dcf8e2824bfd15b8c45a0b963e301b91ee1a926d 17 ……………… 18 Port: 80/TCP 19 Host Port: 8080/TCP 20 State: Running 21 Started: Sat, 23 May 2020 18:03:04 +0800 22 Ready: True 23 Restart Count: 0 24 Liveness: tcp-socket :80 delay=5s timeout=1s period=3s #success=1 #failure=3 25 Environment: <none> 26 Mounts: 27 /var/run/secrets/kubernetes.io/serviceaccount from default-token-v48g4 (ro) 28 Conditions: 29 Type Status 30 Initialized True 31 Ready True 32 ContainersReady True 33 PodScheduled True 34 Volumes: 35 default-token-v48g4: 36 Type: Secret (a volume populated by a Secret) 37 SecretName: default-token-v48g4 38 Optional: false 39 QoS Class: BestEffort 40 Node-Selectors: <none> 41 Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s 42 node.kubernetes.io/unreachable:NoExecute for 300s 43 Events: 44 Type Reason Age From Message 45 ---- ------ ---- ---- ------- 46 Normal Scheduled <unknown> default-scheduler Successfully assigned default/liveness-tcp-pod to k8s-node01 47 Normal Pulling 74s kubelet, k8s-node01 Pulling image "registry.cn-beijing.aliyuncs.com/google_registry/nginx:1.17" 48 Normal Pulled 58s kubelet, k8s-node01 Successfully pulled image "registry.cn-beijing.aliyuncs.com/google_registry/nginx:1.17" 49 Normal Created 57s kubelet, k8s-node01 Created container liveness-tcp 50 Normal Started 57s kubelet, k8s-node01 Started container liveness-tcp
以上是正常情況,可見存活探測成功。
模擬TCP探測失敗情況
將上面yaml文件中的探測TCP端口進行如下修改:
1 livenessProbe: 2 tcpSocket: 3 port: 8090 # 之前是80
刪除之前的pod並重新創建,並過一會兒看pod狀態
1 [root@k8s-master lifecycle]# kubectl apply -f livenessProbe-tcp.yaml 2 pod/liveness-tcp-pod created 3 [root@k8s-master lifecycle]# kubectl get pod -o wide # 可見RESTARTS變為了1,再過一會兒會變為2,之后依次疊加 4 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES 5 liveness-tcp-pod 1/1 Running 1 25s 10.244.2.28 k8s-node02 <none> <none>
pod詳情
1 [root@k8s-master lifecycle]# kubectl describe pod liveness-tcp-pod 2 Name: liveness-tcp-pod 3 Namespace: default 4 Priority: 0 5 Node: k8s-node02/172.16.1.112 6 Start Time: Sat, 23 May 2020 18:08:32 +0800 7 Labels: test=liveness 8 ……………… 9 Events: 10 Type Reason Age From Message 11 ---- ------ ---- ---- ------- 12 Normal Scheduled <unknown> default-scheduler Successfully assigned default/liveness-tcp-pod to k8s-node02 13 Normal Pulled 12s (x2 over 29s) kubelet, k8s-node02 Container image "registry.cn-beijing.aliyuncs.com/google_registry/nginx:1.17" already present on machine 14 Normal Created 12s (x2 over 29s) kubelet, k8s-node02 Created container liveness-tcp 15 Normal Started 12s (x2 over 28s) kubelet, k8s-node02 Started container liveness-tcp 16 Normal Killing 12s kubelet, k8s-node02 Container liveness-tcp failed liveness probe, will be restarted 17 Warning Unhealthy 0s (x4 over 18s) kubelet, k8s-node02 Liveness probe failed: dial tcp 10.244.2.28:8090: connect: connection refused
由上可見,liveness-tcp檢測失敗,重建了Pod容器。
檢測探針-啟動檢測
有時候,會有一些現有的應用程序在啟動時需要較多的初始化時間【如:Tomcat服務】。這種情況下,在不影響對觸發這種探測的死鎖的快速響應的情況下,設置存活探測參數是要有技巧的。
技巧就是使用一個命令來設置啟動探測。針對HTTP 或者 TCP 檢測,可以通過設置 failureThreshold * periodSeconds 參數來保證有足夠長的時間應對糟糕情況下的啟動時間。
示例如下:
pod yaml文件
1 [root@k8s-master lifecycle]# pwd 2 /root/k8s_practice/lifecycle 3 [root@k8s-master lifecycle]# cat startupProbe-httpget.yaml 4 apiVersion: v1 5 kind: Pod 6 metadata: 7 name: startup-pod 8 labels: 9 test: startup 10 spec: 11 containers: 12 - name: startup 13 image: registry.cn-beijing.aliyuncs.com/google_registry/tomcat:7.0.94-jdk8-openjdk 14 imagePullPolicy: IfNotPresent 15 ports: 16 - name: web-port 17 containerPort: 8080 18 hostPort: 8080 19 livenessProbe: 20 httpGet: 21 path: /index.jsp 22 port: web-port 23 initialDelaySeconds: 5 24 periodSeconds: 10 25 failureThreshold: 1 26 startupProbe: 27 httpGet: 28 path: /index.jsp 29 port: web-port 30 periodSeconds: 10 #指定 kubelet 每隔 10 秒執行一次存活探測。默認是 10 秒。最小值是 1 31 failureThreshold: 30 #最大的失敗次數
啟動pod,並查看狀態
1 [root@k8s-master lifecycle]# kubectl apply -f startupProbe-httpget.yaml 2 pod/startup-pod created 3 [root@k8s-master lifecycle]# kubectl get pod -o wide 4 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES 5 startup-pod 1/1 Running 0 8m46s 10.244.4.26 k8s-node01 <none> <none>
查看pod詳情
[root@k8s-master ~]# kubectl describe pod startup-pod
有啟動探測,應用程序將會有最多 5 分鍾(30 * 10 = 300s) 的時間來完成它的啟動。一旦啟動探測成功一次,存活探測任務就會接管對容器的探測,對容器死鎖可以快速響應。 如果啟動探測一直沒有成功,容器會在 300 秒后被殺死,並且根據 restartPolicy 來設置 Pod 狀態。
探測器配置詳解
使用如下這些字段可以精確的控制存活和就緒檢測行為:
- initialDelaySeconds:容器啟動后要等待多少秒后存活和就緒探測器才被初始化,默認是 0 秒,最小值是 0。
- periodSeconds:執行探測的時間間隔(單位是秒)。默認是 10 秒。最小值是 1。
- timeoutSeconds:探測的超時時間。默認值是 1 秒。最小值是 1。
- successThreshold:探測器在失敗后,被視為成功的最小連續成功數。默認值是 1。存活探測的這個值必須是 1。最小值是 1。
- failureThreshold:當探測失敗時,Kubernetes 的重試次數。存活探測情況下的放棄就意味着重新啟動容器。就緒探測情況下的放棄 Pod 會被打上未就緒的標簽。默認值是 3。最小值是 1。
HTTP 探測器可以在 httpGet 上配置額外的字段:
- host:連接使用的主機名,默認是 Pod 的 IP。也可以在 HTTP 頭中設置 “Host” 來代替。
- scheme :用於設置連接主機的方式(HTTP 還是 HTTPS)。默認是 HTTP。
- path:訪問 HTTP 服務的路徑。
- httpHeaders:請求中自定義的 HTTP 頭。HTTP 頭字段允許重復。
- port:訪問容器的端口號或者端口名。如果數字必須在 1 ~ 65535 之間。
相關閱讀
2、Kubernetes K8S之Pod 生命周期與init container初始化容器
完畢!
———END———
如果覺得不錯就關注下唄 (-^O^-) !