背景:
當node宕機時,希望該node節點上的pod能夠快速疏散到其他節點,並提供服務。測試發現,要等待5分鍾,上面的pod才會疏散。
網上介紹通過修改 /etc/kubernetes/manifests/kube-controller-manager.yaml
- --node-monitor-grace-period=10s - --node-monitor-period=2s - --pod-eviction-timeout=10s
然而驗證不生效。
解決辦法:
通過修改deployment解決
[root@node-01 testnginx]# kubectl describe pod nginx-deployment|grep -i toleration -A 2 Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: <none> -- Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: <none>
[root@node-01 testnginx]# cat test-nginx.yaml apiVersion: extensions/v1beta1 kind: Deployment metadata: name: my-nginx spec: replicas: 2 template: metadata: labels: app: my-nginx spec: tolerations: - key: "node.kubernetes.io/unreachable" operator: "Exists" effect: "NoExecute" tolerationSeconds: 2 - key: "node.kubernetes.io/not-ready" operator: "Exists" effect: "NoExecute" tolerationSeconds: 2 containers: - name: my-nginx image: nginx ports: - containerPort: 443
親測有效!!!