嘗試的解決方案:
- 升級docker,因為通過查看,集群中的機器docker進程版本並不完全相同,升級完之后並且重啟docker進程
- 通過describe信息查看得到以下輸出
State: Terminated
Reason: OOMKilled
Exit Code: 137
Started: Mon, 23 Mar 2020 16:24:15 +0800
Finished: Mon, 23 Mar 2020 16:24:27 +0800
Ready: False
Restart Count: 29
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedCreatePodSandBox 48m (x43134 over 17h) kubelet, master Failed create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "kube-flannel-ds-amd64-d7xxk": Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "process_linux.go:303: getting the final child's pid from pipe caused \"read init-p: connection reset by peer\"": unknown
Warning FailedCreatePodSandBox 13m (x8967 over 16h) kubelet, master Failed create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "kube-flannel-ds-amd64-d7xxk": Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "process_linux.go:299: copying bootstrap data to pipe caused \"write init-p: broken pipe\"": unknown
Normal SandboxChanged 3m40s (x54265 over 22h) kubelet, master Pod sandbox changed, it will be killed and re-created.
oomkilld,內存不夠嗎?只有master上的flannel有這個錯誤,node上的沒有,限制的同樣的內存和CPU資源啊。但是查看node上的flannel組件並沒有出現類似信息。
kubectl patch ds -n=kube-system kube-flannel-ds-amd64 -p '{"spec": {"template":{"spec":{"containers": [{"name":"kube-flannel", "resources": {"limits": {"cpu": "250m","memory": "550Mi"},"requests": {"cpu": "100m","memory": "100Mi"}}}]}}}}'
但是我還是通過命令將內存和CPU資源擴展了一點,之后再查看會不會發生。如果不發生,那就是資源限制除了問題吧