Kubernetes 常見錯誤

本文轉載自查看原文 2019-11-24 19:26 388 錯誤集錦/ kubernetes/ Bug/ Kubernetes

Pod 異常

OOMKilled: Pod 的內存使用超出了 resources.limits 中的限制，被強制殺死。
CrashLoopBackoff: Pod 進入 崩潰-重啟循環，重啟間隔時間從 10 20 40 80 一直翻倍到上限 300 秒，然后以 300 秒為間隔無限重啟。
Pod 一直 Pending: 這說明沒有任何節點能滿足 Pod 的要求，容器無法被調度。比如端口被別的容器用 hostPort 占用，節點有污點等。
FailedCreateSandBox: Failed create pod sandbox: rpc error: code = DeadlineExceeded desc = context deadline exceeded：很可能是 CNI 網絡插件的問題（比如 ip 地址溢出），
SandboxChanged: Pod sandbox changed, it will be killed and re-created: 很可能是由於內存限制導致容器被 OOMKilled，或者其他資源不足
FailedSync: error determining status: rpc error: code = DeadlineExceeded desc = context deadline exceeded: 常和前兩個錯誤先后出現，很可能是 CNI 網絡插件的問題。
開發集群，一次性部署所有服務時，各 Pod 互相爭搶資源，導致 Pod 生存探針失敗，不斷重啟，重啟進一步加重資源使用。惡性循環。
- 需要給每個 Pod 加上 resources.requests，這樣資源不足時，后續 Pod 會停止調度，直到資源恢復正常。
Pod 出現大量的 Failed 記錄，Deployment 一直重復建立 Pod: 通過 kubectl describe/edit pod <pod-name> 查看 pod Events 和 Status，一般會看到失敗信息，如節點異常導致 Pod 被驅逐。

404：不存在該 Service/Istio Gateway
503：Service 對應的 Pods NotReady
504：主要有兩種可能
1. 考慮是不是 Ingress Controller 的 IP 表未更新，將請求代理到了不存在的 Pod ip，導致得不到響應。
2. Pod 響應太慢，代碼問題。

Ingress 相關網絡問題的排查流程：

Which ingress controller?
Timeout between client and ingress controller, or between ingress controller and backend service/pod?
HTTP/504 generated by the ingress controller, proven by logs from the ingress controller?
If you port-forward to skip the internet between client and ingress controller, does the timeout still happen?

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 kubernetes集群常見錯誤及解決方法 kubernetes常見錯誤問題及解決辦法 rancher 使用RKE部署Kubernetes常見錯誤解決方法 Kubernetes 部署常見應用 kubernetes 常見問題整理「Kubernetes」- 常見 Calico 問題 @20210422 kubernetes之常見故障排除(一) Kubernetes 常見問題總結 ImageInspectError----Kubernetes遇到的錯誤 Kubernetes常用錯誤總結