Kubernetes 常见错误

本文转载自查看原文 2019-11-24 19:26 388 错误集锦/ kubernetes/ Bug/ Kubernetes

Pod 异常

OOMKilled: Pod 的内存使用超出了 resources.limits 中的限制，被强制杀死。
CrashLoopBackoff: Pod 进入 崩溃-重启循环，重启间隔时间从 10 20 40 80 一直翻倍到上限 300 秒，然后以 300 秒为间隔无限重启。
Pod 一直 Pending: 这说明没有任何节点能满足 Pod 的要求，容器无法被调度。比如端口被别的容器用 hostPort 占用，节点有污点等。
FailedCreateSandBox: Failed create pod sandbox: rpc error: code = DeadlineExceeded desc = context deadline exceeded：很可能是 CNI 网络插件的问题（比如 ip 地址溢出），
SandboxChanged: Pod sandbox changed, it will be killed and re-created: 很可能是由于内存限制导致容器被 OOMKilled，或者其他资源不足
FailedSync: error determining status: rpc error: code = DeadlineExceeded desc = context deadline exceeded: 常和前两个错误先后出现，很可能是 CNI 网络插件的问题。
开发集群，一次性部署所有服务时，各 Pod 互相争抢资源，导致 Pod 生存探针失败，不断重启，重启进一步加重资源使用。恶性循环。
- 需要给每个 Pod 加上 resources.requests，这样资源不足时，后续 Pod 会停止调度，直到资源恢复正常。
Pod 出现大量的 Failed 记录，Deployment 一直重复建立 Pod: 通过 kubectl describe/edit pod <pod-name> 查看 pod Events 和 Status，一般会看到失败信息，如节点异常导致 Pod 被驱逐。

404：不存在该 Service/Istio Gateway
503：Service 对应的 Pods NotReady
504：主要有两种可能
1. 考虑是不是 Ingress Controller 的 IP 表未更新，将请求代理到了不存在的 Pod ip，导致得不到响应。
2. Pod 响应太慢，代码问题。

Ingress 相关网络问题的排查流程：

Which ingress controller?
Timeout between client and ingress controller, or between ingress controller and backend service/pod?
HTTP/504 generated by the ingress controller, proven by logs from the ingress controller?
If you port-forward to skip the internet between client and ingress controller, does the timeout still happen?

本站转载的文章为个人学习借鉴使用，本站对版权不负任何法律责任。如果侵犯了您的隐私权益，请联系本站邮箱yoyou2525@163.com删除。

猜您在找 rancher 使用RKE部署Kubernetes常见错误解决方法「Kubernetes」- 常见 Calico 问题 @20210422 elasticsearch之常见错误 elasticsearch启动常见错误 mongo常见错误 MySQL 常见错误 Maven项目常见错误 IDEA常见错误 Maven常见错误（一） composer 重装常见错误