Kubernetes常用错误总结


1、Kubernetes: requesting flag for "kubectl logs" to avoid 5-minute timeout if no stdout/stderr 

When running kubectl logs --follow on a pod, after 5 minutes of no stdout/stderr, we received:

$ kubectl --kubeconfig=config --namespace=foo logs --follow foo-oneoff-w8npn --container bar

######################################
#                                    #
#  /system_tests/test_derp.py (1/4)  #
#                                    #
######################################

RESULTS: [/system_tests/test_derp/TestDerp] Ran 4 tests in 130.044 s, ALL TESTS PASSED


######################################
#                                    #
#  /system_tests/test_fuzz.py (2/4)  #
#                                    #
######################################

error: unexpected EOF

 解决方案:

For posterity, the problem here is that we're using haproxy to serve a VIP to balance HA Kubernetes masters.

Specifically we're using HA-Proxy v1.4.21, and we have this in our haproxy cfg:

defaults
  timeout client  500000
  timeout server  500000

2、The connection to the server localhost:8080 was refused - did you specify the right host or port? 错误

K8s集群初始化成功后,kubectl get nodes 查看节点信息时报错

报错信息:The connection to the server localhost:8080 was refused - did you specify the right host or port?

解决方法:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

3、master节点的calico组件是0/1 Running状态,其他worker节点的calico组件是1/1 Running状态,describe pod发现是

Readiness probe failed: calico/node is not ready: BIRD is not ready: BGP not established with 10.244.0.1,10.244.2.12020-04-13 06:29:59.582 [INFO][682] health.go 156: Number of node(s) with BGP peering established = 0

解决办法:
修改的calico.yaml,新增两行:

            - name: IP_AUTODETECTION_METHOD
              value: "interface=eth0"

value指向从ip a看到的实际网卡名。结果如下:

            # Cluster type to identify the deployment type
            - name: CLUSTER_TYPE
              value: "k8s,bgp"
            - name: IP_AUTODETECTION_METHOD
              value: "interface=eth0"
            # Auto-detect the BGP IP address.
            - name: IP
              value: "autodetect"
            # Enable IPIP
            - name: CALICO_IPV4POOL_IPIP
              value: "Always"

等一会就正常了。

参考:https://blog.csdn.net/majixiang1996/article/details/105438506/

4、使用kubeadm安装kubernetes集群etcd一直卡在starting状态,查看etcd日志报2020-12-08 17:11:10.741954 I | embed: rejected connection from "192.168.100.179:47288" (error "tls: failed to verify client‘s certificate: x509: certificate has expired or is not yet valid", ServerName "") 错误

错误分析:原因是生成证书的机器时间要比服务器时间快,导致服务器验证时,证书超出了时间使用范围。

解决办法:

  1)、服务器和生成证书机器进行时间同步更新(高可用k8s集群需要master节点之间时间同步)。

  2)、或者直接调整生成证书的机器时间,小于服务器的时间。

参考:http://www.bubuko.com/infodetail-3670080.html

5、如果使用nfs作为存储方案所有节点都需要保证安装好nfs-client

检查服务器是否已安装好nfs-utils、rpcbind、libtirpc包。

参考:http://www.bubuko.com/infodetail-3546759.html

6、报错 cannot allocate memory 或者 no space left on device ,修复K8S内存泄露问题

链接:https://www.cnblogs.com/zhangmingcheng/p/14309962.html

7、执行kubectl命令时报错 error: You must be logged in to the server (Unauthorized)

链接:https://www.cnblogs.com/zhangmingcheng/p/14317551.html

8、kubectl delete kube-scheduler-xxxx  -n=kube-system 后,这个pod没有重启直接没了,重启kubelet服务调度pod重启

 

 


免责声明!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系本站邮箱yoyou2525@163.com删除。



 
粤ICP备18138465号  © 2018-2025 CODEPRJ.COM