Kubernetes常用錯誤總結

本文轉載自查看原文 2020-09-30 11:09 495 Mesos&Kubernetes

1、Kubernetes: requesting flag for "kubectl logs" to avoid 5-minute timeout if no stdout/stderr

When running kubectl logs --follow on a pod, after 5 minutes of no stdout/stderr, we received:

$ kubectl --kubeconfig=config --namespace=foo logs --follow foo-oneoff-w8npn --container bar

######################################
#                                    #
#  /system_tests/test_derp.py (1/4)  #
#                                    #
######################################

RESULTS: [/system_tests/test_derp/TestDerp] Ran 4 tests in 130.044 s, ALL TESTS PASSED


######################################
#                                    #
#  /system_tests/test_fuzz.py (2/4)  #
#                                    #
######################################

error: unexpected EOF

解決方案：

For posterity, the problem here is that we're using haproxy to serve a VIP to balance HA Kubernetes masters.

Specifically we're using HA-Proxy v1.4.21, and we have this in our haproxy cfg:

defaults
  timeout client  500000
  timeout server  500000

2、The connection to the server localhost:8080 was refused - did you specify the right host or port? 錯誤

K8s集群初始化成功后，kubectl get nodes 查看節點信息時報錯

報錯信息：The connection to the server localhost:8080 was refused - did you specify the right host or port?

解決方法：

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

3、master節點的calico組件是0/1 Running狀態，其他worker節點的calico組件是1/1 Running狀態，describe pod發現是

Readiness probe failed: calico/node is not ready: BIRD is not ready: BGP not established with 10.244.0.1,10.244.2.12020-04-13 06:29:59.582 [INFO][682] health.go 156: Number of node(s) with BGP peering established = 0

解決辦法：
修改的calico.yaml，新增兩行：

            - name: IP_AUTODETECTION_METHOD
              value: "interface=eth0"

value指向從ip a看到的實際網卡名。結果如下：

            # Cluster type to identify the deployment type
            - name: CLUSTER_TYPE
              value: "k8s,bgp"
            - name: IP_AUTODETECTION_METHOD
              value: "interface=eth0"
            # Auto-detect the BGP IP address.
            - name: IP
              value: "autodetect"
            # Enable IPIP
            - name: CALICO_IPV4POOL_IPIP
              value: "Always"

等一會就正常了。

參考：https://blog.csdn.net/majixiang1996/article/details/105438506/

4、使用kubeadm安裝kubernetes集群etcd一直卡在starting狀態，查看etcd日志報2020-12-08 17:11:10.741954 I | embed: rejected connection from "192.168.100.179:47288" (error "tls: failed to verify client‘s certificate: x509: certificate has expired or is not yet valid", ServerName "") 錯誤

錯誤分析：原因是生成證書的機器時間要比服務器時間快，導致服務器驗證時，證書超出了時間使用范圍。

解決辦法：

　　1）、服務器和生成證書機器進行時間同步更新（高可用k8s集群需要master節點之間時間同步）。

　　2）、或者直接調整生成證書的機器時間，小於服務器的時間。

參考：http://www.bubuko.com/infodetail-3670080.html

5、如果使用nfs作為存儲方案所有節點都需要保證安裝好nfs-client

檢查服務器是否已安裝好nfs-utils、rpcbind、libtirpc包。

參考：http://www.bubuko.com/infodetail-3546759.html

6、報錯 cannot allocate memory 或者 no space left on device ，修復K8S內存泄露問題

鏈接：https://www.cnblogs.com/zhangmingcheng/p/14309962.html

7、執行kubectl命令時報錯 error: You must be logged in to the server (Unauthorized)

鏈接：https://www.cnblogs.com/zhangmingcheng/p/14317551.html

8、kubectl delete kube-scheduler-xxxx -n=kube-system 后，這個pod沒有重啟直接沒了，重啟kubelet服務調度pod重啟

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 React Native常用錯誤集合 wireshark常用錯誤提示分析-轉 Chrome 插件開發常用錯誤處理 Sentry實時應用錯誤跟蹤系統在Kubernetes中私有化部署 pycharm使用錯誤排查 Java泛型（2）常見使用錯誤一些內存使用錯誤理解 http 調用錯誤處理 Kubernetes-4：kubectl常用命令總結 No space left on device 解決 Cydia 安裝應用錯誤