ubuntu 20.04 基於kubeadm部署kubernetes 1.22.4集群—報錯解決

本文轉載自查看原文 2021-11-29 16:41 943 Kubernetes

一、添加node節點，報錯1

注：可以提前在各node節點上修改好（無報錯無需執行此項）

yang@node2:~$ sudo kubeadm join 192.168.1.101:6443 --token 6131nu.8ohxo1ttgwiqlwmp --discovery-token-ca-cert-hash sha256:9bece23d1089b6753a42ce4dab3fa5ac7d2d4feb260a0f682bfb06ccf1eb4fe2
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
error execution phase kubelet-start: error uploading crisocket: timed out waiting for the condition
To see the stack trace of this error execute with --v=5 or higher

原　　因：

1.因為docker默認的Cgroup Driver是cgroupfs ，cgroupfs是cgroup為給用戶提供的操作接口而開發的虛擬文件系統類型

　它和sysfs，proc類似，可以向用戶展示cgroup的hierarchy，通知kernel用戶對cgroup改動

　對cgroup的查詢和修改只能通過cgroupfs文件系統來進行

2.Kubernetes 推薦使用 systemd 來代替 cgroupfs

　因為systemd是Kubernetes自帶的cgroup管理器, 負責為每個進程分配cgroups,

但docker的cgroup driver默認是cgroupfs,這樣就同時運行有兩個cgroup控制管理器,

當資源有壓力的情況時,有可能出現不穩定的情況

解決辦法：

①.創建daemon.json文件，加入以下內容：
ls /etc/docker/daemon.json
{"exec-opts": ["native.cgroupdriver=systemd"]}
②.重啟docker
sudo systemctl restart docker
③.重啟kubelet
sudo systemctl restart kubelet
sudo systemctl status kubelet
④.重新執行加入集群命令
首先清除緩存
sudo kubeadm reset
⑤.加入集群
sudo kubeadm join 192.168.1.101:6443 --token 6131nu.8ohxo1ttgwiqlwmp --discovery-token-ca-cert-hash sha256:9bece23d1089b6753a42ce4dab3fa5ac7d2d4feb260a0f682bfb06ccf1eb4fe2

二、報錯2

注：各node上需要有kube-proxy鏡像

問題：

Normal BackOff 106s (x249 over 62m) kubelet Back-off pulling image "k8s.gcr.io/kube-proxy:v1.22.4"

原　　因：
缺少k8s.gcr.io/kube-proxy:v1.22.4鏡像

解決辦法：

① 上傳鏡像

sudo docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.22.4

② 修改鏡像tag

sudo docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.22.4 k8s.gcr.io/kube-proxy:v1.22.4

三、報錯3

注：各node上需要有pause:3.5鏡像

問題：

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 17m default-scheduler Successfully assigned kube-system/kube-proxy-dv2sw to node2
Warning FailedCreatePodSandBox 4m48s (x26 over 16m) kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed pulling image "k8s.gcr.io/pause:3.5": Error response from daemon: Get "https://k8s.gcr.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning FailedCreatePodSandBox 93s (x2 over 16m) kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed pulling image "k8s.gcr.io/pause:3.5": Error response from daemon: Get "https://k8s.gcr.io/v2/": dial tcp 74.125.195.82:443: i/o timeout

原　　因：
缺少k8s.gcr.io/pause:3.5鏡像

解決辦法：

① 上傳鏡像

sudo docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5

② 修改鏡像tag

sudo docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5 k8s.gcr.io/pause:3.5

四、報錯4

注：各node上需要有coredns:v1.8.4鏡像

問題：

Warning Failed 28m (x11 over 73m) kubelet Failed to pull image "k8s.gcr.io/coredns/coredns:v1.8.4"
: rpc error: code = Unknown desc = Error response from daemon: Get "https://k8s.gcr.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

這里是一個坑，鏡像的tag是v1.8.4,而不是1.8.4，所以導致下載超時
查看官網images列表

yang@master:/etc/docker$ kubeadm config images list
k8s.gcr.io/kube-apiserver:v1.22.4
k8s.gcr.io/kube-controller-manager:v1.22.4
k8s.gcr.io/kube-scheduler:v1.22.4
k8s.gcr.io/kube-proxy:v1.22.4
k8s.gcr.io/pause:3.5
k8s.gcr.io/etcd:3.5.0-0
k8s.gcr.io/coredns/coredns:v1.8.4

解決辦法：

① 上傳鏡像

sudo docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.8.4

② 修改鏡像tag

sudo docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.8.4 k8s.gcr.io/coredns/coredns:v1.8.4

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 kubeadm安裝kubernetes 1.13.1集群完整部署記錄 CentOS部署Kubernetes1.13集群-1（使用kubeadm安裝K8S）使用kubeadm在CentOS上搭建Kubernetes1.14.3集群 kubernetes1.18.2集群部署 Kubeadm 部署k8s 1.21.1集群 kubeadm部署kubernetes集群 Ubuntu16.04安裝kubernetes1.13集群使用kubeadm部署K8S v1.17.0集群 etcd 基於ubuntu 20.04 部署集群 kubernetes kubeadm部署高可用集群