忘記改hostname了,導致master節點name顯示不正常
- 日常踩坑
- 而且flannel也沒起來
[root@master01 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
localhost.localdomain NotReady master 44h v1.19.3
node01 Ready <none> 44h v1.19.3
node02 Ready <none> 44h v1.19.3
一、查看flannel狀態
- 的確是分配到localhost.localdomain節點了,可是起不來
[root@master01 ~]# kubectl get po -n kube-system kube-flannel-ds-w4mwc -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-flannel-ds-w4mwc 0/1 Pending 0 102s <none> localhost.localdomain <none> <none>
- describe查看,以下是Events的日志
- 沒了,重啟也沒用,於是決定先吧k8s的master節點name改一下
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 3m7s default-scheduler Successfully assigned kube-system/kube-flannel-ds-w4mwc to localhost.localdomain
二、修改步驟(測試環境,生產慎用)
- 參考了一些文章,因為我這是測試環境,有是單matser所以我就干脆直接重新來了
2.1、刪除節點
[root@master01 ~]# kubectl delete node node01 node02 localhost.localdomain
node "node01" deleted
node "node02" deleted
node "localhost.localdomain" deleted
2.2、查看是否刪除完畢
[root@master01 ~]# kubectl get nodes
No resources found
# 已經變成Pending狀態
[root@master01 ~]# kubectl get csr
NAME AGE SIGNERNAME REQUESTOR CONDITION
csr-6pqpf 43m kubernetes.io/kube-apiserver-client-kubelet system:node:localhost.localdomain Pending
csr-8vj2f 90m kubernetes.io/kube-apiserver-client-kubelet system:node:localhost.localdomain Pending
csr-9cs9r 59m kubernetes.io/kube-apiserver-client-kubelet system:node:localhost.localdomain Pending
csr-9gh45 105m kubernetes.io/kube-apiserver-client-kubelet system:node:localhost.localdomain Pending
csr-9ln7v 151m kubernetes.io/kube-apiserver-client-kubelet system:node:localhost.localdomain Pending
csr-9zbnw 74m kubernetes.io/kube-apiserver-client-kubelet system:node:localhost.localdomain Pending
csr-d7jdl 162m kubernetes.io/kube-apiserver-client-kubelet system:node:localhost.localdomain Pending
csr-dh8hs 136m kubernetes.io/kube-apiserver-client-kubelet system:node:localhost.localdomain Pending
csr-ktbl6 12m kubernetes.io/kube-apiserver-client-kubelet system:node:localhost.localdomain Pending
csr-m7d22 121m kubernetes.io/kube-apiserver-client-kubelet system:node:localhost.localdomain Pending
csr-zsfgr 28m kubernetes.io/kube-apiserver-client-kubelet system:node:localhost.localdomain Pending
2.3、清除集群所有的配置(三節點)
[root@master01 ~]# kubeadm reset
[root@master01 ~]# rm -rf $HOME/.kube/config (這個在配置了kubectl的節點執行即可) ## yum install -y net-tools
[root@master01 ~]# ifconfig cni0 down && ip link delete cni0 (如果cni0沒干掉就手動干掉)
[root@master01 ~]# ifconfig flannel.1 down && ip link delete flannel.1
[root@master01 ~]# ifconfig kube-ipvs0 down && ip link delete kube-ipvs0
[root@master01 ~]# ifconfig dummy0 down && ip link delete dummy0
2.4、修改kubeadm.yaml文件(重點)
- 這配置文件中有個參數,kind.nodeRegistration.name, 檢查這個name是否為主機名,如果不是則改為主機名
- kubeadm config print init-defaults > kubeadm.yaml (這文件是這樣弄下來的,給新接觸的同學提個醒,最好找到原先的。防止修改過。)
[root@master01 ~]# grep -C2 nodeRegistration kubeadm.yaml | grep name
name: master01
2.5、重新初始化集群
[root@master01 ~]# kubeadm init --config kubeadm.yaml
2.5.1、成功后添加node節點即可
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.1.70:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:f4d24e1b28d4dcb1ebaa9c4847221fda503bb08627175dbdacb589cc4ebfaa8a
2.7、驗證
-
可以看到,已經改過來了
-
記錄下,以防參坑。一時大意了,給自己留了個坑
-
"cni0" already has an IP address different from 10.244.1.1/24 (-owide后,到對應節點systemctl restart network)
-
"cni0" already has an IP address different from 10.244.1.1/24 (或者ifconfig cni0 down && ip link delete cni0)
-
然后重啟coredns,還行就重新初始那節點吧。。。。。
-
發現cni0的這個網卡地址是10.244.2.1,明顯與報錯中的10.244.1.1不一致,將其改為10.244.1.1,重啟網絡服務,回到master,發現容器正常運行
-
kubeadm reset systemctl stop kubelet && rm -rf /etc/cni/ ifconfig cni0 down && ifconfig flannel.1 down ip link delete cni0 && ip link delete flannel.1 systemctl start kubelet # 獲取master的join token kubeadm token create --print-join-command # 重新加入節點 kubeadm join 192.168.1.70:6443 --token 1ni0cy.frcpumeb2bdmscqu --discovery-token-ca-cert-hash sha256:f4d24e1b28d4dcb1ebaa9c4847221fda503bb08627175dbdacb589cc4ebfaa8a
[root@master01 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
master01 NotReady master 2m37s v1.19.3
node01 NotReady <none> 34s v1.19.3
node02 NotReady <none> 38s v1.19.3
2.8、部署網絡插件
[root@master01 ~]# kubectl apply -f kube-flannel.yml
2.9、集群驗證
[root@master01 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
master01 Ready master 32m v1.19.3
node01 Ready <none> 30m v1.19.3
node02 Ready <none> 30m v1.19.3
kubeadm參考: https://blog.csdn.net/qq_24794401/article/details/106654710
二進制安裝可參坑:https://blog.csdn.net/chenshm/article/details/118718644