架構是3台master加2台work節點。負載均衡模式使用的haproxy+keepalived。
前期的一些准備網上有很多介紹,大家可以直接參考其他的一些文章。這里主要介紹一些其他的異常報錯。
- 、/var/log/message報錯
Sep 15 16:47:32 k8s-master-1 kubelet: E0915 16:47:32.945920 2757 kubelet_node_status.go:92] Unable to register node "vip-k8s-master" with API server: Post https://vip-k8s-master:8443/api/v1/nodes: http: server gave HTTP response to HTTPS client Sep 15 16:47:33 k8s-master-1 kubelet: E0915 16:47:33.511767 2757 event.go:269] Unable to write event: 'Post https://vip-k8s-master:8443/api/v1/namespaces/default/events: http: server gave HTTP response to HTTPS client' (may retry after sleeping)
解決辦法:haproxy配置有問題,配置文件里mode全部改為tcp,啟動后查看haproxy日志,端口是否正常。
- 啟動flannel插件后,coredns狀態還是notreday ,/var/log/message也有對應報錯
Sep 16 10:13:42 k8s-master-1 kubelet: : [plugin flannel does not support config version "" plugin portmap does not support config version ""] Sep 16 10:13:42 k8s-master-1 kubelet: W0916 10:13:42.168412 9539 cni.go:237] Unable to update cni config: no valid networks found in /etc/cni/net.d Sep 16 10:13:42 k8s-master-1 kubelet: E0916 10:13:42.460568 9539 kubelet.go:2188] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
解決辦法:flannel.yaml的configmap中的cni-conf.json數據缺少cniVersion字段,因此plugin flannel does not support config version ""返回類似錯誤。此處flannel版本用的是v0.11.0。
在cbr0
這一行上面新增一行:"cniVersion":"0.3.1",然后重新應用一下配置文件。如果coredns還是異常,可以干掉這個pod重來。
cni-conf.json: | { "cniVersion":"0.3.1", "name": "cbr0", "plugins": [ { "type": "flannel", "delegate": { "hairpinMode": true, "isDefaultGateway": true } }, { "type": "portmap", "capabilities": { "portMappings": true } } ] }
3、kube-scheduler和kube-controller-manager組件狀態異常,顯示Unhealthy
NAME STATUS MESSAGE ERROR scheduler Unhealthy Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused controller-manager Unhealthy Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: connect: connection refused etcd-0 Healthy {"health":"true"}
解決辦法:ss -ntl確實沒有這些端口。查看kube-scheduler和kube-controller-manager組件配置是否禁用了非安全端口。注釋掉--port=0 然后重啟kubelet systemctl restart kubelet
配置文件路徑:/etc/kubernetes/manifests/kube-scheduler.conf 、/etc/kubernetes/manifests/kube-controller-manager.conf
spec: containers: - command: - kube-scheduler - --authentication-kubeconfig=/etc/kubernetes/scheduler.conf - --authorization-kubeconfig=/etc/kubernetes/scheduler.conf - --bind-address=127.0.0.1 - --kubeconfig=/etc/kubernetes/scheduler.conf - --leader-elect=true # - --port=0 image: k8s.gcr.io/kube-scheduler:v1.18.8 imagePullPolicy: IfNotPresent
4,設置docker運行所需參數
cat > /etc/sysctl.d/k8s.conf << EOF net.ipv4.ip_forward = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 EOF modprobe br_netfilter sysctl -p /etc/sysctl.d/k8s.conf
在/etc/docker/daemon
.json這個文件中添加
"exec-opts"
: [
"native.cgroupdriver=systemd"
],
4、加載ipvs模塊
cat > /etc/sysconfig/modules/ipvs.modules <<EOF #!/bin/bash modprobe -- ip_vs modprobe -- ip_vs_rr modprobe -- ip_vs_wrr modprobe -- ip_vs_sh modprobe -- nf_conntrack_ipv4 EOF chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules
5,etcd節點必須3個及以上,這是一個失敗容忍度的介紹。具體的可以去官網查看