容器雲平台No.2~kubeadm創建高可用集群v1.19.1


通過kubernetes構建容器雲平台第二篇,最近剛好官方發布了V1.19.0,本文就以最新版來介紹通過kubeadm安裝高可用的kubernetes集群。
市面上安裝k8s的工具很多,但是用於學習的話,還是建議一步步安裝,了解整個集群內部運行的組件,以便后期學習排錯更方便。。。

本文環境如下:
服務器:3台
操作系統:CentOS 7
拓撲圖就不畫了,直接copy官網的

概述

簡單說下這個圖,三台服務器作為master節點,使用keepalive+haproxy對apiserver進行負載均衡,node節點和apiserver通信通過VIP進行。第一篇說過,集群的所有信息存在ETCD集群中。
接下來,開干。。。

配置源

這邊配置了三種源,全部替換從國內的鏡像源,以加快安裝包的速度。

# 系統源
curl -O http://mirrors.aliyun.com/repo/Centos-7.repo

# docker源
curl -O https://mirrors.ustc.edu.cn/docker-ce/linux/centos/docker-ce.repo
sed -i 's/download.docker.com/mirrors.ustc.edu.cn\/docker-ce/g' docker-ce.repo

# kubernetes源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

配置系統相關參數

系統配置完源以后,需要對一些參數進行設置,都是官方的推薦,更多優化后期介紹。

# 臨時禁用selinux
# 永久關閉 修改/etc/sysconfig/selinux文件設置
sed -i 's/SELINUX=permissive/SELINUX=disabled/' /etc/sysconfig/selinux
setenforce 0

# 臨時關閉swap
# 永久關閉 注釋/etc/fstab文件里swap相關的行
swapoff -a

# 開啟forward
# Docker從1.13版本開始調整了默認的防火牆規則
# 禁用了iptables filter表中FOWARD鏈
# 這樣會引起Kubernetes集群中跨Node的Pod無法通信
iptables -P FORWARD ACCEPT

# 配置轉發相關參數,否則可能會出錯
cat <<EOF >  /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
vm.swappiness=0
EOF
sysctl --system

# 加載ipvs相關內核模塊
# 如果重新開機,需要重新加載
modprobe ip_vs
modprobe ip_vs_rr
modprobe ip_vs_wrr
modprobe ip_vs_sh
modprobe nf_conntrack_ipv4
lsmod | grep ip_vs

安裝kubeadm及其相關軟件

yum  install -y kubelet kubeadm kubectl ipvsadm

配置docker

主要配置加速下載公有鏡像和允許從不安全的私有倉庫下載鏡像
hub.xxx.om需要改成自己的私有倉庫地址,如果沒有請刪除insecure-registries該行
vim /etc/docker/daemon.json

{
  "registry-mirrors": ["https://ci7pm4nx.mirror.aliyuncs.com","https://registry.docker-cn.com","http://hub-mirror.c.163.com"],
  "insecure-registries":["hub.xxx.om"]
}

寫好配置,重啟docker

systemctl  restart docker
systemctl  enable docker.service

查看docker info,輸出如下

 Insecure Registries:
  hub.xxx.com
  127.0.0.0/8
 Registry Mirrors:
  https://ci7pm4nx.mirror.aliyuncs.com/
  https://registry.docker-cn.com/
  http://hub-mirror.c.163.com/

啟動kubelet

systemctl enable --now kubelet

kubelet 現在每隔幾秒就會重啟,因為它陷入了一個等待 kubeadm 指令的死循環。

安裝配置haproxy和keepalive (三台機器都要安裝配置)

安裝軟件包yum install -y haproxy keepalived

配置haproxy

需要注意,手動創建/var/log/haproxy.log文件

[root@k8s-master001 ~]# cat /etc/haproxy/haproxy.cfg 
# /etc/haproxy/haproxy.cfg
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
    log /var/log/haproxy.log local0
    daemon

#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    mode                    http
    log                     global
    retries                 1
    timeout http-request    10s
    timeout queue           20s
    timeout connect         5s
    timeout client          20s
    timeout server          20s
    timeout http-keep-alive 10s
    timeout check           10s

listen admin_stats
    mode                    http
    bind                    0.0.0.0:1080
    log                     127.0.0.1 local0 err
    stats refresh           30s
    stats uri               /haproxy-status
    stats realm             Haproxy\ Statistics
    stats auth              admin:admin
    stats hide-version
    stats admin if TRUE
#---------------------------------------------------------------------
# apiserver frontend which proxys to the masters
#---------------------------------------------------------------------
frontend apiserver
    bind *:8443
    mode tcp
    option tcplog
    default_backend apiserver

#---------------------------------------------------------------------
# round robin balancing for apiserver
#---------------------------------------------------------------------
backend apiserver
    option httpchk GET /healthz
    http-check expect status 200
    mode tcp
    option ssl-hello-chk
    balance     roundrobin
    server k8s-master001  10.26.25.20:6443 weight 1 maxconn 1000 check inter 2000 rise 2 fall 3
    server k8s-master002  10.26.25.21:6443 weight 1 maxconn 1000 check inter 2000 rise 2 fall 3
    server k8s-master003  10.26.25.22:6443 weight 1 maxconn 1000 check inter 2000 rise 2 fall 3

啟動haproxy

systemctl start haproxy
systemctl enable haproxy

配置keepalived

[root@k8s-master001 ~]# cat /etc/keepalived/keepalived.conf 
! /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
    router_id LVS_K8S
}
vrrp_script check_apiserver {
  script "/etc/keepalived/check_apiserver.sh"
  interval 3
  weight -2
  fall 10
  rise 2
}

vrrp_instance VI_1 {
    state MASTER
    interface ens18
    virtual_router_id 51
    priority 100
    authentication {
        auth_type PASS
        auth_pass kubernetes
    }
    virtual_ipaddress {
        10.26.25.23
    }
    track_script {
        check_apiserver
    }
}

添加keepalive檢查腳本

[root@k8s-master001 ~]# cat /etc/keepalived/check_apiserver.sh 
#!/bin/sh

errorExit() {
    echo "*** $*" 1>&2
    exit 1
}

curl --silent --max-time 2 --insecure https://localhost:8443/ -o /dev/null || errorExit "Error GET https://localhost:8443/"
if ip addr | grep -q 10.26.25.23; then
    curl --silent --max-time 2 --insecure https://10.26.25.23:8443/ -o /dev/null || errorExit "Error GET https://10.26.25.23:8443/"
fi

chmod +x  /etc/keepalived/check_apiserver.sh

啟動keepalived

systemctl  start  keepalived
systemctl  enable keepalived

現在你可以通過訪問master IP:1080/aproxy-status 訪問haproxy管理界面,用戶名密碼在配置文件中。本文是admin/admin,可以自己修改。
剛開始apiserver的行都是紅的,表示服務還未啟動,我這里圖是后截的,所以是綠的


接下開,開始初始化kubernetes集群

初始化第一個控制節點master001

[root@k8s-master001 ~]# kubeadm init --control-plane-endpoint 10.26.25.23:8443 --upload-certs --image-repository registry.aliyuncs.com/google_containers  --pod-network-cidr 10.244.0.0/16 
W0910 05:09:41.166260   29186 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.19.1
[preflight] Running pre-flight checks
        [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
........忽略了部分信息
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
............忽略了部分信息
[addons] Applied essential addon: CoreDNS
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of the control-plane node running the following command on each as root:
  kubeadm join 10.26.25.23:8443 --token f28iti.c5fgj45u28332ga7 \
    --discovery-token-ca-cert-hash sha256:81ec8f1d1db0bb8a31d64ae31091726a92b9294bcfa0e2b4309b9d8c5245db41 \
    --control-plane --certificate-key 93f9514164e2ecbd85293a9c671344e06a1aa811faf1069db6f678a1a5e6f38b
Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 10.26.25.23:8443 --token f28iti.c5fgj45u28332ga7 \
    --discovery-token-ca-cert-hash sha256:81ec8f1d1db0bb8a31d64ae31091726a92b9294bcfa0e2b4309b9d8c5245db41

看到輸出如上,代表初始化成功
初始化命令說明:
kubeadm init --control-plane-endpoint 10.26.25.23:8443 --upload-certs --image-repository registry.aliyuncs.com/google_containers --pod-network-cidr 10.244.0.0/16

  • --control-plane-endpoint 10.26.25.23:8443 這里的10.26.25.23就是keepalived配置的VIP
  • --image-repository registry.aliyuncs.com/google_containers 更改了默認下載鏡像的地址,默認是k8s.gcr.io,國內下載不了,或者自行爬牆~~~
  • --pod-network-cidr 10.244.0.0/16 定義了pod的網段,需要與flannel定義的網段一直,否則在安裝flannel時可能會出現flannel的pod一直重啟,后面安裝flannel的時候會提到

初始化過程簡介:

  • 下載需要的鏡像
  • 創建證書
  • 創建服務的yaml配置文件
  • 啟動靜態pod

初始化完成以后,現在就可以根據提示,配置kubectl客戶端,使用kubernetes了,雖然現在只有一個master節點

開始使用集群

[root@k8s-master001 ~]#  mkdir -p $HOME/.kube
[root@k8s-master001 ~]#   sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@k8s-master001 ~]#   sudo chown $(id -u):$(id -g) $HOME/.kube/config
[root@k8s-master001 ~]#   kubectl  get no
NAME            STATUS     ROLES    AGE    VERSION
k8s-master001   NotReady   master   105s   v1.19.0

現在可以看到集群中只有一個節點,狀態為NotReady,這是因為網絡插件還沒有安裝
接下來安裝網絡插件Flannel

Flannel安裝

下載安裝需要的yalm文件:wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel.yml
因為現在安裝的是最新版本的kubernetes,rbac的api版本需要修改為rbac.authorization.k8s.io/v1,DaemonSet的api版本改為 apps/v1,同時添加selector,這里只貼出配置的一部分。

    [root@k8s-master001 ~]# cat kube-flannel.yml 
    ---
    apiVersion: apps/v1
    kind: DaemonSet
    metadata:
      name: kube-flannel-ds
      namespace: kube-system
      labels:
        tier: node
        app: flannel
    spec:
       selector:
        matchLabels:
          tier: node
          app: flannel
      template:
        metadata:
          labels:
            tier: node
            app: flannel

接下來,通過kubectl安裝Flannel,並通過kubectl查看flannel pod的狀態是否運行。

    kubectl apply -f kube-flannel.yaml
    [root@k8s-master001 ~]# kubectl  get no
    NAME            STATUS   ROLES    AGE     VERSION
    k8s-master001   Ready    master   6m35s   v1.19.0
    [root@k8s-master001 ~]# kubectl  get po -n kube-system
    NAME                                    READY   STATUS    RESTARTS   AGE
    coredns-6d56c8448f-9cr5l                1/1     Running   0          6m51s
    coredns-6d56c8448f-wsjwx                1/1     Running   0          6m51s
    etcd-k8s-master001                      1/1     Running   0          7m
    kube-apiserver-k8s-master001            1/1     Running   0          7m
    kube-controller-manager-k8s-master001   1/1     Running   0          7m
    kube-flannel-ds-nmfwd                   1/1     Running   0          4m36s
    kube-proxy-pqrnl                        1/1     Running   0          6m51s
    kube-scheduler-k8s-master001            1/1     Running   0          7m

可以看到一個名字叫kube-flannel-ds-nmfwd的pod,狀態為running,表示flannel已經安裝好了
因為現在只有一個節點,只看到一個flannel的pod,后面繼續添加另外兩個節點,就會看到更多的pod了
接下來繼續添加master節點

添加另外控制節點master002,master003

因為現在已經有一個控制節點,集群已經存在,只需要將剩下的機器添加到集群中即可,添加信息在剛在初始化節點的時候輸出中可以看到,命令如下
因為輸出太多,這里會刪除一部分不重要的輸出信息
在master002上操作:

    [root@k8s-master002 ~]#   kubeadm join 10.26.25.23:8443 --token f28iti.c5fgj45u28332ga7     --discovery-token-ca-cert-hash sha256:81ec8f1d1db0bb8a31d64ae31091726a92b9294bcfa0e2b4309b9d8c5245db41     --control-plane --certificate-key 93f9514164e2ecbd85293a9c671344e06a1aa811faf1069db6f678a1a5e6f38b
    [preflight] Running pre-flight checks
            [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
    [preflight] Reading configuration from the cluster...
    [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
    [preflight] Running pre-flight checks before initializing the new control plane instance
    [preflight] Pulling images required for setting up a Kubernetes cluster
    [preflight] This might take a minute or two, depending on the speed of your internet connection
    [preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
    [download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
    ..............
    To start administering your cluster from this node, you need to run the following as a regular user:
            mkdir -p $HOME/.kube
            sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
            sudo chown $(id -u):$(id -g) $HOME/.kube/config
    Run 'kubectl get nodes' to see this node join the cluster.

看到這樣的輸出,表示添加成功了。
現在來查看下集群節點信息

    [root@k8s-master002 ~]# kubectl  get no 
    NAME            STATUS   ROLES    AGE     VERSION
    k8s-master001   Ready    master   21m     v1.19.0
    k8s-master002   Ready    master   6m5s    v1.19.0

從輸出能看到兩個master節點,添加master003節點操作和master002一樣,不再多說

最后三個節點全部添加以后,通過kubectl可以看到集群的具體信息

    [root@k8s-master003 ~]# kubectl  get no 
    NAME            STATUS   ROLES    AGE   VERSION
    k8s-master001   Ready    master   25m   v1.19.0
    k8s-master002   Ready    master   10m   v1.19.0
    k8s-master003   Ready    master   26s   v1.19.0

最后查看現在運行的所有pod

    [root@k8s-master003 ~]# kubectl  get po -n kube-system
    NAME                                    READY   STATUS    RESTARTS   AGE
    coredns-6d56c8448f-9cr5l                1/1     Running   0          27m
    coredns-6d56c8448f-wsjwx                1/1     Running   0          27m
    etcd-k8s-master001                      1/1     Running   0          27m
    etcd-k8s-master002                      1/1     Running   0          8m19s
    etcd-k8s-master003                      1/1     Running   0          83s
    kube-apiserver-k8s-master001            1/1     Running   0          27m
    kube-apiserver-k8s-master002            1/1     Running   0          12m
    kube-apiserver-k8s-master003            1/1     Running   0          85s
    kube-controller-manager-k8s-master001   1/1     Running   1          27m
    kube-controller-manager-k8s-master002   1/1     Running   0          12m
    kube-controller-manager-k8s-master003   1/1     Running   0          81s
    kube-flannel-ds-2lh42                   1/1     Running   0          2m31s
    kube-flannel-ds-nmfwd                   1/1     Running   0          25m
    kube-flannel-ds-w276b                   1/1     Running   0          11m
    kube-proxy-dzpdz                        1/1     Running   0          2m39s
    kube-proxy-hd5tb                        1/1     Running   0          12m
    kube-proxy-pqrnl                        1/1     Running   0          27m
    kube-scheduler-k8s-master001            1/1     Running   1          27m
    kube-scheduler-k8s-master002            1/1     Running   0          12m
    kube-scheduler-k8s-master003            1/1     Running   0          76s

現在可以看到,kubernetes的核心服務apiserver,-controller-manager,scheduler都是3個pod。

以上,kubernetes的master高科用就部署完畢了。
現在你可以通過haproxy的web管理界面,可以看到三個master已經可用了。

故障排除

如果master初始化失敗,或者添加節點失敗,可以使用kubeadm reset重置,然后重新安裝

重置節點
    [root@k8s-node003 haproxy]# kubeadm  reset 
    [reset] Reading configuration from the cluster...
    [reset] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
    W0910 05:31:57.345399   20386 reset.go:99] [reset] Unable to fetch the kubeadm-config ConfigMap from cluster: failed to get node registration: node k8s-node003 doesn't have kubeadm.alpha.kubernetes.io/cri-socket annotation
    [reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
    [reset] Are you sure you want to proceed? [y/N]: y
    [preflight] Running pre-flight checks
    W0910 05:31:58.580982   20386 removeetcdmember.go:79] [reset] No kubeadm config, using etcd pod spec to get data directory
    [reset] No etcd config found. Assuming external etcd
    [reset] Please, manually reset etcd to prevent further issues
    [reset] Stopping the kubelet service
    [reset] Unmounting mounted directories in "/var/lib/kubelet"
    [reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
    [reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
    [reset] Deleting contents of stateful directories: [/var/lib/kubelet /var/lib/dockershim /var/run/kubernetes /var/lib/cni]
    The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d
    The reset process does not reset or clean up iptables rules or IPVS tables.
    If you wish to reset iptables, you must do so manually by using the "iptables" command.
    If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
    to reset your system's IPVS tables.
    The reset process does not clean your kubeconfig files and you must remove them manually.
    Please, check the contents of the $HOME/.kube/config file.

一篇內容太多,后續的內容看下篇。。。
Tips: 更多好文章,請關注首發微信公眾號“菜鳥運維雜談”!!!


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM