kubespray安裝高可用k8s集群
環境介紹
系統環境 | 主機名 / IP地址 | 角色 | 內核版本 |
---|---|---|---|
CentOS 7.6.1810 | master1 / 192.168.181.252 | master && node | 5.4 |
CentOS 7.6.1810 | master2 / 192.168.181.253 | master && node | 5.4 |
工具介紹
工具名稱 | 版本 | 官網下載 | 安裝機器 |
---|---|---|---|
ansible | 2.9.16 | 阿里雲的epel.repo | master1 |
kubespray | 2.15.0 | https://github.com/kubernetes-sigs/kubespray | master1 |
chronyd | 3.2 | 系統自帶的就好 | master1 && master2 |
阿里雲yum源 | https://developer.aliyun.com/mirror/ | master1 && master2 |
環境准備工作(所有機器都需要)
1.關閉防火牆、SElinux
## 防火牆 systemctl stop firewalld.service systemctl disable firewalld.service ## selinux setenforce sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config
2.編輯/etc/hosts文件
## /etc/hosts文件中添加所有主機的域名解析 192.168.181.252 master1 192.168.181.253 master2
3.ssh免密
## 生成密鑰 ssh-keygen -t rsa ## 公鑰復制到其他主機 ssh-copy-id master1 ssh-copy-id master2 ## 可以測試訪問是否成功 ssh master2
4.升級內核至5.4
查看當前內核版本
[root@master1 data]# uname -r 3.10.0-957.el7.x86_64
設置ELRepo源
## 導入公鑰 rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
## 安裝yum源 yum install https://www.elrepo.org/elrepo-release-7.el7.elrepo.noarch.rpm
查看可用內核
[root@master1 data]# yum --disablerepo \* --enablerepo elrepo-kernel list available 已加載插件:fastestmirror, langpacks Loading mirror speeds from cached hostfile * elrepo-kernel: mirrors.tuna.tsinghua.edu.cn 可安裝的軟件包 kernel-lt.x86_64 5.4.95-1.el7.elrepo elrepo-kernel kernel-lt-devel.x86_64 5.4.95-1.el7.elrepo elrepo-kernel kernel-lt-doc.noarch 5.4.95-1.el7.elrepo elrepo-kernel kernel-lt-headers.x86_64 5.4.95-1.el7.elrepo elrepo-kernel kernel-lt-tools.x86_64 5.4.95-1.el7.elrepo elrepo-kernel kernel-lt-tools-libs.x86_64 5.4.95-1.el7.elrepo elrepo-kernel kernel-lt-tools-libs-devel.x86_64 5.4.95-1.el7.elrepo elrepo-kernel kernel-ml.x86_64 5.10.13-1.el7.elrepo elrepo-kernel kernel-ml-devel.x86_64 5.10.13-1.el7.elrepo elrepo-kernel kernel-ml-doc.noarch 5.10.13-1.el7.elrepo elrepo-kernel kernel-ml-headers.x86_64 5.10.13-1.el7.elrepo elrepo-kernel kernel-ml-tools.x86_64 5.10.13-1.el7.elrepo elrepo-kernel kernel-ml-tools-libs.x86_64 5.10.13-1.el7.elrepo elrepo-kernel kernel-ml-tools-libs-devel.x86_64 5.10.13-1.el7.elrepo elrepo-kernel perf.x86_64 5.10.13-1.el7.elrepo elrepo-kernel python-perf.x86_64 5.10.13-1.el7.elrepo elrepo-kernel
安裝lt內核
## 安裝 yum --enablerepo elrepo-kernel -y install kernel-lt ## 查看當前所有內核 grubby --info=ALL ## 設置5.4內核為默認啟動內核 grub2-set-default 0 grub2-reboot 0 或 grep menuentry /boot/efi/EFI/centos/grub.cfg grub2-set-default 'CentOS Linux (5.4.95-1.el7.x86_64) 7 (Core)' ## 查看修改結果 grub2-editenv list ## 重啟服務器 systemctl reboot
驗證內核版本
[root@master1 ~]# uname -r 5.4.95-1.el7.elrepo.x86_64 [root@master2 ~]# uname -r 5.4.95-1.el7.elrepo.x86_64
另外擴展一下如何將yum所安裝的所有安裝包及依賴包下載到本地,以供在沒有外網環境時安裝使用:
## 安裝yumdownloader工具 yum -y install yum-utils ## 下載kernel包及其所需依賴包 yumdownloader --resolve --destdir /data/kernel/ --enablerepo elrepo-kernel kernel-lt --resolve 連帶依賴包一起下載 --destdir 包下載到的路徑 --enablerepo 使用哪個repo庫
5.開啟內核路由轉發功能
## 臨時開始,寫入內存 echo 1 > /proc/sys/net/ipv4/ip_forward ## 永久開啟寫入內核參數 echo 'net.ipv4.ip_forward = 1' >> /etc/sysctl.conf ## 加載配置 sysctl -p ## 驗證是否生效 [root@master2 ~]# sysctl -a | grep 'ip_forward' net.ipv4.ip_forward = 1 net.ipv4.ip_forward_update_priority = 1 net.ipv4.ip_forward_use_pmtu = 0
6.關閉swap分區
## 臨時關閉 swapoff -a ## 永久關閉 sed -i "s/.*swap.*//" /etc/fstab
工具准備工作
1.安裝阿里雲yum源(兩台機器都需要)
CentOS-Base.repo
wget -O /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo
## 非阿里雲ECS要執行 sed -i -e '/mirrors.cloud.aliyuncs.com/d' -e '/mirrors.aliyuncs.com/d' /etc/yum.repos.d/CentOS-Base.repo
epel.repo
wget -O /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo
docker-ce.repo
yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
sed -i 's+download.docker.com+mirrors.aliyun.com/docker-ce+' /etc/yum.repos.d/docker-ce.repo
kubernetes.repo
cat <<EOF > /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1 gpgcheck=1 repo_gpgcheck=1 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
yum源准備完畢后,創建元數據
[root@master1 data]# yum clean all [root@master1 data]# yum makecache
2.更新python的pip、jinja2等
## 先安裝gcc編譯需要,和zlib*壓縮解壓縮需要, ## libffi-devel為python需要,不然ansible安裝K8S時會報類似:ModuleNotFoundError: No module named '_ctypes' ## python2-pip pip的安裝 yum -y install gcc zlib* libffi-develpython2-pip-8.1.2-14.el7.noarch ## 配置pip源,這里配置的aliyun的pip源 vim ~/.pip/pip.conf ... [global] index-url = http://mirrors.aliyun.com/pypi/simple
trusted-host = mirrors.aliyun.com ... ## 更新pip、jinja2,如果不更新jinja2,安裝k8s會報錯:AnsibleError: template error while templating string: expected token '=', got 'end of statement block'. pip install --upgrade pip pip install --upgrade jinja2
3.安裝ansible
## 阿里雲的epel.repo中有ansible,直接yum即可 yum -y install ansible ## 然后更新jinja2,必須指定國內的源pip源,這里指定阿里雲的pip源 pip install jinja2 --upgrade
4.配置時鍾服務
這里以master1為時鍾服務端,master2為時鍾客戶端
[master下操作] vim /etc/chrony.conf ... # 主要下面幾個點 server 192.168.181.252 #指定服務端 allow 192.168.181.0/24 #把自身當作服務端 ... [slave下操作] vim /etc/chrony.conf ... server 192.168.181.252 #指定服務端 ... ## 然后重啟服務,查看狀態 systemctl enable chronyd systemctl restart chronyd timedatectl chronyc sources -v
配置kubespray
1.安裝requirements.txt
## 首先要配置pip源,前邊如果配置了可忽略 mkdir -p ~/.pip/
cat > pip.conf << EOF > [global] > index-url = http://mirrors.aliyun.com/pypi/simple
> trusted-host = mirrors.aliyun.com > EOF ## 更新pip python3 -m pip install --upgrade pip ## 安裝requirements.txt pip install -r requirements.txt
2.更改inventory
## 復制inventory/sample到inventory/mycluster cd /data/kubespray-master cp -rfp inventory/sample inventory/mycluster ## 使用庫存生成器更新Ansible inventory 文件 declare -a IPS=(192.168.181.252 192.168.181.253) CONFIG_FILE=inventory/mycluster/hosts.yaml python3 contrib/inventory_builder/inventory.py ${IPS[@]} ## 生成的hosts.yaml文件,這里的node1和node2將會被更改為主機的hostname,可以根據實際場景設定 [root@master1 kubespray]# cat inventory/mycluster/hosts.yaml all: hosts: node1: ansible_host: 192.168.181.252 ip: 192.168.181.252 access_ip: 192.168.181.252 node2: ansible_host: 192.168.181.253 ip: 192.168.181.253 access_ip: 192.168.181.253 children: kube-master: hosts: node1: node2: kube-node: hosts: node1: node2: etcd: hosts: node1: k8s-cluster: children: kube-master: kube-node: calico-rr: hosts: {}
3.根據需求修改默認配置
## 一些組件的安裝,比如helm、registry、local_path_provisioner、ingress等,默認都是關閉狀態,如果有需求,可以將其打開並設置 vim inventory/mycluster/group_vars/k8s-cluster/addons.yml ... # Helm deployment helm_enabled: true # Registry deployment registry_enabled: true # Rancher Local Path Provisioner local_path_provisioner_enabled: false ... ## 還有網絡插件(默認為calico)、網池、kube-proxy的模式等一些可以自己修改 vim inventory/mycluster/group_vars/k8s-cluster/k8s-cluster.yml ... kube_network_plugin: flannel kube_network_plugin_multus: false kube_service_addresses: 10.233.0.0/18 kube_pods_subnet: 10.233.64.0/18 kube_proxy_mode: ipvs ... ## 還有docker的一些配置象存儲位置、端口等配置,都可以在配置文件中修改,不再一一贅述
4.替換鏡像源為國內鏡像源
cd /data/kubespray find ./ -type f |xargs sed -i 's/k8s.gcr.io/registry.cn-hangzhou.aliyuncs.com/g'
find ./ -type f |xargs sed -i 's/gcr.io/registry.cn-hangzhou.aliyuncs.com/g'
find ./ -type f |xargs sed -i 's/google-containers/google_containers/g'
或者提前將所需的所有鏡像下載下來導入,兩種方法都可以
5.開始部署
## 部署前提前安裝 netaddr 包,不然執行過程會報錯: {"failed": true, "msg": "The ipaddr filter requires python-netaddr be installed on the ansible controller"} yum -y install python-netaddr ## 如果yum安裝python-netaddr后,執行ansible-playbook時還報以上錯,則執行pip安裝 pip install python-netaddr --upgrade ## 部署 ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root cluster.yml ## 報錯1:kubeadm-v1.20.2-amd64、kubectl-v1.20.2-amd64兩個下載不下來,網絡問題,可手動下載之后傳上去,所有機器都要傳,傳至/tmp/releases/目錄下 https://storage.googleapis.com/kubernetes-release/release/v1.20.2/bin/linux/amd64/kubeadm
https://storage.googleapis.com/kubernetes-release/release/v1.20.2/bin/linux/amd64/kubectl
## 報錯2:有兩個鏡像基礎鏡像pull不下來的問題,我自己手動pull的 cluster-proportional-autoscaler-amd64:1.8.3 k8s-dns-node-cache:1.16.0 ## 報錯3:AnsibleError: template error while templating string: expected token '=', got 'end of statement block'. 這個報錯是由於jinja2版本較低造成的,前邊 工具准備工作 的 第二步有提及到,直接pip install --upgrade jinja2即可,我是更新到了2.11 這里注意一下,pip版本必須是pip2,因為ansible默認的python模式是python2.7的 ## 報錯4:error running kubectl (/usr/local/bin/kubectl apply --force --filename=/etc/kubernetes/k8s-cluster-critical-pc.yml) command (rc=1), out='', err='Unable to connect to the server: net/http: TLS handshake timeout
報這個是因為內存不夠,我只給了2G內存,再加點內存就ok
以上為我遇到的報錯,如果你也遇到但未得到解決,可以留言一起探討
6.部署完成
[root@node1 kubespray]# kubectl get nodes NAME STATUS ROLES AGE VERSION node1 Ready control-plane,master 24m v1.20.2 node2 Ready control-plane,master 23m v1.20.2 [root@node1 kubespray]# kubectl version Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2", GitCommit:"faecb196815e248d3ecfb03c680a4507229c2a56", GitTreeState:"clean", BuildDate:"2021-01-13T13:28:09Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"} [root@node1 ~]# kubectl get pod -n kube-system NAME READY STATUS RESTARTS AGE coredns-5bfb6bc97d-g7j4v 1/1 Running 0 18m coredns-5bfb6bc97d-vqz2m 1/1 Running 0 17m dns-autoscaler-74877b64cd-gjnpb 1/1 Running 0 17m kube-apiserver-node1 1/1 Running 0 40m kube-apiserver-node2 1/1 Running 0 39m kube-controller-manager-node1 1/1 Running 0 40m kube-controller-manager-node2 1/1 Running 0 39m kube-flannel-82x65 1/1 Running 0 19m kube-flannel-cps8x 1/1 Running 0 19m kube-proxy-4bzmh 1/1 Running 0 19m kube-proxy-h8xqx 1/1 Running 0 19m kube-scheduler-node1 1/1 Running 0 40m kube-scheduler-node2 1/1 Running 0 39m nodelocaldns-nfmjz 1/1 Running 0 17m nodelocaldns-ngn6z 1/1 Running 0 17m registry-proxy-qk24l 1/1 Running 0 17m registry-rwb9k 1/1 Running 0 17m
這里也提一下,cpu最少要2核,不然會有些基礎pod都起不來