1. cluster deployment plan
ip | machine name | role | components |
192.168.84.37 | lenmom-0 | master | docker-ce-18.06 kubeadm-1.16.0 kubectl-1.16.0 kubenet-1.16.0 |
192.168.84.46 | lenmom-1 | node | docker-ce-18.06 |
192.168.84.41 | lenmom-2 | node | docker-ce-18.06 |
2.prerequirement setup
2.1 turn off swap for linux operation system
because kubenetes used virtualization technology, the swap can greately damage the k8s performance, so it's required to to turn off the swap since kubernetes 1.8.
temporary turn off
swapoff -a #turn off swap on currrent teminal session.
for permanently turn off swap, we can disable the mount of swap
vi /etc/fstab
and comment the swap line.
we can use free -h to verify if the swap if off, if off, the swap size would show 0
2.2 turn off selinux
Set SELinux in permissive mode (effectively disabling it)
sed -i 's@SELINUX=enforcing@SELINUX=disabled@g' /etc/selinux/config #turn off selinux permanently setenforce 0 # turn off selinux on currrent teminal session. getenforce
2.3 turn off firewall
systemctl stop firewalld.service #stop firewall serivce
systemctl disable firewalld.service #disable firewall service to start at system restart
set firewall forward rules to prevent message being intercepted by firewall.
cat <<EOF > /etc/sysctl.d/k8s.conf net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 vm.swappiness=0 EOF sysctl --system
2.4 dependency software install
yum install -y vim wget curl telnet net-tools ntp yum-utils device-mapper-persistent-data lvm2
note:
- ntp is used to synchnize timestamp for all cluster machines. for detail usage of ntp, please refer my blog post CentOS7 设置集群时间同步
- device-mapper-persistent-data, lvm are required when install docker.
- vim,telnet,net-tools,yum-utils are infrustructure tools which makes usage of linux better.
2.5 add aliyun kubernetes repo souce
because the k8s official repo source is packages.cloud.google.com which is not avaliable in China, but we can configurate an avalible repos source which is avaliable for us, such as aliyun repo.
cat <<EOF > /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF
2.6 set host name and static ip and /etc/hosts
the following operation should be done on all the machines in the cluster. let's take 192.168.1.101 as example
set host name:
hostnamectl set-hostname lenmom-0 hostnamectl --static set-hostname lenmom-0
set static ip address
sudo vim /etc/sysconfig/network-scripts/ifcfg
set the following content in this file:
TYPE=Ethernet BOOTPROTO=static DEFROUTE=yes PEERDNS=yes PEERROUTES=yes IPV4_FAILURE_FATAL=no IPV6INIT=yes IPV6_AUTOCONF=yes IPV6_DEFROUTE=yes IPV6_PEERDNS=yes IPV6_PEERROUTES=yes IPV6_FAILURE_FATAL=no IPV6_ADDR_GEN_MODE=stable-privacy NAME=eth0 UUID=e925ab50-04af-4d73-8103-4621d24e0038 DEVICE=eth0 ONBOOT=yes IPADDR=192.168.1.101 NETMASK=255.255.255.0 GATEWAY=192.168.1.1 DNS1=114.114.114.114
restart network service to make the network setting take effect
sudo service network restart
set /etc/hosts
sudo cat <<EOF > /etc/hosts 192.168.1.101 lenmom-0 192.168.1.102 lenmom-1 192.168.1.103 lenmom-2 EOF
2.7 set ssh password-less access between cluster
2.7.1 install ssh-server if not already installed
#install ssh client and ssh-server sudo yum install -y openssl openssh-server #enable ssh server to start at system start up systemctl enable sshd.service #start ssh server service systemctl start sshd.service
2.7.2 generate ssh key
execute the following script on each macine in the cluster to generate ssh public key and private key.
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
execute the following script on each macine in the cluster
ssh-copy-id -i ~/.ssh/id_rsa.pub lenmom-0 #复制本机的公钥到lenmom-0机器上,默认会存储在远程机器的~/.ssh/authorized_keys文件中,如果此文件不存在,会创建该文件 ssh-copy-id -i ~/.ssh/id_rsa.pub lenmom-1 #复制本机的公钥到lenmom-1机器上,默认会存储在远程机器的~/.ssh/authorized_keys文件中,如果此文件不存在,会创建该文件 ssh-copy-id -i ~/.ssh/id_rsa.pub lenmom-2 #复制本机的公钥到lenmom-2机器上,默认会存储在远程机器的~/.ssh/authorized_keys文件中,如果此文件不存在,会创建该文件
authentication for keys access
chmod 755 ~ #当前用户根目录访问权限 chmod 700 ~/.ssh/ #.ssh目录权限 chmod 600 ~/.ssh/id_rsa #id_rsa的访问权限 chmod 644 ~/.ssh/id_rsa.pub #id_rsa.pub的访问权限 chmod 644 ~/.ssh/authorized_keys #authorized_keys的访问权限
2.8 update yum cache
yum clean all yum makecache yum repolist
3. install docker
because kubernetes support the highest docker version is 18.09,so we install docker 18.06 for this deployment.
to install the specified docker version, we can follow the steps as follows:
3.1 uninstall docker which has been installed on the target machine
query installed docker components:
sudo yum list installed | grep docker containerd.io.x86_64 1.2.10-3.2.el7 @docker-ce-stable docker-ce.x86_64 3:19.03.5-3.el7 @docker-ce-stable docker-ce-cli.x86_64 1:19.03.5-3.el7 @docker-ce-stable
uninstalled the coponents list above
sudo yum -y remove docker-ce.x86_64 sudo yum -y remove docker-ce-cli.x86_64 sudo yum -y remove containerd.io.x86_64 sudo yum -y remove docker-ce-selinux.noarch
delete docker images and containers,note:if you have defined custom storage position, do handle it!
#delete docker images and containers,note:if you have defined custom storage position, do handle it! sudo rm -rf /var/lib/docker
3.2 query the avaliable docker version in yum respository
yum list docker-ce --showduplicates | sort -r
3.3 install the specified docker version
in our case, we choose to install docker 18.06.3
yum install docker-ce-18.06.3.ce-3.el7
for detail install instructions, please refere to my blog post Docker 系列01: Centos7.3 上安装docker
mkdir -p /etc/docker vi /etc/docker/daemon.json {"registry-mirrors": ["https://5f2jam6c.mirror.aliyuncs.com", "http://hub-mirror.c.163.com"], "exec-opts": ["native.cgroupdriver=systemd"] }
check if the cgroupdriver is systemd
# docker info | grep Cgroup Cgroup Driver: systemd
enable docker start at system restart and restart docker service.
systemctl enable docker && systemctl restart docker
4. install kubernetes
4.1. install kubeadm,kubectl,kubelet
yum install -y kubelet-1.16.0 kubeadm-1.16.0 kubectl-1.16.0 --disableexcludes=kubernetes
- Kubelet is resonsible to communicate with other nodes in the cluster, and resposible for life time management of the pod and container in the local machine. it's an agent.
- Kubeadm is an automatic deployment tool to deploy kubenetes in order to increase productivity.
- Kubectl is the cluster management client tool for kubernetes.
optional command as follows:
yum install -y docker-ce-18.06.0-3.el7.x86_64 kubelet-1.16.0 kubeadm-1.16.0 kubectl-1.16.0
4.2 kubeadm init
kubeadm init --image-repository=registry.aliyuncs.com/google_containers --kubernetes-version=v1.16.0 --pod-network-cidr=192.168.84.0/24 --service-cidr=192.168.84.0/24 \
--apiserver-advertise-address=192.168.84.37 --ignore-preflight-errors=Swap --ignore-preflight-errors=NumCPU
some usual errors druing kubeadm init:
E01:
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
solution:
we can append
{ "exec-opts": [ "native.cgroupdriver=systemd" ] }
to file: /etc/docker/daemon.json
after the append, the file (/etc/docker/daemon.json) content shows as follows:
{ "registry-mirrors": [ "https://5f2jam6c.mirror.aliyuncs.com", "http://hub-mirror.c.163.com" ], "exec-opts": [ "native.cgroupdriver=systemd" ] }
E02:
[ERROR NumCPU]: the number of available CPUs 1 is less than the required 2
solution:
add argument --ignore-preflight-errors=NumCPU in the kubeadm init command.
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
E03:
[root@lenmom-0 lenmom]# kubeadm init --image-repository=registry.aliyuncs.com/google_containers --kubernetes-version=v1.17.0 --pod-network-cidr=192.168.84.0/24 --service-cidr=192.168.84.0/24 --apiserver-advertise-address=192.168.84.37 --ignore-preflight-errors=Swap --ignore-preflight-errors=docker --ignore-preflight-errors=NumCPU [init] Using Kubernetes version: v1.17.0 [preflight] Running pre-flight checks [WARNING NumCPU]: the number of available CPUs 1 is less than the required 2 [WARNING KubernetesVersion]: Kubernetes version is greater than kubeadm version. Please consider to upgrade kubeadm. Kubernetes version: 1.17.0. Kubeadm version: 1.16.x [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' error execution phase preflight: [preflight] Some fatal errors occurred: [ERROR ImagePull]: failed to pull image registry.aliyuncs.com/google_containers/kube-apiserver:v1.17.0: output: Error response from daemon: Get https://registry.aliyuncs.com/v2/google_containers/kube-apiserver/manifests/v1.17.0: Get https://dockerauth.cn-hangzhou.aliyuncs.com/auth?scope=repository%3Agoogle_containers%2Fkube-apiserver%3Apull&service=registry.aliyuncs.com%3Acn-hangzhou%3A26842: dial tcp: lookup dockerauth.cn-hangzhou.aliyuncs.com on 192.168.84.33:53: read udp 192.168.84.37:54293->192.168.84.33:53: i/o timeout , error: exit status 1 [ERROR ImagePull]: failed to pull image registry.aliyuncs.com/google_containers/kube-controller-manager:v1.17.0: output: Error response from daemon: Get https://registry.aliyuncs.com/v2/google_containers/kube-controller-manager/manifests/v1.17.0: Get https://dockerauth.cn-hangzhou.aliyuncs.com/auth?scope=repository%3Agoogle_containers%2Fkube-controller-manager%3Apull&service=registry.aliyuncs.com%3Acn-hangzhou%3A26842: dial tcp: lookup dockerauth.cn-hangzhou.aliyuncs.com on 192.168.84.33:53: read udp 192.168.84.37:46837->192.168.84.33:53: i/o timeout , error: exit status 1 [ERROR ImagePull]: failed to pull image registry.aliyuncs.com/google_containers/kube-scheduler:v1.17.0: output: Error response from daemon: Get https://registry.aliyuncs.com/v2/google_containers/kube-scheduler/manifests/v1.17.0: Get https://dockerauth.cn-hangzhou.aliyuncs.com/auth?scope=repository%3Agoogle_containers%2Fkube-scheduler%3Apull&service=registry.aliyuncs.com%3Acn-hangzhou%3A26842: dial tcp: lookup dockerauth.cn-hangzhou.aliyuncs.com on 192.168.84.33:53: read udp 192.168.84.37:32815->192.168.84.33:53: i/o timeout , error: exit status 1 [ERROR ImagePull]: failed to pull image registry.aliyuncs.com/google_containers/kube-proxy:v1.17.0: output: Error response from daemon: Get https://registry.aliyuncs.com/v2/google_containers/kube-proxy/manifests/v1.17.0: Get https://dockerauth.cn-hangzhou.aliyuncs.com/auth?scope=repository%3Agoogle_containers%2Fkube-proxy%3Apull&service=registry.aliyuncs.com%3Acn-hangzhou%3A26842: dial tcp: lookup dockerauth.cn-hangzhou.aliyuncs.com on 192.168.84.33:53: read udp 192.168.84.37:55244->192.168.84.33:53: i/o timeout , error: exit status 1 [ERROR ImagePull]: failed to pull image registry.aliyuncs.com/google_containers/pause:3.1: output: Error response from daemon: Get https://registry.aliyuncs.com/v2/google_containers/pause/manifests/3.1: Get https://dockerauth.cn-hangzhou.aliyuncs.com/auth?scope=repository%3Agoogle_containers%2Fpause%3Apull&service=registry.aliyuncs.com%3Acn-hangzhou%3A26842: dial tcp: lookup dockerauth.cn-hangzhou.aliyuncs.com on 192.168.84.33:53: read udp 192.168.84.37:55865->192.168.84.33:53: i/o timeout , error: exit status 1 [ERROR ImagePull]: failed to pull image registry.aliyuncs.com/google_containers/etcd:3.3.15-0: output: Error response from daemon: Get https://registry.aliyuncs.com/v2/google_containers/etcd/manifests/3.3.15-0: Get https://dockerauth.cn-hangzhou.aliyuncs.com/auth?scope=repository%3Agoogle_containers%2Fetcd%3Apull&service=registry.aliyuncs.com%3Acn-hangzhou%3A26842: dial tcp: lookup dockerauth.cn-hangzhou.aliyuncs.com on 192.168.84.33:53: read udp 192.168.84.37:56733->192.168.84.33:53: i/o timeout , error: exit status 1 [ERROR ImagePull]: failed to pull image registry.aliyuncs.com/google_containers/coredns:1.6.2: output: Error response from daemon: Get https://registry.aliyuncs.com/v2/google_containers/coredns/manifests/1.6.2: Get https://dockerauth.cn-hangzhou.aliyuncs.com/auth?scope=repository%3Agoogle_containers%2Fcoredns%3Apull&service=registry.aliyuncs.com%3Acn-hangzhou%3A26842: dial tcp: lookup dockerauth.cn-hangzhou.aliyuncs.com on 192.168.84.33:53: read udp 192.168.84.37:39608->192.168.84.33:53: i/o timeout , error: exit status 1 [preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...` To see the stack trace of this error execute with --v=5 or higher
this is because we can't pull docker image from respositoty, we can add docker respository at the comand line or pull docker images manaully. such as
--image-repository=registry.aliyuncs.com/google_containers
--image-repository=registry.cn-hangzhou.aliyuncs.com/google_containers
4.3 start kubelet
systemctl enable kubelet && systemctl start kubelet
reference:
https://www.cnblogs.com/tylerzhou/p/10971336.html
https://blog.csdn.net/networken/article/details/84571373
https://blog.csdn.net/networken/article/details/85215714
https://www.cnblogs.com/programmer-tlh/p/11331812.html