k8s部署文檔
一、文檔簡介
作者:lanjx
郵箱:lanheader@163.com
博客地址:https://www.cnblogs.com/lanheader/
更新時間:2021-07-09
二、使用kubeadm部署文檔
注意:所有執行無特殊說明都需要在所有節點(k8s-master 和 k8s-node)上執行
1、環境准備
准備三台主機(根據自己的情況進行設置)
192.168.8.158 master
192.168.8.159 node1
192.168.8.160 node2
1.1、主機名設置
hostname master
hostname node1
hostname node2
1.2、關閉防火牆
$ systemctl stop firewalld.service
$ systemctl disable firewalld.service
$ yum upgrade
1.3、關閉swap
注意:kubernetes1.8開始不關閉swap無法啟動
$ swapoff -a
$ cp /etc/fstab /etc/fstab_bak
$ cat /etc/fstab_bak |grep -v swap > /etc/fstab
$ cat /etc/fstab
# /etc/fstab
# Created by anaconda on Tue Jul 21 11:51:16 2020
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
/dev/mapper/centos_virtual--machine-root / xfs defaults 0 0
UUID=1694f89b-5c62-4a4a-9c86-46c3f202e4f6 /boot xfs defaults 0 0
/dev/mapper/centos_virtual--machine-home /home xfs defaults 0 0
#/dev/mapper/centos_virtual--machine-swap swap swap defaults 0 0
1.4、修改iptables參數
RHEL / CentOS 7上的一些用戶報告了由於iptables被繞過而導致流量路由不正確的問題。創建/etc/sysctl.d/k8s.conf文件,添加如下內容:
$ cat <<EOF > /etc/sysctl.d/k8s.conf
vm.swappiness = 0
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
使配置生效
$ modprobe br_netfilter
$ sysctl -p /etc/sysctl.d/k8s.conf
1.5、加載ipvs模塊
$cat > /etc/sysconfig/modules/ipvs.modules <<EOF
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4
EOF
這條命令有點長
$ chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4
1.6、安裝docker
# 卸載舊版 docker
$ docker stop `docker ps -a -q`
$ docker rm `docker ps -a -q`
$ docker rmi -f `docker images -a -q` //這里將會強制刪除
# 移除舊版本的軟件信息
$ yum -y remove docker docker-common container-selinux
# 設置最新穩定版本的Docker倉庫
$ yum-config-manager \
--add-repo \
https://docs.docker.com/v1.13/engine/installation/linux/repo_files/centos/docker.repo
# 安裝Docker
# 更新yum源
$ yum makecache fast
# 選擇你要的Docker版本
$ yum list docker-engine.x86_64 --showduplicates |sort -r
$ yum -y install docker-engine-<VERSION_STRING>
$ docker -v
# 啟動
$ systemctl start docker
$ systemctl enable docker
# 卸載
$ yum -y remove docker-engine docker-engine-selinux
1.7、創建共享存儲
如果選擇使用nfs-server執行以下步驟
注意:
/data/k8s *(rw,sync,no_root_squash)執行這步,如果沒有no_root_squash,pod啟動會報錯沒有權限
# 安裝nfs組件
$ yum -y install nfs-utils rpcbind
# 創建nfs路徑
$ mkdir -p /data/k8s/
# 配置路徑權限
$ chmod 755 /data/k8s/
# 配置nfs參數
$ vim /etc/exports
/data/k8s *(rw,sync,no_root_squash)
# 啟動服務
$ systemctl start rpcbind.service
$ systemctl enable rpcbind
$ systemctl status rpcbind
$ systemctl start nfs.service
$ systemctl enable nfs
$ systemctl status nfs
# 分別在node服務器執行掛載
$ showmount -e 192.168.1.109
2、helm安裝
2.1、安裝
我們可以在Helm Realese頁面下載二進制文件,這里下載的v2.10.0版本,解壓后將可執行文件helm
拷貝到/usr/local/bin
目錄下授權755即可,這樣Helm
客戶端就在這台機器上安裝完成了。
2.2、驗證
可以使用Helm
命令查看版本,會提示無法連接到服務端Tiller
:
$ helm version
Client: &version.Version{SemVer:"v2.10.0", GitCommit:"9ad53aac42165a5fadc6c87be0dea6b115f93090", GitTreeState:"clean"}
Error: could not find tiller
注意:要安裝 Helm 的服務端程序,我們需要使用到
kubectl
工具,所以先確保kubectl
工具能夠正常的訪問 kubernetes 集群的apiserver
哦。
然后我們在命令行中執行初始化操作:
helm init
由於 Helm 默認會去
gcr.io
拉取鏡像,所以如果你當前執行的機器沒有配置訪問國外的話可以實現下面的命令代替:
$ helm init --upgrade --tiller-image cnych/tiller:v2.10.0
$HELM_HOME has been configured at /root/.helm.
Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster.
Please note: by default, Tiller is deployed with an insecure 'allow unauthenticated users' policy.
To prevent this, run `helm init` with the --tiller-tls-verify flag.
For more information on securing your installation see: https://docs.helm.sh/using_helm/#securing-your-helm-installation
Happy Helming!
2.3、helm server鏡像地址
修改helm server鏡像地址
$ kubectl edit deployment tiller-deploy -n kube-system
替換
...
spec:
automountServiceAccountToken: true
containers:
- env:
- name: TILLER_NAMESPACE
value: kube-system
- name: TILLER_HISTORY_MAX
value: "0"
############################################# server鏡像地址 #################################
image: registry.cn-hangzhou.aliyuncs.com/hlc-k8s-gcr-io/tiller:v2.16.0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /liveness
port: 44135
scheme: HTTP
initialDelaySeconds: 1
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
name: tiller
...
地址:registry.cn-hangzhou.aliyuncs.com/hlc-k8s-gcr-io/tiller:v2.16.0
3、用kubeadm 部署 kubernetes
3.1、安裝kubeadm, kubelet
注意:yum install 安裝的時候一定要看一下kubernetes的版本號后面kubeadm init 的時候需要用到
$ cat <<EOF > /etc/yum.repos.d/kubernetes.repo
# 結果
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
exclude=kube*
EOF
> **注意:這里一定要看一下版本號,因為 Kubeadm init 的時候 填寫的版本號不能低於kuberenete版本(安裝過程中會有顯示)**
```shell
$ yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
注意:如果需要指定版本 用下面的命令kubelet-
$ yum install kubelet-1.19.2 kubeadm-1.19.2 kubectl-1.19.2 --disableexcludes=Kubernetes
3.2、啟動 kubelet
$ systemctl enable kubelet.service && systemctl start kubelet.service
啟動kubelet.service之后 我們查看一下kubelet狀態是未啟動狀態,查看原因發現是 “/var/lib/kubelet/config.yaml”文件不存在,這里可以暫時先不用處理,當kubeadm init 之后會創建此文件
我們在 k8s-master上用kubeadm ini初始化kubernetes
注意:這里的kubernetes-version 一定要和上面安裝的版本號一致 否則會報錯
kubeadm init \
--apiserver-advertise-address=192.168.8.158 \
--image-repository registry.aliyuncs.com/google_containers \
--kubernetes-version v1.21.2 \
--pod-network-cidr=10.244.0.0/16
--apiserver-advertise-addres # 填寫 k8s-master ip
--image-repository # 鏡像地址
--kubernetes-version #關閉版本探測,因為它的默認值是stable-1,會從https://storage.googleapis.com/kubernetes-release/release/stable-1.txt下載最新的版本號,指定版本跳過網絡請求,再次強調一定要和Kubernetes版本號一致
kubeadm init 初始化信息, 我們看一下初始化過程發現自動創建了 "/var/lib/kubelet/config.yaml" 這個文件 (由於node 節點不需要執行kubeadm init 所以手動拷貝這個文件到節點/var/lib/kubelet/config.yaml)
[init] Using Kubernetes version: v1.13.1
[preflight] Running pre-flight checks
...
certificates in the cluster
[bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes master has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
#======這里是用時再使用集群之前需要執行的操作------
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of machines by running the following on each node
as root:
#=====這是增加節點的方法 token過期 請參考問題集錦------
kubeadm join 10.211.55.6:6443 --token sfaff2.iet15233unw5jzql --discovery-token-ca-cert-hash sha256:f798c5be53416ca3b5c7475ee0a4199eb26f9e31ee7106699729c0660a70f8d7
初始化成功后會提示在使用之前需要再配置一下,配置方法已經給出,另外會生成一個臨時token以及增加節點的方法
普通用戶要使用k8s 需要執行下面操作:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
如果是root 可以直接執行
export KUBECONFIG=/etc/kubernetes/admin.conf
以上兩個二選一即可,這里我是直接用的root 所以直接執行
export KUBECONFIG=/etc/kubernetes/admin.conf
現在我們查看一下 kubelet 的狀態 已經是 running 狀態 ,啟動成功
查看狀態,確認每個 組件都是 Healthy 狀態
kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health": "true"}
查看node狀態
kubectl get node
NAME STATUS ROLES AGE VERSION
centos NotReady master 11m v1.19.2
安裝完k8s集群之后很可能會出現一下情況:
$ kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Unhealthy Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused
controller-manager Unhealthy Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: connect: connection refused
etcd-0 Healthy {"health":"true"}
出現這種情況是kube-controller-manager.yaml和kube-scheduler.yaml設置的默認端口是0,在文件中注釋掉就可以了。(每台master節點都要執行操作)
1.修改kube-scheduler.yaml文件
注釋 - --port=0
vim /etc/kubernetes/manifests/kube-scheduler.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: kube-scheduler
tier: control-plane
name: kube-scheduler
namespace: kube-system
spec:
containers:
- command:
- kube-scheduler
- --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
- --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
- --bind-address=127.0.0.1
- --kubeconfig=/etc/kubernetes/scheduler.conf
- --leader-elect=true
# - --port=0 ## 注釋掉這行
image: k8s.gcr.io/kube-scheduler:v1.18.6
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /healthz
port: 10259
scheme: HTTPS
initialDelaySeconds: 15
timeoutSeconds: 15
name: kube-scheduler
resources:
requests:
cpu: 100m
volumeMounts:
- mountPath: /etc/kubernetes/scheduler.conf
name: kubeconfig
readOnly: true
hostNetwork: true
priorityClassName: system-cluster-critical
volumes:
- hostPath:
path: /etc/kubernetes/scheduler.conf
type: FileOrCreate
name: kubeconfig
status: {}
2.修改kube-controller-manager.yaml文件
vim /etc/kubernetes/manifests/kube-controller-manager.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: kube-controller-manager
tier: control-plane
name: kube-controller-manager
namespace: kube-system
spec:
containers:
- command:
- kube-controller-manager
- --allocate-node-cidrs=true
- --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
- --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
- --bind-address=127.0.0.1
- --client-ca-file=/etc/kubernetes/pki/ca.crt
- --cluster-cidr=10.244.0.0/16
- --cluster-name=kubernetes
- --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
- --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
- --controllers=*,bootstrapsigner,tokencleaner
- --kubeconfig=/etc/kubernetes/controller-manager.conf
- --leader-elect=true
- --node-cidr-mask-size=24
# - --port=0 ## 注釋掉這行
- --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
- --root-ca-file=/etc/kubernetes/pki/ca.crt
- --service-account-private-key-file=/etc/kubernetes/pki/sa.key
- --service-cluster-ip-range=10.96.0.0/12
- --use-service-account-credentials=true
image: k8s.gcr.io/kube-controller-manager:v1.18.6
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /healthz
port: 10257
scheme: HTTPS
initialDelaySeconds: 15
timeoutSeconds: 15
name: kube-controller-manager
resources:
requests:
cpu: 200m
volumeMounts:
- mountPath: /etc/ssl/certs
name: ca-certs
readOnly: true
- mountPath: /etc/pki
name: etc-pki
readOnly: true
- mountPath: /usr/libexec/kubernetes/kubelet-plugins/volume/exec
name: flexvolume-dir
- mountPath: /etc/kubernetes/pki
name: k8s-certs
readOnly: true
- mountPath: /etc/kubernetes/controller-manager.conf
name: kubeconfig
readOnly: true
hostNetwork: true
priorityClassName: system-cluster-critical
volumes:
- hostPath:
path: /etc/ssl/certs
type: DirectoryOrCreate
name: ca-certs
- hostPath:
path: /etc/pki
type: DirectoryOrCreate
name: etc-pki
- hostPath:
path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec
type: DirectoryOrCreate
name: flexvolume-dir
- hostPath:
path: /etc/kubernetes/pki
type: DirectoryOrCreate
name: k8s-certs
- hostPath:
path: /etc/kubernetes/controller-manager.conf
type: FileOrCreate
name: kubeconfig
status: {}
3.每台master重啟kubelet
$ systemctl restart kubelet.service
4.再次查看狀態
$ kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health":"true"}
3.3、安裝port Network( flannel )
$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
如果訪問不了,自己想辦法。。。。
3.4、安裝storageClass
$ git clone https://github.com/helm/charts.git
$ cd charts/
$ helm install stable/nfs-client-provisioner --set nfs.server=192.168.1.109 --set nfs.path=/data/k8s
注意:地址下載不下來想辦法。。。。
3.5、K8s 補全命令:
$ yum install -y bash-completion
$ source /usr/share/bash-completion/bash_completion
$ source <(kubectl completion bash)
$ echo "source <(kubectl completion bash)" >> ~/.bashrc
3.6、安裝部署時出現的問題
3.6.1 集群DNS組件拉取問題:
Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.4.13-0
failed to pull image "registry.cn-hangzhou.aliyuncs.com/google_containers/coredns/coredns:v1.8.0": output: Error response from daemon: pull access denied for registry.cn-hangzhou.aliyuncs.com/google_containers/coredns/coredns, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
, error: exit status 1
原因:
kubernetes v1.21.1 安裝時需要從 k8s.gcr.io 拉取鏡像,但是該網站被我國屏蔽了,國內沒法正常訪問導致沒法正常安裝。
這里通過介紹從Docker官方默認鏡像平台拉取鏡像並重新打tag的方式來繞過對 k8s.gcr.io 的訪問。
解決方案:
手動下載鏡像
$ docker pull coredns/coredns
查看kubeadm需要鏡像,並修改名稱
$ kubeadm config images list --config new.yaml
查看鏡像
$ docker images
打標簽,修改名稱
$ docker tag coredns/coredns:latest registry.cn-hangzhou.aliyuncs.com/google_containers/coredns/coredns:v1.8.0
刪除多余鏡像
$ docker rmi coredns/coredns:latest
3.6.2、kubelet 啟動不了
查看kubelet狀態
systemctl status kubelet.service
輸出如下:
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset:disabled)
Drop-In: /usr/lib/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: activating (auto-restart) (Result: exit-code) since 日 2019-03-31 16:18:55 CST;7s ago
Docs: https://kubernetes.io/docs/
Process: 4564 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=255)
Main PID: 4564 (code=exited, status=255)
3月 31 16:18:55 k8s-node systemd[1]: Unit kubelet.service entered failed state.
3月 31 16:18:55 k8s-node systemd[1]: kubelet.service failed.
查看出錯信息
journalctl -xefu kubelet
3月 31 16:19:46 k8s-node systemd[1]: kubelet.service holdoff time over,scheduling restart.
3月 31 16:19:46 k8s-node systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
-- Subject: Unit kubelet.service has finished shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- Unit kubelet.service has finished shutting down.
3月 31 16:19:46 k8s-node systemd[1]: Started kubelet: The Kubernetes Node Agent
-- Subject: Unit kubelet.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- Unit kubelet.service has finished starting up.
-- The start-up result is done.
######注意以下報錯內容:
3月 31 16:19:46 k8s-node kubelet[4611]: F0331 16:19:46.989588 4611 server.go:193] failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file "/var/lib/kubelet/config.yaml", error: open /var/lib/kubelet/config.yaml: no such file or directory
3月 31 16:19:46 k8s-node systemd[1]: kubelet.service: main process exited,code=exited,status=255/n/a
#######
3月 31 16:19:46 k8s-node systemd[1]: Unit kubelet.service entered failed state.
3月 31 16:19:46 k8s-node systemd[1]: kubelet.service failed.
報錯/var/lib/kubelet/config.yaml不存在,執行3.2初始化操作
三、安裝rancher
注意:雲服務器使用nodeport方式,svc啟動之后修改為nodeport。
1、 創建 Namespace
$ kubectl create namespace cattle-system
2、安裝 cert-manager
# 安裝 CustomResourceDefinition 資源
kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v0.15.0/cert-manager.crds.yaml
# **重要:**# 如果您正在運行 Kubernetes v1.15 或更低版本,
# 則需要在上方的 kubectl apply 命令中添加`--validate=false`標志,
# 否則您將在 cert-manager 的 CustomResourceDefinition 資源中收到與
# x-kubernetes-preserve-unknown-fields 字段有關的驗證錯誤。
# 這是一個良性錯誤,是由於 kubectl 執行資源驗證的方式造成的。
# 為 cert-manager 創建命名空間
kubectl create namespace cert-manager
# 添加 Jetstack Helm 倉庫
helm repo add jetstack https://charts.jetstack.io
# 更新本地 Helm chart 倉庫緩存
helm repo update
# 安裝 cert-manager Helm chart
helm install \
cert-manager jetstack/cert-manager \
--namespace cert-manager \
--version v0.15.0
4、安裝rancher
# 添加rancher源
$ helm repo add rancher https://releases.rancher.com/server-charts/
# 安裝最新版rancher
$ helm install rancher rancher-stable/rancher –namespace cattle-system