一、環境准備:
集群版本:kubernetes 1.22.4
服務器系統 | 節點IP | 節點類型 | 服務器-內存/CUP | hostname |
Ubuntu 20.04 | 192.168.1.101 | 主節點 | 2G/4核 | master |
Ubuntu 20.04 | 192.168.1.102 | 工作節點1 | 2G/4核 | node1 |
Ubuntu 20.04 | 192.168.1.104 | 工作節點2 | 2G/4核 | node2 |
二、安裝檢查:
注:在三台機器上執行------------------------開始----------------------------
1.3台機器網絡連通
2.3台機器Hostname,MAC地址,product_uuid (可以通過sudo cat /sys/class/dmi/id/product_uuid查看)必須唯一
3.檢查以下端口的連通性---我這里為了方便把防火牆直接關了,,生產環境不能這么做!!!
4.禁用swap(重要!!!)
檢查swap
yang@master:/etc/docker$ sudo free -m
[sudo] password for yang:
total used free shared buff/cache available
Mem: 1959 1222 86 3 649 548
Swap: 2047 0 2047
臨時禁用swap
yang@master:/etc/docker$ sudo swapoff -a
再次查看swap
yang@master:/etc/docker$ sudo free -m
[sudo] password for yang:
total used free shared buff/cache available
Mem: 1959 1222 86 3 649 548
Swap: 0 0 0
5.將文件系統設置為可讀寫
yang@master:/etc/docker$ sudo mount -n -o remount,rw /
6.將文件中的swap行使用#注釋掉,永久關閉交換分區
swap分區:交換分區,從磁盤里分一塊空間來充當內存使用,性能比真正的物理內存要差
docker容器在內存里運行 --》 k8s不允許容器到swap分區運行,要關閉swap分區–》所以關閉swap分區是k8s為了追求高性能
yang@master:/etc/docker$ sudo nano /etc/fstab
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point> <type> <options> <dump> <pass>
# / was on /dev/sda2 during curtin installation
/dev/disk/by-uuid/7006dc64-4b4b-41e7-a1ea-857c98683977 / ext4 defaults 0 1
#/swap.img none swap sw 0 0
7.重啟電腦
yang@master:/etc/docker$ sudo reboot
三、安裝CRI-這里使用Docker
1.Ubuntu 20.04 server 安裝docker最新版
# Install Docker Engine on Ubuntu #卸載舊版本,舊版本的 Docker 被稱為 docker、docker.io 或 docker-engine。如果安裝了這些,請卸載它們 sudo apt-get remove docker docker-engine docker.io containerd runc #設置存儲庫 #更新 apt 包索引並安裝包以允許 apt 通過 HTTPS 使用存儲庫 sudo apt-get update sudo apt-get install \ apt-transport-https \ ca-certificates \ curl \ gnupg \ lsb-release #添加Docker官方的GPG密鑰 curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg #使用以下命令設置穩定存儲庫 echo \ "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \ $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null #安裝 Docker 引擎 #更新apt包索引,安裝最新版本的Docker Engine和containerd,或者到下一步安裝特定版本 sudo apt-get update sudo apt-get install docker-ce docker-ce-cli containerd.io #安裝特定docker版本(本文用不到) #查詢存儲庫中可用版本 apt-cache madison docker-ce 使用第二列中的版本字符串安裝特定版本,例如 5:18.09.1~3-0~ubuntu-xenial #安裝命令 sudo apt-get install docker-ce=<VERSION_STRING> docker-ce-cli=<VERSION_STRING> containerd.io 注:將<VERSION_STRING>,替換為特定版本即可安裝。
2.使普通用戶也可以執行docker命令
sudo groupadd docker #添加docker用戶組
sudo gpasswd -a $USER docker #將登陸用戶加入到docker用戶組中
newgrp docker #更新用戶組
3.檢查docker安裝情況
yang@master:/etc/docker$ docker version
Client: Docker Engine - Community
Version: 20.10.11
API version: 1.41
Go version: go1.16.9
Git commit: dea9396
Built: Thu Nov 18 00:37:06 2021
OS/Arch: linux/amd64
Context: default
Experimental: true
Server: Docker Engine - Community
Engine:
Version: 20.10.11
API version: 1.41 (minimum version 1.12)
Go version: go1.16.9
Git commit: 847da18
Built: Thu Nov 18 00:35:15 2021
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.4.12
GitCommit: 7b11cfaabd73bb80907dd23182b9347b4245eb5d
runc:
Version: 1.0.2
GitCommit: v1.0.2-0-g52b36a2
docker-init:
Version: 0.19.0
GitCommit: de40ad0
4.檢查套接字
檢查套接字(kubernetes會在這個目錄下搜索,從而識別到CRI)
yang@master:/etc/docker$ ls /var/run/docker.sock
/var/run/docker.sock
四、安裝kubectl kubeadm kubelet
1.安裝curl和apt-transport-https
sudo apt-get update && sudo apt-get install -y apt-transport-https curl
2.下載GPG
sudo wget https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg
3.添加GPG
sudo apt-key add apt-key.gpg
4.寫入鏡像源文件
注:沒有此目錄,直接在下面創建這個文件,並增加以下內容。
sudo cat <<EOF >/etc/apt/sources.list.d/kubernetes.list deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main EOF
5.更新索引
sudo apt-get update
6.安裝kubectl kubeadm kubelet
sudo apt-get install -y kubeadm
注:在三台機器上執行------------------------結束----------------------------
五、安裝集群
注:master上執行
1.檢查需要哪些鏡像
yang@master:/etc/docker$ kubeadm config images list
k8s.gcr.io/kube-apiserver:v1.22.4
k8s.gcr.io/kube-controller-manager:v1.22.4
k8s.gcr.io/kube-scheduler:v1.22.4
k8s.gcr.io/kube-proxy:v1.22.4
k8s.gcr.io/pause:3.5
k8s.gcr.io/etcd:3.5.0-0
k8s.gcr.io/coredns/coredns:v1.8.4
2.下載鏡像文件
創建一個腳本,給執行權限,方便操作
sudo nano pull sudo chmod +x pull #!/bin/sh sudo docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.22.4 sudo docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.22.4 sudo docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.22.4 sudo docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.22.4 sudo docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5 sudo docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.0-0 sudo docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.8.4
./pull
3.查看鏡像文件
yang@master:/etc/docker$ docker images REPOSITORY TAG IMAGE ID CREATED SIZE registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver v1.22.4 8a5cc299272d 11 days ago 128MB registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager v1.22.4 0ce02f92d3e4 11 days ago 122MB registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler v1.22.4 721ba97f54a6 11 days ago 52.7MB registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy v1.22.4 edeff87e4802 11 days ago 104MB registry.cn-hangzhou.aliyuncs.com/google_containers/etcd 3.5.0-0 004811815584 5 months ago 295MB registry.cn-hangzhou.aliyuncs.com/google_containers/coredns v1.8.4 8d147537fb7d 6 months ago 47.6MB registry.cn-hangzhou.aliyuncs.com/google_containers/pause 3.5 ed210e3e4a5b 8 months ago 683kB
4.修改鏡像tag
sudo nano tag sudo chmod +x tag #!/bin/sh sudo docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.22.4 k8s.gcr.io/kube-apiserver:v1.22.4 sudo docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.22.4 k8s.gcr.io/kube-controller-manager:v1.22.4 sudo docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.22.4 k8s.gcr.io/kube-scheduler:v1.22.4 sudo docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.22.4 k8s.gcr.io/kube-proxy:v1.22.4 sudo docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5 k8s.gcr.io/pause:3.5 sudo docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.0-0 k8s.gcr.io/etcd:3.5.0-0 sudo docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.8.4 k8s.gcr.io/coredns/coredns:v1.8.4
./tag
5.查看修改后的鏡像
yang@master:/etc/docker$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
k8s.gcr.io/kube-apiserver v1.22.4 8a5cc299272d 11 days ago 128MB
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver v1.22.4 8a5cc299272d 11 days ago 128MB
k8s.gcr.io/kube-controller-manager v1.22.4 0ce02f92d3e4 11 days ago 122MB
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager v1.22.4 0ce02f92d3e4 11 days ago 122MB
k8s.gcr.io/kube-scheduler v1.22.4 721ba97f54a6 11 days ago 52.7MB
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler v1.22.4 721ba97f54a6 11 days ago 52.7MB
k8s.gcr.io/kube-proxy v1.22.4 edeff87e4802 11 days ago 104MB
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy v1.22.4 edeff87e4802 11 days ago 104MB
k8s.gcr.io/etcd 3.5.0-0 004811815584 5 months ago 295MB
registry.cn-hangzhou.aliyuncs.com/google_containers/etcd 3.5.0-0 004811815584 5 months ago 295MB
k8s.gcr.io/coredns/coredns v1.8.4 8d147537fb7d 6 months ago 47.6MB
registry.cn-hangzhou.aliyuncs.com/google_containers/coredns v1.8.4 8d147537fb7d 6 months ago 47.6MB
k8s.gcr.io/pause 3.5 ed210e3e4a5b 8 months ago 683kB
registry.cn-hangzhou.aliyuncs.com/google_containers/pause 3.5 ed210e3e4a5b 8 months ago 683kB
6. 配置集群
初始化主節點
master執行
kubeadm init --apiserver-advertise-address=192.168.1.101 --kubernetes-version=v1.22.4 --service-cidr=10.96.0.0/12 --pod-network-cidr=10.244.0.0/16 --ignore-preflight-errors=all --v=6
初始化完成顯示
Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.1.101:6443 --token zbsafp.yliab4onvxpwdmxx \ --discovery-token-ca-cert-hash sha256:5e2e9d7c76cce5e14897138979d7b397311fd632a1618920a29582ca6d2523b3
照上面提示遷移配置
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config
7.子節點(node)加入集群
注:
a.上傳kube-proxy:v1.22.4,pause:3.5,coredns:v1.8.4 三個鏡像,並修改tag
b.修改daemon.json
加入集群命令:
kubeadm join 192.168.1.101:6443 --token zbsafp.yliab4onvxpwdmxx \ --discovery-token-ca-cert-hash sha256:5e2e9d7c76cce5e14897138979d7b397311fd632a1618920a29582ca6d2523b3
添加node節點,報錯1
注:可以提前在各node節點上修改好(無報錯無需執行此項)
yang@node2:~$ sudo kubeadm join 192.168.1.101:6443 --token 6131nu.8ohxo1ttgwiqlwmp --discovery-token-ca-cert-hash sha256:9bece23d1089b6753a42ce4dab3fa5ac7d2d4feb260a0f682bfb06ccf1eb4fe2 [preflight] Running pre-flight checks [preflight] Reading configuration from the cluster... [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml' [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Starting the kubelet [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap... [kubelet-check] Initial timeout of 40s passed. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused. error execution phase kubelet-start: error uploading crisocket: timed out waiting for the condition To see the stack trace of this error execute with --v=5 or higher
原 因:
1.因為docker默認的Cgroup Driver是cgroupfs ,cgroupfs是cgroup為給用戶提供的操作接口而開發的虛擬文件系統類型
它和sysfs,proc類似,可以向用戶展示cgroup的hierarchy,通知kernel用戶對cgroup改動
對cgroup的查詢和修改只能通過cgroupfs文件系統來進行
2.Kubernetes 推薦使用 systemd
來代替 cgroupfs
因為
systemd是Kubernetes自帶的cgroup管理器, 負責為每個進程分配cgroups,
但docker的cgroup driver默認是cgroupfs,這樣就同時運行有兩個cgroup控制管理器,
當資源有壓力的情況時,有可能出現不穩定的情況
解決辦法:
①. 創建daemon.json文件,加入以下內容: ls /etc/docker/daemon.json {"exec-opts": ["native.cgroupdriver=systemd"]} ②. 重啟docker sudo systemctl restart docker ③. 重啟kubelet sudo systemctl restart kubelet sudo systemctl status kubelet ④. 重新執行加入集群命令 首先清除緩存 sudo kubeadm reset ⑤. 加入集群 sudo kubeadm join 192.168.1.101:6443 --token 6131nu.8ohxo1ttgwiqlwmp --discovery-token-ca-cert-hash sha256:9bece23d1089b6753a42ce4dab3fa5ac7d2d4feb260a0f682bfb06ccf1eb4fe2
報錯2
注:各node上需要有kube-proxy鏡像
問題:
Normal BackOff 106s (x249 over 62m) kubelet Back-off pulling image "k8s.gcr.io/kube-proxy:v1.22.4"
原 因:
缺少k8s.gcr.io/kube-proxy:v1.22.4鏡像
解決辦法:
① 上傳鏡像
sudo docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.22.4
② 修改鏡像tag
sudo docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.22.4 k8s.gcr.io/kube-proxy:v1.22.4
報錯3
注:各node上需要有pause:3.5鏡像
問題:
Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 17m default-scheduler Successfully assigned kube-system/kube-proxy-dv2sw to node2 Warning FailedCreatePodSandBox 4m48s (x26 over 16m) kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed pulling image "k8s.gcr.io/pause:3.5": Error response from daemon: Get "https://k8s.gcr.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) Warning FailedCreatePodSandBox 93s (x2 over 16m) kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed pulling image "k8s.gcr.io/pause:3.5": Error response from daemon: Get "https://k8s.gcr.io/v2/": dial tcp 74.125.195.82:443: i/o timeout
原 因:
缺少k8s.gcr.io/pause:3.5鏡像
解決辦法:
① 上傳鏡像
sudo docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5
② 修改鏡像tag
sudo docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5 k8s.gcr.io/pause:3.5
報錯4
注:各node上需要有coredns:v1.8.4鏡像
問題:
Warning Failed 28m (x11 over 73m) kubelet Failed to pull image "k8s.gcr.io/coredns/coredns:v1.8.4" : rpc error: code = Unknown desc = Error response from daemon: Get "https://k8s.gcr.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
這里是一個坑,鏡像的tag是v1.8.4,而不是1.8.4,所以導致下載超時
查看官網images列表
yang@master:/etc/docker$ kubeadm config images list k8s.gcr.io/kube-apiserver:v1.22.4 k8s.gcr.io/kube-controller-manager:v1.22.4 k8s.gcr.io/kube-scheduler:v1.22.4 k8s.gcr.io/kube-proxy:v1.22.4 k8s.gcr.io/pause:3.5 k8s.gcr.io/etcd:3.5.0-0 k8s.gcr.io/coredns/coredns:v1.8.4
解決辦法:
① 上傳鏡像
sudo docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.8.4
② 修改鏡像tag
sudo docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.8.4 k8s.gcr.io/coredns/coredns:v1.8.4
8.檢查健康狀態
此時controller-manager與scheduler異常
yang@master:/etc/docker$ kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
controller-manager UnHealthy Get "http://127.0.0.1:10252/healthz": dial tcp 127.0.0.1:10252: connect: connection refused
etcd-0 Healthy {"health":"true","reason":""}
scheduler UnHealthy Get "http://127.0.0.1:10251/healthz": dial tcp 127.0.0.1:10251: connect: connection refused
9.修改兩個配置文件,如下:
注釋內容:# - --port=0
yang@master:/etc/docker$ sudo nano /etc/kubernetes/manifests/kube-controller-manager.yaml apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: component: kube-controller-manager tier: control-plane name: kube-controller-manager namespace: kube-system spec: containers: - command: - kube-controller-manager - --allocate-node-cidrs=true - --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf - --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf - --bind-address=127.0.0.1 - --client-ca-file=/etc/kubernetes/pki/ca.crt - --cluster-cidr=10.244.0.0/16 - --cluster-name=kubernetes - --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt - --cluster-signing-key-file=/etc/kubernetes/pki/ca.key - --controllers=*,bootstrapsigner,tokencleaner - --kubeconfig=/etc/kubernetes/controller-manager.conf - --leader-elect=true #- --port=0 - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt - --root-ca-file=/etc/kubernetes/pki/ca.crt - --service-account-private-key-file=/etc/kubernetes/pki/sa.key - --service-cluster-ip-range=10.96.0.0/12 - --use-service-account-credentials=true image: k8s.gcr.io/kube-controller-manager:v1.22.4 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 8
yang@master:/etc/docker$ sudo nano /etc/kubernetes/manifests/kube-scheduler.yaml apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: component: kube-scheduler tier: control-plane name: kube-scheduler namespace: kube-system spec: containers: - command: - kube-scheduler - --authentication-kubeconfig=/etc/kubernetes/scheduler.conf - --authorization-kubeconfig=/etc/kubernetes/scheduler.conf - --bind-address=127.0.0.1 - --kubeconfig=/etc/kubernetes/scheduler.conf - --leader-elect=true #- --port=0 image: k8s.gcr.io/kube-scheduler:v1.22.4 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 8 httpGet: host: 127.0.0.1 path: /healthz port: 10259 scheme: HTTPS initialDelaySeconds: 10
10.再次查看健康狀態
這時10251,10252端口就開啟了,健康檢查狀態也正常了。
yang@master:/etc/docker$ kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
etcd-0 Healthy {"health":"true","reason":""}
controller-manager Healthy ok
11.節點狀態
yang@master:~$ kubectl get node
NAME STATUS ROLES AGE VERSION
master NotReady control-plane,master 21m v1.22.4
node1 NotReady <none> 15m v1.22.4
node2 NotReady <none> 14m v1.22.4
節點還是NotReady,檢查Pod
yang@master:~$ kubectl get pod --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-78fcd69978-drgk6 0/1 pending 0 4d1h
kube-system coredns-78fcd69978-tz4kt 0/1 pending 0 4d1h
kube-system etcd-master 1/1 Running 1 (3d19h ago) 4d1h
kube-system kube-apiserver-master 1/1 Running 1 (3d19h ago) 4d1h
kube-system kube-controller-manager-master 1/1 Running 3 3d19h
kube-system kube-flannel-ds-ksn8l 1/1 Running 0 3d20h
kube-system kube-flannel-ds-ld5jr 1/1 Running 1 (3d19h ago) 3d20h
kube-system kube-flannel-ds-wf2t2 1/1 Running 0 3d20h
kube-system kube-proxy-dv2sw 1/1 Running 0 4d
kube-system kube-proxy-jl7f5 1/1 Running 0 4d
kube-system kube-proxy-xn96j 1/1 Running 2 (3d19h ago) 4d1h
kube-system kube-scheduler-master 1/1 Running 3 3d19h
kube-system rancher-6fbc899b67-mzcvb 1/1 Running 0 2d21h
原 因:coredns狀態為pending,未運行,因為缺少網絡組件
執行命令查看:journalctl -f -u kubelet
Nov 06 15:37:21 jupiter kubelet[86177]: W1106 15:37:21.482574 86177 cni.go:237] Unable to update cni config: no valid networks found in /etc/cni
Nov 06 15:37:25 jupiter kubelet[86177]: E1106 15:37:25.075839 86177 kubelet.go:2187] Container runtime network not ready: NetworkReady=false reaeady: cni config uninitialized
解決辦法:
yang@master:/etc/docker$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml yang@master:~$ kubectl get pod --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system coredns-78fcd69978-drgk6 1/1 Running 0 4d1h kube-system coredns-78fcd69978-tz4kt 1/1 Running 0 4d1h kube-system etcd-master 1/1 Running 1 (3d19h ago) 4d1h kube-system kube-apiserver-master 1/1 Running 1 (3d19h ago) 4d1h kube-system kube-controller-manager-master 1/1 Running 3 3d19h kube-system kube-flannel-ds-ksn8l 1/1 Running 0 3d20h kube-system kube-flannel-ds-ld5jr 1/1 Running 1 (3d19h ago) 3d20h kube-system kube-flannel-ds-wf2t2 1/1 Running 0 3d20h kube-system kube-proxy-dv2sw 1/1 Running 0 4d kube-system kube-proxy-jl7f5 1/1 Running 0 4d kube-system kube-proxy-xn96j 1/1 Running 2 (3d19h ago) 4d1h kube-system kube-scheduler-master 1/1 Running 3 3d19h kube-system rancher-6fbc899b67-mzcvb 1/1 Running 0 2d21h
12.查看節點狀態
yang@master:~$ kubectl get node
NAME STATUS ROLES AGE VERSION
master Ready control-plane,master 4d1h v1.22.4
node1 Ready <none> 4d v1.22.4
node2 Ready <none> 4d v1.22.4
至此,集群部署完畢!
Kubernetes正在不斷加快在雲原生環境的應用,但如何以統一、安全的方式對運行於任何地方的Kubernetes集群進行管理面臨着挑戰,而有效的管理工具能夠大大降低管理的難度。如何管理集群呢?有什么圖形化工具呢?
例如:
1. kubernetes-dashboard
Dashboard 是基於網頁的 Kubernetes 用戶界面。 你可以使用 Dashboard 將容器應用部署到 Kubernetes 集群中,也可以對容器應用排錯,還能管理集群資源。 你可以使用 Dashboard 獲取運行在集群中的應用的概覽信息,也可以創建或者修改 Kubernetes 資源 (如 Deployment,Job,DaemonSet 等等)。 例如,你可以對 Deployment 實現彈性伸縮、發起滾動升級、重啟 Pod 或者使用向導創建新的應用。
Dashboard 同時展示了 Kubernetes 集群中的資源狀態信息和所有報錯信息。
官網地址:https://kubernetes.io/zh/docs/tasks/access-application-cluster/web-ui-dashboard/
2.kuboard
Kuboard 是一款免費的 Kubernetes 管理工具,提供了豐富的功能,結合已有或新建的代碼倉庫、鏡像倉庫、CI/CD工具等,可以便捷的搭建一個生產可用的 Kubernetes 容器雲平台,輕松管理和運行雲原生應用。您也可以直接將 Kuboard 安裝到現有的 Kubernetes 集群,通過 Kuboard 提供的 Kubernetes RBAC 管理界面,將 Kubernetes 提供的能力開放給您的開發/測試團隊。Kuboard 提供的功能有:
- Kubernetes 基本管理功能
- Kubernetes 問題診斷
- Kubernetes 存儲管理
- 認證與授權(收費)
- Kuboard 特色功能
官方有在線演示地址,可以不用安裝先體驗一下
官網地址:https://www.kuboard.cn/
3.kubesphere
KubeSphere 是在 Kubernetes 之上構建的面向雲原生應用的分布式操作系統,支持多雲與多集群管理,提供全棧的 IT 自動化運維的能力,簡化企業的 DevOps 工作流。它的架構可以非常方便地使第三方應用與雲原生生態組件進行即插即用 (plug-and-play) 的集成。
作為全棧化容器部署與多租戶管理平台,KubeSphere 提供了運維友好的向導式操作界面,幫助企業快速構建一個強大和功能豐富的容器雲平台。它擁有 Kubernetes 企業級服務所需的最常見功能,例如 Kubernetes 資源管理、DevOps、多集群部署與管理、應用生命周期管理、微服務治理、日志查詢與收集、服務與網絡、多租戶管理、監控告警、事件審計、存儲、訪問控制、GPU 支持、網絡策略、鏡像倉庫管理以及安全管理等。
官網地址:https://kubesphere.io/zh/
4.Rancher
Rancher 是為使用容器的公司打造的容器管理平台。Rancher 簡化了使用 Kubernetes 的流程,開發者可以隨處運行 Kubernetes(Run Kubernetes Everywhere),滿足 IT 需求規范,賦能 DevOps 團隊。
這幾種是目前比較火的集群管理工具,可以任意選一種!
我這里使用了kuboard工具
在 K8S 中安裝 Kuboard,主要考慮的問題是,如何提供 etcd 的持久化數據卷。建議的兩個選項有:
- 使用 hostPath 提供持久化存儲,將 kuboard 所依賴的 Etcd 部署到 Master 節點,並將 etcd 的數據目錄映射到 Master 節點的本地目錄;推薦
- 使用 StorageClass 動態創建 PV 為 etcd 提供數據卷;不推薦
1.安裝
執行 Kuboard v3 在 K8S 中的安裝
kubectl apply -f https://addons.kuboard.cn/kuboard/kuboard-v3.yaml
# 您也可以使用下面的指令,唯一的區別是,該指令使用華為雲的鏡像倉庫替代 docker hub 分發 Kuboard 所需要的鏡像 # kubectl apply -f https://addons.kuboard.cn/kuboard/kuboard-v3-swr.yaml
2.等待 Kuboard v3 就緒
執行指令 watch kubectl get pods -n kuboard
,等待 kuboard 名稱空間中所有的 Pod 就緒,如下所示,
如果結果中沒有出現 kuboard-etcd-xxxxx
的容器。
3.訪問
在瀏覽器中打開鏈接 http://your-node-ip-address:30080
-
輸入初始用戶名和密碼,並登錄
- 用戶名:
admin
- 密碼:
Kuboard123
- 用戶名:
訪問后,將創建一個集群,將部署的k8s加入到管理列表中,如下:
編寫集群名稱,描述,加入集群的指令(按提示框內操作),Context:kubernetes ,ApiServer地址為:http://masterIP:6443,然后確定即可將集群加入!
yang@master:~$ sudo cat /etc/kubernetes/admin.conf
部署一個rancher,調度到了node1上,並且可以正常訪問!
亦可進入容器,終端控制台,以及,日志的查詢,顯示如下:
此時,我已經將部署的kubernetes集群加入到kuboard里面,並且顯示正常運行,且可以正常調度!
瀏覽器兼容性
- 請使用 Chrome / FireFox / Safari / Edge 等瀏覽器
- 不兼容 IE 以及以 IE 為內核的瀏覽器
4.添加新的集群
- Kuboard v3 是支持 Kubernetes 多集群管理的,在 Kuboard v3 的首頁里,點擊 添加集群 按鈕,在向導的引導下可以完成集群的添加;
- 向 Kuboard v3 添加新的 Kubernetes 集群時,請確保:
- 您新添加集群可以訪問到當前集群 Master 節點
內網IP
的30080 TCP
、30081 TCP
、30081 UDP
端口; - 如果您打算新添加到 Kuboard 中的集群與當前集群不在同一個局域網,請咨詢 Kuboard 團隊,幫助您解決問題。
- 您新添加集群可以訪問到當前集群 Master 節點
5.卸載
-
執行 Kuboard v3 的卸載
kubectl delete -f https://addons.kuboard.cn/kuboard/kuboard-v3.yaml
-
清理遺留數據
在 master 節點以及帶有
k8s.kuboard.cn/role=etcd
標簽的節點上執行rm -rf /usr/share/kuboard
常用命令:
1、查看集群信息node
kubuctl get node
2、查看所有pod
kubectl get pod --all-namespaces
3、查看pod詳細信息
kubectl describe pod pod名稱 --namespace=NAMESPACE
4、查看健康狀態
kubectl get cs
5.查看指定namespace 下的pod
kubectl get pods --namespace=test
6.查看namespace:
kubectl get namespace
7.創建名為test的namespace
kubectl create namespace test
8.設置命名空間首選項
kubectl config set-context --current --namespace=test
9.在已創建的命名空間中創建資源
kubectl apply -f pod.yaml --namespace=test
10.進入k8s啟動pod
kubectl exec -it admin-frontend-server-74497cb64f-8fxk8 --bash
11.查看各組件狀態
kubectl get componentstatuses
12.master 節點也作為node節點
kubectl taint nodes --all node-role.kubernetes.io/master-
13.其他命令
kubectl get pods -n kube-system
kubectl get pod --all-namespaces
kubectl get csr
kubectl get deployments
kubectl get pods -n kube-system -o wide --watch
kubectl describe pods weave-net-87t7g -n kube-system
14.創建pod
kubectl create -f deployment.yaml
15.創建server文件
kubectl create -f services.yaml
16.查看文件創建情況
kubectl discribe service **
kubectl discribe deployment **
17.刪除 secret, deployment
kubectl delete secret **
kubectl delete deployment **
18. 匹配特定服務;
kubectl get po | grep display
kubectl get svc | grep data
19.查看服務日志
kubectl logs ** -f —tail=20
kubectl logs ** --since=1h
20。卸載集群
# 想要撤銷kubeadm做的事,首先要排除節點,並確保在關閉節點之前要清空節點。
# 在主節點上運行:
kubectl drain <node name> --delete-local-data --force --ignore-daemonsets
kubectl delete node <node name>
21. 然后在需要移除的節點上,重置kubeadm的安裝狀態:
kubeadm reset
# 重置Kubernetes
# 參考https://www.jianshu.com/p/31f7dda9ccf7
sudo kubeadm reset
# 重置后刪除網卡等信息
rm -rf /etc/cni/net.d
# 重置iptables
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X
sysctl net.bridge.bridge-nf-call-iptables=1
# 清楚網卡
sudo ip link del cni0
sudo ip link del flannel.1