ubuntu 20.04 基於kubeadm部署kubernetes 1.22.4集群及部署集群管理工具


一、環境准備:

集群版本:kubernetes 1.22.4

服務器系統 節點IP 節點類型 服務器-內存/CUP hostname
Ubuntu 20.04 192.168.1.101 主節點 2G/4核 master
Ubuntu 20.04 192.168.1.102 工作節點1 2G/4核 node1
Ubuntu 20.04  192.168.1.104 工作節點2 2G/4核 node2

二、安裝檢查:

注:在三台機器上執行------------------------開始----------------------------

1.3台機器網絡連通

2.3台機器Hostname,MAC地址,product_uuid    (可以通過sudo cat /sys/class/dmi/id/product_uuid查看)必須唯一

3.檢查以下端口的連通性---我這里為了方便把防火牆直接關了,,生產環境不能這么做!!!

4.禁用swap(重要!!!) 

  檢查swap

yang@master:/etc/docker$ sudo free -m
[sudo] password for yang: 
              total        used        free      shared  buff/cache   available
Mem:           1959        1222          86           3         649         548
Swap:          2047           0        2047

  臨時禁用swap

yang@master:/etc/docker$ sudo swapoff -a

  再次查看swap

yang@master:/etc/docker$ sudo free -m
[sudo] password for yang: 
              total        used        free      shared  buff/cache   available
Mem:           1959        1222          86           3         649         548
Swap:             0           0           0

5.將文件系統設置為可讀寫

yang@master:/etc/docker$ sudo mount -n -o remount,rw /

6.將文件中的swap行使用#注釋掉,永久關閉交換分區

swap分區:交換分區,從磁盤里分一塊空間來充當內存使用,性能比真正的物理內存要差
docker容器在內存里運行 --》 k8s不允許容器到swap分區運行,要關閉swap分區–》所以關閉swap分區是k8s為了追求高性能

yang@master:/etc/docker$ sudo nano /etc/fstab
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point> <type> <options> <dump> <pass>
# / was on /dev/sda2 during curtin installation
/dev/disk/by-uuid/7006dc64-4b4b-41e7-a1ea-857c98683977 / ext4 defaults 0 1
#/swap.img none swap sw 0 0

7.重啟電腦

yang@master:/etc/docker$ sudo reboot

三、安裝CRI-這里使用Docker

1.Ubuntu 20.04 server 安裝docker最新版

# Install Docker Engine on Ubuntu
#卸載舊版本,舊版本的 Docker 被稱為 docker、docker.io 或 docker-engine。如果安裝了這些,請卸載它們
sudo apt-get remove docker docker-engine docker.io containerd runc
#設置存儲庫
#更新 apt 包索引並安裝包以允許 apt 通過 HTTPS 使用存儲庫
sudo apt-get update
sudo apt-get install \
apt-transport-https \
ca-certificates \
curl \
gnupg \
lsb-release
#添加Docker官方的GPG密鑰
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
#使用以下命令設置穩定存儲庫
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
#安裝 Docker 引擎
#更新apt包索引,安裝最新版本的Docker Engine和containerd,或者到下一步安裝特定版本
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io

#安裝特定docker版本(本文用不到)
#查詢存儲庫中可用版本
apt-cache madison docker-ce
使用第二列中的版本字符串安裝特定版本,例如 5:18.09.1~3-0~ubuntu-xenial
#安裝命令
sudo apt-get install docker-ce=<VERSION_STRING> docker-ce-cli=<VERSION_STRING> containerd.io
注:將<VERSION_STRING>,替換為特定版本即可安裝。

2.使普通用戶也可以執行docker命令

sudo groupadd docker #添加docker用戶組
sudo gpasswd -a $USER docker #將登陸用戶加入到docker用戶組中
newgrp docker #更新用戶組

3.檢查docker安裝情況

yang@master:/etc/docker$ docker version
Client: Docker Engine - Community
 Version:           20.10.11
 API version:       1.41
 Go version:        go1.16.9
 Git commit:        dea9396
 Built:             Thu Nov 18 00:37:06 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.11
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.9
  Git commit:       847da18
  Built:            Thu Nov 18 00:35:15 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.12
  GitCommit:        7b11cfaabd73bb80907dd23182b9347b4245eb5d
 runc:
  Version:          1.0.2
  GitCommit:        v1.0.2-0-g52b36a2
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

4.檢查套接字

檢查套接字(kubernetes會在這個目錄下搜索,從而識別到CRI)

yang@master:/etc/docker$ ls /var/run/docker.sock
/var/run/docker.sock

四、安裝kubectl  kubeadm kubelet

 1.安裝curl和apt-transport-https

sudo apt-get update && sudo apt-get install -y apt-transport-https curl

2.下載GPG

sudo wget https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg

3.添加GPG

sudo apt-key add apt-key.gpg

4.寫入鏡像源文件

注:沒有此目錄,直接在下面創建這個文件,並增加以下內容。

sudo cat <<EOF >/etc/apt/sources.list.d/kubernetes.list

deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main

EOF

5.更新索引

sudo apt-get update

6.安裝kubectl kubeadm kubelet

sudo apt-get install -y kubeadm

注:在三台機器上執行------------------------結束----------------------------

五、安裝集群

注:master上執行

1.檢查需要哪些鏡像

yang@master:/etc/docker$ kubeadm config images list
k8s.gcr.io/kube-apiserver:v1.22.4
k8s.gcr.io/kube-controller-manager:v1.22.4
k8s.gcr.io/kube-scheduler:v1.22.4
k8s.gcr.io/kube-proxy:v1.22.4
k8s.gcr.io/pause:3.5
k8s.gcr.io/etcd:3.5.0-0
k8s.gcr.io/coredns/coredns:v1.8.4

2.下載鏡像文件

創建一個腳本,給執行權限,方便操作

sudo nano pull
sudo chmod +x pull
#!/bin/sh
sudo docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.22.4
sudo docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.22.4
sudo docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.22.4
sudo docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.22.4
sudo docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5
sudo docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.0-0
sudo docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.8.4
./pull

3.查看鏡像文件

yang@master:/etc/docker$ docker images
REPOSITORY                                                                    TAG        IMAGE ID       CREATED        SIZE
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver            v1.22.4    8a5cc299272d   11 days ago    128MB
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager   v1.22.4    0ce02f92d3e4   11 days ago    122MB
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler            v1.22.4    721ba97f54a6   11 days ago    52.7MB
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy                v1.22.4    edeff87e4802   11 days ago    104MB
registry.cn-hangzhou.aliyuncs.com/google_containers/etcd                      3.5.0-0    004811815584   5 months ago   295MB
registry.cn-hangzhou.aliyuncs.com/google_containers/coredns                   v1.8.4      8d147537fb7d   6 months ago   47.6MB
registry.cn-hangzhou.aliyuncs.com/google_containers/pause                     3.5        ed210e3e4a5b   8 months ago   683kB

4.修改鏡像tag

sudo nano tag
sudo chmod +x tag
#!/bin/sh
sudo docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.22.4 k8s.gcr.io/kube-apiserver:v1.22.4
sudo docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.22.4 k8s.gcr.io/kube-controller-manager:v1.22.4
sudo docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.22.4 k8s.gcr.io/kube-scheduler:v1.22.4
sudo docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.22.4 k8s.gcr.io/kube-proxy:v1.22.4
sudo docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5 k8s.gcr.io/pause:3.5
sudo docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.0-0 k8s.gcr.io/etcd:3.5.0-0
sudo docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.8.4 k8s.gcr.io/coredns/coredns:v1.8.4
./tag

5.查看修改后的鏡像

yang@master:/etc/docker$ docker images
REPOSITORY                                                                    TAG        IMAGE ID       CREATED        SIZE
k8s.gcr.io/kube-apiserver                                                     v1.22.4    8a5cc299272d   11 days ago    128MB
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver            v1.22.4    8a5cc299272d   11 days ago    128MB
k8s.gcr.io/kube-controller-manager                                            v1.22.4    0ce02f92d3e4   11 days ago    122MB
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager   v1.22.4    0ce02f92d3e4   11 days ago    122MB
k8s.gcr.io/kube-scheduler                                                     v1.22.4    721ba97f54a6   11 days ago    52.7MB
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler            v1.22.4    721ba97f54a6   11 days ago    52.7MB
k8s.gcr.io/kube-proxy                                                         v1.22.4    edeff87e4802   11 days ago    104MB
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy                v1.22.4    edeff87e4802   11 days ago    104MB
k8s.gcr.io/etcd                                                               3.5.0-0    004811815584   5 months ago   295MB
registry.cn-hangzhou.aliyuncs.com/google_containers/etcd                      3.5.0-0    004811815584   5 months ago   295MB
k8s.gcr.io/coredns/coredns                                                    v1.8.4     8d147537fb7d   6 months ago   47.6MB
registry.cn-hangzhou.aliyuncs.com/google_containers/coredns                   v1.8.4      8d147537fb7d   6 months ago   47.6MB
k8s.gcr.io/pause                                                              3.5        ed210e3e4a5b   8 months ago   683kB
registry.cn-hangzhou.aliyuncs.com/google_containers/pause                     3.5        ed210e3e4a5b   8 months ago   683kB

6. 配置集群

初始化主節點

master執行

kubeadm init --apiserver-advertise-address=192.168.1.101 --kubernetes-version=v1.22.4 --service-cidr=10.96.0.0/12 --pod-network-cidr=10.244.0.0/16 --ignore-preflight-errors=all --v=6

初始化完成顯示

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube

  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.

Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:

  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.1.101:6443 --token zbsafp.yliab4onvxpwdmxx \

    --discovery-token-ca-cert-hash sha256:5e2e9d7c76cce5e14897138979d7b397311fd632a1618920a29582ca6d2523b3

照上面提示遷移配置

mkdir -p $HOME/.kube

sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

sudo chown $(id -u):$(id -g) $HOME/.kube/config

7.子節點(node)加入集群

注:

a.上傳kube-proxy:v1.22.4,pause:3.5,coredns:v1.8.4 三個鏡像,並修改tag

b.修改daemon.json

加入集群命令:

kubeadm join 192.168.1.101:6443 --token zbsafp.yliab4onvxpwdmxx \

--discovery-token-ca-cert-hash sha256:5e2e9d7c76cce5e14897138979d7b397311fd632a1618920a29582ca6d2523b3

添加node節點,報錯1

注:可以提前在各node節點上修改好(無報錯無需執行此項)

yang@node2:~$ sudo kubeadm join 192.168.1.101:6443 --token 6131nu.8ohxo1ttgwiqlwmp --discovery-token-ca-cert-hash sha256:9bece23d1089b6753a42ce4dab3fa5ac7d2d4feb260a0f682bfb06ccf1eb4fe2
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
error execution phase kubelet-start: error uploading crisocket: timed out waiting for the condition
To see the stack trace of this error execute with --v=5 or higher

原  因:

1.因為docker默認的Cgroup Driver是cgroupfs  ,cgroupfs是cgroup為給用戶提供的操作接口而開發的虛擬文件系統類型

 它和sysfs,proc類似,可以向用戶展示cgroup的hierarchy,通知kernel用戶對cgroup改動

 對cgroup的查詢和修改只能通過cgroupfs文件系統來進行

2.Kubernetes 推薦使用 systemd 來代替 cgroupfs

 因為systemd是Kubernetes自帶的cgroup管理器, 負責為每個進程分配cgroups,  

   但docker的cgroup driver默認是cgroupfs,這樣就同時運行有兩個cgroup控制管理器, 

   當資源有壓力的情況時,有可能出現不穩定的情況

解決辦法:

①. 創建daemon.json文件,加入以下內容:
ls /etc/docker/daemon.json
{"exec-opts": ["native.cgroupdriver=systemd"]}
②. 重啟docker
sudo systemctl restart docker
③. 重啟kubelet
sudo systemctl restart kubelet
sudo systemctl status kubelet
④. 重新執行加入集群命令
首先清除緩存
sudo kubeadm reset
⑤. 加入集群
sudo kubeadm join 192.168.1.101:6443 --token 6131nu.8ohxo1ttgwiqlwmp --discovery-token-ca-cert-hash sha256:9bece23d1089b6753a42ce4dab3fa5ac7d2d4feb260a0f682bfb06ccf1eb4fe2

 

報錯2

注:各node上需要有kube-proxy鏡像

問題:

Normal BackOff 106s (x249 over 62m) kubelet Back-off pulling image "k8s.gcr.io/kube-proxy:v1.22.4"

原  因:
缺少k8s.gcr.io/kube-proxy:v1.22.4鏡像

解決辦法:

① 上傳鏡像

sudo docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.22.4

② 修改鏡像tag

sudo docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.22.4 k8s.gcr.io/kube-proxy:v1.22.4

 

報錯3

注:各node上需要有pause:3.5鏡像

問題:

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 17m default-scheduler Successfully assigned kube-system/kube-proxy-dv2sw to node2
Warning FailedCreatePodSandBox 4m48s (x26 over 16m) kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed pulling image "k8s.gcr.io/pause:3.5": Error response from daemon: Get "https://k8s.gcr.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning FailedCreatePodSandBox 93s (x2 over 16m) kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed pulling image "k8s.gcr.io/pause:3.5": Error response from daemon: Get "https://k8s.gcr.io/v2/": dial tcp 74.125.195.82:443: i/o timeout

原  因:
缺少k8s.gcr.io/pause:3.5鏡像

解決辦法:

① 上傳鏡像

sudo docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5

② 修改鏡像tag

sudo docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5 k8s.gcr.io/pause:3.5

 

報錯4

注:各node上需要有coredns:v1.8.4鏡像

問題:

Warning Failed 28m (x11 over 73m) kubelet Failed to pull image "k8s.gcr.io/coredns/coredns:v1.8.4"
: rpc error: code = Unknown desc = Error response from daemon: Get "https://k8s.gcr.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

這里是一個坑,鏡像的tag是v1.8.4,而不是1.8.4,所以導致下載超時
查看官網images列表

yang@master:/etc/docker$ kubeadm config images list
k8s.gcr.io/kube-apiserver:v1.22.4
k8s.gcr.io/kube-controller-manager:v1.22.4
k8s.gcr.io/kube-scheduler:v1.22.4
k8s.gcr.io/kube-proxy:v1.22.4
k8s.gcr.io/pause:3.5
k8s.gcr.io/etcd:3.5.0-0
k8s.gcr.io/coredns/coredns:v1.8.4

解決辦法:

① 上傳鏡像

sudo docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.8.4

② 修改鏡像tag

sudo docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.8.4 k8s.gcr.io/coredns/coredns:v1.8.4


8.檢查健康狀態

此時controller-manager與scheduler異常

yang@master:/etc/docker$ kubectl get cs 
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME                 STATUS    MESSAGE                         ERROR
controller-manager   UnHealthy   Get "http://127.0.0.1:10252/healthz": dial tcp 127.0.0.1:10252: connect: connection refused 
etcd-0               Healthy   {"health":"true","reason":""}   
scheduler            UnHealthy    Get "http://127.0.0.1:10251/healthz": dial tcp 127.0.0.1:10251: connect: connection refused

9.修改兩個配置文件,如下:

注釋內容:# - --port=0

yang@master:/etc/docker$ sudo nano /etc/kubernetes/manifests/kube-controller-manager.yaml                                            
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    component: kube-controller-manager
    tier: control-plane
  name: kube-controller-manager
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-controller-manager
    - --allocate-node-cidrs=true
    - --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
    - --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
    - --bind-address=127.0.0.1
    - --client-ca-file=/etc/kubernetes/pki/ca.crt
    - --cluster-cidr=10.244.0.0/16
    - --cluster-name=kubernetes
    - --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
    - --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
    - --controllers=*,bootstrapsigner,tokencleaner
    - --kubeconfig=/etc/kubernetes/controller-manager.conf
    - --leader-elect=true
      #- --port=0
    - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
    - --root-ca-file=/etc/kubernetes/pki/ca.crt
    - --service-account-private-key-file=/etc/kubernetes/pki/sa.key
    - --service-cluster-ip-range=10.96.0.0/12
    - --use-service-account-credentials=true
    image: k8s.gcr.io/kube-controller-manager:v1.22.4
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
yang@master:/etc/docker$ sudo nano /etc/kubernetes/manifests/kube-scheduler.yaml                                                  
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    component: kube-scheduler
    tier: control-plane
  name: kube-scheduler
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-scheduler
    - --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
    - --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
    - --bind-address=127.0.0.1
    - --kubeconfig=/etc/kubernetes/scheduler.conf
    - --leader-elect=true
      #- --port=0
    image: k8s.gcr.io/kube-scheduler:v1.22.4
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 127.0.0.1
        path: /healthz
        port: 10259
        scheme: HTTPS
      initialDelaySeconds: 10

10.再次查看健康狀態

這時10251,10252端口就開啟了,健康檢查狀態也正常了。

yang@master:/etc/docker$ kubectl get cs 
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME                 STATUS    MESSAGE                         ERROR
scheduler            Healthy   ok                              
etcd-0               Healthy   {"health":"true","reason":""}   
controller-manager   Healthy   ok  

 11.節點狀態

yang@master:~$ kubectl get node

NAME      STATUS     ROLES                 AGE   VERSION

master   NotReady   control-plane,master   21m   v1.22.4

node1    NotReady   <none>                 15m   v1.22.4

node2    NotReady   <none>                 14m   v1.22.4

節點還是NotReady,檢查Pod

yang@master:~$ kubectl get pod --all-namespaces
NAMESPACE     NAME                               READY   STATUS    RESTARTS        AGE
kube-system   coredns-78fcd69978-drgk6           0/1     pending   0               4d1h
kube-system   coredns-78fcd69978-tz4kt           0/1     pending   0               4d1h
kube-system   etcd-master                        1/1     Running   1 (3d19h ago)   4d1h
kube-system   kube-apiserver-master              1/1     Running   1 (3d19h ago)   4d1h
kube-system   kube-controller-manager-master     1/1     Running   3               3d19h
kube-system   kube-flannel-ds-ksn8l              1/1     Running   0               3d20h
kube-system   kube-flannel-ds-ld5jr              1/1     Running   1 (3d19h ago)   3d20h
kube-system   kube-flannel-ds-wf2t2              1/1     Running   0               3d20h
kube-system   kube-proxy-dv2sw                   1/1     Running   0               4d
kube-system   kube-proxy-jl7f5                   1/1     Running   0               4d
kube-system   kube-proxy-xn96j                   1/1     Running   2 (3d19h ago)   4d1h
kube-system   kube-scheduler-master              1/1     Running   3               3d19h
kube-system   rancher-6fbc899b67-mzcvb           1/1     Running   0               2d21h

原  因coredns狀態為pending,未運行,因為缺少網絡組件

執行命令查看:journalctl -f -u kubelet
Nov 06 15:37:21 jupiter kubelet[86177]: W1106 15:37:21.482574 86177 cni.go:237] Unable to update cni config: no valid networks found in /etc/cni
Nov 06 15:37:25 jupiter kubelet[86177]: E1106 15:37:25.075839 86177 kubelet.go:2187] Container runtime network not ready: NetworkReady=false reaeady: cni config uninitialized
解決辦法:

yang@master:/etc/docker$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
yang@master:~$ kubectl get pod --all-namespaces
NAMESPACE     NAME                               READY   STATUS    RESTARTS        AGE
kube-system   coredns-78fcd69978-drgk6           1/1     Running   0               4d1h
kube-system   coredns-78fcd69978-tz4kt           1/1     Running   0               4d1h
kube-system   etcd-master                        1/1     Running   1 (3d19h ago)   4d1h
kube-system   kube-apiserver-master              1/1     Running   1 (3d19h ago)   4d1h
kube-system   kube-controller-manager-master     1/1     Running   3               3d19h
kube-system   kube-flannel-ds-ksn8l              1/1     Running   0               3d20h
kube-system   kube-flannel-ds-ld5jr              1/1     Running   1 (3d19h ago)   3d20h
kube-system   kube-flannel-ds-wf2t2              1/1     Running   0               3d20h
kube-system   kube-proxy-dv2sw                   1/1     Running   0               4d
kube-system   kube-proxy-jl7f5                   1/1     Running   0               4d
kube-system   kube-proxy-xn96j                   1/1     Running   2 (3d19h ago)   4d1h
kube-system   kube-scheduler-master              1/1     Running   3               3d19h
kube-system   rancher-6fbc899b67-mzcvb           1/1     Running   0               2d21h

12.查看節點狀態

yang@master:~$ kubectl get node
NAME     STATUS   ROLES                  AGE    VERSION
master   Ready    control-plane,master   4d1h   v1.22.4
node1    Ready    <none>                 4d     v1.22.4
node2    Ready    <none>                 4d     v1.22.4

至此,集群部署完畢!

 

Kubernetes正在不斷加快在雲原生環境的應用,但如何以統一、安全的方式對運行於任何地方的Kubernetes集群進行管理面臨着挑戰,而有效的管理工具能夠大大降低管理的難度。如何管理集群呢?有什么圖形化工具呢?

例如:

1. kubernetes-dashboard

Dashboard 是基於網頁的 Kubernetes 用戶界面。 你可以使用 Dashboard 將容器應用部署到 Kubernetes 集群中,也可以對容器應用排錯,還能管理集群資源。 你可以使用 Dashboard 獲取運行在集群中的應用的概覽信息,也可以創建或者修改 Kubernetes 資源 (如 Deployment,Job,DaemonSet 等等)。 例如,你可以對 Deployment 實現彈性伸縮、發起滾動升級、重啟 Pod 或者使用向導創建新的應用。

Dashboard 同時展示了 Kubernetes 集群中的資源狀態信息和所有報錯信息。

官網地址:https://kubernetes.io/zh/docs/tasks/access-application-cluster/web-ui-dashboard/

2.kuboard

Kuboard 是一款免費的 Kubernetes 管理工具,提供了豐富的功能,結合已有或新建的代碼倉庫、鏡像倉庫、CI/CD工具等,可以便捷的搭建一個生產可用的 Kubernetes 容器雲平台,輕松管理和運行雲原生應用。您也可以直接將 Kuboard 安裝到現有的 Kubernetes 集群,通過 Kuboard 提供的 Kubernetes RBAC 管理界面,將 Kubernetes 提供的能力開放給您的開發/測試團隊。Kuboard 提供的功能有:

  1. Kubernetes 基本管理功能
  1. Kubernetes 問題診斷
  1. Kubernetes 存儲管理
  1. 認證與授權(收費)
  1. Kuboard 特色功能

官方有在線演示地址,可以不用安裝先體驗一下

官網地址:https://www.kuboard.cn/

 3.kubesphere

KubeSphere 是在 Kubernetes 之上構建的面向雲原生應用的分布式操作系統,支持多雲與多集群管理,提供全棧的 IT 自動化運維的能力,簡化企業的 DevOps 工作流。它的架構可以非常方便地使第三方應用與雲原生生態組件進行即插即用 (plug-and-play) 的集成。

作為全棧化容器部署與多租戶管理平台,KubeSphere 提供了運維友好的向導式操作界面,幫助企業快速構建一個強大和功能豐富的容器雲平台。它擁有 Kubernetes 企業級服務所需的最常見功能,例如 Kubernetes 資源管理、DevOps、多集群部署與管理、應用生命周期管理、微服務治理、日志查詢與收集、服務與網絡、多租戶管理、監控告警、事件審計、存儲、訪問控制、GPU 支持、網絡策略、鏡像倉庫管理以及安全管理等。

官網地址:https://kubesphere.io/zh/

4.Rancher

Rancher 是為使用容器的公司打造的容器管理平台。Rancher 簡化了使用 Kubernetes 的流程,開發者可以隨處運行 Kubernetes(Run Kubernetes Everywhere),滿足 IT 需求規范,賦能 DevOps 團隊。

官網地址:https://www.rancher.cn/

這幾種是目前比較火的集群管理工具,可以任意選一種!

 

我這里使用了kuboard工具

在 K8S 中安裝 Kuboard,主要考慮的問題是,如何提供 etcd 的持久化數據卷。建議的兩個選項有:

  1. 使用 hostPath 提供持久化存儲,將 kuboard 所依賴的 Etcd 部署到 Master 節點,並將 etcd 的數據目錄映射到 Master 節點的本地目錄;推薦
  2. 使用 StorageClass 動態創建 PV 為 etcd 提供數據卷;不推薦

1.安裝

執行 Kuboard v3 在 K8S 中的安裝

kubectl apply -f https://addons.kuboard.cn/kuboard/kuboard-v3.yaml
# 您也可以使用下面的指令,唯一的區別是,該指令使用華為雲的鏡像倉庫替代 docker hub 分發 Kuboard 所需要的鏡像
# kubectl apply -f https://addons.kuboard.cn/kuboard/kuboard-v3-swr.yaml

2.等待 Kuboard v3 就緒

執行指令 watch kubectl get pods -n kuboard,等待 kuboard 名稱空間中所有的 Pod 就緒,如下所示,

如果結果中沒有出現 kuboard-etcd-xxxxx 的容器。

3.訪問

在瀏覽器中打開鏈接 http://your-node-ip-address:30080

  • 輸入初始用戶名和密碼,並登錄

    • 用戶名: admin
    • 密碼: Kuboard123

訪問后,將創建一個集群,將部署的k8s加入到管理列表中,如下:

 編寫集群名稱,描述,加入集群的指令(按提示框內操作),Context:kubernetes ,ApiServer地址為:http://masterIP:6443,然后確定即可將集群加入!

yang@master:~$ sudo cat /etc/kubernetes/admin.conf

 

 部署一個rancher,調度到了node1上,並且可以正常訪問!

亦可進入容器,終端控制台,以及,日志的查詢,顯示如下:

 此時,我已經將部署的kubernetes集群加入到kuboard里面,並且顯示正常運行,且可以正常調度!

 

瀏覽器兼容性

  • 請使用 Chrome / FireFox / Safari / Edge 等瀏覽器
  • 不兼容 IE 以及以 IE 為內核的瀏覽器

4.添加新的集群

  • Kuboard v3 是支持 Kubernetes 多集群管理的,在 Kuboard v3 的首頁里,點擊 添加集群 按鈕,在向導的引導下可以完成集群的添加;
  • 向 Kuboard v3 添加新的 Kubernetes 集群時,請確保:
    • 您新添加集群可以訪問到當前集群 Master 節點 內網IP 的 30080 TCP30081 TCP30081 UDP 端口;
    • 如果您打算新添加到 Kuboard 中的集群與當前集群不在同一個局域網,請咨詢 Kuboard 團隊,幫助您解決問題。

5.卸載

  • 執行 Kuboard v3 的卸載

    kubectl delete -f https://addons.kuboard.cn/kuboard/kuboard-v3.yaml
  • 清理遺留數據

    在 master 節點以及帶有 k8s.kuboard.cn/role=etcd 標簽的節點上執行

    rm -rf /usr/share/kuboard

常用命令:

1、查看集群信息node
kubuctl get node
2、查看所有pod
kubectl get pod --all-namespaces
3、查看pod詳細信息
kubectl describe pod pod名稱 --namespace=NAMESPACE

4、查看健康狀態

kubectl get cs

5.查看指定namespace 下的pod
kubectl get pods --namespace=test

6.查看namespace:
kubectl get namespace

7.創建名為test的namespace
kubectl create namespace test

8.設置命名空間首選項
kubectl config set-context --current --namespace=test

9.在已創建的命名空間中創建資源
kubectl apply -f pod.yaml --namespace=test

10.進入k8s啟動pod
kubectl exec -it admin-frontend-server-74497cb64f-8fxk8 --bash

11.查看各組件狀態
kubectl get componentstatuses

12.master 節點也作為node節點
kubectl taint nodes --all node-role.kubernetes.io/master-

13.其他命令
kubectl get pods -n kube-system
kubectl get pod --all-namespaces
kubectl get csr
kubectl get deployments
kubectl get pods -n kube-system -o wide --watch
kubectl describe pods weave-net-87t7g -n kube-system

14.創建pod
kubectl create -f deployment.yaml

15.創建server文件
kubectl create -f services.yaml

16.查看文件創建情況
kubectl discribe service **
kubectl discribe deployment **


17.刪除 secret, deployment
kubectl delete secret **
kubectl delete deployment **

18. 匹配特定服務;
kubectl get po | grep display
kubectl get svc | grep data

19.查看服務日志
kubectl logs ** -f —tail=20
kubectl logs ** --since=1h

20。卸載集群
# 想要撤銷kubeadm做的事,首先要排除節點,並確保在關閉節點之前要清空節點。
# 在主節點上運行:
kubectl drain <node name> --delete-local-data --force --ignore-daemonsets
kubectl delete node <node name>

21. 然后在需要移除的節點上,重置kubeadm的安裝狀態:
kubeadm reset

# 重置Kubernetes
# 參考https://www.jianshu.com/p/31f7dda9ccf7
sudo kubeadm reset
# 重置后刪除網卡等信息
rm -rf /etc/cni/net.d
# 重置iptables
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X
sysctl net.bridge.bridge-nf-call-iptables=1
# 清楚網卡
sudo ip link del cni0
sudo ip link del flannel.1

 

 

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM