centos7安裝kubernetes k8s v1.16.0 國內環境


kubernetes部署成功,但Dashboard界面管理的時候總不成功,建議安裝行, 1.16.3

. 為什么是k8s v1.16.0?

最新版的v1.16.2試過了,一直無法安裝完成,安裝到kubeadm init那一步執行后,報了很多錯,如:node xxx not found等。centos7都重裝了幾次,還是無法解決。用了一天都沒安裝完,差點放棄。后來在網上搜到的安裝教程基本都是v1.16.0的,我不太相信是v1.16.2的坑所以先前沒打算降級到v1.16.0。沒辦法了就試着安裝v1.16.0版本,竟然成功了。記錄在此,避免后來者踩坑。

本篇文章,安裝大步驟如下:

  • 安裝docker-ce 18.09.9(所有機器)
  • 設置k8s環境前置條件(所有機器)
  • 安裝k8s v1.16.0 master管理節點
  • 安裝k8s v1.16.0 node工作節點
  • 安裝flannel(master)

這里有重要的一步,請記住自己master和node之間通信的ip,如我的master的ip為192.168.237.143node的ip為:192.168.237.144 請確保使用這兩個ip在master和node上能互相ping通,這個master的ip 192.168.237.143接下來配置k8s的時候需要用到。

我的環境:

  • 操作系統:win10
  • 虛擬機:virtual box
  • linux發行版:CentOS7
  • linux內核(使用uname -r查看):3.10.0-957.el7.x86_64
  • master和node節點通信的ip(master):192.168.99.104
  • 修改host  主  hostnamectl --static set-hostname k8s-master
  • 修改host  節點 hostnamectl --static set-hostname k8s-node1
  • vi /etc/hosts
  • 添加
  • 192.168.237.143   k8s-master
  • 192.168.237.144   k8s-node1
  • 127.0.0.1   k8s-master

. 安裝docker-ce 18.09.9(所有機器)

所有安裝k8s的機器都需要安裝docker,命令如下:

# 安裝docker所需的工具

yum install -y yum-utils device-mapper-persistent-data lvm2

# 配置阿里雲的docker源

yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo

# 指定安裝這個版本的docker-ce

yum install -y docker-ce-18.09.9-3.el7

# 啟動docker

systemctl enable docker && systemctl start docker

. 設置k8s環境准備條件(所有機器)

安裝k8s的機器需要2個CPU和2g內存以上,這個簡單,在虛擬機里面配置一下就可以了。然后執行以下腳本做一些准備操作。所有安裝k8s的機器都需要這一步操作。

# 關閉防火牆

systemctl disable firewalld

systemctl stop firewalld

# 關閉selinux

# 臨時禁用selinux

setenforce 0

# 永久關閉 修改/etc/sysconfig/selinux文件設置

sed -i 's/SELINUX=permissive/SELINUX=disabled/' /etc/sysconfig/selinux

sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config

# 禁用交換分區

swapoff -a

# 永久禁用,打開/etc/fstab注釋掉swap那一行。

sed -i 's/.*swap.*/#&/' /etc/fstab

# 修改內核參數

cat <<EOF >  /etc/sysctl.d/k8s.conf

net.bridge.bridge-nf-call-ip6tables = 1

net.bridge.bridge-nf-call-iptables = 1

EOF

sysctl --system

. 安裝k8s v1.16.0 master管理節點

如果還沒安裝docker,請參照本文步驟二安裝docker-ce 18.09.9(所有機器)安裝。
如果沒設置k8s環境准備條件,請參照本文步驟三設置k8s環境准備條件(所有機器)執行。
以上兩個步驟檢查完畢之后,繼續以下步驟。

1. 安裝kubeadm、kubelet、kubectl

kubeadm —— 啟動 k8s 集群的命令工具

kubelet —— 集群容器內的命令工具

kubectl —— 操作集群的命令工具

由於官方k8s源在google,國內無法訪問,這里使用阿里雲yum源

# 執行配置k8s阿里雲源

cat <<EOF > /etc/yum.repos.d/kubernetes.repo

[kubernetes]

name=Kubernetes

baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/

enabled=1

gpgcheck=1

repo_gpgcheck=1

gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg

EOF

# 安裝kubeadm、kubectl、kubelet

yum install -y kubectl-1.16.0-0 kubeadm-1.16.0-0 kubelet-1.16.0-0

# 啟動kubelet服務

systemctl enable kubelet && systemctl start kubelet

2. 初始化k8s

以下這個命令開始安裝k8s需要用到的docker鏡像,因為無法訪問到國外網站,所以這條命令使用的是國內的阿里雲的源(registry.aliyuncs.com/google_containers)。另一個非常重要的是:這里的--apiserver-advertise-address使用的是master和node間能互相ping通的ip,我這里是192.168.99.104,剛開始在這里被坑了一個晚上,你請自己修改下ip執行。 這條命令執行時會卡在[preflight] You can also perform this action in beforehand using ''kubeadm config images pull,大概需要2分鍾,請耐心等待。

# 下載管理節點中用到的6個docker鏡像,你可以使用docker images查看到

# 這里需要大概兩分鍾等待,會卡在[preflight] You can also perform this action in beforehand using ''kubeadm config images pull

kubeadm init --image-repository registry.aliyuncs.com/google_containers --kubernetes-version v1.16.0 --apiserver-advertise-address 192.168.237.143 --pod-network-cidr=10.244.0.0/16 --token-ttl 0

上面安裝完后,會提示你輸入如下命令,復制粘貼過來,執行即可。

# 上面安裝完成后,k8s會提示你輸入如下命令,執行

mkdir -p $HOME/.kube

sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

sudo chown $(id -u):$(id -g) $HOME/.kube/config

3. 記住node加入集群的命令

上面kubeadm init執行成功后會返回給你node節點加入集群的命令,等會要在node節點上執行,需要保存下來,如果忘記了,可以使用如下命令獲取。

kubeadm token create --print-join-command

以上,安裝master節點完畢。可以使用kubectl get nodes查看一下,此時master處於NotReady狀態,暫時不用管。

 

. 安裝k8s v1.16.0 node工作節點

如果還沒安裝docker,請參照本文步驟二安裝docker-ce 18.09.9(所有機器)安裝。
如果沒設置k8s環境准備條件,請參照本文步驟三設置k8s環境准備條件(所有機器)執行。
以上兩個步驟檢查完畢之后,繼續以下步驟。

1. 安裝kubeadm、kubelet

# 執行配置k8s阿里雲源

cat <<EOF > /etc/yum.repos.d/kubernetes.repo

[kubernetes]

name=Kubernetes

baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/

enabled=1

gpgcheck=1

repo_gpgcheck=1

gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg

EOF

# 安裝kubeadm、kubectl、kubelet

yum install -y  kubeadm-1.16.0-0 kubelet-1.16.0-0

# 啟動kubelet服務

systemctl enable kubelet && systemctl start kubelet

2. 加入集群

這里加入集群的命令每個人都不一樣,可以登錄master節點,使用kubeadm token create --print-join-command 來獲取。獲取后執行如下。

# 加入集群,如果這里不知道加入集群的命令,可以登錄master節點,使用kubeadm token create --print-join-command 來獲取

kubeadm join 192.168.99.104:6443 --token ncfrid.7ap0xiseuf97gikl \

    --discovery-token-ca-cert-hash sha256:47783e9851a1a517647f1986225f104e81dbfd8fb256ae55ef6d68ce9334c6a2

加入成功后,可以在master節點上使用kubectl get nodes命令查看到加入的節點。

  1. 刪除節點

首先釋放 bogon 節點資源

kubectl drain bogon --delete-local-data --force --ignore-daemonsets

刪除 bogon 節點

kubectl delete node bogon

查看節點

kubectl get nodes

No resources found.

 

. 安裝flannel(master機器)

以上步驟安裝完后,機器搭建起來了,但狀態還是NotReady狀態,如下圖,master機器需要安裝flanneld。

 

1. 下載官方fannel配置文件

使用wget命令,地址為:(https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml),這個地址國內訪問不了,所以我把內容復制下來,為了避免前面文章過長,我把它粘貼到文章末尾第八個步驟附錄了。這個yml配置文件中配置了一個國內無法訪問的地址(quay.io),我已經將其改為國內可以訪問的地址(quay-mirror.qiniu.com)。我們新建一個kube-flannel.yml文件,復制粘貼該內容即可。

2. 安裝fannel

kubectl apply -f kube-flannel.yml

. 大功告成

至此,k8s集群搭建完成,如下圖節點已為Ready狀態,大功告成,完結撒花。

 

. 附錄

這是kube-flannel.yml文件的內容,已經將無法訪問的地址(quay.io)全部改為國內可以訪問的地址(quay-mirror.qiniu.com)。我們新建一個kube-flannel.yml文件,復制粘貼該內容即可。

---apiVersion: policy/v1beta1kind: PodSecurityPolicymetadata:

  name: psp.flannel.unprivileged

  annotations:

    seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default

    seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default

    apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default

    apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/defaultspec:

  privileged: false

  volumes:

    - configMap

    - secret

    - emptyDir

    - hostPath

  allowedHostPaths:

    - pathPrefix: "/etc/cni/net.d"

    - pathPrefix: "/etc/kube-flannel"

    - pathPrefix: "/run/flannel"

  readOnlyRootFilesystem: false

  # Users and groups

  runAsUser:

    rule: RunAsAny

  supplementalGroups:

    rule: RunAsAny

  fsGroup:

    rule: RunAsAny

  # Privilege Escalation

  allowPrivilegeEscalation: false

  defaultAllowPrivilegeEscalation: false

  # Capabilities

  allowedCapabilities: ['NET_ADMIN']

  defaultAddCapabilities: []

  requiredDropCapabilities: []

  # Host namespaces

  hostPID: false

  hostIPC: false

  hostNetwork: true

  hostPorts:

  - min: 0

    max: 65535

  # SELinux

  seLinux:

    # SELinux is unused in CaaSP

    rule: 'RunAsAny'---kind: ClusterRoleapiVersion: rbac.authorization.k8s.io/v1beta1metadata:

  name: flannelrules:

  - apiGroups: ['extensions']

    resources: ['podsecuritypolicies']

    verbs: ['use']

    resourceNames: ['psp.flannel.unprivileged']

  - apiGroups:

      - ""

    resources:

      - pods

    verbs:

      - get

  - apiGroups:

      - ""

    resources:

      - nodes

    verbs:

      - list

      - watch

  - apiGroups:

      - ""

    resources:

      - nodes/status

    verbs:

      - patch---kind: ClusterRoleBindingapiVersion: rbac.authorization.k8s.io/v1beta1metadata:

  name: flannelroleRef:

  apiGroup: rbac.authorization.k8s.io

  kind: ClusterRole

  name: flannelsubjects:- kind: ServiceAccount

  name: flannel

  namespace: kube-system---apiVersion: v1kind: ServiceAccountmetadata:

  name: flannel

  namespace: kube-system---kind: ConfigMapapiVersion: v1metadata:

  name: kube-flannel-cfg

  namespace: kube-system

  labels:

    tier: node

    app: flanneldata:

  cni-conf.json: |

    {

      "name": "cbr0",

      "cniVersion": "0.3.1",

      "plugins": [

        {

          "type": "flannel",

          "delegate": {

            "hairpinMode": true,

            "isDefaultGateway": true

          }

        },

        {

          "type": "portmap",

          "capabilities": {

            "portMappings": true

          }

        }

      ]

    }

  net-conf.json: |

    {

      "Network": "10.244.0.0/16",

      "Backend": {

        "Type": "vxlan"

      }

    }---apiVersion: apps/v1kind: DaemonSetmetadata:

  name: kube-flannel-ds-amd64

  namespace: kube-system

  labels:

    tier: node

    app: flannelspec:

  selector:

    matchLabels:

      app: flannel

  template:

    metadata:

      labels:

        tier: node

        app: flannel

    spec:

      affinity:

        nodeAffinity:

          requiredDuringSchedulingIgnoredDuringExecution:

            nodeSelectorTerms:

              - matchExpressions:

                  - key: beta.kubernetes.io/os

                    operator: In

                    values:

                      - linux

                  - key: beta.kubernetes.io/arch

                    operator: In

                    values:

                      - amd64

      hostNetwork: true

      tolerations:

      - operator: Exists

        effect: NoSchedule

      serviceAccountName: flannel

      initContainers:

      - name: install-cni

        image: quay-mirror.qiniu.com/coreos/flannel:v0.11.0-amd64

        command:

        - cp

        args:

        - -f

        - /etc/kube-flannel/cni-conf.json

        - /etc/cni/net.d/10-flannel.conflist

        volumeMounts:

        - name: cni

          mountPath: /etc/cni/net.d

        - name: flannel-cfg

          mountPath: /etc/kube-flannel/

      containers:

      - name: kube-flannel

        image: quay-mirror.qiniu.com/coreos/flannel:v0.11.0-amd64

        command:

        - /opt/bin/flanneld

        args:

        - --ip-masq

        - --kube-subnet-mgr

        resources:

          requests:

            cpu: "100m"

            memory: "50Mi"

          limits:

            cpu: "100m"

            memory: "50Mi"

        securityContext:

          privileged: false

          capabilities:

            add: ["NET_ADMIN"]

        env:

        - name: POD_NAME

          valueFrom:

            fieldRef:

              fieldPath: metadata.name

        - name: POD_NAMESPACE

          valueFrom:

            fieldRef:

              fieldPath: metadata.namespace

        volumeMounts:

        - name: run

          mountPath: /run/flannel

        - name: flannel-cfg

          mountPath: /etc/kube-flannel/

      volumes:

        - name: run

          hostPath:

            path: /run/flannel

        - name: cni

          hostPath:

            path: /etc/cni/net.d

        - name: flannel-cfg

          configMap:

            name: kube-flannel-cfg---apiVersion: apps/v1kind: DaemonSetmetadata:

  name: kube-flannel-ds-arm64

  namespace: kube-system

  labels:

    tier: node

    app: flannelspec:

  selector:

    matchLabels:

      app: flannel

  template:

    metadata:

      labels:

        tier: node

        app: flannel

    spec:

      affinity:

        nodeAffinity:

          requiredDuringSchedulingIgnoredDuringExecution:

            nodeSelectorTerms:

              - matchExpressions:

                  - key: beta.kubernetes.io/os

                    operator: In

                    values:

                      - linux

                  - key: beta.kubernetes.io/arch

                    operator: In

                    values:

                      - arm64

      hostNetwork: true

      tolerations:

      - operator: Exists

        effect: NoSchedule

      serviceAccountName: flannel

      initContainers:

      - name: install-cni

        image: quay-mirror.qiniu.com/coreos/flannel:v0.11.0-arm64

        command:

        - cp

        args:

        - -f

        - /etc/kube-flannel/cni-conf.json

        - /etc/cni/net.d/10-flannel.conflist

        volumeMounts:

        - name: cni

          mountPath: /etc/cni/net.d

        - name: flannel-cfg

          mountPath: /etc/kube-flannel/

      containers:

      - name: kube-flannel

        image: quay-mirror.qiniu.com/coreos/flannel:v0.11.0-arm64

        command:

        - /opt/bin/flanneld

        args:

        - --ip-masq

        - --kube-subnet-mgr

        resources:

          requests:

            cpu: "100m"

            memory: "50Mi"

          limits:

            cpu: "100m"

            memory: "50Mi"

        securityContext:

          privileged: false

          capabilities:

             add: ["NET_ADMIN"]

        env:

        - name: POD_NAME

          valueFrom:

            fieldRef:

              fieldPath: metadata.name

        - name: POD_NAMESPACE

          valueFrom:

            fieldRef:

              fieldPath: metadata.namespace

        volumeMounts:

        - name: run

          mountPath: /run/flannel

        - name: flannel-cfg

          mountPath: /etc/kube-flannel/

      volumes:

        - name: run

          hostPath:

            path: /run/flannel

        - name: cni

          hostPath:

            path: /etc/cni/net.d

        - name: flannel-cfg

          configMap:

            name: kube-flannel-cfg---apiVersion: apps/v1kind: DaemonSetmetadata:

  name: kube-flannel-ds-arm

  namespace: kube-system

  labels:

    tier: node

    app: flannelspec:

  selector:

    matchLabels:

      app: flannel

  template:

    metadata:

      labels:

        tier: node

        app: flannel

    spec:

      affinity:

        nodeAffinity:

          requiredDuringSchedulingIgnoredDuringExecution:

            nodeSelectorTerms:

              - matchExpressions:

                  - key: beta.kubernetes.io/os

                    operator: In

                    values:

                      - linux

                  - key: beta.kubernetes.io/arch

                    operator: In

                    values:

                      - arm

      hostNetwork: true

      tolerations:

      - operator: Exists

        effect: NoSchedule

      serviceAccountName: flannel

      initContainers:

      - name: install-cni

        image: quay-mirror.qiniu.com/coreos/flannel:v0.11.0-arm

        command:

        - cp

        args:

        - -f

        - /etc/kube-flannel/cni-conf.json

        - /etc/cni/net.d/10-flannel.conflist

        volumeMounts:

        - name: cni

          mountPath: /etc/cni/net.d

        - name: flannel-cfg

          mountPath: /etc/kube-flannel/

      containers:

      - name: kube-flannel

        image: quay-mirror.qiniu.com/coreos/flannel:v0.11.0-arm

        command:

        - /opt/bin/flanneld

        args:

        - --ip-masq

        - --kube-subnet-mgr

        resources:

          requests:

            cpu: "100m"

            memory: "50Mi"

          limits:

            cpu: "100m"

            memory: "50Mi"

        securityContext:

          privileged: false

          capabilities:

             add: ["NET_ADMIN"]

        env:

        - name: POD_NAME

          valueFrom:

            fieldRef:

              fieldPath: metadata.name

        - name: POD_NAMESPACE

          valueFrom:

            fieldRef:

              fieldPath: metadata.namespace

        volumeMounts:

        - name: run

          mountPath: /run/flannel

        - name: flannel-cfg

          mountPath: /etc/kube-flannel/

      volumes:

        - name: run

          hostPath:

            path: /run/flannel

        - name: cni

          hostPath:

            path: /etc/cni/net.d

        - name: flannel-cfg

          configMap:

            name: kube-flannel-cfg---apiVersion: apps/v1kind: DaemonSetmetadata:

  name: kube-flannel-ds-ppc64le

  namespace: kube-system

  labels:

    tier: node

    app: flannelspec:

  selector:

    matchLabels:

      app: flannel

  template:

    metadata:

      labels:

        tier: node

        app: flannel

    spec:

      affinity:

        nodeAffinity:

          requiredDuringSchedulingIgnoredDuringExecution:

            nodeSelectorTerms:

              - matchExpressions:

                  - key: beta.kubernetes.io/os

                    operator: In

                    values:

                      - linux

                  - key: beta.kubernetes.io/arch

                    operator: In

                    values:

                      - ppc64le

      hostNetwork: true

      tolerations:

      - operator: Exists

        effect: NoSchedule

      serviceAccountName: flannel

      initContainers:

      - name: install-cni

        image: quay-mirror.qiniu.com/coreos/flannel:v0.11.0-ppc64le

        command:

        - cp

        args:

        - -f

        - /etc/kube-flannel/cni-conf.json

        - /etc/cni/net.d/10-flannel.conflist

        volumeMounts:

        - name: cni

          mountPath: /etc/cni/net.d

        - name: flannel-cfg

          mountPath: /etc/kube-flannel/

      containers:

      - name: kube-flannel

        image: quay-mirror.qiniu.com/coreos/flannel:v0.11.0-ppc64le

        command:

        - /opt/bin/flanneld

        args:

        - --ip-masq

        - --kube-subnet-mgr

        resources:

          requests:

            cpu: "100m"

            memory: "50Mi"

          limits:

            cpu: "100m"

            memory: "50Mi"

        securityContext:

          privileged: false

          capabilities:

             add: ["NET_ADMIN"]

        env:

        - name: POD_NAME

          valueFrom:

            fieldRef:

              fieldPath: metadata.name

        - name: POD_NAMESPACE

          valueFrom:

            fieldRef:

              fieldPath: metadata.namespace

        volumeMounts:

        - name: run

          mountPath: /run/flannel

        - name: flannel-cfg

          mountPath: /etc/kube-flannel/

      volumes:

        - name: run

          hostPath:

            path: /run/flannel

        - name: cni

          hostPath:

            path: /etc/cni/net.d

        - name: flannel-cfg

          configMap:

            name: kube-flannel-cfg---apiVersion: apps/v1kind: DaemonSetmetadata:

  name: kube-flannel-ds-s390x

  namespace: kube-system

  labels:

    tier: node

    app: flannelspec:

  selector:

    matchLabels:

      app: flannel

  template:

    metadata:

      labels:

        tier: node

        app: flannel

    spec:

      affinity:

        nodeAffinity:

          requiredDuringSchedulingIgnoredDuringExecution:

            nodeSelectorTerms:

              - matchExpressions:

                  - key: beta.kubernetes.io/os

                    operator: In

                    values:

                      - linux

                  - key: beta.kubernetes.io/arch

                    operator: In

                    values:

                      - s390x

      hostNetwork: true

      tolerations:

      - operator: Exists

        effect: NoSchedule

      serviceAccountName: flannel

      initContainers:

      - name: install-cni

        image: quay-mirror.qiniu.com/coreos/flannel:v0.11.0-s390x

        command:

        - cp

        args:

        - -f

        - /etc/kube-flannel/cni-conf.json

        - /etc/cni/net.d/10-flannel.conflist

        volumeMounts:

        - name: cni

          mountPath: /etc/cni/net.d

        - name: flannel-cfg

          mountPath: /etc/kube-flannel/

      containers:

      - name: kube-flannel

        image: quay-mirror.qiniu.com/coreos/flannel:v0.11.0-s390x

        command:

        - /opt/bin/flanneld

        args:

        - --ip-masq

        - --kube-subnet-mgr

        resources:

          requests:

            cpu: "100m"

            memory: "50Mi"

          limits:

            cpu: "100m"

            memory: "50Mi"

        securityContext:

          privileged: false

          capabilities:

             add: ["NET_ADMIN"]

        env:

        - name: POD_NAME

          valueFrom:

            fieldRef:

              fieldPath: metadata.name

        - name: POD_NAMESPACE

          valueFrom:

            fieldRef:

              fieldPath: metadata.namespace

        volumeMounts:

        - name: run

          mountPath: /run/flannel

        - name: flannel-cfg

          mountPath: /etc/kube-flannel/

      volumes:

        - name: run

          hostPath:

            path: /run/flannel

        - name: cni

          hostPath:

            path: /etc/cni/net.d

        - name: flannel-cfg

          configMap:

            name: kube-flannel-cfg


.  k8s 集群部署問題整理

 

kubernetes感興趣的可以加群885763297,一起玩轉kubernetes

 

1、hostname “master” could not be reached

host中沒有加解析

 

2、curl -sSL http://localhost:10248/healthz

curl: (7) Failed connect to localhost:10248; 拒絕連接 在host中沒有localhost的解析

 

3、Error starting daemon: SELinux is not supported with the overlay2 graph driver on this kernel. Either boot into a newer kernel or…abled=false)

vim /etc/ssconfig/docker --selinux-enabled=False

 

4、bridge-nf-call-iptables 固化的問題:

#下面的是關於bridge的配置: net.bridge.bridge-nf-call-ip6tables = 0 net.bridge.bridge-nf-call-iptables = 1 #意味着二層的網絡在轉發包的時候會被iptables的forward規則過濾 net.bridge.bridge-nf-call-arptables = 0

 

5、The connection to the server localhost:8080 was refused - did you specify the right host or port?

unable to recognize "kube-flannel.yml": Get http://localhost:8080/api?timeout=32s: dial tcp [::1]:8080: connect: connection refused 下面如果在root用戶下執行的,就不會報錯 mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config

###6、error: unable to recognize “mycronjob.yml”: no matches for kind “CronJob” in version “batch/v2alpha1”

kube-apiserver.yaml文件中添加: - --runtime-config=batch/v2alpha1=true,然后重啟kubelet服務,就可以了

 

7、Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized Unable to update cni config: No networks found in /etc/cni/net.d Failed to get system container stats for “/system.slice/kubelet.service”: failed to get cgroup stats for “/system.slice/kubelet.service”: failed to get container info for “/system.slice/kubelet.service”: unknown container “/system.slice/kubelet.service”

docker pull quay.io/coreos/flannel:v0.10.0-amd64

mkdir -p /etc/cni/net.d/

cat <<EOF> /etc/cni/net.d/10-flannel.conf

{"name":"cbr0","type":"flannel","delegate": {"isDefaultGateway": true}}

EOF

mkdir /usr/share/oci-umount/oci-umount.d -p

mkdir /run/flannel/

cat <<EOF> /run/flannel/subnet.env

FLANNEL_NETWORK=172.100.0.0/16

FLANNEL_SUBNET=172.100.1.0/24

FLANNEL_MTU=1450

FLANNEL_IPMASQ=true

EOF

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.9.1/Documentation/kube-flannel.yml

 

8、Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of “crypto/rsa: verification error” while trying to verify candidate authority certificate “kubernetes”)

export KUBECONFIG=/etc/kubernetes/kubelet.conf

 

9、Failed to get system container stats for “/system.slice/docker.service”: failed to get cgroup stats for “/system.slice/docker.service”: failed to get container info for “/system.slice/docker.service”: unknown container “/system.slice/docker.service”

vim /etc/sysconfig/kubelet --runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice systemctl restart kubelet

 

大概意思是Flag --cgroup-driver --kubelet-cgroups 驅動已經被禁用,這個參數應該通過kubelet 的配置指定配置文件來配置

 

10、The HTTP call equal to ‘curl -sSL http://localhost:10255/healthz’ failed with error: Get http://localhost:10255/healthz: dial tcp 127.0.0.1:10255: getsockopt: connection refused.

vim /etc/systemd/system/kubelet.service.d/10-kubeadm.conf Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true --fail-swap-on=false"

 

11、failed to run Kubelet: failed to create kubelet: miscon figuration: kubelet cgroup driver: “systemd” is different from docker cgroup driver: “cgroupfs”

kubelet: Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=systemd" docker:   vi /lib/systemd/system/docker.service -exec-opt native.cgroupdriver=systemd

 

12、[ERROR CRI]: unable to check if the container runtime at “/var/run/dockershim.sock” is running: exit status 1

rm -f /usr/bin/crictl

 

13、 Warning FailedScheduling 2s (x7 over 33s) default-scheduler 0/4 nodes are available: 4 node(s) didn’t match node selector.

如果指定的label在所有node上都無法匹配,則創建Pod失敗,會提示無法調度:

 

14、kubeadm 生成的token過期后,集群增加節點

 kubeadm token create

 

openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null |

openssl dgst -sha256 -hex | sed 's/^.* //'

 

kubeadm join --token aa78f6.8b4cafc8ed26c34f --discovery-token-ca-cert-hash sha256:0fd95a9bc67a7bf0ef42da968a0d55d92e52898ec37c971bd77ee501d845b538  172.16.6.79:6443 --skip-preflight-checks

 

15、### systemctl status kubelet告警

cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d

May 29 06:30:28 fnode kubelet[4136]: E0529 06:30:28.935309 4136 kubelet.go:2130] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

刪除 /etc/systemd/system/kubelet.service.d/10-kubeadm.conf 的 KUBELET_NETWORK_ARGS,然后重啟kubelet服務 臨時解決。沒啥用

根本原因是缺少: k8s.gcr.io/pause-amd64:3.1

 

16 刪除flannel網絡:

ifconfig cni0 down

ifconfig flannel.1 down

ifconfig del flannel.1

ifconfig del cni0

 

ip link del flannel.1

ip link del cni0

 

yum install bridge-utils

brctl delbr  flannel.1

brctl delbr cni0

rm -rf /var/lib/cni/flannel/* && rm -rf /var/lib/cni/networks/cbr0/* && ip link delete cni0 &&  rm -rf /var/lib/cni/network/cni0/*

 

17、E0906 15:10:55.415662 1 leaderelection.go:234] error retrieving resource lock default/ceph.com-rbd: endpoints “ceph.com-rbd” is forbidden: User “system:serviceaccount:default:rbd-provisioner” cannot get endpoints in the namespace “default”

`在 添加下面的這一段 (會重新申請資源) kubectl apply -f ceph/rbd/deploy/rbac/clusterrole.yaml

 

apiGroups: [""]

resources: [“endpoints”]

verbs: [“get”, “list”, “watch”, “create”, “update”, “patch”]`

 

18、flannel指定網卡設備:

- --iface=eth0

 

21、 Failed create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container “957541888b8a0e5b9ad65da932f688eb02cc182808e10d1a89a6e8db2132c253” network for pod “coredns-7655b945bc-6hgj9”: NetworkPlugin cni failed to set up pod “coredns-7655b945bc-6hgj9_kube-system” network: failed to find plugin “loopback” in path [/opt/cni/bin], failed to clean up sandbox container “957541888b8a0e5b9ad65da932f688eb02cc182808e10d1a89a6e8db2132c253” network for pod “coredns-7655b945bc-6hgj9”: NetworkPlugin cni failed to teardown pod “coredns-7655b945bc-6hgj9_kube-system” network: failed to find plugin “portmap” in path [/opt/cni/bin]]

https://kubernetes.io/docs/setup/independent/troubleshooting-kubeadm/#coredns-pods-have-crashloopbackoff-or-error-state

如果您的網絡提供商不支持portmap CNI插件,您可能需要使用服務的NodePort功能或使用HostNetwork=true。

 

22、問題:kubelet設置了system-reserved(800m)、kube-reserved(500m)、eviction-hard(800),其實集群實際可用的內存是總內存-800m-800m-500m ,但是發現還 是會觸發系統級別kill進程,

排查:使用top查看前幾名的內存使用情況,發現etcd服務使用了內存達到500M以上,kubelet使用內存200m,ceph使用內存總和是200多m,加起來就已經900m了,這些都是k8s之外的系統開銷,已經完全超出了系統預留內存,因此可能會觸發系統級別的kill,

 

23、如何訪問api-server?

使用kubectl proxy功能

 

24、使用svc的endpoint代理集群外部服務,經常出現endpoint丟失的問題

解決:去掉service.spec.selecter 標簽就好了。

 

25、集群雪崩的一次問題處理,node節點偶爾出現noreading狀態,

排查:此node節點上cpu使用率過高。

 

1、沒有觸發node節點上的cpuPressure的狀態,判斷出來不是k8s所管理的cpu占用過高的問題,應該是system、kube組件預留的cpu高導致的。

2、查看cpu和mem的cgroup分組,發現kubelet,都在system.sliec下面,因此判斷kube預留資源沒有生效導致的。

3、

--enforce-node-allocatable=pods,kube-reserved,system-reserved  #采用硬限制,超出限制就oom

--system-reserved-cgroup=/system.slice  #指定系統reserved-cgroup對那些cgroup限制。

--kube-reserved-cgroup=/system.slice/kubelet.service #指定kube-reserved-cgroup對那些服務的cgroup進行限制

--system-reserved=memory=1Gi,cpu=500m  

--kube-reserved=memory=500Mi,cpu=500m,ephemeral-storage=10Gi

26、[etcd] Checking Etcd cluster health

etcd cluster is not healthy: context deadline exceeded

 

————————————————

十.  k8s 部署問題解決

snap安裝導致的初始化問題

由於一開始我安裝的時候沒有配置好鏡像源,所以導致了apt下載 k8s 三件套時出現了找不到對應包的問題,再加上 ubuntu 又提示了一下 try sudo snap isntall kubelet ... 所以我就用snap安裝了三件套,使用的安裝命令如下:

snap install kubelet --classic

snap install kubeadm --classic

snap install kubectl --classic

雖然我在網上也找到了不少用snap成功部署的例子,但是迫於技術不精,最終實在是無法解決出現的問題,換用了apt安裝之后就一帆風順的安裝完成了。下面記錄一下用snap安裝時出現的問題:

kubelet isn't running or healthy

使用kubeadm init初始化時出現了下述錯誤,重復四次之后就超時退出了:

[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.[kubelet-check] It seems like the kubelet isn't running or healthy.

官方給出的解決方案是使用systemctl status kubelet查看一下kubelet的狀態。但是我運行之后顯示未找到kubelet.service,然后用如下命令查看啟動失敗的服務:

systemctl list-units --failed

結果發現一個名為snap.kubelet.daemon.service的服務無法啟動了,嘗試了各種方法都沒有讓它復活,無奈只好放棄用snap安裝了。如果有大佬知道該怎么解決請告訴我,不勝感激。下面就說一下遇到的其他問題。

初始化時的警告

在使用kubeadm init命令初始化節點剛開始時,會有如下的perflight階段,該階段會進行檢查,如果其中出現了如下WARNING並且初始化失敗了。就要回來具體查看一下問題了。下面會對下述兩個警告進行解決:

# kubeadm init ...[init] Using Kubernetes version: v1.15.0[preflight] Running pre-flight checks

        [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/

        [WARNING FileExisting-socat]: socat not found in system path

WARNING IsDockerSystemdCheck

修改或創建/etc/docker/daemon.json,加入下述內容:

{

  "exec-opts": ["native.cgroupdriver=systemd"]}

重啟docker:

systemctl restart docker

查看修改后的狀態:

docker info | grep Cgroup

WARNING FileExisting-socat

socat是一個網絡工具, k8s 使用它來進行 pod 的數據交互,出現這個問題直接安裝socat即可:

apt-get install socat

節點狀態為 NotReady

使用kubectl get nodes查看已加入的節點時,出現了Status為NotReady的情況。

root@master1:~# kubectl get nodesNAME      STATUS      ROLES    AGE    VERSION

master1   NotReady    master   152m   v1.15.0

worker1   NotReady    <none>   94m    v1.15.0

這種情況是因為有某些關鍵的 pod 沒有運行起來,首先使用如下命令來看一下kube-system pod 狀態:

kubectl get pod -n kube-system

NAME                              READY   STATUS             RESTARTS   AGE

coredns-bccdc95cf-792px           1/1     Pending            0          3h11m

coredns-bccdc95cf-bc76j           1/1     Pending            0          3h11m

etcd-master1                      1/1     Running            2          3h10m

kube-apiserver-master1            1/1     Running            2          3h11m

kube-controller-manager-master1   1/1     Running            2          3h10m

kube-flannel-ds-amd64-9trbq       0/1     ImagePullBackoff   0          133m

kube-flannel-ds-amd64-btt74       0/1     ImagePullBackoff   0          174m

kube-proxy-27zfk                  1/1     Pending            2          3h11m

kube-proxy-lx4gk                  1/1     Pending            0          133m

kube-scheduler-master1            1/1     Running            2          3h11m

如下,可以看到 pod kube-flannel 的狀態是ImagePullBackoff,意思是鏡像拉取失敗了,所以我們需要手動去拉取這個鏡像。這里可以看到某些 pod 運行了兩個副本是因為我有兩個節點存在了。

你也可以通過kubectl describe pod -n kube-system <服務名>來查看某個服務的詳細情況,如果 pod 存在問題的話,你在使用該命令后在輸出內容的最下面看到一個[Event]條目,如下:

root@master1:~# kubectl describe pod kube-flannel-ds-amd64-9trbq -n kube-system

...

 

Events:

  Type     Reason                  Age                 From              Message

  ----     ------                  ----                ----              -------

  Normal   Killing                 29m                 kubelet, worker1  Stopping container kube-flannel

  Warning  FailedCreatePodSandBox  27m (x12 over 29m)  kubelet, worker1  Failed create pod sandbox: rpc error: code = Unknown desc = failed to create a sandbox for pod "kube-flannel-ds-amd64-9trbq": Error response from daemon: cgroup-parent for systemd cgroup should be a valid slice named as "xxx.slice"

  Normal   SandboxChanged          19m (x48 over 29m)  kubelet, worker1  Pod sandbox changed, it will be killed and re-created.

  Normal   Pulling                 42s                 kubelet, worker1  Pulling image "quay.io/coreos/flannel:v0.11.0-amd64"

手動拉取鏡像

flannel的鏡像可以使用如下命令拉到,如果你是其他鏡像沒拉到的話,百度一下就可以找到國內的鏡像源地址了,這里記得把最后面的版本號修改成你自己的版本,具體的版本號可以用上面說的kubectl describe命令看到:

docker pull quay-mirror.qiniu.com/coreos/flannel:v0.11.0-amd64

等鏡像拉取完了之后需要把鏡像名改一下,改成 k8s 沒有拉到的那個鏡像名稱,我這里貼的鏡像名和版本和你的不一定一樣,注意修改:

docker tag quay-mirror.qiniu.com/coreos/flannel:v0.11.0-amd64 quay.io/coreos/flannel:v0.11.0-amd64

修改完了之后過幾分鍾 k8s 會自動重試,等一下就可以發現不僅flannel正常了,其他的 pod 狀態也都變成了Running,這時再看 node 狀態就可以發現問題解決了:

root@master1:~# kubectl get nodesNAME      STATUS   ROLES    AGE     VERSION

master1   Ready    master   3h27m   v1.15.0

worker1   Ready    <none>   149m    v1.15.0

 

工作節點加入失敗

在子節點執行kubeadm join命令后返回超時錯誤,如下:

root@worker2:~# kubeadm join 192.168.56.11:6443 --token wbryr0.am1n476fgjsno6wa --discovery-token-ca-cert-hash sha256:7640582747efefe7c2d537655e428faa6275dbaff631de37822eb8fd4c054807[preflight] Running pre-flight checks

error execution phase preflight: couldn't validate the identity of the API Server: abort connecting to API servers after timeout of 5m0s

在master節點上執行kubeadm token create --print-join-command重新生成加入命令,並使用輸出的新命令在工作節點上重新執行即可。


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM