這里我們選用較新版本 kubeadm-1.17.4 + docker-19.03.5 + flannel-0.12,並開啟ipvs。
前置條件:
- 一台或多台機器
- 操作系統:CentOS7.x-86_x64
- 運行內存:2GB或更多
- 系統內核:2CPU或更多
- 硬盤大小:30GB或更多
- 集群中所有機器之間網絡互通可以訪問外網(需要拉取鏡像)
- 節點之中不可以有重復的主機名、MAC 地址、Product_uuid
- 禁止交換分區
kubernetes架構圖:
1) 設置主機名以及域名解析
hostnamectl set-hostname k8s-128
cat >> /etc/hosts <<EOF 192.168.17.128 k8s-128 192.168.17.129 k8s-129 192.168.17.130 k8s-130 192.168.17.200 myregistry.com EOF
2) 安裝依賴包以及常用軟件
yum -y install vim curl wget unzip ntpdate net-tools ipvsadm ipset sysstat conntrack libseccomp
ntpdate ntp1.aliyun.com # 同步系統時間
3) 關閉swap、selinux、firewalld
swapoff -a && sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab setenforce 0 && sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config systemctl stop firewalld && systemctl disable firewalld
注意:
- k8s要求關閉swap分區,以防容器運行在虛擬內存,導致性能大大降低。
- 生產環境要根據實際需要配置防火牆規則,可參考>>>Kubernetes集群開啟Firewall
4) 調整系統內核參數
cat > /etc/sysctl.d/kubernetes.conf <<EOF # 開啟網橋模式,關閉ipv6協議(必要) net.bridge.bridge-nf-call-iptables=1 net.bridge.bridge-nf-call-ip6tables=1 net.ipv6.conf.all.disable_ipv6=1 # 啟用IP路由轉發功能,關閉TW狀態快速回收機制 net.ipv4.ip_forward=1 net.ipv4.tcp_tw_recycle=0 # 禁止使用Swap空間,只有當系統OOM時才允許使用它 vm.swappiness=0 # 設置文件句柄數目,打開數目(可選) fs.file-max=2000000 fs.nr_open=2000000 fs.inotify.max_user_instances=512 fs.inotify.max_user_watches=1280000 # 設置系統最大連接數(可選) net.netfilter.nf_conntrack_max=524288 EOF sysctl -p /etc/sysctl.d/kubernetes.conf
注意:紅色為較為重要,其余為可選優化方案。
5) 加載系統ipvs相關模塊
modprobe br_netfilter cat > /etc/sysconfig/modules/ipvs.modules <<EOF #!/bin/bash modprobe -- ip_vs modprobe -- ip_vs_rr modprobe -- ip_vs_wrr modprobe -- ip_vs_sh modprobe -- nf_conntrack_ipv4 EOF chmod 755 /etc/sysconfig/modules/ipvs.modules sh /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_
6) 其他可選系統優化
1、修改yum源

# 備份原有鏡像源文件 mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.bak # 下載阿里雲鏡像源文件 wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo wget -O /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo # 重新生成緩存文件 yum clean all && yum makecache
2、安裝nfs服務

# 安裝nfs和rpcbind yum install -y nfs-common nfs-utils rpcbind # 創建目錄並賦予權限 mkdir /nfsdata chmod 666 /nfsdata chown nfsnobody /nfsdata cat /etc/exports /nfsdata *(rw,no_root_squash,no_all_squash,sync) # 重啟服務 systemctl start nfs && systemctl enable nfs systemctl start rpcbind && systemctl enable rpcbind
3、調整日志規則

mkdir /var/log/journal # 持久化保存日志的目錄 mkdir /etc/systemd/journald.conf.d cat > /etc/systemd/journald.conf.d/99-prophet.conf <<EOF [Journal] # 持久化保存到磁盤 Storage=persistent # 壓縮歷史日志 Compress=yes SyncIntervalSec=5m RateLimitInterval=30s RateLimitBurst=1000 # 最大占用空間 10G SystemMaxUse=10G # 單日志文件最大 200M SystemMaxFileSize=200M # 日志保存時間 2 周 MaxRetentionSec=2week # 不將日志轉發到 syslog ForwardToSyslog=no EOF systemctl restart systemd-journald
4、升級系統內核

rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm # 安裝完成后檢查 /boot/grub2/grub.cfg 中對應內核 menuentry 中是否包含 initrd16 配置,如果沒有,再安裝 一次! yum --enablerepo=elrepo-kernel install -y kernel-lt # 設置開機從新內核啟動 grub2-set-default "CentOS Linux (4.4.182-1.el7.elrepo.x86_64) 7 (Core)" # 重啟后安裝內核源文件 yum --enablerepo=elrepo-kernel install kernel-lt-devel-$(uname -r) kernel-lt-headers-$(uname -r)
5、關閉NUMA

cp /etc/default/grub{,.bak} vim /etc/default/grub # 在 GRUB_CMDLINE_LINUX 一行添加 `numa=off` 參數,如下所示: diff /etc/default/grub.bak /etc/default/grub 6c6 < GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=centos/root rhgb quiet" --- > GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=centos/root rhgb quiet numa=off" cp /boot/grub2/grub.cfg{,.bak} grub2-mkconfig -o /boot/grub2/grub.cfg
7) 安裝部署docker
# 安裝docker組件
yum install -y yum-utils device-mapper-persistent-data lvm2 # 設置docker鏡像源
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo # 這里指定v19.03.5版本
yum install -y docker-ce-19.03.5 docker-ce-cli-19.03.5
# 配置鏡像加速,倉庫地址,日志規則
mkdir /etc/docker cat > /etc/docker/daemon.json <<EOF { "registry-mirrors": ["https://jc3y13r3.mirror.aliyuncs.com"], "insecure-registries":[""], "exec-opts": ["native.cgroupdriver=systemd"], "log-driver": "json-file", "log-opts": { "max-size": "100m" } } EOF # 重啟docker服務,設置開機自啟
mkdir -p /etc/systemd/system/docker.service.d systemctl daemon-reload && systemctl restart docker && systemctl enable docker

{ "authorization-plugins": [], //訪問授權插件 "data-root": "", //docker數據持久化存儲的根目錄 "dns": [], //DNS服務器 "dns-opts": [], //DNS配置選項,如端口等 "dns-search": [], //DNS搜索域名 "exec-opts": [], //執行選項 "exec-root": "", //執行狀態的文件的根目錄 "experimental": false, //是否開啟試驗性特性 "storage-driver": "", //存儲驅動器 "storage-opts": [], //存儲選項 "labels": [], //鍵值對式標記docker元數據 "live-restore": true, //dockerd掛掉是否保活容器(避免了docker服務異常而造成容器退出) "log-driver": "", //容器日志的驅動器 "log-opts": {}, //容器日志的選項 , //設置容器網絡MTU(最大傳輸單元) "pidfile": "", //daemon PID文件的位置 "cluster-store": "", //集群存儲系統的URL "cluster-store-opts": {}, //配置集群存儲 "cluster-advertise": "", //對外的地址名稱 , //設置每個pull進程的最大並發 , //設置每個push進程的最大並發 "default-shm-size": "64M", //設置默認共享內存的大小 , //設置關閉的超時時限(who?) "debug": true, //開啟調試模式 "hosts": [], //監聽地址(?) "log-level": "", //日志級別 "tls": true, //開啟傳輸層安全協議TLS "tlsverify": true, //開啟輸層安全協議並驗證遠程地址 "tlscacert": "", //CA簽名文件路徑 "tlscert": "", //TLS證書文件路徑 "tlskey": "", //TLS密鑰文件路徑 "swarm-default-advertise-addr": "", //swarm對外地址 "api-cors-header": "", //設置CORS(跨域資源共享-Cross-origin resource sharing)頭 "selinux-enabled": false, //開啟selinux(用戶、進程、應用、文件的強制訪問控制) "userns-remap": "", //給用戶命名空間設置 用戶/組 "group": "", //docker所在組 "cgroup-parent": "", //設置所有容器的cgroup的父類(?) "default-ulimits": {}, //設置所有容器的ulimit "init": false, //容器執行初始化,來轉發信號或控制(reap)進程 "init-path": "/usr/libexec/docker-init", //docker-init文件的路徑 "ipv6": false, //開啟IPV6網絡 "iptables": false, //開啟防火牆規則 "ip-forward": false, //開啟net.ipv4.ip_forward "ip-masq": false, //開啟ip掩蔽(IP封包通過路由器或防火牆時重寫源IP地址或目的IP地址的技術) "userland-proxy": false, //用戶空間代理 "userland-proxy-path": "/usr/libexec/docker-proxy", //用戶空間代理路徑 "ip": "0.0.0.0", //默認IP "bridge": "", //將容器依附(attach)到橋接網絡上的橋標識 "bip": "", //指定橋接ip "fixed-cidr": "", //(ipv4)子網划分,即限制ip地址分配范圍,用以控制容器所屬網段實現容器間(同一主機或不同主機間)的網絡訪問 "fixed-cidr-v6": "", //(ipv6)子網划分 "default-gateway": "", //默認網關 "default-gateway-v6": "", //默認ipv6網關 "icc": false, //容器間通信 "raw-logs": false, //原始日志(無顏色、全時間戳) "allow-nondistributable-artifacts": [], //不對外分發的產品提交的registry倉庫 "registry-mirrors": [], //registry倉庫鏡像 "seccomp-profile": "", //seccomp配置文件 "insecure-registries": [], //非https的registry地址 "no-new-privileges": false, //禁止新優先級(??) "default-runtime": "runc", //OCI聯盟(The Open Container Initiative)默認運行時環境 , //內存溢出被殺死的優先級(-1000~1000) "node-generic-resources": ["NVIDIA-GPU=UUID1", "NVIDIA-GPU=UUID2"], //對外公布的資源節點 "runtimes": { //運行時 "cc-runtime": { "path": "/usr/bin/cc-runtime" }, "custom": { "path": "/usr/local/bin/my-runc-replacement", "runtimeArgs": [ "--debug" ] } } }
8) 安裝部署kubernetes
# 設置kubernetes鏡像源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF # 安裝k8s組件,這里指定v1.17.4版本 yum -y install kubeadm-1.17.1 kubectl-1.17.1 kubelet-1.17.1
# 設置kubelet開機自啟 systemctl enable kubelet.service
9) 初始化管理節點
# 獲取初始化配置文件,並修改相關參數
kubeadm config print init-defaults > kubeadm-config.yaml
vim kubeadm-config.yaml
--- apiVersion: kubeadm.k8s.io/v1beta2 bootstrapTokens: - groups: - system:bootstrappers:kubeadm:default-node-token token: abcdef.0123456789abcdef ttl: 24h0m0s usages: - signing - authentication kind: InitConfiguration localAPIEndpoint: advertiseAddress: 192.168.17.137 # 主節點地址 bindPort: 6443 # apiserver默認端口 nodeRegistration: criSocket: /var/run/dockershim.sock name: k8s-master # 主節點名稱 taints: # 主節點默認污點 - effect: NoSchedule key: node-role.kubernetes.io/master --- apiServer: timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta2 certificatesDir: /etc/kubernetes/pki # 證書存放位置 clusterName: kubernetes controllerManager: {} dns: type: CoreDNS etcd: local: dataDir: /var/lib/etcd imageRepository: registry.aliyuncs.com/google_containers # 修改鏡像源 kind: ClusterConfiguration kubernetesVersion: v1.17.4 # 修改k8s版本 networking: dnsDomain: cluster.local podSubnet: "10.244.0.0/16" # 指定pod網絡范圍,必須與flannel配置一致 serviceSubnet: 10.96.0.0/12 scheduler: {} --- # kube-proxy開啟ipvs apiVersion: kubeproxy.config.k8s.io/v1alpha1 kind: KubeProxyConfiguration featureGates: SupportIPVSProxyMode: true mode: ipvs ---
kubeadm init --config=kubeadm-config.yaml --upload-certs | tee kubeadm-init.log
# 根據初始化日志提示,需要將生成的admin.conf拷貝到.kube/config。 mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config

[init] Using Kubernetes version: v1.15.1 [preflight] Running pre-flight checks [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Activating the kubelet service [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "ca" certificate and key [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [k8s-master01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.17.137] [certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "front-proxy-ca" certificate and key [certs] Generating "front-proxy-client" certificate and key [certs] Generating "etcd/ca" certificate and key [certs] Generating "etcd/server" certificate and key [certs] etcd/server serving cert is signed for DNS names [k8s-master01 localhost] and IPs [192.168.17.137 127.0.0.1 ::1] [certs] Generating "etcd/peer" certificate and key [certs] etcd/peer serving cert is signed for DNS names [k8s-master01 localhost] and IPs [192.168.17.137 127.0.0.1 ::1] [certs] Generating "apiserver-etcd-client" certificate and key [certs] Generating "etcd/healthcheck-client" certificate and key [certs] Generating "sa" key and public key [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Writing "admin.conf" kubeconfig file [kubeconfig] Writing "kubelet.conf" kubeconfig file [kubeconfig] Writing "controller-manager.conf" kubeconfig file [kubeconfig] Writing "scheduler.conf" kubeconfig file [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" [control-plane] Creating static Pod manifest for "kube-scheduler" [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [apiclient] All control plane components are healthy after 21.005629 seconds [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [kubelet] Creating a ConfigMap "kubelet-config-1.15" in namespace kube-system with the configuration for the kubelets in the cluster [upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace [upload-certs] Using certificate key: 48ed9ff6d019a6f4ce9d854b42146d0085432d28bd2671cccd6eb69382d427d2 [mark-control-plane] Marking the node k8s-master01 as control-plane by adding the label "node-role.kubernetes.io/master=''" [mark-control-plane] Marking the node k8s-master01 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule] [bootstrap-token] Using token: abcdef.0123456789abcdef [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials [bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token [bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace [addons] Applied essential addon: CoreDNS [addons] Applied essential addon: kube-proxy Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.17.137:6443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:4ed1fe2c50eff803e7348c7f9229081839b598092d03effe1417e2fda9f340d2

[init]:指定版本進行初始化操作 [preflight] :初始化前的檢查和下載所需要的Docker鏡像文件。 [kubelet-start] :生成kubelet的配置文件”/var/lib/kubelet/config.yaml”,沒有這個文件kubelet無法啟動,所以初始化之前的kubelet實際上啟動失敗。 [certificates]:生成Kubernetes使用的證書,存放在/etc/kubernetes/pki目錄中。 [kubeconfig] :生成 KubeConfig 文件,存放在/etc/kubernetes目錄中,組件之間通信需要使用對應文件。 [control-plane]:使用/etc/kubernetes/manifest目錄下的YAML文件,安裝 Master 組件。 [etcd]:使用/etc/kubernetes/manifest/etcd.yaml安裝Etcd服務。 [wait-control-plane]:等待control-plan部署的Master組件啟動。 [apiclient]:檢查Master組件服務狀態。 [uploadconfig]:更新配置 [kubelet]:使用configMap配置kubelet。 [patchnode]:更新CNI信息到Node上,通過注釋的方式記錄。 [mark-control-plane]:為當前節點打標簽,打了角色Master,和不可調度標簽,這樣默認就不會使用Master節點來運行Pod。 [bootstrap-token]:生成token記錄下來,后邊使用kubeadm join往集群中添加節點時會用到 [addons]:安裝附加組件CoreDNS和kube-proxy
10) 加入工作節點
# 根據初始化日志提示,執行kubeadm join命令加入集群
kubeadm join 192.168.17.137:6443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:260796226d38de54c3c851ad48abf40ff97228cda68ce892cb813d9104c9a914

注意: 默認token有效期為24小時,失效后請在主節點使用以下命令重新生成
kubeadm token create --print-join-command
11) 部署網絡插件
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/a70459be0084506e4ec919aa1c114638878db11b/Documentation/kube-flannel.yml
注意:
- 如果網絡不好建議先下載文件,修改鏡像地址(一般在106和120行),再部署。
- 如果有多網卡的話,可能會出現dns無法解析,則需要指定參數--iface=<iface-name>。
- 當然你也可以直接使用我修改好的文件(已更換鏡像源到aliyuncs)。

# You can get more information at: # https://www.cnblogs.com/leozhanggg/p/12571957.html --- apiVersion: policy/v1beta1 kind: PodSecurityPolicy metadata: name: psp.flannel.unprivileged annotations: seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default spec: privileged: false volumes: - configMap - secret - emptyDir - hostPath allowedHostPaths: - pathPrefix: "/etc/cni/net.d" - pathPrefix: "/etc/kube-flannel" - pathPrefix: "/run/flannel" readOnlyRootFilesystem: false # Users and groups runAsUser: rule: RunAsAny supplementalGroups: rule: RunAsAny fsGroup: rule: RunAsAny # Privilege Escalation allowPrivilegeEscalation: false defaultAllowPrivilegeEscalation: false # Capabilities allowedCapabilities: ['NET_ADMIN'] defaultAddCapabilities: [] requiredDropCapabilities: [] # Host namespaces hostPID: false hostIPC: false hostNetwork: true hostPorts: - min: 0 max: 65535 # SELinux seLinux: # SELinux is unused in CaaSP rule: 'RunAsAny' --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: flannel rules: - apiGroups: ['extensions'] resources: ['podsecuritypolicies'] verbs: ['use'] resourceNames: ['psp.flannel.unprivileged'] - apiGroups: - "" resources: - pods verbs: - get - apiGroups: - "" resources: - nodes verbs: - list - watch - apiGroups: - "" resources: - nodes/status verbs: - patch --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: flannel roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: flannel subjects: - kind: ServiceAccount name: flannel namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: flannel namespace: kube-system --- kind: ConfigMap apiVersion: v1 metadata: name: kube-flannel-cfg namespace: kube-system labels: tier: node app: flannel data: cni-conf.json: | { "name": "cbr0", "cniVersion": "0.3.1", "plugins": [ { "type": "flannel", "delegate": { "hairpinMode": true, "isDefaultGateway": true } }, { "type": "portmap", "capabilities": { "portMappings": true } } ] } net-conf.json: | { "Network": "10.11.0.0/16", "Backend": { "Type": "vxlan" } } --- apiVersion: apps/v1 kind: DaemonSet metadata: name: kube-flannel-ds-amd64 namespace: kube-system labels: tier: node app: flannel spec: selector: matchLabels: app: flannel template: metadata: labels: tier: node app: flannel spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: beta.kubernetes.io/os operator: In values: - linux - key: beta.kubernetes.io/arch operator: In values: - amd64 hostNetwork: true tolerations: - operator: Exists effect: NoSchedule serviceAccountName: flannel initContainers: - name: install-cni image: registry.cn-shanghai.aliyuncs.com/leozhanggg/flannel:v0.12.0-amd64 command: - cp args: - -f - /etc/kube-flannel/cni-conf.json - /etc/cni/net.d/10-flannel.conflist volumeMounts: - name: cni mountPath: /etc/cni/net.d - name: flannel-cfg mountPath: /etc/kube-flannel/ containers: - name: kube-flannel image: registry.cn-shanghai.aliyuncs.com/leozhanggg/flannel:v0.12.0-amd64 command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr resources: requests: cpu: "100m" memory: "50Mi" limits: cpu: "100m" memory: "50Mi" securityContext: privileged: false capabilities: add: ["NET_ADMIN"] env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace volumeMounts: - name: run mountPath: /run/flannel - name: flannel-cfg mountPath: /etc/kube-flannel/ volumes: - name: run hostPath: path: /run/flannel - name: cni hostPath: path: /etc/cni/net.d - name: flannel-cfg configMap: name: kube-flannel-cfg --- apiVersion: apps/v1 kind: DaemonSet metadata: name: kube-flannel-ds-arm64 namespace: kube-system labels: tier: node app: flannel spec: selector: matchLabels: app: flannel template: metadata: labels: tier: node app: flannel spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: beta.kubernetes.io/os operator: In values: - linux - key: beta.kubernetes.io/arch operator: In values: - arm64 hostNetwork: true tolerations: - operator: Exists effect: NoSchedule serviceAccountName: flannel initContainers: - name: install-cni image: registry.cn-shanghai.aliyuncs.com/leozhanggg/flannel:v0.12.0-arm64 command: - cp args: - -f - /etc/kube-flannel/cni-conf.json - /etc/cni/net.d/10-flannel.conflist volumeMounts: - name: cni mountPath: /etc/cni/net.d - name: flannel-cfg mountPath: /etc/kube-flannel/ containers: - name: kube-flannel image: registry.cn-shanghai.aliyuncs.com/leozhanggg/flannel:v0.12.0-arm64 command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr resources: requests: cpu: "100m" memory: "50Mi" limits: cpu: "100m" memory: "50Mi" securityContext: privileged: false capabilities: add: ["NET_ADMIN"] env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace volumeMounts: - name: run mountPath: /run/flannel - name: flannel-cfg mountPath: /etc/kube-flannel/ volumes: - name: run hostPath: path: /run/flannel - name: cni hostPath: path: /etc/cni/net.d - name: flannel-cfg configMap: name: kube-flannel-cfg --- apiVersion: apps/v1 kind: DaemonSet metadata: name: kube-flannel-ds-arm namespace: kube-system labels: tier: node app: flannel spec: selector: matchLabels: app: flannel template: metadata: labels: tier: node app: flannel spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: beta.kubernetes.io/os operator: In values: - linux - key: beta.kubernetes.io/arch operator: In values: - arm hostNetwork: true tolerations: - operator: Exists effect: NoSchedule serviceAccountName: flannel initContainers: - name: install-cni image: registry.cn-shanghai.aliyuncs.com/leozhanggg/flannel:v0.12.0-arm command: - cp args: - -f - /etc/kube-flannel/cni-conf.json - /etc/cni/net.d/10-flannel.conflist volumeMounts: - name: cni mountPath: /etc/cni/net.d - name: flannel-cfg mountPath: /etc/kube-flannel/ containers: - name: kube-flannel image: registry.cn-shanghai.aliyuncs.com/leozhanggg/flannel:v0.12.0-arm command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr resources: requests: cpu: "100m" memory: "50Mi" limits: cpu: "100m" memory: "50Mi" securityContext: privileged: false capabilities: add: ["NET_ADMIN"] env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace volumeMounts: - name: run mountPath: /run/flannel - name: flannel-cfg mountPath: /etc/kube-flannel/ volumes: - name: run hostPath: path: /run/flannel - name: cni hostPath: path: /etc/cni/net.d - name: flannel-cfg configMap: name: kube-flannel-cfg --- apiVersion: apps/v1 kind: DaemonSet metadata: name: kube-flannel-ds-ppc64le namespace: kube-system labels: tier: node app: flannel spec: selector: matchLabels: app: flannel template: metadata: labels: tier: node app: flannel spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: beta.kubernetes.io/os operator: In values: - linux - key: beta.kubernetes.io/arch operator: In values: - ppc64le hostNetwork: true tolerations: - operator: Exists effect: NoSchedule serviceAccountName: flannel initContainers: - name: install-cni image: registry.cn-shanghai.aliyuncs.com/leozhanggg/flannel:v0.12.0-ppc64le command: - cp args: - -f - /etc/kube-flannel/cni-conf.json - /etc/cni/net.d/10-flannel.conflist volumeMounts: - name: cni mountPath: /etc/cni/net.d - name: flannel-cfg mountPath: /etc/kube-flannel/ containers: - name: kube-flannel image: registry.cn-shanghai.aliyuncs.com/leozhanggg/flannel:v0.12.0-ppc64le command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr resources: requests: cpu: "100m" memory: "50Mi" limits: cpu: "100m" memory: "50Mi" securityContext: privileged: false capabilities: add: ["NET_ADMIN"] env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace volumeMounts: - name: run mountPath: /run/flannel - name: flannel-cfg mountPath: /etc/kube-flannel/ volumes: - name: run hostPath: path: /run/flannel - name: cni hostPath: path: /etc/cni/net.d - name: flannel-cfg configMap: name: kube-flannel-cfg --- apiVersion: apps/v1 kind: DaemonSet metadata: name: kube-flannel-ds-s390x namespace: kube-system labels: tier: node app: flannel spec: selector: matchLabels: app: flannel template: metadata: labels: tier: node app: flannel spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: beta.kubernetes.io/os operator: In values: - linux - key: beta.kubernetes.io/arch operator: In values: - s390x hostNetwork: true tolerations: - operator: Exists effect: NoSchedule serviceAccountName: flannel initContainers: - name: install-cni image: registry.cn-shanghai.aliyuncs.com/leozhanggg/flannel:v0.12.0-s390x command: - cp args: - -f - /etc/kube-flannel/cni-conf.json - /etc/cni/net.d/10-flannel.conflist volumeMounts: - name: cni mountPath: /etc/cni/net.d - name: flannel-cfg mountPath: /etc/kube-flannel/ containers: - name: kube-flannel image: registry.cn-shanghai.aliyuncs.com/leozhanggg/flannel:v0.12.0-s390x command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr resources: requests: cpu: "100m" memory: "50Mi" limits: cpu: "100m" memory: "50Mi" securityContext: privileged: false capabilities: add: ["NET_ADMIN"] env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace volumeMounts: - name: run mountPath: /run/flannel - name: flannel-cfg mountPath: /etc/kube-flannel/ volumes: - name: run hostPath: path: /run/flannel - name: cni hostPath: path: /etc/cni/net.d - name: flannel-cfg configMap: name: kube-flannel-cfg flannel-0.12-aliyuncs
12) 檢查集群健康狀況
[root@k8s-master ~]# kubectl get cs NAME STATUS MESSAGE ERROR controller-manager Healthy ok scheduler Healthy ok etcd-0 Healthy {"health":"true"} [root@k8s-master ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master Ready master 37m v1.15.1 k8s-node01 Ready <none> 5m22s v1.15.1 k8s-node02 Ready <none> 5m18s v1.15.1 [root@k8s-master ~]# kubectl get pod -n kube-system NAME READY STATUS RESTARTS AGE coredns-bccdc95cf-h2ngj 1/1 Running 0 14m coredns-bccdc95cf-m78lt 1/1 Running 0 14m etcd-k8s-master 1/1 Running 0 13m kube-apiserver-k8s-master 1/1 Running 0 13m kube-controller-manager-k8s-master 1/1 Running 0 13m kube-flannel-ds-amd64-j774f 1/1 Running 0 9m48s kube-flannel-ds-amd64-t8785 1/1 Running 0 9m48s kube-flannel-ds-amd64-wgbtz 1/1 Running 0 9m48s kube-proxy-ddzdx 1/1 Running 0 14m kube-proxy-nwhzt 1/1 Running 0 14m kube-proxy-p64rw 1/1 Running 0 13m kube-scheduler-k8s-master 1/1 Running 0 13m
13) 其他操作
# 設置kubectl命令自動補全
yum install -y bash-completion source /usr/share/bash-completion/bash_completion source <(kubectl completion bash) echo "source <(kubectl completion bash)" >> ~/.bashrc
# 主節點默認污點移除
kubectl taint nodes --all node-role.kubernetes.io/master-
# 將node節點標記為不可調度,不影響現有Pod kubectl cordon node-name # 驅逐該節點的Pod kubectl drain node-name # 維護結束,節點重新投入使用 kubectl uncordon node-name
# K8S服務NodePort默認端口范圍是30000-32767,可以通過以下操作進行修改: vim /etc/kubernetes/manifests/kube-apiserver.yaml --- apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: component: kube-apiserver tier: control-plane name: kube-apiserver namespace: kube-system spec: containers: - command: - kube-apiserver - --service-node-port-range=2-65535 # 增加此配置 - --advertise-address=192.168.17.128 ......
# 重啟kube-apiserver即可 docker restart $(docker ps | grep k8s_kube-apiserver | awk '{print $1}')
14) 常見報錯處理
1、[ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1 # 解決辦法:echo "1" > /proc/sys/net/bridge/bridge-nf-call-iptables 2、[ERROR Swap]: running with swap on is not supported. Please disable swap # 解決辦法:關閉swap分區 swapoff -a vim /etc/fstab #/dev/mapper/rhel-swap swap swap defaults 0 0 3、[ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty # 解決辦法:直接刪除/var/lib/etcd文件夾 rm -rf /var/lib/etcd 4、The connection to the server localhost:8080 was refused - did you specify the right host or port? # 解決辦法: 為了使用kubectl訪問apiserver,在~/.bash_profile中追加下面的環境變量: export KUBECONFIG=/etc/kubernetes/admin.conf source ~/.bash_profile 重新初始化kubectl 5、Error execution phase preflight: [preflight] Some fatal errors occurred: [ERROR Port-6443]: Port 6443 is in use [ERROR Port-10251]: Port 10251 is in use [ERROR Port-10252]: Port 10252 is in use [ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists [ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists [ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists [ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists [ERROR Port-10250]: Port 10250 is in use [preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...` # 解決辦法:安裝提示忽略掉加上參數 --ignore-preflight-errors=all
折騰kubernetes各種問題匯總
kubectl命令可參考: Kubernetes kubectl 命令表 kubernetes常用命令整理
作者:Leozhanggg
出處:https:////www.cnblogs.com/leozhanggg/p/12571957.html
本文版權歸作者和博客園共有,歡迎轉載,但未經作者同意必須保留此段聲明,且在文章頁面明顯位置給出原文連接,否則保留追究法律責任的權利。