kubernetes worker 節點運行如下組件:
- + docker
- + kubelet
- + kube-proxy
- + flanneld
- + kube-nginx
部署kube-nginx是為了訪問 kube-apiserver 集群,kubelet、kube-proxy通過本地的 nginx(監聽 127.0.0.1)訪問 kube-apiserver,從而實現 kube-apiserver 的高可用
kube-nginx部署請參考:https://www.cnblogs.com/deny/p/12260717.html 里的安裝配置nginx代理
CA證書請參考:https://www.cnblogs.com/deny/p/12259778.html
節點信息:
+ zhangjun-k8s01:192.168.1.201 + zhangjun-k8s02:192.168.1.202 + zhangjun-k8s03:192.168.1.203
所需的變量存放在/opt/k8s/bin/environment.sh
#!/usr/bin/bash # 生成 EncryptionConfig 所需的加密 key export ENCRYPTION_KEY=$(head -c 32 /dev/urandom | base64) # 集群各機器 IP 數組 export NODE_IPS=(192.168.1.201 192.168.1.202 192.168.1.203) # 集群各 IP 對應的主機名數組 export NODE_NAMES=(zhangjun-k8s01 zhangjun-k8s02 zhangjun-k8s03) # etcd 集群服務地址列表 export ETCD_ENDPOINTS="https://192.168.1.201:2379,https://192.168.1.202:2379,https://192.168.1.203:2379" # etcd 集群間通信的 IP 和端口 export ETCD_NODES="zhangjun-k8s01=https://192.168.1.201:2380,zhangjun-k8s02=https://192.168.1.202:2380,zhangjun-k8s03=https://192.168.1.203:2380" # kube-apiserver 的反向代理(kube-nginx)地址端口 export KUBE_APISERVER="https://127.0.0.1:8443" # 節點間互聯網絡接口名稱 export IFACE="ens33" # etcd 數據目錄 export ETCD_DATA_DIR="/data/k8s/etcd/data" # etcd WAL 目錄,建議是 SSD 磁盤分區,或者和 ETCD_DATA_DIR 不同的磁盤分區 export ETCD_WAL_DIR="/data/k8s/etcd/wal" # k8s 各組件數據目錄 export K8S_DIR="/data/k8s/k8s" # docker 數據目錄 export DOCKER_DIR="/data/k8s/docker" ## 以下參數一般不需要修改 # TLS Bootstrapping 使用的 Token,可以使用命令 head -c 16 /dev/urandom | od -An -t x | tr -d ' ' 生成 BOOTSTRAP_TOKEN="41f7e4ba8b7be874fcff18bf5cf41a7c" # 最好使用 當前未用的網段 來定義服務網段和 Pod 網段 # 服務網段,部署前路由不可達,部署后集群內路由可達(kube-proxy 保證) SERVICE_CIDR="10.254.0.0/16" # Pod 網段,建議 /16 段地址,部署前路由不可達,部署后集群內路由可達(flanneld 保證) CLUSTER_CIDR="172.30.0.0/16" # 服務端口范圍 (NodePort Range) export NODE_PORT_RANGE="30000-32767" # flanneld 網絡配置前綴 export FLANNEL_ETCD_PREFIX="/kubernetes/network" # kubernetes 服務 IP (一般是 SERVICE_CIDR 中第一個IP) export CLUSTER_KUBERNETES_SVC_IP="10.254.0.1" # 集群 DNS 服務 IP (從 SERVICE_CIDR 中預分配) export CLUSTER_DNS_SVC_IP="10.254.0.2" # 集群 DNS 域名(末尾不帶點號) export CLUSTER_DNS_DOMAIN="cluster.local" # 將二進制目錄 /opt/k8s/bin 加到 PATH 中 export PATH=/opt/k8s/bin:$PATH
注意:如果沒有特殊指明,本文檔的所有操作**均在 zhangjun-k8s01 節點上執行**,然后遠程分發文件和執行命令
安裝依賴包
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "yum install -y epel-release" ssh root@${node_ip} "yum install -y conntrack ipvsadm ntp ntpdate ipset jq iptables curl sysstat libseccomp && modprobe ip_vs " done
一、安裝和配置 flanneld
flannel 集群部署請參考:https://www.cnblogs.com/deny/p/12260072.html
二、安裝和配置 kube-nginx
kube-nginx部署請參考:https://www.cnblogs.com/deny/p/12260717.html 里的安裝配置nginx代理
三、部署 docker 組件
docker 運行和管理容器,kubelet 通過 Container Runtime Interface (CRI) 與它進行交互。
1、安裝docker
1)下載和分發 docker 二進制文件
到 [docker 下載頁面](https://download.docker.com/linux/static/stable/x86_64/) 下載最新發布包
cd /opt/k8s/work wget https://download.docker.com/linux/static/stable/x86_64/docker-19.03.5.tgz tar -xvf docker-19.03.5.tgz
2)分發二進制文件到所有 worker 節點
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp docker/* root@${node_ip}:/opt/k8s/bin/ ssh root@${node_ip} "chmod +x /opt/k8s/bin/*" done
2、創建和分發 systemd unit 文件
1)創建模板文件
cd /opt/k8s/work cat > docker.service <<"EOF" [Unit] Description=Docker Application Container Engine Documentation=http://docs.docker.io [Service] WorkingDirectory=##DOCKER_DIR## Environment="PATH=/opt/k8s/bin:/bin:/sbin:/usr/bin:/usr/sbin" EnvironmentFile=-/run/flannel/docker ExecStart=/opt/k8s/bin/dockerd $DOCKER_NETWORK_OPTIONS ExecReload=/bin/kill -s HUP $MAINPID Restart=on-failure RestartSec=5 LimitNOFILE=infinity LimitNPROC=infinity LimitCORE=infinity Delegate=yes KillMode=process [Install] WantedBy=multi-user.target EOF
- EOF 前后有雙引號,這樣 bash 不會替換文檔中的變量,如 `$DOCKER_NETWORK_OPTIONS` (這些環境變量是 systemd 負責替換的。);
- dockerd 運行時會調用其它 docker 命令,如 docker-proxy,所以需要將 docker 命令所在的目錄加到 PATH 環境變量中;
- flanneld 啟動時將網絡配置寫入 `/run/flannel/docker` 文件中,dockerd 啟動前讀取該文件中的環境變量 `DOCKER_NETWORK_OPTIONS` ,然后設置 docker0 網橋網段;
- 如果指定了多個 `EnvironmentFile` 選項,則必須將 `/run/flannel/docker` 放在最后(確保 docker0 使用 flanneld 生成的 bip 參數);
- docker 需要以 root 用於運行;
- docker 從 1.13 版本開始,可能將 **iptables FORWARD chain的默認策略設置為DROP**,從而導致 ping 其它 Node 上的 Pod IP 失敗,遇到這種情況時,需要手動設置策略為 `ACCEPT`:iptables -P FORWARD ACCEPT,並且把以下命令寫入 `/etc/rc.local` 文件中,防止節點重啟**iptables FORWARD chain的默認策略又還原為DROP** ,/sbin/iptables -P FORWARD ACCEPT
2)分發 systemd unit 文件到所有 worker 機器
cd /opt/k8s/work source /opt/k8s/bin/environment.sh sed -i -e "s|##DOCKER_DIR##|${DOCKER_DIR}|" docker.service for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp docker.service root@${node_ip}:/etc/systemd/system/ done
3、配置和分發 docker 配置文件
1)配置docker-daemon.json
使用國內的倉庫鏡像服務器以加快 pull image 的速度,同時增加下載的並發數 (需要重啟 dockerd 生效):
cd /opt/k8s/work source /opt/k8s/bin/environment.sh cat > docker-daemon.json <<EOF { "registry-mirrors": ["https://docker.mirrors.ustc.edu.cn","https://hub-mirror.c.163.com"], "insecure-registries": ["docker02:35000"], "max-concurrent-downloads": 20, "live-restore": true, "max-concurrent-uploads": 10, "debug": true, "data-root": "${DOCKER_DIR}/data", "exec-root": "${DOCKER_DIR}/exec", "log-opts": { "max-size": "100m", "max-file": "5" } } EOF
2)分發 docker 配置文件到所有 worker 節點
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "mkdir -p /etc/docker/ ${DOCKER_DIR}/{data,exec}" scp docker-daemon.json root@${node_ip}:/etc/docker/daemon.json done
4、啟動 docker 服務
source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "systemctl daemon-reload && systemctl enable docker && systemctl restart docker" done
1)檢查服務運行狀態
source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "systemctl status docker|grep Active" done
確保狀態為 `active (running)`,否則查看日志,確認原因:journalctl -u docker
2)檢查 docker0 網橋
source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "/usr/sbin/ip addr show flannel.1 && /usr/sbin/ip addr show docker0" done
確認各 worker 節點的 docker0 網橋和 flannel.1 接口的 IP 處於同一個網段中(如下 172.30.128.0/32 位於 172.30.128.1/21 中)
注意: 如果您的服務安裝順序不對或者機器環境比較復雜, docker服務早於flanneld服務安裝,此時 worker 節點的 docker0 網橋和 flannel.1 接口的 IP可能不會同處同一個網段下,這個時候請先停止docker服務, 手工刪除docker0網卡,重新啟動docker服務后即可修復:
systemctl stop docker
ip link delete docker0
systemctl start docker
3)查看 docker 的狀態信息
ps -elfH|grep docker
docker info
四、部署 kubelet 組件
kubelet 運行在每個 worker 節點上,接收 kube-apiserver 發送的請求,管理 Pod 容器,執行交互式命令,如 exec、run、logs 等。
kubelet 啟動時自動向 kube-apiserver 注冊節點信息,內置的 cadvisor 統計和監控節點的資源使用情況。
為確保安全,部署時關閉了 kubelet 的非安全 http 端口,對請求進行認證和授權,拒絕未授權的訪問(如 apiserver、heapster 的請求)。
1、下載和分發 kubelet 二進制文件
1)從 [CHANGELOG 頁面](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG.md) 下載二進制 tar 文件並解壓
cd /opt/k8s/work wget https://dl.k8s.io/v1.14.2/kubernetes-server-linux-amd64.tar.gz tar -xzvf kubernetes-server-linux-amd64.tar.gz cd kubernetes tar -xzvf kubernetes-src.tar.gz
其他下載鏈接:
wget https://storage.googleapis.com/kubernetes-release/release/v1.14.2/kubernetes-node-linux-amd64.tar.gz wget https://storage.googleapis.com/kubernetes-release/release/v1.14.2/kubernetes-server-linux-amd64.tar.gz wget https://storage.googleapis.com/kubernetes-release/release/v1.14.2/kubernetes-client-linux-amd64.tar.gz wget https://storage.googleapis.com/kubernetes-release/release/v1.14.2/kubernetes.tar.gz
2)將二進制文件 kubelet拷貝到所有 work 節點
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp kubernetes/server/bin/{kubelet,kubectl,kubeadm,kube-proxy} root@${node_ip}:/opt/k8s/bin/ ssh root@${node_ip} "chmod +x /opt/k8s/bin/*" done
2、創建 kubelet bootstrap kubeconfig 文件
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_name in ${NODE_NAMES[@]} do echo ">>> ${node_name}" # 創建 token export BOOTSTRAP_TOKEN=$(kubeadm token create \ --description kubelet-bootstrap-token \ --groups system:bootstrappers:${node_name} \ --kubeconfig ~/.kube/config) # 設置集群參數 kubectl config set-cluster kubernetes \ --certificate-authority=/etc/kubernetes/cert/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=kubelet-bootstrap-${node_name}.kubeconfig # 設置客戶端認證參數 kubectl config set-credentials kubelet-bootstrap \ --token=${BOOTSTRAP_TOKEN} \ --kubeconfig=kubelet-bootstrap-${node_name}.kubeconfig # 設置上下文參數 kubectl config set-context default \ --cluster=kubernetes \ --user=kubelet-bootstrap \ --kubeconfig=kubelet-bootstrap-${node_name}.kubeconfig # 設置默認上下文 kubectl config use-context default --kubeconfig=kubelet-bootstrap-${node_name}.kubeconfig done
- 向 kubeconfig 寫入的是 token,bootstrap 結束后 kube-controller-manager 為 kubelet 創建 client 和 server 證書
1)查看 kubeadm 為各節點創建的 token
$ kubeadm token list --kubeconfig ~/.kube/config TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS 3gzd53.ahl5unc2d09yjid9 23h 2019-05-27T11:29:57+08:00 authentication,signing kubelet-bootstrap-token system:bootstrappers:zhangjun-k8s02 82jfrm.um1mkjkr7w2c7ex9 23h 2019-05-27T11:29:56+08:00 authentication,signing kubelet-bootstrap-token system:bootstrappers:zhangjun-k8s01 b1f7np.lwnnzur3i8ymtkur 23h 2019-05-27T11:29:57+08:00 authentication,signing kubelet-bootstrap-token system:bootstrappers:zhangjun-k8s03
- token 有效期為 1 天,超期后將不能再被用來 boostrap kubelet,且會被 kube-controller-manager 的 token cleaner 清理;
- kube-apiserver 接收 kubelet 的 bootstrap token 后,將請求的 user 設置為 `system:bootstrap:<Token ID>`,group 設置為 `system:bootstrappers`,后續將為這個 group 設置 ClusterRoleBinding;
2)查看各 token 關聯的 Secret
$ kubectl get secrets -n kube-system|grep bootstrap-token bootstrap-token-3gzd53 bootstrap.kubernetes.io/token 7 33s bootstrap-token-82jfrm bootstrap.kubernetes.io/token 7 34s bootstrap-token-b1f7np bootstrap.kubernetes.io/token 7 33s
3)分發 bootstrap kubeconfig 文件到所有 worker 節點
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_name in ${NODE_NAMES[@]} do echo ">>> ${node_name}" scp kubelet-bootstrap-${node_name}.kubeconfig root@${node_name}:/etc/kubernetes/kubelet-bootstrap.kubeconfig done
3、創建和分發 kubelet 參數配置文件
從 v1.10 開始,部分 kubelet 參數需在**配置文件**中配置,`kubelet --help` 會提示:
DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag
創建 kubelet 參數配置文件模板(可配置項參考[代碼中注釋](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/apis/config/types.go)
1)創建文件
cd /opt/k8s/work source /opt/k8s/bin/environment.sh cat > kubelet-config.yaml.template <<EOF kind: KubeletConfiguration apiVersion: kubelet.config.k8s.io/v1beta1 address: "##NODE_IP##" staticPodPath: "" syncFrequency: 1m fileCheckFrequency: 20s httpCheckFrequency: 20s staticPodURL: "" port: 10250 readOnlyPort: 0 rotateCertificates: true serverTLSBootstrap: true authentication: anonymous: enabled: false webhook: enabled: true x509: clientCAFile: "/etc/kubernetes/cert/ca.pem" authorization: mode: Webhook registryPullQPS: 0 registryBurst: 20 eventRecordQPS: 0 eventBurst: 20 enableDebuggingHandlers: true enableContentionProfiling: true healthzPort: 10248 healthzBindAddress: "##NODE_IP##" clusterDomain: "${CLUSTER_DNS_DOMAIN}" clusterDNS: - "${CLUSTER_DNS_SVC_IP}" nodeStatusUpdateFrequency: 10s nodeStatusReportFrequency: 1m imageMinimumGCAge: 2m imageGCHighThresholdPercent: 85 imageGCLowThresholdPercent: 80 volumeStatsAggPeriod: 1m kubeletCgroups: "" systemCgroups: "" cgroupRoot: "" cgroupsPerQOS: true cgroupDriver: cgroupfs runtimeRequestTimeout: 10m hairpinMode: promiscuous-bridge maxPods: 220 podCIDR: "${CLUSTER_CIDR}" podPidsLimit: -1 resolvConf: /etc/resolv.conf maxOpenFiles: 1000000 kubeAPIQPS: 1000 kubeAPIBurst: 2000 serializeImagePulls: false evictionHard: memory.available: "100Mi" nodefs.available: "10%" nodefs.inodesFree: "5%" imagefs.available: "15%" evictionSoft: {} enableControllerAttachDetach: true failSwapOn: true containerLogMaxSize: 20Mi containerLogMaxFiles: 10 systemReserved: {} kubeReserved: {} systemReservedCgroup: "" kubeReservedCgroup: "" enforceNodeAllocatable: ["pods"] EOF
- address:kubelet 安全端口(https,10250)監聽的地址,不能為 127.0.0.1,否則 kube-apiserver、heapster 等不能調用 kubelet 的 API;
- readOnlyPort=0:關閉只讀端口(默認 10255),等效為未指定;
- authentication.anonymous.enabled:設置為 false,不允許匿名訪問 10250 端口;
- authentication.x509.clientCAFile:指定簽名客戶端證書的 CA 證書,開啟 HTTP 證書認證;
- authentication.webhook.enabled=true:開啟 HTTPs bearer token 認證;
- 對於未通過 x509 證書和 webhook 認證的請求(kube-apiserver 或其他客戶端),將被拒絕,提示 Unauthorized;
- authroization.mode=Webhook:kubelet 使用 SubjectAccessReview API 查詢 kube-apiserver 某 user、group 是否具有操作資源的權限(RBAC);
- featureGates.RotateKubeletClientCertificate、featureGates.RotateKubeletServerCertificate:自動 rotate 證書,證書的有效期取決於 kube-controller-manager 的 --experimental-cluster-signing-duration 參數;
- 需要 root 賬戶運行;
2)為各節點創建和分發 kubelet 配置文件
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" sed -e "s/##NODE_IP##/${node_ip}/" kubelet-config.yaml.template > kubelet-config-${node_ip}.yaml.template scp kubelet-config-${node_ip}.yaml.template root@${node_ip}:/etc/kubernetes/kubelet-config.yaml done
4、創建和分發 kubelet systemd unit 文件
1)創建 kubelet systemd unit 文件模板
cd /opt/k8s/work source /opt/k8s/bin/environment.sh cat > kubelet.service.template <<EOF [Unit] Description=Kubernetes Kubelet Documentation=https://github.com/GoogleCloudPlatform/kubernetes After=docker.service Requires=docker.service [Service] WorkingDirectory=${K8S_DIR}/kubelet ExecStart=/opt/k8s/bin/kubelet \\ --allow-privileged=true \\ --bootstrap-kubeconfig=/etc/kubernetes/kubelet-bootstrap.kubeconfig \\ --cert-dir=/etc/kubernetes/cert \\ --cni-conf-dir=/etc/cni/net.d \\ --container-runtime=docker \\ --container-runtime-endpoint=unix:///var/run/dockershim.sock \\ --root-dir=${K8S_DIR}/kubelet \\ --kubeconfig=/etc/kubernetes/kubelet.kubeconfig \\ --config=/etc/kubernetes/kubelet-config.yaml \\ --hostname-override=##NODE_NAME## \\ --pod-infra-container-image=registry.cn-beijing.aliyuncs.com/images_k8s/pause-amd64:3.1 \\ --image-pull-progress-deadline=15m \\ --volume-plugin-dir=${K8S_DIR}/kubelet/kubelet-plugins/volume/exec/ \\ --logtostderr=true \\ --v=2 Restart=always RestartSec=5 StartLimitInterval=0 [Install] WantedBy=multi-user.target EOF
- + 如果設置了 `--hostname-override` 選項,則 `kube-proxy` 也需要設置該選項,否則會出現找不到 Node 的情況;
- + `--bootstrap-kubeconfig`:指向 bootstrap kubeconfig 文件,kubelet 使用該文件中的用戶名和 token 向 kube-apiserver 發送 TLS Bootstrapping 請求;
- + K8S approve kubelet 的 csr 請求后,在 `--cert-dir` 目錄創建證書和私鑰文件,然后寫入 `--kubeconfig` 文件;
- + `--pod-infra-container-image` 不使用 redhat 的 `pod-infrastructure:latest` 鏡像,它不能回收容器的僵屍;
2)為各節點創建和分發 kubelet systemd unit 文件
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_name in ${NODE_NAMES[@]} do echo ">>> ${node_name}" sed -e "s/##NODE_NAME##/${node_name}/" kubelet.service.template > kubelet-${node_name}.service scp kubelet-${node_name}.service root@${node_name}:/etc/systemd/system/kubelet.service done
5、Bootstrap Token Auth 和授予權限
kubelet 啟動時查找 `--kubeletconfig` 參數對應的文件是否存在,如果不存在則使用 `--bootstrap-kubeconfig` 指定的 kubeconfig 文件向 kube-apiserver 發送證書簽名請求 (CSR)。
kube-apiserver 收到 CSR 請求后,對其中的 Token 進行認證,認證通過后將請求的 user 設置為 `system:bootstrap:<Token ID>`,group 設置為 `system:bootstrappers`,這一過程稱為 Bootstrap Token Auth。
默認情況下,這個 user 和 group 沒有創建 CSR 的權限,kubelet 啟動失敗,錯誤日志如下:
$ sudo journalctl -u kubelet -a |grep -A 2 'certificatesigningrequests' May 26 12:13:41 zhangjun-k8s01 kubelet[128468]: I0526 12:13:41.798230 128468 certificate_manager.go:366] Rotating certificates May 26 12:13:41 zhangjun-k8s01 kubelet[128468]: E0526 12:13:41.801997 128468 certificate_manager.go:385] Failed while requesting a signed certificate from the master: cannot cre ate certificate signing request: certificatesigningrequests.certificates.k8s.io is forbidden: User "system:bootstrap:82jfrm" cannot create resource "certificatesigningrequests" i n API group "certificates.k8s.io" at the cluster scope May 26 12:13:42 zhangjun-k8s01 kubelet[128468]: E0526 12:13:42.044828 128468 kubelet.go:2244] node "zhangjun-k8s01" not found May 26 12:13:42 zhangjun-k8s01 kubelet[128468]: E0526 12:13:42.078658 128468 reflector.go:126] k8s.io/kubernetes/pkg/kubelet/kubelet.go:442: Failed to list *v1.Service: Unauthor ized May 26 12:13:42 zhangjun-k8s01 kubelet[128468]: E0526 12:13:42.079873 128468 reflector.go:126] k8s.io/kubernetes/pkg/kubelet/kubelet.go:451: Failed to list *v1.Node: Unauthorize d May 26 12:13:42 zhangjun-k8s01 kubelet[128468]: E0526 12:13:42.082683 128468 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.CSIDriver: Unau thorized May 26 12:13:42 zhangjun-k8s01 kubelet[128468]: E0526 12:13:42.084473 128468 reflector.go:126] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Unau thorized May 26 12:13:42 zhangjun-k8s01 kubelet[128468]: E0526 12:13:42.088466 128468 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.RuntimeClass: U nauthorized
解決辦法是:創建一個 clusterrolebinding,將 group system:bootstrappers 和 clusterrole system:node-bootstrapper 綁定:
$ kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --group=system:bootstrappers
6、啟動 kubelet 服務
source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "mkdir -p ${K8S_DIR}/kubelet/kubelet-plugins/volume/exec/" ssh root@${node_ip} "/usr/sbin/swapoff -a" ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kubelet && systemctl restart kubelet" done
- + 啟動服務前必須先創建工作目錄;
- + 關閉 swap 分區,否則 kubelet 會啟動失敗;
$ journalctl -u kubelet |tail 8月 15 12:16:49 zhangjun-k8s01 kubelet[7807]: I0815 12:16:49.578598 7807 feature_gate.go:230] feature gates: &{map[RotateKubeletClientCertificate:true RotateKubeletServerCertificate:true]} 8月 15 12:16:49 zhangjun-k8s01 kubelet[7807]: I0815 12:16:49.578698 7807 feature_gate.go:230] feature gates: &{map[RotateKubeletClientCertificate:true RotateKubeletServerCertificate:true]} 8月 15 12:16:50 zhangjun-k8s01 kubelet[7807]: I0815 12:16:50.205871 7807 mount_linux.go:214] Detected OS with systemd 8月 15 12:16:50 zhangjun-k8s01 kubelet[7807]: I0815 12:16:50.205939 7807 server.go:408] Version: v1.11.2 8月 15 12:16:50 zhangjun-k8s01 kubelet[7807]: I0815 12:16:50.206013 7807 feature_gate.go:230] feature gates: &{map[RotateKubeletClientCertificate:true RotateKubeletServerCertificate:true]} 8月 15 12:16:50 zhangjun-k8s01 kubelet[7807]: I0815 12:16:50.206101 7807 feature_gate.go:230] feature gates: &{map[RotateKubeletServerCertificate:true RotateKubeletClientCertificate:true]} 8月 15 12:16:50 zhangjun-k8s01 kubelet[7807]: I0815 12:16:50.206217 7807 plugins.go:97] No cloud provider specified. 8月 15 12:16:50 zhangjun-k8s01 kubelet[7807]: I0815 12:16:50.206237 7807 server.go:524] No cloud provider specified: "" from the config file: "" 8月 15 12:16:50 zhangjun-k8s01 kubelet[7807]: I0815 12:16:50.206264 7807 bootstrap.go:56] Using bootstrap kubeconfig to generate TLS client cert, key and kubeconfig file 8月 15 12:16:50 zhangjun-k8s01 kubelet[7807]: I0815 12:16:50.208628 7807 bootstrap.go:86] No valid private key and/or certificate found, reusing existing private key or creating a new one
kubelet 啟動后使用 --bootstrap-kubeconfig 向 kube-apiserver 發送 CSR 請求,當這個 CSR 被 approve 后,kube-controller-manager 為 kubelet 創建 TLS 客戶端證書、私鑰和 --kubeletconfig 文件。
注意:kube-controller-manager 需要配置 `--cluster-signing-cert-file` 和 `--cluster-signing-key-file` 參數,才會為 TLS Bootstrap 創建證書和私鑰。
TLS bootstrapping相關概念請參考:https://www.cnblogs.com/deny/p/12268224.html
$ kubectl get csr NAME AGE REQUESTOR CONDITION csr-5f4vh 31s system:bootstrap:82jfrm Pending csr-5rw7s 29s system:bootstrap:b1f7np Pending csr-m29fm 31s system:bootstrap:3gzd53 Pending $ kubectl get nodes No resources found.
- 三個 worker 節點的 csr 均處於 pending 狀態;
1)自動 approve CSR 請求
創建三個 ClusterRoleBinding,分別用於自動 approve client、renew client、renew server 證書
cd /opt/k8s/work cat > csr-crb.yaml <<EOF # Approve all CSRs for the group "system:bootstrappers" kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: auto-approve-csrs-for-group subjects: - kind: Group name: system:bootstrappers apiGroup: rbac.authorization.k8s.io roleRef: kind: ClusterRole name: system:certificates.k8s.io:certificatesigningrequests:nodeclient apiGroup: rbac.authorization.k8s.io --- # To let a node of the group "system:nodes" renew its own credentials kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: node-client-cert-renewal subjects: - kind: Group name: system:nodes apiGroup: rbac.authorization.k8s.io roleRef: kind: ClusterRole name: system:certificates.k8s.io:certificatesigningrequests:selfnodeclient apiGroup: rbac.authorization.k8s.io --- # A ClusterRole which instructs the CSR approver to approve a node requesting a # serving cert matching its client cert. kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: approve-node-server-renewal-csr rules: - apiGroups: ["certificates.k8s.io"] resources: ["certificatesigningrequests/selfnodeserver"] verbs: ["create"] --- # To let a node of the group "system:nodes" renew its own server credentials kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: node-server-cert-renewal subjects: - kind: Group name: system:nodes apiGroup: rbac.authorization.k8s.io roleRef: kind: ClusterRole name: approve-node-server-renewal-csr apiGroup: rbac.authorization.k8s.io EOF kubectl apply -f csr-crb.yaml
- + auto-approve-csrs-for-group:自動 approve node 的第一次 CSR; 注意第一次 CSR 時,請求的 Group 為 system:bootstrappers;
- + node-client-cert-renewal:自動 approve node 后續過期的 client 證書,自動生成的證書 Group 為 system:nodes;
- + node-server-cert-renewal:自動 approve node 后續過期的 server 證書,自動生成的證書 Group 為 system:nodes;
2)查看 kubelet 的情況
等待一段時間(1-10 分鍾),三個節點的 CSR 都被自動 approved:
$ kubectl get csr NAME AGE REQUESTOR CONDITION csr-5f4vh 7m59s system:bootstrap:82jfrm Approved,Issued csr-5r7j7 4m45s system:node:zhangjun-k8s03 Pending csr-5rw7s 7m57s system:bootstrap:b1f7np Approved,Issued csr-9snww 6m37s system:bootstrap:82jfrm Approved,Issued csr-c7z56 4m46s system:node:zhangjun-k8s02 Pending csr-j55lh 4m46s system:node:zhangjun-k8s01 Pending csr-m29fm 7m59s system:bootstrap:3gzd53 Approved,Issued csr-rc8w7 6m37s system:bootstrap:3gzd53 Approved,Issued csr-vd52r 6m36s system:bootstrap:b1f7np Approved,Issued
Pending 的 CSR 用於創建 kubelet server 證書,需要手動 approve,參考后文。
所有節點均 ready:
[root@zhangjun-k8s01 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION zhangjun-k8s01 Ready <none> 2d1h v1.14.2 zhangjun-k8s02 Ready <none> 2d1h v1.14.2 zhangjun-k8s03 Ready <none> 2d1h v1.14.2
kube-controller-manager 為各 node 生成了 kubeconfig 文件和公私鑰:
[root@zhangjun-k8s01 ~]# ls -l /etc/kubernetes/kubelet.kubeconfig -rw------- 1 root root 2310 2月 3 14:44 /etc/kubernetes/kubelet.kubeconfig [root@zhangjun-k8s01 ~]# ls -l /etc/kubernetes/cert/|grep kubelet -rw------- 1 root root 1281 2月 3 14:45 kubelet-client-2020-02-03-14-45-30.pem lrwxrwxrwx 1 root root 59 2月 3 14:45 kubelet-client-current.pem -> /etc/kubernetes/cert/kubelet-client-2020-02-03-14-45-30.pem -rw------- 1 root root 1330 2月 3 14:47 kubelet-server-2020-02-03-14-47-02.pem lrwxrwxrwx 1 root root 59 2月 3 14:47 kubelet-server-current.pem -> /etc/kubernetes/cert/kubelet-server-2020-02-03-14-47-02.pem
- 沒有自動生成 kubelet server 證書;
3)手動 approve server cert csr
基於[安全性考慮](https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-tls-bootstrapping/#kubelet-configuration),CSR approving controllers 不會自動 approve kubelet server 證書簽名請求,需要手動 approve
$ kubectl get csr NAME AGE REQUESTOR CONDITION csr-5f4vh 9m25s system:bootstrap:82jfrm Approved,Issued csr-5r7j7 6m11s system:node:zhangjun-k8s03 Pending csr-5rw7s 9m23s system:bootstrap:b1f7np Approved,Issued csr-9snww 8m3s system:bootstrap:82jfrm Approved,Issued csr-c7z56 6m12s system:node:zhangjun-k8s02 Pending csr-j55lh 6m12s system:node:zhangjun-k8s01 Pending csr-m29fm 9m25s system:bootstrap:3gzd53 Approved,Issued csr-rc8w7 8m3s system:bootstrap:3gzd53 Approved,Issued csr-vd52r 8m2s system:bootstrap:b1f7np Approved,Issued $ kubectl certificate approve csr-5r7j7 certificatesigningrequest.certificates.k8s.io/csr-5r7j7 approved $ kubectl certificate approve csr-c7z56 certificatesigningrequest.certificates.k8s.io/csr-c7z56 approved $ kubectl certificate approve csr-j55lh certificatesigningrequest.certificates.k8s.io/csr-j55lh approved $ ls -l /etc/kubernetes/cert/kubelet-* -rw------- 1 root root 1281 May 26 12:19 /etc/kubernetes/cert/kubelet-client-2019-05-26-12-19-25.pem lrwxrwxrwx 1 root root 59 May 26 12:19 /etc/kubernetes/cert/kubelet-client-current.pem -> /etc/kubernetes/cert/kubelet-client-2019-05-26-12-19-25.pem -rw------- 1 root root 1326 May 26 12:26 /etc/kubernetes/cert/kubelet-server-2019-05-26-12-26-39.pem lrwxrwxrwx 1 root root 59 May 26 12:26 /etc/kubernetes/cert/kubelet-server-current.pem -> /etc/kubernetes/cert/kubelet-server-2019-05-26-12-26-39.pem
7、kubelet 提供的 API 接口
kubelet 啟動后監聽多個端口,用於接收 kube-apiserver 或其它客戶端發送的請求:
[root@zhangjun-k8s01 ~]# netstat -lnpt|grep kubelet tcp 0 0 127.0.0.1:46758 0.0.0.0:* LISTEN 1505/kubelet tcp 0 0 192.168.1.201:10248 0.0.0.0:* LISTEN 1505/kubelet tcp 0 0 192.168.1.201:10250 0.0.0.0:* LISTEN 1505/kubelet
- + 10248: healthz http 服務;
- + 10250: https 服務,訪問該端口時需要認證和授權(即使訪問 /healthz 也需要);
- + 未開啟只讀端口 10255;
- + 從 K8S v1.10 開始,去除了 `--cadvisor-port` 參數(默認 4194 端口),不支持訪問 cAdvisor UI & API。
例如執行 `kubectl exec -it nginx-ds-5rmws -- sh` 命令時,kube-apiserver 會向 kubelet 發送如下請求:
POST /exec/default/nginx-ds-5rmws/my-nginx?command=sh&input=1&output=1&tty=1
kubelet 接收 10250 端口的 https 請求,可以訪問如下資源:
+ /pods、/runningpods
+ /metrics、/metrics/cadvisor、/metrics/probes
+ /spec
+ /stats、/stats/container
+ /logs
+ /run/、/exec/, /attach/, /portForward/, /containerLogs/
詳情參考:https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/server/server.go#L434:3
由於關閉了匿名認證,同時開啟了 webhook 授權,所有訪問 10250 端口 https API 的請求都需要被認證和授權。
預定義的 ClusterRole system:kubelet-api-admin 授予訪問 kubelet 所有 API 的權限(kube-apiserver 使用的 kubernetes 證書 User 授予了該權限):
[root@zhangjun-k8s01 ~]# kubectl describe clusterrole system:kubelet-api-admin Name: system:kubelet-api-admin Labels: kubernetes.io/bootstrapping=rbac-defaults Annotations: rbac.authorization.kubernetes.io/autoupdate: true PolicyRule: Resources Non-Resource URLs Resource Names Verbs --------- ----------------- -------------- ----- nodes/log [] [] [*] nodes/metrics [] [] [*] nodes/proxy [] [] [*] nodes/spec [] [] [*] nodes/stats [] [] [*] nodes [] [] [get list watch proxy]
8、kubelet api 認證和授權
kubelet 配置了如下認證參數:
+ authentication.anonymous.enabled:設置為 false,不允許匿名訪問 10250 端口;
+ authentication.x509.clientCAFile:指定簽名客戶端證書的 CA 證書,開啟 HTTPs 證書認證;
+ authentication.webhook.enabled=true:開啟 HTTPs bearer token 認證;
同時配置了如下授權參數:
+ authroization.mode=Webhook:開啟 RBAC 授權;
kubelet 收到請求后,使用 clientCAFile 對證書簽名進行認證,或者查詢 bearer token 是否有效。如果兩者都沒通過,則拒絕請求,提示 Unauthorized:
$ curl -s --cacert /etc/kubernetes/cert/ca.pem https://192.168.1.201:10250/metrics Unauthorized $ curl -s --cacert /etc/kubernetes/cert/ca.pem -H "Authorization: Bearer 123456" https://192.168.1.201:10250/metrics Unauthorized
通過認證后,kubelet 使用 SubjectAccessReview API 向 kube-apiserver 發送請求,查詢證書或 token 對應的 user、group 是否有操作資源的權限(RBAC);
1)證書認證和授權
curl -s --cacert /etc/kubernetes/cert/ca.pem --cert /etc/kubernetes/cert/kube-controller-manager.pem --key /etc/kubernetes/cert/kube-controller-manager-key.pem https://192.168.1.201:10250/metrics
Forbidden (user=system:kube-controller-manager, verb=get, resource=nodes, subresource=metrics)
2) 使用部署 kubectl 命令行工具時創建的、具有最高權限的 admin 證書;
curl -s --cacert /etc/kubernetes/cert/ca.pem --cert /opt/k8s/work/admin.pem --key /opt/k8s/work/admin-key.pem https://192.168.1.201:10250/metrics|head
- + `--cacert`、`--cert`、`--key` 的參數值必須是文件路徑,如上面的 `./admin.pem` 不能省略 `./`,否則返回 `401 Unauthorized`;
3)bear token 認證和授權
創建一個 ServiceAccount,將它和 ClusterRole system:kubelet-api-admin 綁定,從而具有調用 kubelet API 的權限:
kubectl create sa kubelet-api-test kubectl create clusterrolebinding kubelet-api-test --clusterrole=system:kubelet-api-admin --serviceaccount=default:kubelet-api-test SECRET=$(kubectl get secrets | grep kubelet-api-test | awk '{print $1}') TOKEN=$(kubectl describe secret ${SECRET} | grep -E '^token' | awk '{print $2}') echo ${TOKEN}
curl -s --cacert /etc/kubernetes/cert/ca.pem -H "Authorization: Bearer ${TOKEN}" https://192.168.1.201:10250/metrics|head
9、cadvisor 和 metrics
cadvisor 是內嵌在 kubelet 二進制中的,統計所在節點各容器的資源(CPU、內存、磁盤、網卡)使用情況的服務。
瀏覽器訪問 https://192.168.1.201:10250/metrics 和 https://192.168.1.201:10250/metrics/cadvisor 分別返回 kubelet 和 cadvisor 的 metrics。
注意:
+ kubelet.config.json 設置 authentication.anonymous.enabled 為 false,不允許匿名證書訪問 10250 的 https 服務;
+ 參考瀏覽器訪問 kube-apiserver 安全端口 https://www.cnblogs.com/deny/p/12264757.html,創建和導入相關證書,然后訪問上面的 10250 端口;
10、獲取 kubelet 的配置
從 kube-apiserver 獲取各節點 kubelet 的配置,使用部署 kubectl 命令行工具時創建的、具有最高權限的 admin 證書
curl -sSL --cacert /etc/kubernetes/cert/ca.pem --cert /opt/k8s/work/admin.pem --key /opt/k8s/work/admin-key.pem ${KUBE_APISERVER}/api/v1/nodes/zhangjun-k8s01/proxy/configz | jq '.kubeletconfig|.kind="KubeletConfiguration"|.apiVersion="kubelet.config.k8s.io/v1beta1"'
kubelet 認證和授權:https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-authentication-authorization/
五、部署 kube-proxy 組件
1、創建 kube-proxy 證書
1)創建證書簽名請求
cd /opt/k8s/work cat > kube-proxy-csr.json <<EOF { "CN": "system:kube-proxy", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "k8s", "OU": "4Paradigm" } ] } EOF
- + CN:指定該證書的 User 為 `system:kube-proxy`;
- + 預定義的 RoleBinding `system:node-proxier` 將User `system:kube-proxy` 與 Role `system:node-proxier` 綁定,該 Role 授予了調用 `kube-apiserver` Proxy 相關 API 的權限;
- + 該證書只會被 kube-proxy 當做 client 證書使用,所以 hosts 字段為空;
2)生成證書和私鑰
cd /opt/k8s/work cfssl gencert -ca=/opt/k8s/work/ca.pem \ -ca-key=/opt/k8s/work/ca-key.pem \ -config=/opt/k8s/work/ca-config.json \ -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy ls kube-proxy*
2、創建和分發 kubeconfig 文件
1)創建文件
cd /opt/k8s/work source /opt/k8s/bin/environment.sh kubectl config set-cluster kubernetes \ --certificate-authority=/opt/k8s/work/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=kube-proxy.kubeconfig kubectl config set-credentials kube-proxy \ --client-certificate=kube-proxy.pem \ --client-key=kube-proxy-key.pem \ --embed-certs=true \ --kubeconfig=kube-proxy.kubeconfig kubectl config set-context default \ --cluster=kubernetes \ --user=kube-proxy \ --kubeconfig=kube-proxy.kubeconfig kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig
- --embed-certs=true`:將 ca.pem 和 admin.pem 證書內容嵌入到生成的 kubectl-proxy.kubeconfig 文件中(不加時,寫入的是證書文件路徑);
2)分發 kubeconfig 文件
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_name in ${NODE_NAMES[@]} do echo ">>> ${node_name}" scp kube-proxy.kubeconfig root@${node_name}:/etc/kubernetes/ done
3、創建 kube-proxy 配置文件
從 v1.10 開始,kube-proxy **部分參數**可以配置文件中配置。可以使用 `--write-config-to` 選項生成該配置文件,或者參考 [源代碼的注釋](https://github.com/kubernetes/kubernetes/blob/release-1.14/pkg/proxy/apis/config/types.go)。
1)創建 kube-proxy config 文件模板
cd /opt/k8s/work source /opt/k8s/bin/environment.sh cat > kube-proxy-config.yaml.template <<EOF kind: KubeProxyConfiguration apiVersion: kubeproxy.config.k8s.io/v1alpha1 clientConnection: burst: 200 kubeconfig: "/etc/kubernetes/kube-proxy.kubeconfig" qps: 100 bindAddress: ##NODE_IP## healthzBindAddress: ##NODE_IP##:10256 metricsBindAddress: ##NODE_IP##:10249 enableProfiling: true clusterCIDR: ${CLUSTER_CIDR} hostnameOverride: ##NODE_NAME## mode: "ipvs" portRange: "" iptables: masqueradeAll: false ipvs: scheduler: rr excludeCIDRs: [] EOF
- + `bindAddress`: 監聽地址;
- + `clientConnection.kubeconfig`: 連接 apiserver 的 kubeconfig 文件;
- + `clusterCIDR`: kube-proxy 根據 `--cluster-cidr` 判斷集群內部和外部流量,指定 `--cluster-cidr` 或 `--masquerade-all` 選項后 kube-proxy 才會對訪問 Service IP 的請求做 SNAT;
- + `hostnameOverride`: 參數值必須與 kubelet 的值一致,否則 kube-proxy 啟動后會找不到該 Node,從而不會創建任何 ipvs 規則;
- + `mode`: 使用 ipvs 模式;
2)為各節點創建和分發 kube-proxy 配置文件
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for (( i=0; i < 3; i++ )) do echo ">>> ${NODE_NAMES[i]}" sed -e "s/##NODE_NAME##/${NODE_NAMES[i]}/" -e "s/##NODE_IP##/${NODE_IPS[i]}/" kube-proxy-config.yaml.template > kube-proxy-config-${NODE_NAMES[i]}.yaml.template scp kube-proxy-config-${NODE_NAMES[i]}.yaml.template root@${NODE_NAMES[i]}:/etc/kubernetes/kube-proxy-config.yaml done
4、創建和分發 kube-proxy systemd unit 文件
1)創建模板文件
cd /opt/k8s/work source /opt/k8s/bin/environment.sh cat > kube-proxy.service <<EOF [Unit] Description=Kubernetes Kube-Proxy Server Documentation=https://github.com/GoogleCloudPlatform/kubernetes After=network.target [Service] WorkingDirectory=${K8S_DIR}/kube-proxy ExecStart=/opt/k8s/bin/kube-proxy \\ --config=/etc/kubernetes/kube-proxy-config.yaml \\ --logtostderr=true \\ --v=2 Restart=on-failure RestartSec=5 LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF
2)分發 kube-proxy systemd unit 文件
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_name in ${NODE_NAMES[@]} do echo ">>> ${node_name}" scp kube-proxy.service root@${node_name}:/etc/systemd/system/ done
5、啟動 kube-proxy 服務
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "mkdir -p ${K8S_DIR}/kube-proxy" ssh root@${node_ip} "modprobe ip_vs_rr" ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kube-proxy && systemctl restart kube-proxy" done
- 啟動服務前必須先創建工作目錄
1) 檢查啟動結果
source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "systemctl status kube-proxy|grep Active" done
確保狀態為 `active (running)`,否則查看日志,確認原因:journalctl -u kube-proxy
2)查看監聽端口
[root@zhangjun-k8s01 ~]# netstat -lnpt|grep kube-prox tcp 0 0 192.168.1.201:10249 0.0.0.0:* LISTEN 899/kube-proxy tcp 0 0 192.168.1.201:10256 0.0.0.0:* LISTEN 899/kube-proxy tcp6 0 0 :::31424 :::* LISTEN 899/kube-proxy tcp6 0 0 :::31205 :::* LISTEN 899/kube-proxy tcp6 0 0 :::31789 :::* LISTEN 899/kube-proxy [root@zhangjun-k8s01 ~]#
- + 10249:http prometheus metrics port;
- + 10256:http healthz port;
3)查看 ipvs 路由規則
source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "/usr/sbin/ipvsadm -ln" done
- 可見所有通過 https 訪問 K8S SVC kubernetes 的請求都轉發到 kube-apiserver 節點的 6443 端口