在前一篇文章中詳細介紹了Kubernetes容器集群管理環境 - 完整部署(中篇),這里繼續記錄下Kubernetes集群插件等部署過程:
十一、Kubernetes集群插件
插件是Kubernetes集群的附件組件,豐富和完善了集群的功能,這里分別介紹的插件有coredns、Dashboard、Metrics Server,需要注意的是:kuberntes 自帶插件的 manifests yaml 文件使用 gcr.io 的 docker registry,國內被牆,需要手動替換為其它registry 地址或提前在翻牆服務器上下載,然后再同步到對應的k8s部署機器上。
11.1 - Kubernetes集群插件 - coredns
可以從微軟中國提供的 gcr.io 免費代理下載被牆的鏡像;下面部署命令均在k8s-master01節點上執行。
1)修改配置文件
將下載的 kubernetes-server-linux-amd64.tar.gz 解壓后,再解壓其中的 kubernetes-src.tar.gz 文件。
[root@k8s-master01 ~]# cd /opt/k8s/work/kubernetes
[root@k8s-master01 kubernetes]# tar -xzvf kubernetes-src.tar.gz
解壓之后,coredns 目錄是 cluster/addons/dns。
[root@k8s-master01 kubernetes]# cd /opt/k8s/work/kubernetes/cluster/addons/dns/coredns
[root@k8s-master01 coredns]# cp coredns.yaml.base coredns.yaml
[root@k8s-master01 coredns]# source /opt/k8s/bin/environment.sh
[root@k8s-master01 coredns]# sed -i -e "s/__PILLAR__DNS__DOMAIN__/${CLUSTER_DNS_DOMAIN}/" -e "s/__PILLAR__DNS__SERVER__/${CLUSTER_DNS_SVC_IP}/" coredns.yaml
2)創建 coredns
[root@k8s-master01 coredns]# fgrep "image" ./*
./coredns.yaml: image: k8s.gcr.io/coredns:1.3.1
./coredns.yaml: imagePullPolicy: IfNotPresent
./coredns.yaml.base: image: k8s.gcr.io/coredns:1.3.1
./coredns.yaml.base: imagePullPolicy: IfNotPresent
./coredns.yaml.in: image: k8s.gcr.io/coredns:1.3.1
./coredns.yaml.in: imagePullPolicy: IfNotPresent
./coredns.yaml.sed: image: k8s.gcr.io/coredns:1.3.1
./coredns.yaml.sed: imagePullPolicy: IfNotPresent
提前翻牆下載"k8s.gcr.io/coredns:1.3.1"鏡像,然后上傳到node節點上, 執行"docker load ..."命令導入到node節點的images鏡像里面
或者從微軟中國提供的gcr.io免費代理下載被牆的鏡像,然后在修改yaml文件里更新coredns的鏡像下載地址
然后確保對應yaml文件里的鏡像拉取策略為IfNotPresent,即本地有則使用本地鏡像,不拉取
接着再次進行coredns的創建
[root@k8s-master01 coredns]# kubectl create -f coredns.yaml
3)檢查coredns功能 (執行下面命令后,稍微等一會兒,確保READY狀態都是可用的)
[root@k8s-master01 coredns]# kubectl get all -n kube-system
NAME READY STATUS RESTARTS AGE
pod/coredns-5b969f4c88-pd5js 1/1 Running 0 55s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kube-dns ClusterIP 10.254.0.2 <none> 53/UDP,53/TCP,9153/TCP 56s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/coredns 1/1 1 1 57s
NAME DESIRED CURRENT READY AGE
replicaset.apps/coredns-5b969f4c88 1 1 1 56s
查看創建的coredns的pod狀態,確保沒有報錯
[root@k8s-master01 coredns]# kubectl describe pod/coredns-5b969f4c88-pd5js -n kube-system
.............
.............
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m12s default-scheduler Successfully assigned kube-system/coredns-5b969f4c88-pd5js to k8s-node03
Normal Pulled 2m11s kubelet, k8s-node03 Container image "k8s.gcr.io/coredns:1.3.1" already present on machine
Normal Created 2m10s kubelet, k8s-node03 Created container coredns
Normal Started 2m10s kubelet, k8s-node03 Started container coredns
4)新建一個 Deployment
[root@k8s-master01 coredns]# cd /opt/k8s/work
[root@k8s-master01 work]# cat > my-nginx.yaml <<EOF
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: my-nginx
spec:
replicas: 2
template:
metadata:
labels:
run: my-nginx
spec:
containers:
- name: my-nginx
image: nginx:1.7.9
ports:
- containerPort: 80
EOF
接着執行這個Deployment的創建
[root@k8s-master01 work]# kubectl create -f my-nginx.yaml
export 該 Deployment, 生成 my-nginx 服務:
[root@k8s-master01 work]# kubectl expose deploy my-nginx
[root@k8s-master01 work]# kubectl get services --all-namespaces |grep my-nginx
default my-nginx ClusterIP 10.254.170.246 <none> 80/TCP 19s
創建另一個 Pod,查看 /etc/resolv.conf 是否包含 kubelet 配置的 --cluster-dns 和 --cluster-domain,
是否能夠將服務 my-nginx 解析到上面顯示的 Cluster IP 10.254.170.246
[root@k8s-master01 work]# cd /opt/k8s/work
[root@k8s-master01 work]# cat > dnsutils-ds.yml <<EOF
apiVersion: v1
kind: Service
metadata:
name: dnsutils-ds
labels:
app: dnsutils-ds
spec:
type: NodePort
selector:
app: dnsutils-ds
ports:
- name: http
port: 80
targetPort: 80
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: dnsutils-ds
labels:
addonmanager.kubernetes.io/mode: Reconcile
spec:
template:
metadata:
labels:
app: dnsutils-ds
spec:
containers:
- name: my-dnsutils
image: tutum/dnsutils:latest
command:
- sleep
- "3600"
ports:
- containerPort: 80
EOF
接着創建這個pod
[root@k8s-master01 work]# kubectl create -f dnsutils-ds.yml
查看上面創建的pod狀態(需要等待一會兒,確保STATUS狀態為"Running"。如果狀態失敗,可以執行"kubectl describe pod ...."查看原因)
[root@k8s-master01 work]# kubectl get pods -lapp=dnsutils-ds
NAME READY STATUS RESTARTS AGE
dnsutils-ds-5sc4z 1/1 Running 0 52s
dnsutils-ds-h546r 1/1 Running 0 52s
dnsutils-ds-jx5kx 1/1 Running 0 52s
[root@k8s-master01 work]# kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
dnsutils-ds NodePort 10.254.185.211 <none> 80:32767/TCP 7m14s
kubernetes ClusterIP 10.254.0.1 <none> 443/TCP 7d13h
my-nginx ClusterIP 10.254.170.246 <none> 80/TCP 9m11s
nginx-ds NodePort 10.254.41.83 <none> 80:30876/TCP 27h
然后驗證coredns 功能。
先依次登陸上面創建的dnsutils的pod里面進行驗證,確保pod容器中/etc/resolv.conf里的nameserver地址為"CLUSTER_DNS_SVC_IP"變量值(即environment.sh腳本中定義的)
[root@k8s-master01 work]# kubectl -it exec dnsutils-ds-5sc4z bash
root@dnsutils-ds-5sc4z:/# cat /etc/resolv.conf
nameserver 10.254.0.2
search default.svc.cluster.local svc.cluster.local cluster.local localdomain
options ndots:5
[root@k8s-master01 work]# kubectl exec dnsutils-ds-5sc4z nslookup kubernetes
Server: 10.254.0.2
Address: 10.254.0.2#53
Name: kubernetes.default.svc.cluster.local
Address: 10.254.0.1
[root@k8s-master01 work]# kubectl exec dnsutils-ds-5sc4z nslookup www.baidu.com
Server: 10.254.0.2
Address: 10.254.0.2#53
Non-authoritative answer:
www.baidu.com canonical name = www.a.shifen.com.
www.a.shifen.com canonical name = www.wshifen.com.
Name: www.wshifen.com
Address: 103.235.46.39
發現可以將服務 my-nginx 解析到上面它對應的 Cluster IP 10.254.170.246
[root@k8s-master01 work]# kubectl exec dnsutils-ds-5sc4z nslookup my-nginx
Server: 10.254.0.2
Address: 10.254.0.2#53
Non-authoritative answer:
Name: my-nginx.default.svc.cluster.local
Address: 10.254.170.246
[root@k8s-master01 work]# kubectl exec dnsutils-ds-5sc4z nslookup kube-dns.kube-system.svc.cluster
Server: 10.254.0.2
Address: 10.254.0.2#53
** server can't find kube-dns.kube-system.svc.cluster: NXDOMAIN
command terminated with exit code 1
[root@k8s-master01 work]# kubectl exec dnsutils-ds-5sc4z nslookup kube-dns.kube-system.svc
Server: 10.254.0.2
Address: 10.254.0.2#53
Name: kube-dns.kube-system.svc.cluster.local
Address: 10.254.0.2
[root@k8s-master01 work]# kubectl exec dnsutils-ds-5sc4z nslookup kube-dns.kube-system.svc.cluster.local
Server: 10.254.0.2
Address: 10.254.0.2#53
Name: kube-dns.kube-system.svc.cluster.local
Address: 10.254.0.2
[root@k8s-master01 work]# kubectl exec dnsutils-ds-5sc4z nslookup kube-dns.kube-system.svc.cluster.local.
Server: 10.254.0.2
Address: 10.254.0.2#53
Name: kube-dns.kube-system.svc.cluster.local
Address: 10.254.0.2
11.2 - Kubernetes集群插件 - dashboard
可以從微軟中國提供的 gcr.io 免費代理下載被牆的鏡像;下面部署命令均在k8s-master01節點上執行。
1)修改配置文件
將下載的 kubernetes-server-linux-amd64.tar.gz 解壓后,再解壓其中的 kubernetes-src.tar.gz 文件 (上面在coredns部署階段已經解壓過了)
[root@k8s-master01 ~]# cd /opt/k8s/work/kubernetes/
[root@k8s-master01 kubernetes]# ls -d cluster/addons/dashboard
cluster/addons/dashboard
dashboard 對應的目錄是:cluster/addons/dashboard
[root@k8s-master01 kubernetes]# cd /opt/k8s/work/kubernetes/cluster/addons/dashboard
修改 service 定義,指定端口類型為 NodePort,這樣外界可以通過地址 NodeIP:NodePort 訪問 dashboard;
[root@k8s-master01 dashboard]# vim dashboard-service.yaml
apiVersion: v1
kind: Service
metadata:
name: kubernetes-dashboard
namespace: kube-system
labels:
k8s-app: kubernetes-dashboard
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
type: NodePort # 添加這一行內容
selector:
k8s-app: kubernetes-dashboard
ports:
- port: 443
targetPort: 8443
2) 執行所有定義文件
需要提前翻牆將k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1鏡像下載下來,然后上傳到node節點上,然后執行"docker load ......" 導入到node節點的images鏡像里
或者從微軟中國提供的gcr.io免費代理下載被牆的鏡像,然后在修改yaml文件里更新dashboard的鏡像下載地址
[root@k8s-master01 dashboard]# fgrep "image" ./*
./dashboard-controller.yaml: image: k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1
[root@k8s-master01 dashboard]# ls *.yaml
dashboard-configmap.yaml dashboard-controller.yaml dashboard-rbac.yaml dashboard-secret.yaml dashboard-service.yaml
[root@k8s-master01 dashboard]# kubectl apply -f .
3)查看分配的 NodePort
[root@k8s-master01 dashboard]# kubectl get deployment kubernetes-dashboard -n kube-system
NAME READY UP-TO-DATE AVAILABLE AGE
kubernetes-dashboard 1/1 1 1 48s
[root@k8s-master01 dashboard]# kubectl --namespace kube-system get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-5b969f4c88-pd5js 1/1 Running 0 33m 172.30.72.3 k8s-node03 <none> <none>
kubernetes-dashboard-85bcf5dbf8-8s7hm 1/1 Running 0 63s 172.30.72.6 k8s-node03 <none> <none>
[root@k8s-master01 dashboard]# kubectl get services kubernetes-dashboard -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes-dashboard NodePort 10.254.164.208 <none> 443:30284/TCP 104s
可以看出:NodePort 30284 映射到 dashboard pod 443 端口;
4)查看 dashboard 支持的命令行參數
[root@k8s-master01 dashboard]# kubectl exec --namespace kube-system -it kubernetes-dashboard-85bcf5dbf8-8s7hm -- /dashboard --help
2019/06/25 16:54:04 Starting overwatch
Usage of /dashboard:
--alsologtostderr log to standard error as well as files
--api-log-level string Level of API request logging. Should be one of 'INFO|NONE|DEBUG'. Default: 'INFO'. (default "INFO")
--apiserver-host string The address of the Kubernetes Apiserver to connect to in the format of protocol://address:port, e.g., http://localhost:8080. If not specified, the assumption is that the binary runs inside a Kubernetes cluster and local discovery is attempted.
--authentication-mode strings Enables authentication options that will be reflected on login screen. Supported values: token, basic. Default: token.Note that basic option should only be used if apiserver has '--authorization-mode=ABAC' and '--basic-auth-file' flags set. (default [token])
--auto-generate-certificates When set to true, Dashboard will automatically generate certificates used to serve HTTPS. Default: false.
--bind-address ip The IP address on which to serve the --secure-port (set to 0.0.0.0 for all interfaces). (default 0.0.0.0)
--default-cert-dir string Directory path containing '--tls-cert-file' and '--tls-key-file' files. Used also when auto-generating certificates flag is set. (default "/certs")
--disable-settings-authorizer When enabled, Dashboard settings page will not require user to be logged in and authorized to access settings page.
--enable-insecure-login When enabled, Dashboard login view will also be shown when Dashboard is not served over HTTPS. Default: false.
--enable-skip-login When enabled, the skip button on the login page will be shown. Default: false.
--heapster-host string The address of the Heapster Apiserver to connect to in the format of protocol://address:port, e.g., http://localhost:8082. If not specified, the assumption is that the binary runs inside a Kubernetes cluster and service proxy will be used.
--insecure-bind-address ip The IP address on which to serve the --port (set to 0.0.0.0 for all interfaces). (default 127.0.0.1)
--insecure-port int The port to listen to for incoming HTTP requests. (default 9090)
--kubeconfig string Path to kubeconfig file with authorization and master location information.
--log_backtrace_at traceLocation when logging hits line file:N, emit a stack trace (default :0)
--log_dir string If non-empty, write log files in this directory
--logtostderr log to standard error instead of files
--metric-client-check-period int Time in seconds that defines how often configured metric client health check should be run. Default: 30 seconds. (default 30)
--port int The secure port to listen to for incoming HTTPS requests. (default 8443)
--stderrthreshold severity logs at or above this threshold go to stderr (default 2)
--system-banner string When non-empty displays message to Dashboard users. Accepts simple HTML tags. Default: ''.
--system-banner-severity string Severity of system banner. Should be one of 'INFO|WARNING|ERROR'. Default: 'INFO'. (default "INFO")
--tls-cert-file string File containing the default x509 Certificate for HTTPS.
--tls-key-file string File containing the default x509 private key matching --tls-cert-file.
--token-ttl int Expiration time (in seconds) of JWE tokens generated by dashboard. Default: 15 min. 0 - never expires (default 900)
-v, --v Level log level for V logs
--vmodule moduleSpec comma-separated list of pattern=N settings for file-filtered logging
pflag: help requested
command terminated with exit code 2
5)訪問dashboard
從1.7版本開始,dashboard只允許通過https訪問,如果使用kube proxy則必須監聽localhost或127.0.0.1。
對於NodePort沒有這個限制,但是僅建議在開發環境中使用。
對於不滿足這些條件的登錄訪問,在登錄成功后瀏覽器不跳轉,始終停在登錄界面。
有三種訪問dashboard的方式:
-> kubernetes-dashboard 服務暴露了 NodePort,可以使用 https://NodeIP:NodePort 地址訪問 dashboard;
-> 通過 kube-apiserver 訪問 dashboard;
-> 通過 kubectl proxy 訪問 dashboard:
第一種方式:
kubernetes-dashboard 服務暴露了NodePort端口,可以通過https://NodeIP+NodePort 來訪問dashboard
[root@k8s-master01 dashboard]# kubectl get services kubernetes-dashboard -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes-dashboard NodePort 10.254.164.208 <none> 443:30284/TCP 14m
則可以通過訪問https://172.16.60.244:30284,https://172.16.60.245:30284,https://172.16.60.246:30284 來打開dashboard界面
第二種方式:通過 kubectl proxy 訪問 dashboard
啟動代理(下面命令會一直在前台執行,可以選擇使用tmux虛擬終端執行)
[root@k8s-master01 dashboard]# kubectl proxy --address='localhost' --port=8086 --accept-hosts='^*$'
Starting to serve on 127.0.0.1:8086
需要注意:
--address 必須為 localhost 或 127.0.0.1;
需要指定 --accept-hosts 選項,否則瀏覽器訪問 dashboard 頁面時提示 “Unauthorized”;
這樣就可以在這個服務器的瀏覽器里訪問 URL:http://127.0.0.1:8086/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy
第三種方式:通過 kube-apiserver 訪問 dashboard
獲取集群服務地址列表:
[root@k8s-master01 dashboard]# kubectl cluster-info
Kubernetes master is running at https://172.16.60.250:8443
CoreDNS is running at https://172.16.60.250:8443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
kubernetes-dashboard is running at https://172.16.60.250:8443/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
需要注意:
必須通過 kube-apiserver 的安全端口(https)訪問 dashbaord,訪問時瀏覽器需要使用自定義證書,否則會被 kube-apiserver 拒絕訪問。
創建和導入自定義證書的操作已經在前面"部署node工作節點"環節介紹過了,這里就略過了~~~
瀏覽器訪問 URL:https://172.16.60.250:8443/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy 即可打開dashboard界面
6)創建登錄 Dashboard 的 token 和 kubeconfig 配置文件
dashboard 默認只支持 token 認證(不支持 client 證書認證),所以如果使用 Kubeconfig 文件,需要將 token 寫入到該文件。
方法一:創建登錄 token
[root@k8s-master01 ~]# kubectl create sa dashboard-admin -n kube-system
serviceaccount/dashboard-admin created
[root@k8s-master01 ~]# kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin
clusterrolebinding.rbac.authorization.k8s.io/dashboard-admin created
[root@k8s-master01 ~]# ADMIN_SECRET=$(kubectl get secrets -n kube-system | grep dashboard-admin | awk '{print $1}')
[root@k8s-master01 ~]# DASHBOARD_LOGIN_TOKEN=$(kubectl describe secret -n kube-system ${ADMIN_SECRET} | grep -E '^token' | awk '{print $2}')
[root@k8s-master01 ~]# echo ${DASHBOARD_LOGIN_TOKEN}
eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJkYXNoYm9hcmQtYWRtaW4tdG9rZW4tcmNicnMiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiZGFzaGJvYXJkLWFkbWluIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiZGQ1Njg0OGUtOTc2Yi0xMWU5LTkwZDQtMDA1MDU2YWM3YzgxIiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Omt1YmUtc3lzdGVtOmRhc2hib2FyZC1hZG1pbiJ9.Kwh_zhI-dA8kIfs7DRmNecS_pCXQ3B2ujS_eooR-Gvoaz29cJTzD_Z67bRDS1qlJ8oyIQjW2_m837EkUCpJ8LRiOnTMjwBPMeBPHHomDGdSmdj37UEc7YQa5AmkvVWIYiUKgTHJjgLaKlk6eH7Ihvcez3IBHWTFXlULu24mlMt9XP4J7M5fIg7I5-ctfLIbV2NsvWLwiv6JAECocbGX1w0fJTmn9LlheiDQP1ByxU_WavsFYWOYPEqdUQbqcZ7iovT1ZUVyFuGS5rxzSHm86tcK_ptEinYO1dGLjMrLRZ3tB1OAOW8_u-VnHqsNwKjbZJNUljfzCGy1YoI2xUB7V4w
則可以使用上面輸出的token 登錄 Dashboard。
方法二:創建使用 token 的 KubeConfig 文件 (推薦使用這種方式)
[root@k8s-master01 ~]# source /opt/k8s/bin/environment.sh
設置集群參數
[root@k8s-master01 ~]# kubectl config set-cluster kubernetes \
--certificate-authority=/etc/kubernetes/cert/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=dashboard.kubeconfig
設置客戶端認證參數,使用上面創建的 Token
[root@k8s-master01 ~]# kubectl config set-credentials dashboard_user \
--token=${DASHBOARD_LOGIN_TOKEN} \
--kubeconfig=dashboard.kubeconfig
設置上下文參數
[root@k8s-master01 ~]# kubectl config set-context default \
--cluster=kubernetes \
--user=dashboard_user \
--kubeconfig=dashboard.kubeconfig
設置默認上下文
[root@k8s-master01 ~]# kubectl config use-context default --kubeconfig=dashboard.kubeconfig
將上面生成的 dashboard.kubeconfig文件拷貝到本地,然后使用這個文件登錄 Dashboard。
[root@k8s-master01 ~]# ll dashboard.kubeconfig
-rw------- 1 root root 3025 Jun 26 01:14 dashboard.kubeconfig



這里由於缺少Heapster或metrics-server插件,當前dashboard還不能展示 Pod、Nodes 的 CPU、內存等統計數據和圖表。
11.3 - 部署 metrics-server 插件
metrics-server 通過 kube-apiserver 發現所有節點,然后調用 kubelet APIs(通過 https 接口)獲得各節點(Node)和 Pod 的 CPU、Memory 等資源使用情況。從 Kubernetes 1.12 開始,kubernetes 的安裝腳本移除了 Heapster,從 1.13 開始完全移除了對 Heapster 的支持,Heapster 不再被維護。替代方案如下:
-> 用於支持自動擴縮容的 CPU/memory HPA metrics:metrics-server;
-> 通用的監控方案:使用第三方可以獲取 Prometheus 格式監控指標的監控系統,如 Prometheus Operator;
-> 事件傳輸:使用第三方工具來傳輸、歸檔 kubernetes events;
從 Kubernetes 1.8 開始,資源使用指標(如容器 CPU 和內存使用率)通過 Metrics API 在 Kubernetes 中獲取, metrics-server 替代了heapster。Metrics Server 實現了Resource Metrics API,Metrics Server 是集群范圍資源使用數據的聚合器。 Metrics Server 從每個節點上的 Kubelet 公開的 Summary API 中采集指標信息。
在了解Metrics-Server之前,必須要事先了解下Metrics API的概念。Metrics API相比於之前的監控采集方式(hepaster)是一種新的思路,官方希望核心指標的監控應該是穩定的,版本可控的,且可以直接被用戶訪問(例如通過使用 kubectl top 命令),或由集群中的控制器使用(如HPA),和其他的Kubernetes APIs一樣。官方廢棄heapster項目,就是為了將核心資源監控作為一等公民對待,即像pod、service那樣直接通過api-server或者client直接訪問,不再是安裝一個hepater來匯聚且由heapster單獨管理。
假設每個pod和node我們收集10個指標,從k8s的1.6開始,支持5000節點,每個節點30個pod,假設采集粒度為1分鍾一次,則"10 x 5000 x 30 / 60 = 25000 平均每分鍾2萬多個采集指標"。因為k8s的api-server將所有的數據持久化到了etcd中,顯然k8s本身不能處理這種頻率的采集,而且這種監控數據變化快且都是臨時數據,因此需要有一個組件單獨處理他們,k8s版本只存放部分在內存中,於是metric-server的概念誕生了。其實hepaster已經有暴露了api,但是用戶和Kubernetes的其他組件必須通過master proxy的方式才能訪問到,且heapster的接口不像api-server一樣,有完整的鑒權以及client集成。
有了Metrics Server組件,也采集到了該有的數據,也暴露了api,但因為api要統一,如何將請求到api-server的/apis/metrics請求轉發給Metrics Server呢,
解決方案就是:kube-aggregator,在k8s的1.7中已經完成,之前Metrics Server一直沒有面世,就是耽誤在了kube-aggregator這一步。kube-aggregator(聚合api)主要提供:
-> Provide an API for registering API servers;
-> Summarize discovery information from all the servers;
-> Proxy client requests to individual servers;
Metric API的使用:
-> Metrics API 只可以查詢當前的度量數據,並不保存歷史數據
-> Metrics API URI 為 /apis/metrics.k8s.io/,在 k8s.io/metrics 維護
-> 必須部署 metrics-server 才能使用該 API,metrics-server 通過調用 Kubelet Summary API 獲取數據
Metrics server定時從Kubelet的Summary API(類似/ap1/v1/nodes/nodename/stats/summary)采集指標信息,這些聚合過的數據將存儲在內存中,且以metric-api的形式暴露出去。Metrics server復用了api-server的庫來實現自己的功能,比如鑒權、版本等,為了實現將數據存放在內存中嗎,去掉了默認的etcd存儲,引入了內存存儲(即實現Storage interface)。因為存放在內存中,因此監控數據是沒有持久化的,可以通過第三方存儲來拓展,這個和heapster是一致的。
Kubernetes Dashboard 還不支持 metrics-server,如果使用 metrics-server 替代 Heapster,將無法在 dashboard 中以圖形展示 Pod 的內存和 CPU 情況,需要通過 Prometheus、Grafana 等監控方案來彌補。kuberntes 自帶插件的 manifests yaml 文件使用 gcr.io 的 docker registry,國內被牆,需要手動替換為其它 registry 地址(本文檔未替換);可以從微軟中國提供的 gcr.io 免費代理下載被牆的鏡像;下面部署命令均在k8s-master01節點上執行。
監控架構

1)安裝 metrics-server
從 github clone 源碼:
[root@k8s-master01 ~]# cd /opt/k8s/work/
[root@k8s-master01 work]# git clone https://github.com/kubernetes-incubator/metrics-server.git
[root@k8s-master01 work]# cd metrics-server/deploy/1.8+/
[root@k8s-master01 1.8+]# ls
aggregated-metrics-reader.yaml auth-reader.yaml metrics-server-deployment.yaml resource-reader.yaml
auth-delegator.yaml metrics-apiservice.yaml metrics-server-service.yaml
修改 metrics-server-deployment.yaml 文件,為 metrics-server 添加三個命令行參數(在"imagePullPolicy"行的下面添加):
[root@k8s-master01 1.8+]# cp metrics-server-deployment.yaml metrics-server-deployment.yaml.bak
[root@k8s-master01 1.8+]# vim metrics-server-deployment.yaml
.........
args:
- --metric-resolution=30s
- --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
這里需要注意:
--metric-resolution=30s:從 kubelet 采集數據的周期;
--kubelet-preferred-address-types:優先使用 InternalIP 來訪問 kubelet,這樣可以避免節點名稱沒有 DNS 解析記錄時,通過節點名稱調用節點 kubelet API 失敗的情況(未配置時默認的情況);
另外:
需要提前FQ將k8s.gcr.io/metrics-server-amd64:v0.3.3鏡像下載下來,然后上傳到node節點上,然后執行"docker load ......" 導入到node節點的images鏡像里
或者從微軟中國提供的gcr.io免費代理下載被牆的鏡像,然后在修改yaml文件里更新dashboard的鏡像下載地址.
[root@k8s-master01 1.8+]# fgrep "image" metrics-server-deployment.yaml
# mount in tmp so we can safely use from-scratch images and/or read-only containers
image: k8s.gcr.io/metrics-server-amd64:v0.3.3
imagePullPolicy: Always
由於已經提前將相應鏡像導入到各node節點的image里了,所以需要將metrics-server-deployment.yaml文件中的鏡像拉取策略修改為"IfNotPresent".
即:本地有則使用本地鏡像,不拉取
[root@k8s-master01 1.8+]# fgrep "image" metrics-server-deployment.yaml
# mount in tmp so we can safely use from-scratch images and/or read-only containers
image: k8s.gcr.io/metrics-server-amd64:v0.3.3
imagePullPolicy: IfNotPresent
部署 metrics-server:
[root@k8s-master01 1.8+]# kubectl create -f .
2)查看運行情況
[root@k8s-master01 1.8+]# kubectl -n kube-system get pods -l k8s-app=metrics-server
NAME READY STATUS RESTARTS AGE
metrics-server-54997795d9-4cv6h 1/1 Running 0 50s
[root@k8s-master01 1.8+]# kubectl get svc -n kube-system metrics-server
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
metrics-server ClusterIP 10.254.238.208 <none> 443/TCP 65s
3)metrics-server 的命令行參數 (在任意一個node節點上執行下面命令)
[root@k8s-node01 ~]# docker run -it --rm k8s.gcr.io/metrics-server-amd64:v0.3.3 --help
4)查看 metrics-server 輸出的 metrics
-> 通過 kube-apiserver 或 kubectl proxy 訪問:
https://172.16.60.250:8443/apis/metrics.k8s.io/v1beta1/nodes
https://172.16.60.250:8443/apis/metrics.k8s.io/v1beta1/nodes/
https://172.16.60.250:8443/apis/metrics.k8s.io/v1beta1/pods
https://172.16.60.250:8443/apis/metrics.k8s.io/v1beta1/namespace//pods/
-> 直接使用 kubectl 命令訪問 :
# kubectl get --raw apis/metrics.k8s.io/v1beta1/nodes
# kubectl get --raw apis/metrics.k8s.io/v1beta1/pods kubectl
# get --raw apis/metrics.k8s.io/v1beta1/nodes/ kubectl
# get --raw apis/metrics.k8s.io/v1beta1/namespace//pods/
[root@k8s-master01 1.8+]# kubectl get --raw "/apis/metrics.k8s.io/v1beta1" | jq .
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "metrics.k8s.io/v1beta1",
"resources": [
{
"name": "nodes",
"singularName": "",
"namespaced": false,
"kind": "NodeMetrics",
"verbs": [
"get",
"list"
]
},
{
"name": "pods",
"singularName": "",
"namespaced": true,
"kind": "PodMetrics",
"verbs": [
"get",
"list"
]
}
]
}
[root@k8s-master01 1.8+]# kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes" | jq .
{
"kind": "NodeMetricsList",
"apiVersion": "metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes"
},
"items": [
{
"metadata": {
"name": "k8s-node01",
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/k8s-node01",
"creationTimestamp": "2019-06-27T17:11:43Z"
},
"timestamp": "2019-06-27T17:11:36Z",
"window": "30s",
"usage": {
"cpu": "47615396n",
"memory": "2413536Ki"
}
},
{
"metadata": {
"name": "k8s-node02",
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/k8s-node02",
"creationTimestamp": "2019-06-27T17:11:43Z"
},
"timestamp": "2019-06-27T17:11:38Z",
"window": "30s",
"usage": {
"cpu": "42000411n",
"memory": "2496152Ki"
}
},
{
"metadata": {
"name": "k8s-node03",
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/k8s-node03",
"creationTimestamp": "2019-06-27T17:11:43Z"
},
"timestamp": "2019-06-27T17:11:40Z",
"window": "30s",
"usage": {
"cpu": "54095172n",
"memory": "3837404Ki"
}
}
]
}
這里需要注意:/apis/metrics.k8s.io/v1beta1/nodes 和 /apis/metrics.k8s.io/v1beta1/pods 返回的 usage 包含 CPU 和 Memory;
5)使用 kubectl top 命令查看集群節點資源使用情況
[root@k8s-master01 1.8+]# kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
k8s-node01 45m 1% 2357Mi 61%
k8s-node02 44m 1% 2437Mi 63%
k8s-node03 54m 1% 3747Mi 47%
=======================================================================================================================================
報錯解決:
[root@k8s-master01 1.8+]# kubectl top node
Error from server (Forbidden): nodes.metrics.k8s.io is forbidden: User "aggregator" cannot list resource "nodes" in API group "metrics.k8s.io" at the cluster scope
出現上述錯誤的原因主要是未對aggregator這個sa進行rbac授權!
偷懶的解決方案,直接將這個sa和cluster-admin進行綁定,但不符合最小權限原則。
[root@k8s-master01 1.8+]# kubectl create clusterrolebinding custom-metric-with-cluster-admin --clusterrole=cluster-admin --user=aggregator


11.4 - 部署 kube-state-metrics 插件
上面已經部署了metric-server,幾乎容器運行的大多數指標數據都能采集到了,但是下面這種情況的指標數據的采集卻無能為力:
-> 調度了多少個replicas?現在可用的有幾個?
-> 多少個Pod是running/stopped/terminated狀態?
-> Pod重啟了多少次?
-> 當前有多少job在運行中?
這些則是kube-state-metrics提供的內容,它是K8S的一個附加服務,基於client-go開發的。它會輪詢Kubernetes API,並將Kubernetes的結構化信息轉換為metrics。kube-state-metrics能夠采集絕大多數k8s內置資源的相關數據,例如pod、deploy、service等等。同時它也提供自己的數據,主要是資源采集個數和采集發生的異常次數統計。
kube-state-metrics 指標類別包括:
CronJob Metrics
DaemonSet Metrics
Deployment Metrics
Job Metrics
LimitRange Metrics
Node Metrics
PersistentVolume Metrics
PersistentVolumeClaim Metrics
Pod Metrics
Pod Disruption Budget Metrics
ReplicaSet Metrics
ReplicationController Metrics
ResourceQuota Metrics
Service Metrics
StatefulSet Metrics
Namespace Metrics
Horizontal Pod Autoscaler Metrics
Endpoint Metrics
Secret Metrics
ConfigMap Metrics
以pod為例的指標有:
kube_pod_info
kube_pod_owner
kube_pod_status_running
kube_pod_status_ready
kube_pod_status_scheduled
kube_pod_container_status_waiting
kube_pod_container_status_terminated_reason
..............
kube-state-metrics與metric-server (或heapster)的對比
1)metric-server是從api-server中獲取cpu,內存使用率這種監控指標,並把它們發送給存儲后端,如influxdb或雲廠商,它當前的核心作用是:為HPA等組件提供決策指標支持。
2)kube-state-metrics關注於獲取k8s各種資源的最新狀態,如deployment或者daemonset,之所以沒有把kube-state-metrics納入到metric-server的能力中,是因為它們的關注點本質上是不一樣的。metric-server僅僅是獲取、格式化現有數據,寫入特定的存儲,實質上是一個監控系統。而kube-state-metrics是將k8s的運行狀況在內存中做了個快照,並且獲取新的指標,但它沒有能力導出這些指標
3)換個角度講,kube-state-metrics本身是metric-server的一種數據來源,雖然現在沒有這么做。
4)另外,像Prometheus這種監控系統,並不會去用metric-server中的數據,它都是自己做指標收集、集成的(Prometheus包含了metric-server的能力),但Prometheus可以監控metric-server本身組件的監控狀態並適時報警,這里的監控就可以通過kube-state-metrics來實現,如metric-serverpod的運行狀態。
kube-state-metrics本質上是不斷輪詢api-server,其性能優化:
kube-state-metrics在之前的版本中暴露出兩個問題:
1)/metrics接口響應慢(10-20s)
2)內存消耗太大,導致超出limit被殺掉
問題一的方案:就是基於client-go的cache tool實現本地緩存,具體結構為:var cache = map[uuid][]byte{}
問題二的的方案是:對於時間序列的字符串,是存在很多重復字符的(如namespace等前綴篩選),可以用指針或者結構化這些重復字符。
kube-state-metrics優化點和問題
1)因為kube-state-metrics是監聽資源的add、delete、update事件,那么在kube-state-metrics部署之前已經運行的資源的數據是不是就拿不到了?其實kube-state-metric利用client-go可以初始化所有已經存在的資源對象,確保沒有任何遺漏;
2)kube-state-metrics當前不會輸出metadata信息(如help和description);
3)緩存實現是基於golang的map,解決並發讀問題當期是用了一個簡單的互斥鎖,應該可以解決問題,后續會考慮golang的sync.Map安全map;
4)kube-state-metrics通過比較resource version來保證event的順序;
5)kube-state-metrics並不保證包含所有資源;
下面部署命令均在k8s-master01節點上執行。
1)修改配置文件
將下載的 kube-state-metrics.tar.gz 放到/opt/k8s/work目錄下解壓
[root@k8s-master01 ~]# cd /opt/k8s/work/
[root@k8s-master01 work]# tar -zvxf kube-state-metrics.tar.gz
[root@k8s-master01 work]# cd kube-state-metrics
kube-state-metrics目錄下,有所需要的文件
[root@k8s-master01 kube-state-metrics]# ll
total 32
-rw-rw-r-- 1 root root 362 May 6 17:31 kube-state-metrics-cluster-role-binding.yaml
-rw-rw-r-- 1 root root 1076 May 6 17:31 kube-state-metrics-cluster-role.yaml
-rw-rw-r-- 1 root root 1657 Jul 1 17:35 kube-state-metrics-deployment.yaml
-rw-rw-r-- 1 root root 381 May 6 17:31 kube-state-metrics-role-binding.yaml
-rw-rw-r-- 1 root root 508 May 6 17:31 kube-state-metrics-role.yaml
-rw-rw-r-- 1 root root 98 May 6 17:31 kube-state-metrics-service-account.yaml
-rw-rw-r-- 1 root root 404 May 6 17:31 kube-state-metrics-service.yaml
[root@k8s-master01 kube-state-metrics]# fgrep -R "image" ./*
./kube-state-metrics-deployment.yaml: image: quay.io/coreos/kube-state-metrics:v1.5.0
./kube-state-metrics-deployment.yaml: imagePullPolicy: IfNotPresent
./kube-state-metrics-deployment.yaml: image: k8s.gcr.io/addon-resizer:1.8.3
./kube-state-metrics-deployment.yaml: imagePullPolicy: IfNotPresent
[root@k8s-master01 kube-state-metrics]# cat kube-state-metrics-service.yaml
apiVersion: v1
kind: Service
metadata:
name: kube-state-metrics
namespace: kube-system
labels:
k8s-app: kube-state-metrics
annotations:
prometheus.io/scrape: 'true'
spec:
ports:
- name: http-metrics
port: 8080
targetPort: http-metrics
protocol: TCP
- name: telemetry
port: 8081
targetPort: telemetry
protocol: TCP
type: NodePort #添加這一行
selector:
k8s-app: kube-state-metrics
注意兩點:
其中有個是鏡像是"k8s.gcr.io/addon-resizer:1.8.3"在國內因為某些原因無法拉取,可以更換為"ist0ne/addon-resizer"即可正常使用。或者通過翻牆下載。
service 如果需要集群外部訪問,需要改為NodePort
2)執行所有定義文件
需要提前FQ將quay.io/coreos/kube-state-metrics:v1.5.0 和 k8s.gcr.io/addon-resizer:1.8.3鏡像下載下來,然后上傳到node節點上,然后執行"docker load ......" 導入到node節點的images鏡像里
或者從微軟中國提供的gcr.io免費代理下載被牆的鏡像,然后在修改yaml文件里更新dashboard的鏡像下載地址。由於已經提前將相應鏡像導入到各node節點的image里了,
所以需要將kube-state-metrics-deployment.yaml文件中的鏡像拉取策略修改為"IfNotPresent".即本地有則使用本地鏡像,不拉取。
[root@k8s-master01 kube-state-metrics]# kubectl create -f .
執行后檢查一下:
[root@k8s-master01 kube-state-metrics]# kubectl get pod -n kube-system|grep kube-state-metrics
kube-state-metrics-5dd55c764d-nnsdv 2/2 Running 0 9m3s
[root@k8s-master01 kube-state-metrics]# kubectl get svc -n kube-system|grep kube-state-metrics
kube-state-metrics NodePort 10.254.228.212 <none> 8080:30978/TCP,8081:30872/TCP 9m14s
[root@k8s-master01 kube-state-metrics]# kubectl get pod,svc -n kube-system|grep kube-state-metrics
pod/kube-state-metrics-5dd55c764d-nnsdv 2/2 Running 0 9m12s
service/kube-state-metrics NodePort 10.254.228.212 <none> 8080:30978/TCP,8081:30872/TCP 9m18s
3)驗證kube-state-metrics數據采集
通過上面的檢查,可以得知映射到外部訪問的NodePort端口是30978,通過任意一個node工作節點即可驗證訪問:
[root@k8s-master01 kube-state-metrics]# curl http://172.16.60.244:30978/metrics|head -10
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0# HELP kube_configmap_info Information about configmap.
# TYPE kube_configmap_info gauge
kube_configmap_info{namespace="kube-system",configmap="extension-apiserver-authentication"} 1
kube_configmap_info{namespace="kube-system",configmap="coredns"} 1
kube_configmap_info{namespace="kube-system",configmap="kubernetes-dashboard-settings"} 1
# HELP kube_configmap_created Unix creation timestamp
# TYPE kube_configmap_created gauge
kube_configmap_created{namespace="kube-system",configmap="extension-apiserver-authentication"} 1.560825764e+09
kube_configmap_created{namespace="kube-system",configmap="coredns"} 1.561479528e+09
kube_configmap_created{namespace="kube-system",configmap="kubernetes-dashboard-settings"} 1.56148146e+09
100 73353 0 73353 0 0 9.8M 0 --:--:-- --:--:-- --:--:-- 11.6M
curl: (23) Failed writing body (0 != 2048)


11.5 - 部署 harbor 私有倉庫
安裝的話,可以參考Docker私有倉庫Harbor介紹和部署記錄,需要在兩台節點機172.16.60.247、172.16.60.248上都安裝harbor私有倉庫環境。上層通過Nginx+Keepalived實現Harbor的負載均衡+高可用,兩個Harbor相互同步(主主復制)。 harbor上遠程同步的操作:1)"倉庫管理"創建目標,創建后可以測試是否正常連接目標。2)"同步管理"創建規則,在規則中調用上面創建的目標。3)手動同步或定時同步。
例如:已經在172.16.60.247這台harbor節點的私有倉庫library和kevin_img的項目里各自存放了鏡像,如下:

現在要把172.16.60.247的harbor私有倉庫的這兩個項目下的鏡像同步到另一個節點172.16.60.248的harbor里。同步同步方式:147 -> 148 或 147 <- 148








上面是手動同步,也可以選擇定時同步,分別填寫的是"秒 分 時 日 月 周", 如下每兩分鍾同步一次! 則過了兩分鍾之后就會自動同步過來了~

11.6 - kubernetes集群管理測試
[root@k8s-master01 ~]# kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-2 Healthy {"health":"true"}
etcd-0 Healthy {"health":"true"}
etcd-1 Healthy {"health":"true"}
[root@k8s-master01 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-node01 Ready <none> 20d v1.14.2
k8s-node02 Ready <none> 20d v1.14.2
k8s-node03 Ready <none> 20d v1.14.2
部署測試實例
[root@k8s-master01 ~]# kubectl run kevin-nginx --image=nginx --replicas=3
kubectl run --generator=deployment/apps.v1 is DEPRECATED and will be removed in a future version. Use kubectl run --generator=run-pod/v1 or kubectl create instead.
deployment.apps/kevin-nginx created
[root@k8s-master01 ~]# kubectl run --generator=run-pod/v1 kevin-nginx --image=nginx --replicas=3
pod/kevin-nginx created
稍等一會兒,查看創建的kevin-nginx的pod(由於創建時要自動下載nginx鏡像,所以需要等待一段時間)
[root@k8s-master01 ~]# kubectl get pods --all-namespaces|grep "kevin-nginx"
default kevin-nginx 1/1 Running 0 98s
default kevin-nginx-569dcd559b-6h4nn 1/1 Running 0 106s
default kevin-nginx-569dcd559b-7f2b4 1/1 Running 0 106s
default kevin-nginx-569dcd559b-7tds2 1/1 Running 0 106s
查看具體詳細事件
[root@k8s-master01 ~]# kubectl get pods --all-namespaces -o wide|grep "kevin-nginx"
default kevin-nginx 1/1 Running 0 2m13s 172.30.72.12 k8s-node03 <none> <none>
default kevin-nginx-569dcd559b-6h4nn 1/1 Running 0 2m21s 172.30.56.7 k8s-node02 <none> <none>
default kevin-nginx-569dcd559b-7f2b4 1/1 Running 0 2m21s 172.30.72.11 k8s-node03 <none> <none>
default kevin-nginx-569dcd559b-7tds2 1/1 Running 0 2m21s 172.30.88.8 k8s-node01 <none> <none>
[root@k8s-master01 ~]# kubectl get deployment|grep kevin-nginx
kevin-nginx 3/3 3 3 2m57s
創建svc
[root@k8s-master01 ~]# kubectl expose deployment kevin-nginx --port=8080 --target-port=80 --type=NodePort
[root@k8s-master01 ~]# kubectl get svc|grep kevin-nginx
nginx NodePort 10.254.111.50 <none> 8080:32177/TCP 33s
集群內部,各pod之間訪問kevin-nginx
[root@k8s-master01 ~]# curl http://10.254.111.50:8080
外部訪問kevin-nginx的地址為http://node_ip/32177
http://172.16.60.244:32177
http://172.16.60.245:32177
http://172.16.60.246:32177


11.7 - 清理kubernetes集群
1)清理 Node 節點 (node節點同樣操作)
停相關進程:
[root@k8s-node01 ~]# systemctl stop kubelet kube-proxy flanneld docker kube-proxy kube-nginx
清理文件:
[root@k8s-node01 ~]# source /opt/k8s/bin/environment.sh
umount kubelet 和 docker 掛載的目錄
[root@k8s-node01 ~]# mount | grep "${K8S_DIR}" | awk '{print $3}'|xargs sudo umount
刪除 kubelet 工作目錄
[root@k8s-node01 ~]# sudo rm -rf ${K8S_DIR}/kubelet
刪除 docker 工作目錄
[root@k8s-node01 ~]# sudo rm -rf ${DOCKER_DIR}
刪除 flanneld 寫入的網絡配置文件
[root@k8s-node01 ~]# sudo rm -rf /var/run/flannel/
刪除 docker 的一些運行文件
[root@k8s-node01 ~]# sudo rm -rf /var/run/docker/
刪除 systemd unit 文件
[root@k8s-node01 ~]# sudo rm -rf /etc/systemd/system/{kubelet,docker,flanneld,kube-nginx}.service
刪除程序文件
[root@k8s-node01 ~]# sudo rm -rf /opt/k8s/bin/*
刪除證書文件
[root@k8s-node01 ~]# sudo rm -rf /etc/flanneld/cert /etc/kubernetes/cert
清理 kube-proxy 和 docker 創建的 iptables
[root@k8s-node01 ~]# iptables -F && sudo iptables -X && sudo iptables -F -t nat && sudo iptables -X -t nat
刪除 flanneld 和 docker 創建的網橋:
[root@k8s-node01 ~]# ip link del flannel.1
[root@k8s-node01 ~]# ip link del docker0
2)清理 Master 節點 (master節點同樣操作)
停相關進程:
[root@k8s-master01 ~]# systemctl stop kube-apiserver kube-controller-manager kube-scheduler kube-nginx
清理文件:
刪除 systemd unit 文件
[root@k8s-master01 ~]# rm -rf /etc/systemd/system/{kube-apiserver,kube-controller-manager,kube-scheduler,kube-nginx}.service
刪除程序文件
[root@k8s-master01 ~]# rm -rf /opt/k8s/bin/{kube-apiserver,kube-controller-manager,kube-scheduler}
刪除證書文件
[root@k8s-master01 ~]# rm -rf /etc/flanneld/cert /etc/kubernetes/cert
清理 etcd 集群
[root@k8s-master01 ~]# systemctl stop etcd
清理文件:
[root@k8s-master01 ~]# source /opt/k8s/bin/environment.sh
刪除 etcd 的工作目錄和數據目錄
[root@k8s-master01 ~]# rm -rf ${ETCD_DATA_DIR} ${ETCD_WAL_DIR}
刪除 systemd unit 文件
[root@k8s-master01 ~]# rm -rf /etc/systemd/system/etcd.service
刪除程序文件
[root@k8s-master01 ~]# rm -rf /opt/k8s/bin/etcd
刪除 x509 證書文件
[root@k8s-master01 ~]# rm -rf /etc/etcd/cert/*
上面部署的dashboard是https證書方式,如果是http方式訪問的kubernetes集群web-ui,操作如下:
1)配置kubernetes-dashboard.yaml (里面的"k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1"鏡像已經提前在node節點上下載了)
[root@k8s-master01 ~]# cd /opt/k8s/work/
[root@k8s-master01 work]# cat kubernetes-dashboard.yaml
# ------------------- Dashboard Secret ------------------- #
apiVersion: v1
kind: Secret
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard-certs
namespace: kube-system
type: Opaque
---
# ------------------- Dashboard Service Account ------------------- #
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system
---
# ------------------- Dashboard Role & Role Binding ------------------- #
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: kubernetes-dashboard-minimal
namespace: kube-system
rules:
# Allow Dashboard to create 'kubernetes-dashboard-key-holder' secret.
- apiGroups: [""]
resources: ["secrets"]
verbs: ["create"]
# Allow Dashboard to create 'kubernetes-dashboard-settings' config map.
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["create"]
# Allow Dashboard to get, update and delete Dashboard exclusive secrets.
- apiGroups: [""]
resources: ["secrets"]
resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs"]
verbs: ["get", "update", "delete"]
# Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map.
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["kubernetes-dashboard-settings"]
verbs: ["get", "update"]
# Allow Dashboard to get metrics from heapster.
- apiGroups: [""]
resources: ["services"]
resourceNames: ["heapster"]
verbs: ["proxy"]
- apiGroups: [""]
resources: ["services/proxy"]
resourceNames: ["heapster", "http:heapster:", "https:heapster:"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: kubernetes-dashboard-minimal
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: kubernetes-dashboard-minimal
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kube-system
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: kubernetes-dashboard
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kube-system
roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
---
# ------------------- Dashboard Deployment ------------------- #
kind: Deployment
apiVersion: apps/v1beta2
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system
spec:
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: kubernetes-dashboard
template:
metadata:
labels:
k8s-app: kubernetes-dashboard
spec:
serviceAccountName: kubernetes-dashboard-admin
containers:
- name: kubernetes-dashboard
image: k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1
ports:
- containerPort: 9090
protocol: TCP
args:
#- --auto-generate-certificates
# Uncomment the following line to manually specify Kubernetes API server Host
# If not specified, Dashboard will attempt to auto discover the API server and connect
# to it. Uncomment only if the default does not work.
#- --apiserver-host=http://10.0.1.168:8080
volumeMounts:
- name: kubernetes-dashboard-certs
mountPath: /certs
# Create on-disk volume to store exec logs
- mountPath: /tmp
name: tmp-volume
livenessProbe:
httpGet:
scheme: HTTP
path: /
port: 9090
initialDelaySeconds: 30
timeoutSeconds: 30
volumes:
- name: kubernetes-dashboard-certs
secret:
secretName: kubernetes-dashboard-certs
- name: tmp-volume
emptyDir: {}
serviceAccountName: kubernetes-dashboard
# Comment the following tolerations if Dashboard must not be deployed on master
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
---
# ------------------- Dashboard Service ------------------- #
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system
spec:
ports:
- port: 9090
targetPort: 9090
selector:
k8s-app: kubernetes-dashboard
# ------------------------------------------------------------
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard-external
namespace: kube-system
spec:
ports:
- port: 9090
targetPort: 9090
nodePort: 30090
type: NodePort
selector:
k8s-app: kubernetes-dashboard
創建這個yaml文件
[root@k8s-master01 work]# kubectl create -f kubernetes-dashboard.yaml
稍微等一會兒,查看kubernetes-dashboard的pod創建情況(如下可知,該pod落在了k8s-node03節點上,即172.16.60.246)
[root@k8s-master01 work]# kubectl get pods -n kube-system -o wide|grep "kubernetes-dashboard"
kubernetes-dashboard-7976c5cb9c-q7z2w 1/1 Running 0 10m 172.30.72.6 k8s-node03 <none> <none>
[root@k8s-master01 work]# kubectl get svc -n kube-system|grep "kubernetes-dashboard"
kubernetes-dashboard-external NodePort 10.254.227.142 <none> 9090:30090/TCP 10m


