-
cluster-autoscaler組件
-
HPA(Horizontal Pod Autoscaler):Pod個數自動擴/縮容
-
VPA(Vertical Pod Autoscaler):Pod配置自動擴/縮容,主要是CPU、內存
node自動擴容縮容
-
AWS: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/README.md
-
Azure:
ansiable擴容node流程
1. 觸發新增Node 2. 調用Ansible腳本部署組件 3. 檢查服務是否可用 4. 調用API將新Node加入集群或者啟用Node自動加入 5. 觀察新Node狀態 6. 完成Node擴容,接收新Pod
node縮容流程:
#獲取節點列表 kubectl get node #設置不可調度 kubectl cordon $node_name #驅逐節點上的pod kubectl drain $node_name --ignore-daemonsets #移除節點 kubectl delete node $node_name
POD自動擴容縮容 (HPA)
Horizontal Pod Autoscaler(HPA,Pod水平自動伸縮),根據資源利用率或者自定義指標自動調整replication controller, deployment 或 replica set,實現部署的自動擴展和縮減,讓部署的規模接近於實際服務的負載。HPA不適於無法縮放的對象,例如DaemonSet。
HPA基本原理
Kubernetes 中的 Metrics Server 持續采集所有 Pod 副本的指標數據。HPA 控制器通過 Metrics Server 的 API(Heapster 的 API 或聚合 API)獲取這些數據,基於用戶定義的擴縮容規則進行計算,得到目標 Pod 副本數量。當目標 Pod 副本數量與當前副本數量不同時,HPA 控制器就向 Pod 的副本控制器(Deployment、RC 或 ReplicaSet)發起 scale 操作,調整 Pod 的副本數量,完成擴縮容操作。如圖所示。
在 HPA 中,默認的擴容冷卻周期是 3 分鍾,縮容冷卻周期是 5 分鍾。
-
--horizontal-pod-autoscaler-downscale-delay :擴容冷卻
-
HPA的演進歷程:
目前大多數人比較熟悉是autoscaling/v1,這個版本只支持CPU一個指標的彈性伸縮。
而autoscaling/v2beta1增加了支持自定義指標,autoscaling/v2beta2又額外增加了外部指標支持。
#v1版本: apiVersion: autoscaling/v1 kind: HorizontalPodAutoscaler metadata: name: php-apache namespace: default spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: php-apache minReplicas: 1 maxReplicas: 10 targetCPUUtilizationPercentage: 50 #v2beta2版本: apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: php-apache namespace: default spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: php-apache minReplicas: 1 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 50 - type: Pods pods: metric: name: packets-per-second target: type: AverageValue averageValue: 1k - type: Object object: metric: name: requests-per-second describedObject: apiVersion: networking.k8s.io/v1beta1 kind: Ingress name: main-route target: type: Value value: 10k - type: External external: metric: name: queue_messages_ready selector: "queue=worker_tasks" target: type: AverageValue averageValue: 30
基於CPU指標縮放
# vi /opt/kubernetes/cfg/kube-apiserver.conf ... --requestheader-client-ca-file=/opt/kubernetes/ssl/ca.pem \ --proxy-client-cert-file=/opt/kubernetes/ssl/server.pem \ --proxy-client-key-file=/opt/kubernetes/ssl/server-key.pem \ --requestheader-allowed-names=kubernetes \ --requestheader-extra-headers-prefix=X-Remote-Extra- \ --requestheader-group-headers=X-Remote-Group \ --requestheader-username-headers=X-Remote-User \ --enable-aggregator-routing=true \ ...
在設置完成重啟 kube-apiserver 服務,就啟用 API 聚合功能了。
Metric server從每個節點上Kubelet公開的摘要API收集指標。
# git clone https://github.com/kubernetes-incubator/metrics-server 修改deployment.yaml文件,修正集群問題 問題1:metrics-server默認使用節點hostname通過kubelet 10250端口獲取數據,但是coredns里面沒有該數據無法解析(10.96.0.10:53),可以在metrics server啟動命令添加參數 --kubelet-preferred-address-types=InternalIP 直接使用節點IP地址獲取數據 問題2:kubelet 的10250端口使用的是https協議,連接需要驗證tls證書。可以在metrics server啟動命令添加參數--kubelet-insecure-tls不驗證客戶端證書 問題3:yaml文件中的image地址k8s.gcr.io/metrics-server-amd64:v0.3.0 需要梯子,需要改成中國可以訪問的image地址,可以使用aliyun的。這里使用hub.docker.com里的google鏡像地址 image: mirrorgooglecontainers/metrics-server-amd64:v0.3.1 kubectl apply -f . kubectl get pod -n kube-system
kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes
kubectl top node
kubectl get apiservice |grep metrics
kubectl describe apiservice v1beta1.metrics.k8s.io
創建HPA策略:
# kubectl get pod NAME READY STATUS RESTARTS AGE java-demo-8548998c57-d4wkp 1/1 Running 0 12m java-demo-8548998c57-w24x6 1/1 Running 0 11m java-demo-8548998c57-wbnrs 1/1 Running 0 11m # kubectl autoscale deployment java-demo --cpu-percent=50 --min=3 --max=10 --dry-run -o yaml > hpa-v1.yaml # cat hpa-v1.yaml apiVersion: autoscaling/v1 kind: HorizontalPodAutoscaler metadata: name: java-demo spec: maxReplicas: 10 minReplicas: 3 scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: java-demo targetCPUUtilizationPercentage: 50 # kubectl apply -f hpa-v1.yaml # kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE java-demo Deployment/java-demo 1%/50% 3 10 3 10m # kubectl describe hpa java-demo
# yum install httpd-tools -y # kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE java-demo ClusterIP 10.0.0.215 <none> 80/TCP 171m # ab -n 100000 -c 100 http://10.0.0.215/index This is ApacheBench, Version 2.3 <$Revision: 1430300 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking 10.0.0.215 (be patient) Completed 10000 requests Completed 20000 requests Completed 30000 requests Completed 40000 requests Completed 50000 requests Completed 60000 requests Completed 70000 requests Completed 80000 requests apr_socket_recv: Connection refused (111) Total of 85458 requests completed
檢測擴容狀態
# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE java-demo Deployment/java-demo 1038%/50% 3 10 10 165m # kubectl get pod NAME READY STATUS RESTARTS AGE java-demo-77d4f5cdcf-4chv4 1/1 Running 0 56s java-demo-77d4f5cdcf-9bkz7 1/1 Running 0 56s java-demo-77d4f5cdcf-bk9mk 1/1 Running 0 156m java-demo-77d4f5cdcf-bv68j 1/1 Running 0 41s java-demo-77d4f5cdcf-khhlv 1/1 Running 0 41s java-demo-77d4f5cdcf-nvdjh 1/1 Running 0 56s java-demo-77d4f5cdcf-pqxvb 1/1 Running 0 41s java-demo-77d4f5cdcf-pxgl9 1/1 Running 0 41s java-demo-77d4f5cdcf-qqk6q 1/1 Running 0 156m java-demo-77d4f5cdcf-tkct6 1/1 Running 0 156m # kubectl top pod NAME CPU(cores) MEMORY(bytes) java-demo-77d4f5cdcf-4chv4 2m 269Mi java-demo-77d4f5cdcf-bk9mk 2m 246Mi java-demo-77d4f5cdcf-cwzwz 2m 177Mi java-demo-77d4f5cdcf-cz7hj 3m 220Mi java-demo-77d4f5cdcf-fb9zl 3m 197Mi java-demo-77d4f5cdcf-ftjht 3m 194Mi java-demo-77d4f5cdcf-qdxqf 2m 174Mi java-demo-77d4f5cdcf-qx52w 2m 175Mi java-demo-77d4f5cdcf-rfrlh 3m 220Mi java-demo-77d4f5cdcf-xjzjt 2m 176Mi
工作流程:hpa -> apiserver -> kube aggregation -> metrics-server -> kubelet(cadvisor)
apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: java-demo namespace: default spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: web minReplicas: 1 maxReplicas: 10 metrics: - resource: type: Resource name: cpu target: averageUtilization: 60 type: Utilization
type: Pods pods: metric: name: packets-per-second target: type: AverageValue averageValue: 1k
type: Object object: metric: name: requests-per-second describedObject: apiVersion: networking.k8s.io/v1beta1 kind: Ingress name: main-route target: type: Value value: 2k
-
-
Object:指的是指定k8s內部對象的指標,數據需要第三方adapter提供,只支持Value和AverageValue類型的目標值。
-
Pods:指的是伸縮對象Pods的指標,數據需要第三方的adapter提供,只允許AverageValue類型的目標值。
-
apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: java-demo namespace: default spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: java-demo minReplicas: 1 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 50 - type: Pods pods: metric: name: packets-per-second target: type: AverageValue averageValue: 1k - type: Object object: metric: name: requests-per-second describedObject: apiVersion: networking.k8s.io/v1beta1 kind: Ingress name: main-route target: type: Value value: 10k
資源指標只包含CPU、內存,一般來說也夠了。但如果想根據自定義指標:如請求qps/5xx錯誤數來實現HPA,就需要使用自定義指標了,目前比較成熟的實現是 Prometheus Custom Metrics。自定義指標由Prometheus來提供,再利用k8s-prometheus-adpater聚合到apiserver,實現和核心指標(metric-server)同樣的效果。
工作流程:hpa -> apiserver -> kube aggregation -> prometheus-adapter -> prometheus -> pods
https://github.com/DirectXMan12/k8s-prometheus-adapter
該 PrometheusAdapter 有一個穩定的Helm Charts,我們直接使用。
wget https://get.helm.sh/helm-v3.0.0-linux-amd64.tar.gz tar zxvf helm-v3.0.0-linux-amd64.tar.gz mv linux-amd64/helm /usr/bin/ helm repo add stable http://mirror.azure.cn/kubernetes/charts helm repo update helm repo list
部署prometheus-adapter,指定prometheus地址:
# helm install prometheus-adapter stable/prometheus-adapter --namespace kube-system --set prometheus.url=http://prometheus.kube-system,prometheus.port=9090 # helm list -n kube-system NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION prometheus-adapter kube-system 1 2020-05-28 11:38:35.156622425 +0800 CST deployed prometheus-adapter-2.3.1 v0.6.0
確保適配器注冊到APIServer:
# kubectl get apiservices |grep custom # kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1"
基於QPS指標實踐
部署應用暴露prometheus指標接口,可以通過訪問service看到
apiVersion: apps/v1 kind: Deployment metadata: labels: app: metrics-app name: metrics-app spec: replicas: 3 selector: matchLabels: app: metrics-app template: metadata: labels: app: metrics-app annotations: prometheus.io/scrape: "true" prometheus.io/port: "80" prometheus.io/path: "/metrics" spec: containers: - image: lizhenliang/metrics-app name: metrics-app ports: - name: web containerPort: 80 resources: requests: cpu: 200m memory: 256Mi readinessProbe: httpGet: path: / port: 80 initialDelaySeconds: 3 periodSeconds: 5 livenessProbe: httpGet: path: / port: 80 initialDelaySeconds: 3 periodSeconds: 5 --- apiVersion: v1 kind: Service metadata: name: metrics-app labels: app: metrics-app spec: ports: - name: web port: 80 targetPort: 80 selector: app: metrics-app # curl 10.99.15.240/metrics # HELP http_requests_total The amount of requests in total # TYPE http_requests_total counter http_requests_total 86 # HELP http_requests_per_second The amount of requests per second the latest ten seconds # TYPE http_requests_per_second gauge http_requests_per_second 0.5
創建HPA策略
使用Prometheus提供的指標測試來測試自定義指標(QPS)的自動縮放。
# vi app-hpa-v2.yml apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: metrics-app-hpa namespace: default spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: metrics-app minReplicas: 1 maxReplicas: 10 metrics: - type: Pods pods: metric: name: http_requests_per_second target: type: AverageValue averageValue: 800m # 800m 即0.8個/秒
# kubectl edit cm prometheus-adapter -n kube-system apiVersion: v1 kind: ConfigMap metadata: labels: app: prometheus-adapter chart: prometheus-adapter-v0.1.2 heritage: Tiller release: prometheus-adapter name: prometheus-adapter data: config.yaml: | rules: - seriesQuery: 'http_requests_total{kubernetes_namespace!="",kubernetes_pod_name!=""}' resources: overrides: kubernetes_namespace: {resource: "namespace"} kubernetes_pod_name: {resource: "pod"} name: matches: "^(.*)_total" as: "${1}_per_second" metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)' ...
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/http_requests_per_second"
壓測
ab -n 100000 -c 100 http://10.99.15.240/metrics
查看PHA狀態
kubectl get hpa
kubectl describe hpa metrics-app-hpa
小結
-
-
prometheus將收集到的信息匯總;
-
APIServer定時從Prometheus查詢,獲取request_per_second的數據;
-
HPA定期向APIServer查詢以判斷是否符合配置的autoscaler規則;
-