HPA 自動水平伸縮(基於CPU)

本文轉載自查看原文 2021-11-30 14:32 111 Docker + K8S

自動水平伸縮，是指運行在k8s上的應用負載(POD)，可以根據資源使用率進行自動擴容、縮容，它依賴metrics-server服務pod使用資源指標收集；我們知道應用的資源使用率通常都有高峰和低谷，所以k8s的HPA特性應運而生；它也是最能體現區別於傳統運維的優勢之一，不僅能夠彈性伸縮，而且完全自動化！

我們在生產中通常用得最多的就是基於服務pod的cpu使用率metrics來自動擴容pod數量，下面來以生產的標准來實戰測試下（注意：使用HPA前我們要確保K8s集群的dns服務和metrics服務是正常運行的，並且我們所創建的服務需要配置指標分配）

Github地址:

https://github.com/kubernetes/kubernetes/blob/release-1.19/cluster/addons/metrics-server/

解決不能下載google 鏡像倉庫的問題：

由於無法下載google的k8s.gcr.io鏡像，選擇國內的鏡像代理站點下載同步鏡像：

例如：
docker pull k8s.gcr.io/metrics-server-amd64:v0.3.6
替換成阿里雲鏡像
docker pull registry.aliyuncs.com/google_containers/metrics-server-amd64:v0.3.6

*** 如果生產使用建議放在harbor上.

1 安裝前修改kube-apiserver.yaml配置文件:

vim /etc/kubernetes/manifests/kube-apiserver.yaml 

。。。。。。省略
    - --enable-bootstrap-token-auth=true
    - --enable-aggregator-routing=true                      # 添加此行
    - --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
。。。。。。省略

***  一定要添加上: "- --enable-aggregator-routing=true"  保證有這行.

2 安裝metrics-server v0.3.6

mkdir metrics-server
cd metrics-server

#2.1  下載yaml文件：
wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml

#2.2  修改components.yaml文件
vim components.yaml

......其他不做變動省略即可......
apiVersion: apps/v1
kind: Deployment
metadata:
  name: metrics-server
  namespace: kube-system
  labels:
    k8s-app: metrics-server
spec:
  selector:
    matchLabels:
      k8s-app: metrics-server
  template:
    metadata:
      name: metrics-server
      labels:
        k8s-app: metrics-server
    spec:
      serviceAccountName: metrics-server
      volumes:
      # mount in tmp so we can safely use from-scratch images and/or read-only containers
      - name: tmp-dir
        emptyDir: {}
      containers:
      - name: metrics-server
        #image: k8s.gcr.io/metrics-server-amd64:v0.3.6                                  #  注釋原行
        image: registry.aliyuncs.com/google_containers/metrics-server-amd64:v0.3.6      #  替換成阿里雲的代理鏡像
        imagePullPolicy: IfNotPresent
        args:
          - --cert-dir=/tmp
          - --secure-port=4443
          - /metrics-server                                                              #新增
          - --kubelet-preferred-address-types=InternalIP                                 #新增
          - --kubelet-insecure-tls                                                       #新增
        ports:
        - name: main-port
......其他不做變動省略即可......


## 2.3 安裝metrics-server ：
kubectl apply -f components.yaml

## 2.4 查看metrics-server服務狀態：
[root@k8s-master1 data]# kubectl get pod -n kube-system | grep metrics-server
metrics-server-6ddbc8ff55-zwqdm                   1/1     Running   0          38m

## 2.5 檢查接口是否有異常：
[root@k8s-master1 data]# kubectl api-versions
......
events.k8s.io/v1beta1
extensions/v1beta1
metrics.k8s.io/v1beta1          # 需要有此行
networking.k8s.io/v1
networking.k8s.io/v1beta1
......

[root@k8s-master1 data]# kubectl describe apiservice v1beta1.metrics.k8s.io   # 看到下面的信息API接口即正常
Name:         v1beta1.metrics.k8s.io
Namespace:    
Labels:       <none>
Annotations:  <none>
API Version:  apiregistration.k8s.io/v1
Kind:         APIService
Metadata:
  Creation Timestamp:  2021-11-11T09:22:34Z
  Managed Fields:
    API Version:  apiregistration.k8s.io/v1
 ......省略部分

3 配置一個deploment，並添加資源限制條件：

## pod內資源分配的配置格式如下：
## 默認可以只配置requests，但根據生產中的經驗，建議把limits資源限制也加上，因為對K8s來說，只有這兩個都配置了且配置的值都要一樣，這個pod資源的優先級才是最高的，在node資源不夠的情況下，首先是把沒有任何資源分配配置的pod資源給干掉，其次是只配置了requests的，最后才是兩個都配置的情況，仔細品品

vim deployment-nginx.yaml

kind: Deployment
metadata:
  labels:
    app: ng-web
  name: ng-web
spec:
  replicas: 1
  selector:
    matchLabels:
      app: ng-web
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: ng-web
    spec:
      containers:
      - image: 192.168.10.19/devops/nginx:1.14.0
        name: nginx
        resources:                                    # 資源使用
          limits:                                     # (最高)限制單個POD 最多能使用1核(1000m 毫核)CPU以及2G內存
            cpu: '1'
            memory: 20Gi
          requests:                                   # (最低)保證此pod 初始獲取這么多資源
            cpu: '1'
            memory: 20Mi
        ports:
        - containerPort: 80
        volumeMounts:
        - name: my-nfs
          mountPath: /usr/share/nginx/html
        - name: k8snfs-db
          mountPath: /data/nginx/html
      volumes:
       - name: my-nfs
         nfs:
           server: 192.168.10.19
           path: /data/k8sdata
       - name: k8snfs-db
         nfs:
           server: 192.168.10.19
           path: /data/k8sdb
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: ng-web
  name: ng-web
spec:
  ports:
  - port: 80
    protocol: TCP
    targetPort: 80
  selector:
    app: ng-web
  type: ClusterIP

****** ****** ****** ****** ****** ****** ****** ****** ****** ****** ****** ****** ****** ****** ****** ****** ****** 

# 安裝部署測試的deployment：
kubectl apply -f deployment-nginx.yaml

# 查看pod 的資源使用:
[root@k8s-master1 data]# kubectl top pod
NAME                      CPU(cores)   MEMORY(bytes)   
ng-web-59f56fdfb6-58ch6   0m           1Mi             
ng-web-59f56fdfb6-9ds9x   0m           1Mi             
ng-web-59f56fdfb6-xmnks   0m           2Mi             

# 查看Node 使用情況:
[root@k8s-master1 data]# kubectl top node
NAME                      CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
k8s-master1.example.com   242m         12%    1172Mi          61%       
node1.example.com         66m          3%     505Mi           56%       
node2.example.com         56m          2%     520Mi           58%       
node3.example.com         80m          4%     534Mi           60%

4 為測試deployment，創建一個HPA:

# 為deployment資源web創建HPA，pod數量上限3個，最低1個，在pod平均CPU達到50%后開始擴容
 kubectl  autoscale deployment ng-web --max=3 --min=1 --cpu-percent=50

# 下面提示說到，HPA缺少最小資源分配的request參數:
[root@k8s-master1 linux39_ng]# kubectl describe hpa web1

  Type     Reason                        Age                    From                       Message
  ----     ------                        ----                   ----                       -------
  Warning  FailedGetScale                3m30s (x3 over 142m)   horizontal-pod-autoscaler  deployments/scale.apps "ng-web" not found
  Warning  FailedGetResourceMetric       3m (x2 over 3m15s)     horizontal-pod-autoscaler  unable to get metrics for resource cpu: no metrics returned from resource metrics API
  Warning  FailedComputeMetricsReplicas  3m (x2 over 3m15s)     horizontal-pod-autoscaler  invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
  Warning  FailedGetResourceMetric       2m15s (x3 over 2m45s)  horizontal-pod-autoscaler  missing request for cpu
  Warning  FailedComputeMetricsReplicas  2m15s (x3 over 2m45s)  horizontal-pod-autoscaler  invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: missing request for cpu

*****  所以在deployment-nginx.yaml里面一定要加入resources limits request 這幾個資源限制參數.

5 查看HPA的資源信息：

## 添加資源限制后重新應用：
kubectl apply -f deployment-nginx.yaml

[root@k8s-master1 linux39_ng]# kubectl autoscale deployment web1 --max=3  --min=1 --cpu-percent=1 
horizontalpodautoscaler.autoscaling/web1 autoscaled

# --cpu-percent=1    CPU百分比，測試建議放低點

## 等待一會，可以看到相關的hpa信息（K8s上metrics服務收集所有pod資源的時間間隔大概在60s的時間）
[root@k8s-master1 web1]# kubectl get hpa -w
NAME     REFERENCE           TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
web1     Deployment/web1     0%/1%     1         3         1          48m

## 再次查看HPA：
[root@k8s-master1 linux39_ng]# kubectl describe  hpa  web1
Conditions:
  Type            Status  Reason               Message
  ----            ------  ------               -------
  AbleToScale     True    ScaleDownStabilized  recent recommendations were higher than current one, applying the highest recent recommendation
  ScalingActive   True    ValidMetricFound     the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
  ScalingLimited  False   DesiredWithinRange   the desired count is within the acceptable range

6 模擬業務流量增長，看看hpa自動伸縮的效果：

# 獲取ng-web 的CluseterIP：
[root@k8s-master2 ng]# kubectl get service 
NAME         TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
web1        ClusterIP   172.16.1.108   <none>        80/TCP    16d

# 其他node模擬流量,隨便寫了個html頁面用於測試:
[root@k8s-master2 ~]# while :;do curl 172.16.1.108 ;done 
<h1>this is NFS PV PVC  pod show</h1>
<img src='123.png' >
<h1>this is NFS PV PVC  pod show</h1>

一台不行就2個節點上量.....

# 過一會在看看HPA 信息，不會上來就自動擴容, 需要一定的時間全達到閾值范圍，就開始自動擴容.

[root@k8s-master1 web1]# kubectl get hpa -w
web1     Deployment/web1     10%/1%    1         3         1          48m
web1     Deployment/web1     10%/1%    1         3         3          48m
web1     Deployment/web1     0%/1%     1         3         3          49m

# 停止Node 並發訪問，在查看HPA是不是慢慢的進行了縮容：
[root@k8s-master1 web1]# kubectl get hpa -w
NAME     REFERENCE           TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
web1     Deployment/web1     10%/1%    1         3         1          48m
web1     Deployment/web1     0%/1%     1         3         3          54m
web1     Deployment/web1     0%/1%     1         3         1          54m

7 文件方式創建HPA:

vim hpa_web1.yaml
---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: web1
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web1
  minReplicas: 1
  maxReplicas: 3
  targetCPUUtilizationPercentage: 50


## 執行文件：
kubectl apply -f hpa_web1.yaml

錯誤：

1. kubectl get hpa -w 不顯示資源使用率：

問題 1：
  [root@k8s-master1 web1]# kubectl delete ingress web1
Warning: extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
Error from server (NotFound): ingresses.extensions "web1" not found
[root@k8s-master1 web1]# kubectl describe hpa web1
Name:                                                  web1
Namespace:                                             default
Labels:                                                <none>
Annotations:                                           <none>
CreationTimestamp:                                     Tue, 23 Nov 2021 16:13:41 +0800
Reference:                                             Deployment/web1
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  <unknown> / 1%
Min replicas:                                          1
Max replicas:                                          3
Deployment pods:                                       1 current / 0 desired
Conditions:
  Type           Status  Reason                   Message
  ----           ------  ------                   -------
  AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
  ScalingActive  False   FailedGetResourceMetric  the HPA was unable to compute the replica count: missing request for cpu
Events:
  Type     Reason                        Age                         From                       Message
  ----     ------                        ----                        ----                       -------
  Warning  FailedComputeMetricsReplicas  10s (x12 over 2m58s)        horizontal-pod-autoscaler  invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: missing request for cpu
  Warning  FailedGetResourceMetric       <invalid> (x20 over 2m58s)  horizontal-pod-autoscaler  missing request for cpu

問題 2 ：
[root@k8s-master1 web1]# kubectl get hpa -w
NAME     REFERENCE           TARGETS        MINPODS   MAXPODS   REPLICAS   AGE
web1     Deployment/web1     <unknown>/1%   1         3         0          12s
web1     Deployment/web1     <unknown>/1%   1         3         1          15s

*************************************************************************************************************************************
解決：
    以上的兩個問題都是因為創建deployment 時的resources 這部分的限制超標或者資源限制的配置格式不對. 仔細檢查deployment 清單里的資源限制書寫格式部分.

正確的限制資源格式如下：
......
        # resources: {}                   # 這行很重要,命令行導出自帶的, 1 刪除 2 把資源限制寫在里面. 我就直接注釋，大家可以選擇刪除.
        resources:                        # 資源使用   
          limits:                         # (最高)限制單個POD 最多能使用1核(1000m 毫核)CPU以及2G內存
            cpu: "20m"
            memory: "50Mi"
          requests:                       # (最低)保證此pod 初始獲取這么多資源
            cpu: "10m"
            memory: "20Mi"
......

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 kubernetes之Pod水平自動伸縮(HPA) kubernetes雲平台管理實戰：HPA水平自動伸縮（十一） Pod容器自動伸縮(HPA) 測試終於成功部署 Kubernetes HPA 基於 QPS 進行自動伸縮 K8S(17)二進制的1.15版本部署hpa自動伸縮【六】K8s-Pod 水平自動擴縮實踐（簡稱HPA） Kubernetes 彈性伸縮全場景解讀（二）- HPA 的原理與演進 Kubernetes自動橫向擴展（HPA）詳解在騰訊雲容器服務 TKE 中利用 HPA 實現業務的彈性伸縮 Kubernetes 自動伸縮 auto-scaling