自動水平伸縮,是指運行在k8s上的應用負載(POD),可以根據資源使用率進行自動擴容、縮容,它依賴metrics-server服務pod使用資源指標收集;我們知道應用的資源使用率通常都有高峰和低谷,所以k8s的HPA特性應運而生;它也是最能體現區別於傳統運維的優勢之一,不僅能夠彈性伸縮,而且完全自動化!
我們在生產中通常用得最多的就是基於服務pod的cpu使用率metrics來自動擴容pod數量,下面來以生產的標准來實戰測試下(注意:使用HPA前我們要確保K8s集群的dns服務和metrics服務是正常運行的,並且我們所創建的服務需要配置指標分配)
Github地址:
https://github.com/kubernetes/kubernetes/blob/release-1.19/cluster/addons/metrics-server/
解決不能下載google 鏡像倉庫的問題:
由於無法下載google的k8s.gcr.io鏡像,選擇國內的鏡像代理站點下載同步鏡像:
例如:
docker pull k8s.gcr.io/metrics-server-amd64:v0.3.6
替換成阿里雲鏡像
docker pull registry.aliyuncs.com/google_containers/metrics-server-amd64:v0.3.6
*** 如果生產使用建議放在harbor上.
1 安裝前修改kube-apiserver.yaml配置文件:
vim /etc/kubernetes/manifests/kube-apiserver.yaml
。。。。。。省略
- --enable-bootstrap-token-auth=true
- --enable-aggregator-routing=true # 添加此行
- --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
。。。。。。省略
*** 一定要添加上: "- --enable-aggregator-routing=true" 保證有這行.
2 安裝metrics-server v0.3.6
mkdir metrics-server
cd metrics-server
#2.1 下載yaml文件:
wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml
#2.2 修改components.yaml文件
vim components.yaml
......其他不做變動省略即可......
apiVersion: apps/v1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
labels:
k8s-app: metrics-server
spec:
selector:
matchLabels:
k8s-app: metrics-server
template:
metadata:
name: metrics-server
labels:
k8s-app: metrics-server
spec:
serviceAccountName: metrics-server
volumes:
# mount in tmp so we can safely use from-scratch images and/or read-only containers
- name: tmp-dir
emptyDir: {}
containers:
- name: metrics-server
#image: k8s.gcr.io/metrics-server-amd64:v0.3.6 # 注釋原行
image: registry.aliyuncs.com/google_containers/metrics-server-amd64:v0.3.6 # 替換成阿里雲的代理鏡像
imagePullPolicy: IfNotPresent
args:
- --cert-dir=/tmp
- --secure-port=4443
- /metrics-server #新增
- --kubelet-preferred-address-types=InternalIP #新增
- --kubelet-insecure-tls #新增
ports:
- name: main-port
......其他不做變動省略即可......
## 2.3 安裝metrics-server :
kubectl apply -f components.yaml
## 2.4 查看metrics-server服務狀態:
[root@k8s-master1 data]# kubectl get pod -n kube-system | grep metrics-server
metrics-server-6ddbc8ff55-zwqdm 1/1 Running 0 38m
## 2.5 檢查接口是否有異常:
[root@k8s-master1 data]# kubectl api-versions
......
events.k8s.io/v1beta1
extensions/v1beta1
metrics.k8s.io/v1beta1 # 需要有此行
networking.k8s.io/v1
networking.k8s.io/v1beta1
......
[root@k8s-master1 data]# kubectl describe apiservice v1beta1.metrics.k8s.io # 看到下面的信息API接口即正常
Name: v1beta1.metrics.k8s.io
Namespace:
Labels: <none>
Annotations: <none>
API Version: apiregistration.k8s.io/v1
Kind: APIService
Metadata:
Creation Timestamp: 2021-11-11T09:22:34Z
Managed Fields:
API Version: apiregistration.k8s.io/v1
......省略部分
3 配置一個deploment,並添加資源限制條件:
## pod內資源分配的配置格式如下:
## 默認可以只配置requests,但根據生產中的經驗,建議把limits資源限制也加上,因為對K8s來說,只有這兩個都配置了且配置的值都要一樣,這個pod資源的優先級才是最高的,在node資源不夠的情況下,首先是把沒有任何資源分配配置的pod資源給干掉,其次是只配置了requests的,最后才是兩個都配置的情況,仔細品品
vim deployment-nginx.yaml
kind: Deployment
metadata:
labels:
app: ng-web
name: ng-web
spec:
replicas: 1
selector:
matchLabels:
app: ng-web
template:
metadata:
creationTimestamp: null
labels:
app: ng-web
spec:
containers:
- image: 192.168.10.19/devops/nginx:1.14.0
name: nginx
resources: # 資源使用
limits: # (最高)限制單個POD 最多能使用1核(1000m 毫核)CPU以及2G內存
cpu: '1'
memory: 20Gi
requests: # (最低)保證此pod 初始獲取這么多資源
cpu: '1'
memory: 20Mi
ports:
- containerPort: 80
volumeMounts:
- name: my-nfs
mountPath: /usr/share/nginx/html
- name: k8snfs-db
mountPath: /data/nginx/html
volumes:
- name: my-nfs
nfs:
server: 192.168.10.19
path: /data/k8sdata
- name: k8snfs-db
nfs:
server: 192.168.10.19
path: /data/k8sdb
---
apiVersion: v1
kind: Service
metadata:
labels:
app: ng-web
name: ng-web
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: ng-web
type: ClusterIP
****** ****** ****** ****** ****** ****** ****** ****** ****** ****** ****** ****** ****** ****** ****** ****** ******
# 安裝部署測試的deployment:
kubectl apply -f deployment-nginx.yaml
# 查看pod 的資源使用:
[root@k8s-master1 data]# kubectl top pod
NAME CPU(cores) MEMORY(bytes)
ng-web-59f56fdfb6-58ch6 0m 1Mi
ng-web-59f56fdfb6-9ds9x 0m 1Mi
ng-web-59f56fdfb6-xmnks 0m 2Mi
# 查看Node 使用情況:
[root@k8s-master1 data]# kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
k8s-master1.example.com 242m 12% 1172Mi 61%
node1.example.com 66m 3% 505Mi 56%
node2.example.com 56m 2% 520Mi 58%
node3.example.com 80m 4% 534Mi 60%
4 為測試deployment,創建一個HPA:
# 為deployment資源web創建HPA,pod數量上限3個,最低1個,在pod平均CPU達到50%后開始擴容
kubectl autoscale deployment ng-web --max=3 --min=1 --cpu-percent=50
# 下面提示說到,HPA缺少最小資源分配的request參數:
[root@k8s-master1 linux39_ng]# kubectl describe hpa web1
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetScale 3m30s (x3 over 142m) horizontal-pod-autoscaler deployments/scale.apps "ng-web" not found
Warning FailedGetResourceMetric 3m (x2 over 3m15s) horizontal-pod-autoscaler unable to get metrics for resource cpu: no metrics returned from resource metrics API
Warning FailedComputeMetricsReplicas 3m (x2 over 3m15s) horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
Warning FailedGetResourceMetric 2m15s (x3 over 2m45s) horizontal-pod-autoscaler missing request for cpu
Warning FailedComputeMetricsReplicas 2m15s (x3 over 2m45s) horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: missing request for cpu
***** 所以在deployment-nginx.yaml里面一定要加入resources limits request 這幾個資源限制參數.
5 查看HPA的資源信息:
## 添加資源限制后重新應用:
kubectl apply -f deployment-nginx.yaml
[root@k8s-master1 linux39_ng]# kubectl autoscale deployment web1 --max=3 --min=1 --cpu-percent=1
horizontalpodautoscaler.autoscaling/web1 autoscaled
# --cpu-percent=1 CPU百分比,測試建議放低點
## 等待一會,可以看到相關的hpa信息(K8s上metrics服務收集所有pod資源的時間間隔大概在60s的時間)
[root@k8s-master1 web1]# kubectl get hpa -w
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
web1 Deployment/web1 0%/1% 1 3 1 48m
## 再次查看HPA:
[root@k8s-master1 linux39_ng]# kubectl describe hpa web1
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ScaleDownStabilized recent recommendations were higher than current one, applying the highest recent recommendation
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
6 模擬業務流量增長,看看hpa自動伸縮的效果:
# 獲取ng-web 的CluseterIP:
[root@k8s-master2 ng]# kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
web1 ClusterIP 172.16.1.108 <none> 80/TCP 16d
# 其他node模擬流量,隨便寫了個html頁面用於測試:
[root@k8s-master2 ~]# while :;do curl 172.16.1.108 ;done
<h1>this is NFS PV PVC pod show</h1>
<img src='123.png' >
<h1>this is NFS PV PVC pod show</h1>
一台不行就2個節點上量.....
# 過一會在看看HPA 信息,不會上來就自動擴容, 需要一定的時間全達到閾值范圍,就開始自動擴容.
[root@k8s-master1 web1]# kubectl get hpa -w
web1 Deployment/web1 10%/1% 1 3 1 48m
web1 Deployment/web1 10%/1% 1 3 3 48m
web1 Deployment/web1 0%/1% 1 3 3 49m
# 停止Node 並發訪問,在查看HPA是不是慢慢的進行了縮容:
[root@k8s-master1 web1]# kubectl get hpa -w
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
web1 Deployment/web1 10%/1% 1 3 1 48m
web1 Deployment/web1 0%/1% 1 3 3 54m
web1 Deployment/web1 0%/1% 1 3 1 54m
7 文件方式創建HPA:
vim hpa_web1.yaml
---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: web1
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: web1
minReplicas: 1
maxReplicas: 3
targetCPUUtilizationPercentage: 50
## 執行文件:
kubectl apply -f hpa_web1.yaml
錯誤:
1. kubectl get hpa -w 不顯示資源使用率:
問題 1:
[root@k8s-master1 web1]# kubectl delete ingress web1
Warning: extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
Error from server (NotFound): ingresses.extensions "web1" not found
[root@k8s-master1 web1]# kubectl describe hpa web1
Name: web1
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Tue, 23 Nov 2021 16:13:41 +0800
Reference: Deployment/web1
Metrics: ( current / target )
resource cpu on pods (as a percentage of request): <unknown> / 1%
Min replicas: 1
Max replicas: 3
Deployment pods: 1 current / 0 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededGetScale the HPA controller was able to get the target's current scale
ScalingActive False FailedGetResourceMetric the HPA was unable to compute the replica count: missing request for cpu
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedComputeMetricsReplicas 10s (x12 over 2m58s) horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: missing request for cpu
Warning FailedGetResourceMetric <invalid> (x20 over 2m58s) horizontal-pod-autoscaler missing request for cpu
問題 2 :
[root@k8s-master1 web1]# kubectl get hpa -w
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
web1 Deployment/web1 <unknown>/1% 1 3 0 12s
web1 Deployment/web1 <unknown>/1% 1 3 1 15s
*************************************************************************************************************************************
解決:
以上的兩個問題都是因為創建deployment 時的resources 這部分的限制超標或者資源限制的配置格式不對. 仔細檢查deployment 清單里的資源限制書寫格式部分.
正確的限制資源格式如下:
......
# resources: {} # 這行很重要,命令行導出自帶的, 1 刪除 2 把資源限制寫在里面. 我就直接注釋,大家可以選擇刪除.
resources: # 資源使用
limits: # (最高)限制單個POD 最多能使用1核(1000m 毫核)CPU以及2G內存
cpu: "20m"
memory: "50Mi"
requests: # (最低)保證此pod 初始獲取這么多資源
cpu: "10m"
memory: "20Mi"
......