當多個用戶或者開發團隊共享一個有固定節點的的kubernetes集群時,一個團隊或者一個用戶使用的資源超過他應當使用的資源是需要關注的問題,資源配額是管理員用來解決這個問題的一個工具.
資源配額,通過ResourceQuota定義,提供了對某一名稱空間使用資源的總體約束.它即可以限制這個名稱空間下有多少個對象可以被創建,也可以限制對計算機資源使用量的限制(前面說到過,計算機資源包括cpu,內存,磁盤空間等資源)
資源配額通過以下類似方式工作:
-
不同的團隊在不同的名稱空間下工作.當前kubernetes並沒有強制這樣做,完全是自願的,但是kubernetes團隊計划通過acl授權來達到強制這樣做.
-
管理員對每一個名稱空間創建一個
ResourceQuota(資源配額) -
用戶在一個名稱空間下創建資源(例如pod,service等),配額系統跟蹤資源使用量來保證資源的使用不超過
ResourceQuota定義的量. -
如果對一個資源的創建或者更新違反了資源配額約束,則請求會返回失敗,失敗的http狀態碼是
403 FORBIDDEN並且有一條消息來解釋哪個約束被違反. -
如果一個名稱空間下的計算機資源配額,比如CPU和內存被啟用,則用戶必須指定相應的資源申請或者限制的值,否則配額系統可能會阻止pod的創建.
資源配額在某一名稱空間下創建策略示例:
-
在一個有32G內存,16核cpu的集群,讓團隊A使用20G內存和10核cpu,讓團隊B使用10G內存和4核cpu,剩余的2G內存和2核cup預留以備進一步分配
-
限制
測試名稱空間使用1核1G,讓生產名稱空間使用剩下的全部資源
當集群的容量小於所有名稱空間下配額總和時,將會出現資源競爭,這種情況下kubernetes將會基於先到先分配的原則進行處理
不論是資源競爭或者是資源配額的修改都不會影響已經創建的資源
啟用資源配額
很多kubernetes的發行版中資源配額支持默認是開啟的,當ResourceQuota作為apiserver的--enable-admission-plugins=的其中一個值時,資源配額被開啟.
當某一名稱空間包含ResourceQuota對象時資源配額在這個名稱空間下生效.
計算機資源配額
你可以限制一個名稱空間下可以被申請的計算機資源的總和
kubernetes支持以下資源類型:
| Resource Name | Description |
|---|---|
| cpu | Across all pods in a non-terminal state, the sum of CPU requests cannot exceed this value. |
| limits.cpu | Across all pods in a non-terminal state, the sum of CPU limits cannot exceed this value. |
| limits.memory | Across all pods in a non-terminal state, the sum of memory limits cannot exceed this value. |
| memory | Across all pods in a non-terminal state, the sum of memory requests cannot exceed this value. |
| requests.cpu | Across all pods in a non-terminal state, the sum of CPU requests cannot exceed this value. |
| requests.memory | Across all pods in a non-terminal state, the sum of memory requests cannot exceed this value. |
擴展資源的資源配額
除了上面提到的,在kubernetes 1.10里,添加了對擴展資源的配額支持
存儲資源配額
你可以限制某一名稱空間下的存儲空間總量的申請
此外,你也可以你也可以根據關聯的storage-class來限制存儲空間資源的使用
| Resource Name | Description |
|---|---|
| requests.storage | Across all persistent volume claims, the sum of storage requests cannot exceed this value. |
| persistentvolumeclaims | The total number of persistent volume claims that can exist in the namespace. |
|
|
Across all persistent volume claims associated with the storage-class-name, the sum of storage requests cannot exceed this value. |
|
|
Across all persistent volume claims associated with the storage-class-name, the total number of persistent volume claims that can exist in the namespace. |
例如,一個operator想要想要使黃金和青銅單獨申請存儲空間,那么這個operator可以像如下一樣申請配額:
gold.storageclass.storage.k8s.io/requests.storage: 500Gi
bronze.storageclass.storage.k8s.io/requests.storage: 100Gi
在1.8版本里,對local ephemeral storage配額的的支持被添加到alpha特征里.
| Resource Name | Description |
|---|---|
| requests.ephemeral-storage | Across all pods in the namespace, the sum of local ephemeral storage requests cannot exceed this value. |
| limits.ephemeral-storage | Across all pods in the namespace, the sum of local ephemeral storage limits cannot exceed this value. |
對象數量配額
1.9版本通過以下語法加入了對所有標准名稱空間資源類型的配額支持
count/<resource>.<group>
以下是用戶可能想要設置對象數量配額的例子:
-
count/persistentvolumeclaims
-
count/services
-
count/secrets
-
count/configmaps
-
count/replicationcontrollers
-
count/deployments.apps
-
count/replicasets.apps
-
count/statefulsets.apps
-
count/jobs.batch
-
count/cronjobs.batch
-
count/deployments.extensions
當使用count/*類型資源配額,服務器上存在的資源對象將都被控制.這將有助於防止服務器存儲資源被耗盡.比如,如果存儲在服務器上的secrets資源對象過大,你可能會想要限制它的數量.過多的secrets可能會導致服務器無法啟動!你也可能會限制job的數量以防一些設計拙劣的定時任務會創建過多的job以導致服務被拒絕
以下資源類型的限額是支持的
| Resource Name | Description |
|---|---|
| configmaps | The total number of config maps that can exist in the namespace. |
| persistentvolumeclaims | The total number of persistent volume claims that can exist in the namespace. |
| pods | The total number of pods in a non-terminal state that can exist in the namespace. A pod is in a terminal state if .status.phase in (Failed, Succeeded) is true. |
| replicationcontrollers | The total number of replication controllers that can exist in the namespace. |
| resourcequotas | The total number of resource quotas that can exist in the namespace. |
| services | The total number of services that can exist in the namespace. |
| services.loadbalancers | The total number of services of type load balancer that can exist in the namespace. |
| services.nodeports | The total number of services of type node port that can exist in the namespace. |
| secrets | The total number of secrets that can exist in the namespace. |
例如,pod配額限制了一個名稱空間下非terminal狀態的pod總數量.這樣可以防止一個用戶創建太多小的pod以至於耗盡集群分配給pod的所有IP
配額范圍
每一個配額都可以包含一系列相關的范圍.配額只會在匹配列舉出的范圍的交集時才計算資源的使用.
當一個范圍被添加到配額里,它將限制它支持的,屬於范圍的資源.指定的資源不在支持的集合里時,將會導致驗證錯誤
| Scope | Description |
|---|---|
| Terminating | Match pods where .spec.activeDeadlineSeconds >= 0 |
| NotTerminating | Match pods where .spec.activeDeadlineSeconds is nil |
| BestEffort | Match pods that have best effort quality of service. |
| NotBestEffort | Match pods that do not have best effort quality of service. |
BestEffort范圍限制配額只追蹤pods資源
Terminating,NotTerminating和NotBestEffort范圍限制配額追蹤以下資源:
-
cpu
-
limits.cpu
-
limits.memory
-
memory
-
pods
-
requests.cpu
-
requests.memory
每一個PriorityClass的資源配額
此特征在1.12片本中為beta
pod可以以指定的優先級創建.你可以通過pod的優先級來控制pod對系統資源的使用,它是通過配額的spec下的scopeSelector字段產生效果的.
只有當配額spec的scopeSelector選擇了一個pod,配額才會被匹配和消費
你在使用PriorityClass的配額的之前,需要啟用
ResourceQuotaScopeSelectors
以下示例創建一個配額對象,並且一定優先級的pod會匹配它.
-
集群中的pod有以下三個優先級類之一:
low,medium,high -
每個優先級類都創建了一個資源配額
apiVersion: v1
kind: List
items:
- apiVersion: v1
kind: ResourceQuota
metadata:
name: pods-high
spec:
hard:
cpu: "1000"
memory: 200Gi
pods: "10"
scopeSelector:
matchExpressions:
- operator : In
scopeName: PriorityClass
values: ["high"]
- apiVersion: v1
kind: ResourceQuota
metadata:
name: pods-medium
spec:
hard:
cpu: "10"
memory: 20Gi
pods: "10"
scopeSelector:
matchExpressions:
- operator : In
scopeName: PriorityClass
values: ["medium"]
- apiVersion: v1
kind: ResourceQuota
metadata:
name: pods-low
spec:
hard:
cpu: "5"
memory: 10Gi
pods: "10"
scopeSelector:
matchExpressions:
- operator : In
scopeName: PriorityClass
values: ["low"]
使用kubectl create來用戶以上yml文件
kubectl create -f ./quota.yml
resourcequota/pods-high created
resourcequota/pods-medium created
resourcequota/pods-low created
使用kubectl describe quota來查看
kubectl describe quota
Name: pods-high
Namespace: default
Resource Used Hard
-------- ---- ----
cpu 0 1k
memory 0 200Gi
pods 0 10
Name: pods-low
Namespace: default
Resource Used Hard
-------- ---- ----
cpu 0 5
memory 0 10Gi
pods 0 10
Name: pods-medium
Namespace: default
Resource Used Hard
-------- ---- ----
cpu 0 10
memory 0 20Gi
pods 0 10
創建一個具有high優先級的pod,把以下內容保存在high-priority-pod.yml里
apiVersion: v1
kind: Pod
metadata:
name: high-priority
spec:
containers:
- name: high-priority
image: ubuntu
command: ["/bin/sh"]
args: ["-c", "while true; do echo hello; sleep 10;done"]
resources:
requests:
memory: "10Gi"
cpu: "500m"
limits:
memory: "10Gi"
cpu: "500m"
priorityClassName: high
使用kubectl create來應用
kubectl create -f ./high-priority-pod.yml
這時候再用kubectl describe quota來查看
Name: pods-high
Namespace: default
Resource Used Hard
-------- ---- ----
cpu 500m 1k
memory 10Gi 200Gi
pods 1 10
Name: pods-low
Namespace: default
Resource Used Hard
-------- ---- ----
cpu 0 5
memory 0 10Gi
pods 0 10
Name: pods-medium
Namespace: default
Resource Used Hard
-------- ---- ----
cpu 0 10
memory 0 20Gi
pods 0 10
scopeSelector支持operator字段的以下值:
-
In
-
NotIn
-
Exist
-
DoesNotExist
配額資源的申請與限制
當分配計算機資源時,每一個容器可能會指定對cpu或者內存的申請或限制.配額可以配置為它們中的一個值
這里是說配額只能是申請或者限制,而不能同時出現
如果配額指定了requests.cpu或requests.memory那么它需要匹配的容器必須顯式指定申請這些資源.如果配額指定了limits.cpu或limits.memory,那么它需要匹配的容器必須顯式指定限制這些資源
查看和設置配額
kubectl支持創建,更新和查看配額
kubectl create namespace myspace
cat <<EOF > compute-resources.yaml
apiVersion: v1
kind: ResourceQuota
metadata:
name: compute-resources
spec:
hard:
pods: "4"
requests.cpu: "1"
requests.memory: 1Gi
limits.cpu: "2"
limits.memory: 2Gi
requests.nvidia.com/gpu: 4
EOF
kubectl create -f ./compute-resources.yaml --namespace=myspace
cat <<EOF > object-counts.yaml
apiVersion: v1
kind: ResourceQuota
metadata:
name: object-counts
spec:
hard:
configmaps: "10"
persistentvolumeclaims: "4"
replicationcontrollers: "20"
secrets: "10"
services: "10"
services.loadbalancers: "2"
EOF
kubectl create -f ./object-counts.yaml --namespace=myspace
kubectl get quota --namespace=myspace
NAME AGE
compute-resources 30s
object-counts 32s
kubectl describe quota compute-resources --namespace=myspace
Name: compute-resources
Namespace: myspace
Resource Used Hard
-------- ---- ----
limits.cpu 0 2
limits.memory 0 2Gi
pods 0 4
requests.cpu 0 1
requests.memory 0 1Gi
requests.nvidia.com/gpu 0 4
kubectl describe quota object-counts --namespace=myspace
Name: object-counts
Namespace: myspace
Resource Used Hard
-------- ---- ----
configmaps 0 10
persistentvolumeclaims 0 4
replicationcontrollers 0 20
secrets 1 10
services 0 10
services.loadbalancers 0 2
kubectl通過count/<resource>.<group>語法形式支持標准名稱空間對象數量配額
kubectl create namespace myspace
kubectl create quota test --hard=count/deployments.extensions=2,count/replicasets.extensions=4,count/pods=3,count/secrets=4 --namespace=myspace
kubectl run nginx --image=nginx --replicas=2 --namespace=myspace
kubectl describe quota --namespace=myspace
Name: test
Namespace: myspace
Resource Used Hard
-------- ---- ----
count/deployments.extensions 1 2
count/pods 2 3
count/replicasets.extensions 1 4
count/secrets 1 4
配額和集群容量
ResourceQuotas獨立於集群的容量,它們通過絕對的單位表示.因此,如果你向集群添加了節點,這並不會給集群中的每個名稱空間賦予消費更多資源的能力.
有時候需要更為復雜的策略,比如:
-
把集群中所有的資源按照比例分配給不同團隊
-
允許每個租戶根據需求增加資源使用,但是有一個總體的限制以防資源被耗盡
-
檢測名稱空間的需求,添加節點,增加配額
這些策略可以通過實現ResourceQuotas來寫一個controller用於監視配額的使用,並且通過其它信號來調整每個名稱空間的配額
默認限制優先類消費
有時候我們可能希望一定優先級別的pod,例如cluster-services應當被允許在一個名稱空間里,當且僅當匹配的配額存在.
通過這種機制,operators可以限制一些高優先級的類只能用於有限數量的名稱空間里,並且不是所有的名稱空間都可以默認消費它們.
為了使以上生效,kube-apiserver標簽--admission-control-config-file應當傳入以下配置文件的路徑
apiVersion: apiserver.k8s.io/v1alpha1
kind: AdmissionConfiguration
plugins:
- name: "ResourceQuota"
configuration:
apiVersion: resourcequota.admission.k8s.io/v1beta1
kind: Configuration
limitedResources:
- resource: pods
matchScopes:
- scopeName: PriorityClass
operator: In
values: ["cluster-services"]
現在,cluster-services類型的pod僅被允許運行在有匹配scopeSelector的配額資源對象的名稱空間里,例如
scopeSelector:
matchExpressions:
- scopeName: PriorityClass
operator: In
values: ["cluster-services"]
``
