Kubernetes對Pod調度指定Node以及Node的Taint 和 Toleration

本文轉載自查看原文 2021-11-23 21:03 875 K8S

Taint 和 Toleration

1. 在不同機房
2. 在不同城市
3. 不一樣的配置，比如ssd盤，cpu類型

實現方法
1、NodeSelect
比如在擁有某些特性的機器打上標簽
Gpu-server: true
ssd-server: true
Normal-server: true

2、污點和容忍
Taint在一類服務器打上污點，讓不能容忍這個污點的pod不能部署在打了污點的服務器上。

節點親和性，是 pod 的一種屬性（偏好或硬性要求），它使 pod 被吸引到一類特定的節點。Taint 則相反，它使節點能夠排斥一類特定的 pod

Taint 和 toleration 相互配合，可以用來避免 pod 被分配到不合適的節點上。每個節點上都可以應用一個或多個 taint ，這表示對於那些不能容忍這些 taint 的 pod，是不會被該節點接受的。如果將 toleration 應用於 pod 上，則表示這些 pod 可以（但不要求）被調度到具有匹配 taint 的節點上

污點(Taint)

污點 ( Taint ) 的組成

使用 kubectl taint 命令可以給某個 Node 節點設置污點，Node 被設置上污點之后就和 Pod 之間存在了一種相斥的關系，可以讓 Node 拒絕 Pod 的調度執行，甚至將 Node 已經存在的 Pod 驅逐出去

每個污點有一個 key 和 value 作為污點的標簽，其中 value 可以為空，effect 描述污點的作用。

如果node有多個taint，pdo需要容忍多有的key:value effcet才可以調度到這個節點。

當前 taint effect 支持如下三個選項：

NoSchedule：只有擁有和這個 taint 相匹配的 toleration 的 pod 才能夠被分配到這個節點。

PreferNoSchedule：系統會盡量避免將 pod 調度到存在其不能容忍 taint 的節點上，但這不是強制的。

NoExecute ：任何不能忍受這個 taint 的 pod 都會馬上被驅逐，任何可以忍受這個 taint 的 pod 都不會被驅逐。Pod可指定屬性 tolerationSeconds 的值，表示pod 還能繼續在節點上運行的時間。

tolerations:
- key: "key1" operator: "Equal" value: "value1" effect: "NoExecute" tolerationSeconds: 3600 // pod 還能在這個節點上繼續運行這個指定的時間長度

給節點增加一個taint(污點)，它的key是<key>，value是<value>，effect是NoSchedule

kubectl taint nodes <node_name> <key>=<value>:NoSchedule

刪除節點上的taint

kubectl taint nodes node1 key:NoSchedule-

容忍(Tolerations)

設置了污點的 Node 將根據 taint 的 effect：NoSchedule、PreferNoSchedule、NoExecute 和 Pod 之間產生互斥的關系，Pod 將在一定程度上不會被調度到 Node 上。但我們可以在 Pod 上設置容忍 ( Toleration ) ，意思是設置了容忍的 Pod 將可以容忍污點的存在，可以被調度到存在污點的 Node 上

例如，在 Pod Spec 中定義 pod 的 toleration：
operator:Equal 會比較key和value
operator:Exists 只要含有key就會容忍該污點

tolerations:
- key: "key" operator: "Equal" value: "value" # 精確匹配 effect: "NoSchedule"

tolerations:
- key: "key" operator: "Exists" effect: "NoSchedule" # 只匹配key和effect

容忍所有含污點的node

tolerations:
- operator: "Exists"

容忍所有key相同的，忽視effect

tolerations:
- key: "key" operator: "Exists"

有多個 Master 存在時，防止資源浪費，可以如下設置

kubectl taint nodes Node-Name node-role.kubernetes.io/master=:PreferNoSchedule

2.3. 使用場景

2.3.1. 專用節點

kubectl taint nodes <nodename> dedicated=<groupName>:NoSchedule

先給Node添加taint，然后給Pod添加相對應的 toleration，則該Pod可調度到taint的Node，也可調度到其他節點。

如果想讓Pod只調度某些節點且某些節點只接受對應的Pod，則需要在Node上添加Label（例如：dedicated=groupName），同時給Pod的nodeSelector添加對應的Label。

2.3.2. 特殊硬件節點

如果某些節點配置了特殊硬件（例如CPU），希望不使用這些特殊硬件的Pod不被調度該Node，以便保留必要資源。即可給Node設置taint和label，同時給Pod設置toleration和label來使得這些Node專門被指定Pod使用。

kubectl taint
kubectl taint nodes nodename special=true:NoSchedule

或者

kubectl taint nodes nodename special=true:PreferNoSchedule

2.3.3. 基於taint驅逐

effect 值 NoExecute ，它會影響已經在節點上運行的 pod，即根據策略對Pod進行驅逐。

如果 pod 不能忍受effect 值為 NoExecute 的 taint，那么 pod 將馬上被驅逐
如果 pod 能夠忍受effect 值為 NoExecute 的 taint，但是在 toleration 定義中沒有指定 tolerationSeconds，則 pod 還會一直在這個節點上運行。
如果 pod 能夠忍受effect 值為 NoExecute 的 taint，而且指定了 tolerationSeconds，則 pod 還能在這個節點上繼續運行這個指定的時間長度。

指定pod到指定的node上

1. 獲取到該節點的label信息

]# kubectl get node -A --show-labels
NAME STATUS ROLES AGE VERSION LABELS
master Ready master 326d v1.17.11 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=master,kubernetes.io/os=linux,node-role.kubernetes.io/master=
node1 Ready <none> 326d v1.17.11 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node1,kubernetes.io/os=linux
node2 Ready <none> 326d v1.17.11 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node2,kubernetes.io/os=linux

2. 也可通過自己設置label

]# kubectl label nodes node1 project=linux40
node/node1 labeled

]# kubectl get node -A --show-labels
NAME STATUS ROLES AGE VERSION LABELS
master Ready master 326d v1.17.11 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=master,kubernetes.io/os=linux,node-role.kubernetes.io/master=
node1 Ready <none> 326d v1.17.11 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node1,kubernetes.io/os=linux,project=linux40
node2 Ready <none> 326d v1.17.11 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node2,kubernetes.io/os=linux

3. 在配置文件spec下面添加

spec:
nodeSelector:
project: linux40

4. 刪除自定義node label

]# kubectl  label nodes node1 project-
node/node1 labeled

應用yaml文件后發現部分pod處於pending狀態，於是describe了一下pod發現Events：提示 Insufficient cpu,意思是說一個匹配的node節點cpu資源不足導致pod未調度成功。

Pod.spec.nodeName

將 Pod 直接調度到指定的 Node 節點上，會跳過 Scheduler 的調度策略，該匹配規則是強制匹配

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: myweb
spec:
  replicas: 7
  template:
    metadata:
      labels:
        app: myweb
    spec:
      nodeName: k8s-node01    //強制調度到node01
      containers:
      - name: myweb
        image: wangyanglinux/myapp:v1
        ports:
        - containerPort: 80

Pod.spec.nodeSelector

通過 kubernetes 的 label-selector 機制選擇節點，由調度器調度策略匹配 label，而后調度 Pod 到目標節點，該匹配規則屬於強制約束

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: myweb
spec:
  replicas: 2
  template:
    metadata:
      labels:
        app: myweb
    spec:
      nodeSelector:
        type: backEndNode1   //node標簽
      containers:
      - name: myweb
        image: harbor/tomcat:8.5-jre8
        ports:
        - containerPort: 80

查看pod詳情

[root@master-1 ~]# kubectl  describe pod -n linux39         magedu-nginx-deployment-84c4cb9fdd-ssw27
Name:         magedu-nginx-deployment-84c4cb9fdd-ssw27
Namespace:    linux39
Priority:     0
Node:         node1/192.168.64.113
Start Time:   Sat, 07 May 2022 20:03:17 +0800
Labels:       app=magedu-nginx-selector
              pod-template-hash=84c4cb9fdd
Annotations:  cni.projectcalico.org/podIP: 100.66.209.206/32
              cni.projectcalico.org/podIPs: 100.66.209.206/32
Status:       Running
IP:           100.66.209.206
IPs:
  IP:           100.66.209.206
Controlled By:  ReplicaSet/magedu-nginx-deployment-84c4cb9fdd
Containers:
  magedu-nginx-container:
    Container ID:   docker://58619204579f6522eeead49823659334d9e9feb58f9e2712b49a693cc9c14cc8
    Image:          nginx-web:v1
    Image ID:       docker://sha256:8874c3873369c14572f5cfd9e9ee49bb012eca927f1ac4024c787caf7a3bcb38
    Ports:          80/TCP, 443/TCP
    Host Ports:     0/TCP, 0/TCP
    State:          Running
      Started:      Sun, 15 May 2022 21:38:42 +0800
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Sun, 15 May 2022 21:38:12 +0800
      Finished:     Sun, 15 May 2022 21:38:27 +0800
    Ready:          True
    Restart Count:  6
    Limits:
      cpu:     1
      memory:  512Mi
    Requests:
      cpu:     200m
      memory:  246Mi
    Environment:
      password:  123456
      age:       20
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-bpkjb (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  default-token-bpkjb:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-bpkjb
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s   # 當節點宕機不可用，驅逐等待時常
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s  # 當節點無法訪問驅逐時常
Events:          <none>

Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s

# 當節點宕機不可用，驅逐等待時間，比如因為節點網絡狀態震盪導致未及時通過kubelet將節點狀態報告給apiserver，那么master認為節點nodeready，給節點打不可調度污點，任何不能容忍這個污點的pod 馬上被驅逐，加上緩沖時間可以降低因為節點震盪導致的暫時性失聯。

node.kubernetes.io/unreachable:NoExecute op=Exists for 300s # 當節點無法訪問驅逐時常

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Kubernetes對Pod調度指定Node以及Node的Taint 和 Toleration Kubernetes高級調度- Taint和Toleration、Node Affinity分析 Taint/Toleration pod調度策略 kubernetes調度之污點(taint)和容忍(toleration) Kubernetes之Taint 和 Toleration kubernetes之Pod分配到指定Node nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 2 node(s) didn't match pod affinity/anti-affinity kubernetes:用label讓pod在指定的node上運行(kubernetes1.18.3) Kubernetes 配置 Taint 和 Toleration（污點和容忍） kubernets的報錯：0/2 nodes are available: 1 node(s) had taint {env_role: }, that the pod didn't tolerate, 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate