kubernetes使用securityContext和sysctl


前言

在運行一個容器時,有時候需要使用sysctl修改內核參數,比如net.vm.kernel等,sysctl需要容器擁有超級權限,容器啟動時加上--privileged參數即可。那么,在kubernetes中是如何使用的呢?

Security Context

kubernetes中有個字段叫securityContext,即安全上下文,它用於定義Pod或Container的權限和訪問控制設置。其設置包括:

  • Discretionary Access Control: 根據用戶ID(UID)和組ID(GID)來限制其訪問資源(如:文件)的權限

針對pod設置:

apiVersion: v1
kind: Pod
metadata:
  name: security-context-demo
spec:
  securityContext:
    runAsUser: 1000
    fsGroup: 2000
  volumes:
  - name: sec-ctx-vol
    emptyDir: {}
  containers:
  - name: sec-ctx-demo
    image: gcr.io/google-samples/node-hello:1.0
    volumeMounts:
    - name: sec-ctx-vol
      mountPath: /data/demo
    securityContext:
      allowPrivilegeEscalation: false

針對container設置:

apiVersion: v1
kind: Pod
metadata:
  name: security-context-demo-2
spec:
  securityContext:
    runAsUser: 1000
  containers:
  - name: sec-ctx-demo-2
    image: gcr.io/google-samples/node-hello:1.0
    securityContext:
      runAsUser: 2000
      allowPrivilegeEscalation: false
  • Security Enhanced Linux (SELinux): 給容器指定SELinux labels
...
securityContext:
  seLinuxOptions:
    level: "s0:c123,c456"
  • Running as privileged or unprivileged:以privilegedunprivileged權限運行
apiVersion: v1
kind: Pod
metadata:
  name: security-context-demo-4
spec:
  containers:
  - name: sec-ctx-4
    image: gcr.io/google-samples/node-hello:1.0
    securityContext:
      privileged: true
  • Linux Capabilities: 給某個特定的進程privileged權限,而不用給root用戶所有的privileged權限
apiVersion: v1
kind: Pod
metadata:
  name: security-context-demo-4
spec:
  containers:
  - name: sec-ctx-4
    image: gcr.io/google-samples/node-hello:1.0
    securityContext:
      capabilities:
        add: ["NET_ADMIN", "SYS_TIME"]
  • AppArmor: 使用程序文件來限制單個程序的權限

  • Seccomp: 限制一個進程訪問文件描述符的權限

  • AllowPrivilegeEscalation: 控制一個進程是否能比其父進程獲取更多的權限,AllowPrivilegeEscalation的值是bool值,如果一個容器以privileged權限運行或具有CAP_SYS_ADMIN權限,則AllowPrivilegeEscalation的值將總是true。

apiVersion: v1
kind: Pod
metadata:
  name: security-context-demo-2
spec:
  securityContext:
    runAsUser: 1000
  containers:
  - name: sec-ctx-demo-2
    image: gcr.io/google-samples/node-hello:1.0
    securityContext:
      runAsUser: 2000
      allowPrivilegeEscalation: false

注意:要開啟容器的privileged權限,需要提前在kube-apiserverkubelet啟動時添加參數--allow-privileged=true,默認已添加。

使用sysctl

sysctl -a可以獲取sysctl所有參數列表。

從v1.4開始,kubernetes將sysctl分為safeunsafe,其對safe的sysctl定義如下:

  • 不會影響該節點的其他pod
  • 不會影響節點的正常運行
  • 不會獲取超出resource limits范圍的CPU和內存資源

目前屬於safe sysctl的有:

  • kernel.shm_rmid_forced
  • net.ipv4.ip_local_port_range
  • net.ipv4.tcp_syncookies

其余的都是unsafe sysctl,當kubelet支持更好的隔離機制時,safe sysctl列表將在未來的Kubernetes版本中擴展。

使用safe sysctl例子:

apiVersion: v1
kind: Pod
metadata:
  name: sysctl-example
  annotations:
    security.alpha.kubernetes.io/sysctls: kernel.shm_rmid_forced=1
spec:
  ...

而使用unsafe sysctl時,需要在kubelet的啟動參數中指定--experimental-allowed-unsafe-sysctls,如--experimental-allowed-unsafe-sysctls=net.core.somaxconn,具體操作如下:

編輯kubelet配置文件,修改ExecStart=/usr/bin/kubelet那一行,在后面加上--experimental-allowed-unsafe-sysctls=net.core.somaxconn,如:

ExecStart=/usr/bin/kubelet --experimental-allowed-unsafe-sysctls=net.core.somaxconn

因為我是用kubeadm安裝的kubernetes,所以在/etc/systemd/system/kubelet.service.d/10-kubeadm.conf文件中加了倒數第3行內容:

[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true"
Environment="KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin"
Environment="KUBELET_DNS_ARGS=--cluster-dns=10.96.0.10 --cluster-domain=cluster.local"
Environment="KUBELET_AUTHZ_ARGS=--authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt"
Environment="KUBELET_CADVISOR_ARGS=--cadvisor-port=0"
Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=systemd"
Environment="KUBELET_CERTIFICATE_ARGS=--rotate-certificates=true --cert-dir=/var/lib/kubelet/pki"
Environment="KUBELET_EXTRA_ARGS=--experimental-allowed-unsafe-sysctls=net.core.somaxconn"
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_CGROUP_ARGS $KUBELET_CERTIFICATE_ARGS $KUBELET_EXTRA_ARGS

重啟kubelet:

systemctl daemon-reload
systemctl restart kubelet

在Pod中使用unsafe sysctl,開啟privileged權限:

apiVersion: v1
kind: Pod
metadata:
  name: sysctl-example
  annotations:
    security.alpha.kubernetes.io/unsafe-sysctls: net.core.somaxconn=65535                 #使用unsafe sysctl,設置最大連接數
spec:
  securityContext:
    privileged: true                                                                      #開啟privileged權限
  ...

總結

線上環境請謹慎使用privileged權限,使用不慎可能導致整個容器崩掉,相關信息可自行查閱。

參考:
https://kubernetes.io/docs/concepts/cluster-administration/sysctl-cluster/
https://kubernetes.io/docs/tasks/configure-pod-container/security-context/


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM