一、問題背景
- 當前Cluster K8s Version: v1.17.4
- 需要升級到K8s Version:v1.19.3
- 在升級過程中,有個Pod卡在ContainerCreating狀態
api-9flnb 0/1 ContainerCreating 0 4d19h api-bb8th 1/1 Running 0 4d20h api-zwtpp 1/1 Running 0 4d20h
二、問題分析
- Describe該Pod狀態,提示hostPath type check failed: /var/run/docker.sock is not a file
Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedMount 11m (x3543 over 4d18h) kubelet (combined from similar events): Unable to attach or mount volumes: unmounted volumes=[docker-socket], unattached volumes=[xxxx-service-account-token-rjqz7 nginx-certs host-timezone docker-socket helm-home etc-pki kubernetes-root-ca-file root-ca-file bcmt-home etcd-client-certs]: timed out waiting for the condition Warning FailedMount 2m39s (x2889 over 4d19h) kubelet MountVolume.SetUp failed for volume "docker-socket" : hostPath type check failed: /var/run/docker.sock is not a file
- 查看該Pod中volume "docker-socket"的聲明,Path是/var/run/docker.sock,HostPathType是File
Volumes: etcd-client-certs: Type: Secret (a volume populated by a Secret) SecretName: etcd-client-certs Optional: false nginx-certs: Type: HostPath (bare host directory volume) Path: /opt/bcmt/config/bcmt-api/certs HostPathType: Directory docker-socket: Type: HostPath (bare host directory volume) Path: /var/run/docker.sock HostPathType: File
- 查看K8s從v1.17.4到v1.19.3關於Type檢測方面的相關代碼變化
首先,報錯的代碼函數是checkTypeInternal(),在文件host_path.go定義,會判斷HostPathType和實際Type是否一致,否則報錯。
func checkTypeInternal(ftc hostPathTypeChecker, pathType *v1.HostPathType) error { switch *pathType { case v1.HostPathDirectoryOrCreate: if !ftc.Exists() { return ftc.MakeDir() } fallthrough case v1.HostPathDirectory: if !ftc.IsDir() { return fmt.Errorf("hostPath type check failed: %s is not a directory", ftc.GetPath()) } case v1.HostPathFileOrCreate: if !ftc.Exists() { return ftc.MakeFile() } fallthrough case v1.HostPathFile: if !ftc.IsFile() { return fmt.Errorf("hostPath type check failed: %s is not a file", ftc.GetPath()) } case v1.HostPathSocket: if !ftc.IsSocket() { return fmt.Errorf("hostPath type check failed: %s is not a socket file", ftc.GetPath()) } case v1.HostPathCharDev: if !ftc.IsChar() { return fmt.Errorf("hostPath type check failed: %s is not a character device", ftc.GetPath()) } case v1.HostPathBlockDev: if !ftc.IsBlock() { return fmt.Errorf("hostPath type check failed: %s is not a block device", ftc.GetPath()) } default: return fmt.Errorf("%s is an invalid volume type", *pathType) } return nil }
然后,結合我們pod的定義,HostPathType是File,但是實際Path文件/var/run/docker.sock應該是Socket,所以報錯是正確的。疑問在於,為什么v1.17.4沒有報錯(親測v1.18.x也不會報錯),而到了v1.19.3才開始報錯???
- 檢查checkTypeInternal函數代碼有無改動,---> 結果是無改動
- 檢查入參hostPathTypeChecker傳值是否有改動, --->發現有改動
- v1.17.x和v1.18.x中,IsFile()定義如下
-
func (ftc *fileTypeChecker) IsFile() bool { if !ftc.Exists() { return false } return !ftc.IsDir() }
- v1.19.x開始,IsFile()更新如下
func (ftc *fileTypeChecker) IsFile() bool { if !ftc.Exists() { return false } pathType, err := ftc.hu.GetFileType(ftc.path) if err != nil { return false } return string(pathType) == string(v1.HostPathFile) }
三、問題結論
- K8s從v1.19.x修復了IsFile()函數檢測功能不完備的Bug;
- 我們的Pod Mount Volume 文件.sock時指定HostPathType錯誤(應該是Socket, 不應該是File),但是在v1.19.x之前因為k8s的bug正好將錯就錯反而沒有問題,等v1.19.x修復了該Bug就會出現Volume Mount失敗的問題
四、解決方案:.sock文件的HostPathType要設置成Socket
Volumes: docker-socket: Type: HostPath (bare host directory volume) Path: /var/run/docker.sock HostPathType: Socket