前言
在《Rancher中基於Kubernetes的CRD實現》一文中介紹和分析了Rancher是如何基於Kubernetes進行業務擴展的,並結合Kubernetes controller的編程范式介紹其實現機制,主要是以分析代碼的整體結構為主。這篇文章目的是以實例的形式介紹如何實現Kubernetes自定義controller的編寫,示例代碼的git地址如下:https://github.com/alena1108/kubecon2018,這是Rancher Labs的首席軟件工程師Alena Prokharchyk在CNCF舉辦的2018年KubeCon大會上進行的題為《Writing Kubernetes controllers for CRDs》主題演講中的demo示例,該示例以Rancher中的Cluster自定義資源為例,介紹如何進行controller的編寫(並不是Rancher代碼結構中controller的實現)。由於時間有限,Alena Prokharchyk對demo的講解和演示比較簡練,並沒有介紹過多細節,真正要運行起這個demo還是會碰到不少問題,后文中會做具體介紹。
Custom Resource結構
一個典型的Custom Resource結構如下圖所示:
Custom Resource一般要包含Metadata, Spec, Status這幾項內容,Metadata為元數據,如name、namspace、uid等,Spec為用戶期望的資源狀態,是必須的,Status為實際的資源狀態,Status中的Conditions代表資源當前狀態的最新可用觀察值。通過Kubernetes的Api我們可以對資源對象進行創建、更新或刪除等,當實際的資源對象狀態和用戶期望的資源對象狀態不一致時,就要進行相應的處理,並要朝着用戶期望的狀態進行改變。例如,Kubernetes Deployment 對象能夠表示運行在集群中的應用。 當創建 Deployment 時,可能需要設置 Deployment 的規約,以指定該應用需要有 3 個副本在運行。 Kubernetes 系統讀取 Deployment 規約,並啟動我們所期望的該應用的 3 個實例 —— 更新狀態以與規約相匹配。 如果那些實例中有失敗的(一種狀態變更),Kubernetes 系統通過修正來響應規約和狀態之間的不一致 —— 這種情況,會啟動一個新的實例來替換。圖中藍色的部分屬於k8s的field,紅色的Custom field部分是用戶編寫的內容,被自定義的controller實現和使用。
Code Generation使用
Why Code-Generation?通過client-go來實現Kubernetes的CRD(CustomResourceDefinition )的時候,需要進行代碼生成,更具體來說,client-go要求runtime.Object
類型(golang語言編寫的CustomResources必須實現 runtime.Object interface
)必須具備DeepCopy方法。Code-generation通過deepcopy-gen生成實現。
在自定義CRD時,可以使用如下代碼生成器:
- deepcopy-gen—creates a method
func (t* T) DeepCopy() *T
for each type T - client-gen—creates typed clientsets for CustomResource APIGroups
- informer-gen—creates informers for CustomResources which offer an event based interface to react on changes of CustomResources on the server
- lister-gen—creates listers for CustomResources which offer a read-only caching layer for GET and LIST requests.
先看一下准備用來使用Code-generation的目錄結構,如下所示:
[root@localhost kubecon2018]# tree ./pkg ./pkg └── apis └── clusterprovisioner ├── register.go └── v1alpha1 ├── doc.go ├── register.go └── types.go
首先看pkg/apis/clusterprovisioner路徑下的register.go文件,里面只定義了常量GroupName
package clusterprovisioner const ( GroupName = "clusterprovisioner.rke.io" )
在pkg/apis/clusterprovisioner/v1alpha1/doc.go文件中內容如下:
// +k8s:deepcopy-gen=package,register // Package v1alpha1 is the v1alpha1 version of the API. // +groupName=clusterprovisioner.rke.io package v1alpha1
這里需要關注的是注釋的部分,包含了代碼生成中使用的Tags,Tags的一般形態為// +tag-name
或者是 // +tag-name=value,有兩種類型的Tags:
- Global tags above
package
indoc.go
- Local tags above a type that is processed
全局的Tags(Global tags)寫在doc.go文件中,該文件的典型路徑為pkg/apis/<apigroup>/<version>/doc.go,其中
// +k8s:deepcopy-gen=package,register 主要告訴deepcopy-gen為每一個資源類型都缺省創建deepcopy方法,如果資源類型不需要該方法的話,可以在其local tag中使用// +k8s:deepcopy-gen=false
// +groupName=clusterprovisioner.rke.io 定義了API group name
在 pkg/apis/clusterprovisioner/v1alpha1/register.go中,進行scheme和api的注冊,如下所示:
package v1alpha1 import ( "github.com/rancher/kubecon2018/pkg/apis/clusterprovisioner" metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" "k8s.io/apimachinery/pkg/runtime" "k8s.io/apimachinery/pkg/runtime/schema" ) var ( SchemeBuilder = runtime.NewSchemeBuilder(addKnownTypes) AddToScheme = SchemeBuilder.AddToScheme ) var SchemeGroupVersion = schema.GroupVersion{Group: clusterprovisioner.GroupName, Version: "v1alpha1"} func Resource(resource string) schema.GroupResource { return SchemeGroupVersion.WithResource(resource).GroupResource() } func addKnownTypes(scheme *runtime.Scheme) error { scheme.AddKnownTypes(SchemeGroupVersion, &Cluster{}, &ClusterList{}, ) metav1.AddToGroupVersion(scheme, SchemeGroupVersion) return nil }
最后,資源類型定義是在pkg/apis/clusterprovisioner/v1alpha1/types.go文件中,如下所示:
package v1alpha1 import ( "github.com/rancher/norman/condition" "k8s.io/api/core/v1" metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" ) type ClusterConditionType string const ( // ClusterConditionReady Cluster ready to serve API (healthy when true, unhealthy when false) ClusterConditionReady condition.Cond = "Ready" // ClusterConditionProvisioned Cluster is provisioned by RKE ClusterConditionProvisioned condition.Cond = "Provisioned" ) // +genclient // +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object // +resource:path=cluster // +genclient:noStatus // +genclient:nonNamespaced type Cluster struct { metav1.TypeMeta `json:",inline"` metav1.ObjectMeta `json:"metadata,omitempty"` Spec ClusterSpec `json:"spec"` Status ClusterStatus `json:"status"` } // +genclient // +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object // +resource:path=kubeconfig // +genclient:noStatus // +genclient:nonNamespaced type Kubeconfig struct { metav1.TypeMeta `json:",inline"` metav1.ObjectMeta `json:"metadata,omitempty"` Spec KubeconfigSpec `json:"spec"` } // +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object // +resource:path=clusters type ClusterList struct { metav1.TypeMeta `json:",inline"` metav1.ListMeta `json:"metadata"` Items []Cluster `json:"items"` } // +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object // +resource:path=kubeconfigs type KubeconfigList struct { metav1.TypeMeta `json:",inline"` metav1.ListMeta `json:"metadata"` Items []Kubeconfig `json:"items"` } type KubeconfigSpec struct { ConfigPath string `json: "configPath, omitempty"` } type ClusterSpec struct { ConfigPath string `json: "configPath, omitempty"` } type ClusterStatus struct { AppliedConfig string `json:"appliedConfig,omitempty"` //Conditions represent the latest available observations of an object's current state: //More info: https://github.com/kubernetes/community/blob/master/contributors/devel/api-conventions.md#typical-status-properties Conditions []ClusterCondition `json:"conditions,omitempty"` } type ClusterCondition struct { // Type of cluster condition. Type ClusterConditionType `json:"type"` // Status of the condition, one of True, False, Unknown. Status v1.ConditionStatus `json:"status"` // The last time this condition was updated. LastUpdateTime string `json:"lastUpdateTime,omitempty"` // Last time the condition transitioned from one status to another. LastTransitionTime string `json:"lastTransitionTime,omitempty"` // The reason for the condition's last transition. Reason string `json:"reason,omitempty"` // Human-readable message indicating details about last transition Message string `json:"message,omitempty"` }
以Cluster資源為例,從中我們可以看到定義Cluster資源時,包括了Metadata、Spec和Stauts,以及Status中所要用到的condition。同時,這里我們還看到了Local tags的使用。
接下來,就要進行代碼生成了,先看下代碼生成的腳本,即代碼中scripts路徑下的update-codegen.sh文件,應用到了之前提到的代碼生成工具,內容如下:
#!/bin/bash APIS_DIR="github.com/rancher/kubecon2018/pkg/apis/clusterprovisioner" VERSION="v1alpha1" APIS_VERSION_DIR="${APIS_DIR}/${VERSION}" OUTPUT_DIR="github.com/rancher/kubecon2018/pkg/client" CLIENTSET_DIR="${OUTPUT_DIR}/clientset" LISTERS_DIR="${OUTPUT_DIR}/listers" INFORMERS_DIR="${OUTPUT_DIR}/informers" echo Generating deepcopy deepcopy-gen --input-dirs ${APIS_VERSION_DIR} \ -O zz_generated.deepcopy --bounding-dirs ${APIS_DIR} echo Generating clientset client-gen --clientset-name versioned \ --input-base '' --input ${APIS_VERSION_DIR} \ --clientset-path ${CLIENTSET_DIR} echo Generating lister lister-gen --input-dirs ${APIS_VERSION_DIR} \ --output-package ${LISTERS_DIR} echo Generating informer informer-gen --input-dirs ${APIS_VERSION_DIR} \ --versioned-clientset-package "${CLIENTSET_DIR}/versioned" \ --listers-package ${LISTERS_DIR} \ --output-package ${INFORMERS_DIR}
介紹下如何獲取這些代碼生成工具,在$GOPATH/src路徑下創建k8s.io目錄,進入k8s.io目錄,git clone https://github.com/kubernetes/code-generator.git,進入code-generator目錄,選擇分支,這里應用release-1.8(對應kubernetes1.8版本系列),然后執行安裝命令go install ./cmd/{defaulter-gen,client-gen,lister-gen,informer-gen,deepcopy-gen},在$GOPATH/bin下會生成相應的執行程序,如下圖所示:
有了必要的工具后,就可以執行代碼生成腳本了,執行update-codegen.sh時可能會發現報錯,提示需要依賴kubernetes和apimachinery這兩個項目,github上下載這兩個項目到$GOPATH/src/k8s.io路徑下,並切換為release-1.8分支,再次執行即可,最終得到生成代碼后的路徑信息如下:
RKE安裝
運行demo還需要使用RKE工具,那么什么是RKE?這里還是給出官方的說明:Rancher Kubernetes Engine, an extremely simple, lightning fast Kubernetes installer that works everywhere.
在$GOPATH/src/ github.com/rancher/下git clone https://github.com/rancher/rke.git,go build生成rke執行文件,如下圖所示:
代碼執行
工具齊全了,現在開始准備執行代碼了。Alena Prokharchyk在demo中的代碼比較簡潔易懂,不做過多的分析了,主要看一下Cluster CRD資源的yml文件,在代碼執行時會進行kubectl apply -f該文件,即創建了Cluster資源:
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: clusters.clusterprovisioner.rke.io
spec:
group: clusterprovisioner.rke.io
version: v1alpha1
names:
kind: Cluster
plural: clusters
scope: Cluster
再看下controller相關的代碼片段,主要關注informer中Callbacks的實現和object進入到Workqueue后Worker進行的處理,Callbacks內容如下所示:
...... controller.syncQueue = util.NewTaskQueue(controller.sync) controller.clusterInformer.AddEventHandler(cache.ResourceEventHandlerFuncs{ AddFunc: func(obj interface{}) { controller.syncQueue.Enqueue(obj) }, UpdateFunc: func(old, cur interface{}) { controller.syncQueue.Enqueue(cur) }, }) ......
Worker內容如下,即從Workqueue中get到內容后,根據cluster的DeletionTimestamp元數據值進行Cluster的Remove和ADD,在Remove和ADD的代碼中會進行RKE的調用。
func (c *Controller) sync(key string) { cluster, err := c.clusterLister.Get(key) if err != nil { c.syncQueue.Requeue(key, err) return } if cluster.DeletionTimestamp != nil { err = c.handleClusterRemove(cluster) } else { err = c.handleClusterAdd(cluster) } if err != nil { c.syncQueue.Requeue(key, err) return } }
demo運行還需要進行一些代碼修改,主要是結合自己的環境信息進行調整:
1)config/rkespec/cluster_aws.yml nodes: - address: 10.18.74.175 user: tht role: [controlplane,worker,etcd] 2)config/spec/cluster.yml apiVersion: clusterprovisioner.rke.io/v1alpha1 kind: Cluster metadata: name: clusteraws labels: rke: "true" spec: # example of cluster.yml file for RKE can be found here: https://github.com/rancher/rke/blob/master/cluster.yml configPath: /root/go/src/github.com/rancher/kubecon2018/config/rkespec/cluster_aws.yml 3)controllers/provisioner/provisioner.go func removeCluster(cluster *types.Cluster) (err error) { cmdName := "/root/go/src/github.com/rancher/rke/rke" cmdArgs := []string{"remove", "--force", "--config", cluster.Spec.ConfigPath} return executeCommand(cmdName, cmdArgs) } func provisionCluster(cluster *types.Cluster) (err error) { cmdName := "/root/go/src/github.com/rancher/rke/rke" cmdArgs := []string{"up", "--config", cluster.Spec.ConfigPath} return executeCommand(cmdName, cmdArgs) }
go build之后就可以執行了,在執行程序的--kubeconfig參數中可以使用我們已有的kubernetes環境中的kube config文件。demo主要演示了CRD資源的創建,controller的運行,以及通過kubernetes api創建或刪除一個cluster object后,controller的處理,包括RKE的調用(在一個節點上創建一個all-in-one的kubernetes環境)
獲取crd信息:
創建的cluster信息:
遇到的問題
1)在config/rkespec/cluster_aws.yml中user設置為root時,發現在調用rke的時候出現ssh tunnel的錯誤,查了一下只有在Centos操作系統下才會出現這個問題,必須設置為非root用戶,而且進行以下操作(Rancher官方的回復):
if you run rke in centos7, you should not use the root user to open the ssh tunnel.you can try the following step to run rke
in all nodes:
- update openssh to 7.4,and docker version v1.12.6
- set "AllowTcpForwarding yes" "PermitTunnel yes" to /etc/ssh/sshd_config,and then restart sshd service
- the host which run rke can ssh to all nodes without password
- run: "groupadd docker" to create docker group,while docker group is not exist.
- run: "useradd -g docker yourusername" to create yourusername user and set it's group to docker
- set the docker.service's MountFlags=shared (vi /xxx/xxx/docker.service)
- run:"su yourusername" to change current user,and then restart the docker service. so in the user yourusername session the docker.sock will be created in the path /var/run/docker.sock
- in cluster.yml set the ssh user to yourusername(in setup hosts)
nodes:
- address: x.x.x.x
...
user: yourusername
- address: x.x.x.x
...
user: yourusername
- in cluster.yml set the kubelet to use the systemd cgroup(in setup hosts)
services:
kubelet:
image: rancher/k8s:v1.8.3-rancher2
extra_args: {"cgroup-driver":"systemd","fail-swap-on":"false"}
now you can run "rke -d up" to setup your k8s cluster.
2)第一次創建Cluster資源的object(kubectl create -f ./config/spec/cluster.yml)時,由於RKE沒有設置正確,創建不成功,然后刪除該object(kubectl delete -f ./config/spec/cluster.yml),提示成功,但程序一直循環打印“Removing cluster clusteraws”,在代碼中加入日志輸出,發現錯誤原因“error removing cluster clusteraws the server does not allow this method on the requested resource (put clusters.clusterprovisioner.rke.io clusteraws)”,而且再次執行kubectl create -f ./config/spec/cluster.yml也是同樣的錯誤類型,暫不清楚什么原因?
3)出現(2)錯誤的時候,在RKE部署的節點上發現在/var/lib/docker/volumes/路徑下,不停的產生了很多volume,直至將文件系統的空間占滿。
Rancher RKE使用
首先建立一個RKE所需要的配置文件,這里部署一個all-in-one的環境,部署的節點為centos操作系統:
nodes: - address: 10.18.74.175 user: tht ssh_key_path: /home/tht/.ssh/id_rsa ssh_key: |- -----BEGIN RSA PRIVATE KEY----- MIIEowIBAAKCAQEA2OvoeczZdB2/3gvY+sHKVp5AFDBBJ6Gc2TcNqn1PuahtuZR2 zdV5Q32x2VUz/H2qXkcQVNcfCnWH4j557Aj580acRHzjCPjVJbW5E3++QW8KXWMS XP+BfVqE4XuJAlXaK0utBBcPuPvzB7Y71tpFFFWIHZZQs9x1xDwwB3DNouI/JP8H dHAICuFm4vZkfpWNvOeAL1hvTCg6okv5ChPEP3Rk83Gq4TfiFDtMF/QrgtkKFTCS 039/mlxeKpeJENyfzKK2DrmVpFxd1MWrUCVjvV2Ht8fzeOMXqlbccWAthnrlNCbn jU0GdThvRKL0azBxkiuNpIs0+oqG0HSVmg99PwIDAQABAoIBAHmfY3wPIAkbuPz9 dY265ADGv7TSDWX0FiYv2OizU+ULi2HW3PmxbEksC3CIdhpmNwSfIYgACXZqyWJP lzqBGeuNtoYr43ufUJrRFdDZ+clkQdJ0ftJHq8ml3AU0p2/4xNcrmflGGNml4fB7 +3cOcFbjUesM4XjG7fy1plQ1qgZdfXW6rODw1u2aQKt6GDmcw2L2mAUWlWuTre5k iZCTuhkdCCgQpeW3ZFZ83h4EYAGhZPXFR4V5Y3WWBnjimfKhdkOB7ajDxWkBoKr7 fiwArFSo5y3e2RWysDj+wqLX4JZqiUmNHzzlFAHNgPEK/hj5UQMepNxtd4ND8F9p y1us/fkCgYEA9pSUxWK10NexNdyRBb4nPjGoqzLayRAu+aLGZ5ZFo8vhr1QCCP8f S9tnPXyaT90ufEuylVubo1NFGbtDTWUVxGRc2HgjjpN4uPaXTcD5pof3mDe923co 1hIOxWSAroXoy5+Y+l0YRq94oS/tI9DJSQ51ORI5xCJn1etjO5TA680CgYEA4TVH l+XILaNjs/+OW9IVz0xq8lxxM+8CXPS4QL8hMFZRadmI4XtWteiPlPGujEJF7SKp B6C5UiFq1A1IaU9NQ45lBwsq8AjrNl45q/YxJmvAqPuAEnigeN9qA72e3ilasadn cVDrqqnZnojkKjZKzY0ZXY7zaJOOjHLEYCX6uTsCgYB8Ar3Ph5VpMxEsxYEqIjga T19Euo7OEBWP9w1Ri4H6ns8iHl3nqGdU/0Ms6T2ybMq0OF3YP/pGadqW1ldC1VPd MZyAQeugCQrt+xadRDBKUJd1NpOFjKg9AVfsbl9JZo9t2RZW0/shkZ5ZcoERQi/5 TgwmZ8QloCgYrgl6LZXZAQKBgDHo2OD075QNrb7qV+ZJfMPgL6NekUftJBztrxfK Q9SujIRkzU0LRIAz9f4QQZqb5VtUXxltqSRme4JbHz0XcgwStpkFBJMFpvr5jtZp TSMyphPNCOkPCqE/AgOqNlcN2yeb7fTS9idwVOYpeEdSmOlM594wHAmFCgZeON8G C7aZAoGBAI1DD0bSjdErvb/Aq4PzLxkj/JF5GMbphX9eOJG5RpEuqW6AmKyDFJKp 1RL/X7Md8MqI5KfsLPUp9N3SWrqln46NfPi0dFZwqAE2/OMQQd5L9kkPrwMnDz1e CpPDrwMxRa75FbZ6i1WzpqTb3cvTRrndGC5jrmU5mWbkoOY4oRlR -----END RSA PRIVATE KEY----- role: - controlplane - etcd - worker services: kubelet: extra_args: {"cgroup-driver":"systemd","fail-swap-on":"false"}
執行./rke --debug up --config ../kubecon2018/config/rkespec/cluster_rke.yml,用時不久即完成了單節點的kubernetes環境部署,確實是十分便捷,不過需要指出的是這里配置比較簡單,而且環境也是一個單一的節點,RKE能做的事情遠不止於此,關於RKE更多的內容github上有詳細的文檔說明。
小結
本文通過Rancher Labs首席軟件工程師Alena Prokharchyk給出demo,以具體實例的形式介紹Kubernetes CRD的實現方式,以及自定義controller的編寫,並對Rancher RKE工具進行了簡單使用。
參考文檔
https://blog.openshift.com/kubernetes-deep-dive-code-generation-customresources/
https://kubernetes.io/cn/docs/concepts/overview/working-with-objects/kubernetes-objects/
https://github.com/kubernetes/community/blob/master/contributors/devel/api-conventions.md