在專欄“深入剖析Kubernetes”的第20章,我們學到很容易讓一個 StatefulSet 中的 Pod 擁有 DNS 記錄。如果一個 StatefulSet 的名字是 memcached, 而它指定了關聯的 serviceName 叫 memcached-cluster,那 kube-dns 就會為它的每個 pod 解析如下的 DNS A 記錄:
- memcached-0.memcached-cluster.svc.cluster.local
- memcached-1.memcached-cluster.svc.cluster.local
- ...
這里假設 cluster domain 是默認的 cluster.local
。關於 StatefulSet 的 pod DNS,官方文檔中也有簡單說明。
那除了由 StatefulSet 管理的 pod,其它的 pod 能不能有 DNS 記錄呢?就在該專欄第27章,我們看到 etcd operator 在生成 etcd 的啟動命令時,使用了 pod 的 DNS 記錄而不是 IP,這說明答案是肯定的—— pod 是可以有自己的 DNS 記錄的。
我們來部署一個官方文檔中提到的 nginx deployment,其定義如下:
apiVersion: apps/v1 # for versions before 1.9.0 use apps/v1beta2
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginx
replicas: 2 # tells deployment to run 2 pods matching the template
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
部署后創建了兩個 pod:
$ kubectl apply -f nginx-deployment.yaml
$ kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
nginx-deployment-67594d6bf6-brfcw 1/1 Running 0 18m 10.32.0.6 k8s-0 <none>
nginx-deployment-67594d6bf6-nnxg5 1/1 Running 0 18m 10.44.0.6 k8s-1 <none>
然后定義如下的 headless service:
apiVersion: v1
kind: Service
metadata:
name: nginx
spec:
clusterIP: None
ports:
- name: http
port: 80
protocol: TCP
targetPort: 9330
selector:
app: nginx
type: ClusterIP
創建該 service,並嘗試解析 service DNS:
$ kubectl apply -f nginx-service.yaml
service/nginx created
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 47m
nginx ClusterIP None <none> 80/TCP 21s
$ dig @10.96.0.10 nginx.default.svc.cluster.local
; <<>> DiG 9.11.3-1ubuntu1.5-Ubuntu <<>> @10.96.0.10 nginx.default.svc.cluster.local
; (1 server found)
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 13949
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 095945e3dc8b19b8 (echoed)
;; QUESTION SECTION:
;nginx.default.svc.cluster.local. IN A
;; ANSWER SECTION:
nginx.default.svc.cluster.local. 5 IN A 10.32.0.6
nginx.default.svc.cluster.local. 5 IN A 10.44.0.6
;; Query time: 0 msec
;; SERVER: 10.96.0.10#53(10.96.0.10)
;; WHEN: Fri Apr 19 14:50:13 CST 2019
;; MSG SIZE rcvd: 166
跟預期的一樣,kube-dns 為 service 的名字返回了多條 A 記錄,每一條對應一個 pod。上面 dig 命令中使用的 10.96.0.10 就是 kube-dns 的 cluster IP,可以在 kube-system namespace 中查看:
$ kubectl -n kube-system get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 52m
現在試試在 service 名字前面加上 pod 名字交給 kube-dns 做解析:
$ dig @10.96.0.10 nginx-deployment-67594d6bf6-brfcw.nginx.default.svc.cluster.local
; <<>> DiG 9.11.3-1ubuntu1.5-Ubuntu <<>> @10.96.0.10 nginx-deployment-67594d6bf6-brfcw.nginx.default.svc.cluster.local
; (1 server found)
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 10513
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 2ddad0cd6f72dfd7 (echoed)
;; QUESTION SECTION:
;nginx-deployment-67594d6bf6-brfcw.nginx.default.svc.cluster.local. IN A
;; AUTHORITY SECTION:
cluster.local. 30 IN SOA ns.dns.cluster.local. hostmaster.cluster.local. 1555656528 7200 1800 86400 30
;; Query time: 0 msec
;; SERVER: 10.96.0.10#53(10.96.0.10)
;; WHEN: Fri Apr 19 14:51:01 CST 2019
;; MSG SIZE rcvd: 199
無法解析。官方文檔中有一段 Pod’s hostname and subdomain fields 說:
The Pod spec also has an optional subdomain field which can be used to specify its subdomain. For example, a Pod with hostname set to “foo”, and subdomain set to “bar”, in namespace “my-namespace”, will have the fully qualified domain name (FQDN) “foo.bar.my-namespace.svc.cluster.local”.
If there exists a headless service in the same namespace as the pod and with the same name as the subdomain, the cluster’s KubeDNS Server also returns an A record for the Pod’s fully qualified hostname.
編輯一下 nginx-deployment.yaml 加上 subdomain:
apiVersion: apps/v1 # for versions before 1.9.0 use apps/v1beta2
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginx
replicas: 2 # tells deployment to run 2 pods matching the template
template:
metadata:
labels:
app: nginx
spec:
subdomain: nginx
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
更新部署再嘗試解析 pod DNS (注意現在兩個 pod 都是重新創建的,名字已變):
$ kubectl apply -f nginx-deployment.yaml
$ kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
nginx-deployment-59d854d9c8-nbvdm 1/1 Running 0 11s 10.32.0.8 k8s-0 <none>
nginx-deployment-59d854d9c8-thvkd 1/1 Running 0 13s 10.44.0.7 k8s-1 <none>
$ dig @10.96.0.10 nginx-deployment-59d854d9c8-thvkd.nginx.default.svc.cluster.local
; <<>> DiG 9.11.3-1ubuntu1.5-Ubuntu <<>> @10.96.0.10 nginx-deployment-59d854d9c8-thvkd.nginx.default.svc.cluster.local
; (1 server found)
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 4952
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 0e2b8426f0e8fe33 (echoed)
;; QUESTION SECTION:
;nginx-deployment-59d854d9c8-thvkd.nginx.default.svc.cluster.local. IN A
;; AUTHORITY SECTION:
cluster.local. 30 IN SOA ns.dns.cluster.local. hostmaster.cluster.local. 1555658111 7200 1800 86400 30
;; Query time: 0 msec
;; SERVER: 10.96.0.10#53(10.96.0.10)
;; WHEN: Fri Apr 19 15:18:14 CST 2019
;; MSG SIZE rcvd: 199
還是不行!
那就試試官方文檔中的例子 (還是之前鏈接中 Pod’s hostname and subdomain fields 那一段),不用 Deployment 直接創建 pod 吧。第一步先試試把 hostname 和 subdomain 注釋掉:
apiVersion: v1
kind: Service
metadata:
name: default-subdomain
spec:
selector:
name: busybox
clusterIP: None
ports:
- name: foo # Actually, no port is needed.
port: 1234
targetPort: 1234
---
apiVersion: v1
kind: Pod
metadata:
name: busybox1
labels:
name: busybox
spec:
hostname: busybox-1
subdomain: default-subdomain
containers:
- image: busybox:1.28
command:
- sleep
- "3600"
name: busybox
---
apiVersion: v1
kind: Pod
metadata:
name: busybox2
labels:
name: busybox
spec:
hostname: busybox-2
subdomain: default-subdomain
containers:
- image: busybox:1.28
command:
- sleep
- "3600"
name: busybox
部署然后嘗試解析 pod DNS (注意這里 hostname 和 pod 的名字有區別,中間多了減號):
$ kubectl apply -f individual-pods-example.yaml
$ $ dig @10.96.0.10 busybox-1.default-subdomain.default.svc.cluster.local
; <<>> DiG 9.11.3-1ubuntu1.5-Ubuntu <<>> @10.96.0.10 busybox-1.default-subdomain.default.svc.cluster.local
; (1 server found)
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12636
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 5499ded915cf1ff2 (echoed)
;; QUESTION SECTION:
;busybox-1.default-subdomain.default.svc.cluster.local. IN A
;; ANSWER SECTION:
busybox-1.default-subdomain.default.svc.cluster.local. 5 IN A 10.44.0.6
;; Query time: 0 msec
;; SERVER: 10.96.0.10#53(10.96.0.10)
;; WHEN: Fri Apr 19 15:27:38 CST 2019
;; MSG SIZE rcvd: 163
終於可以了。我實際測試中發現 hostname 和 subdomain 二者都必須顯式指定,缺一不可。我修改了一下之前的 nginx deployment 加上 hostname,果然可以解析了:
$ dig @10.96.0.10 nginx.nginx.default.svc.cluster.local
; <<>> DiG 9.11.3-1ubuntu1.5-Ubuntu <<>> @10.96.0.10 nginx.nginx.default.svc.cluster.local
; (1 server found)
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16903
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 1a664224b34c5f87 (echoed)
;; QUESTION SECTION:
;nginx.nginx.default.svc.cluster.local. IN A
;; ANSWER SECTION:
nginx.nginx.default.svc.cluster.local. 5 IN A 10.32.0.9
nginx.nginx.default.svc.cluster.local. 5 IN A 10.44.0.8
;; Query time: 2 msec
;; SERVER: 10.96.0.10#53(10.96.0.10)
;; WHEN: Fri Apr 19 15:41:25 CST 2019
;; MSG SIZE rcvd: 184
果然可以解析了。但是因為 deployment 中無法給每個 pod 指定不同的 hostname,所以兩個 pod 有同樣的 hostname,解析出來兩個 IP,跟我們的本意就不符合了。
從這個角度也可以看出,etcd operator 是直接管理 pod,而不是通過 deployment。這可以理解,在這種場景中,沒必要也不應該通過 deployment 去管理 pod。可以從源代碼中看見 etcd operator 給 pod 設置 hostname 和 subdomain。在 changelog 里,Release 0.2.5 也記錄了相應的變動。
其實社區也有人呼吁給讓 deployment 創建的 pod 也支持 DNS 解析,但是呼聲不夠高,沒有被采納—— Make pod hostname/subdomain feature compatible with Deployments.