很容易让一个 StatefulSet 中的 Pod 拥有 DNS 记录。如果一个 StatefulSet 的名字是 memcached, 而它指定了关联的 serviceName 叫 memcached-cluster,那 kube-dns 就会为它的每个 pod 解析如下的 DNS A 记录:
- memcached-0.memcached-cluster.svc.cluster.local
- memcached-1.memcached-cluster.svc.cluster.local
- ...
这里假设 cluster domain 是默认的 cluster.local
。关于 StatefulSet 的 pod DNS,官方文档中也有简单说明。
那除了由 StatefulSet 管理的 pod,其它的 pod 能不能有 DNS 记录呢?就在该专栏第27章,我们看到 etcd operator 在生成 etcd 的启动命令时,使用了 pod 的 DNS 记录而不是 IP,这说明答案是肯定的—— pod 是可以有自己的 DNS 记录的。
我们来部署一个官方文档中提到的 nginx deployment,其定义如下:
apiVersion: apps/v1 # for versions before 1.9.0 use apps/v1beta2 kind: Deployment metadata: name: nginx-deployment spec: selector: matchLabels: app: nginx replicas: 2 # tells deployment to run 2 pods matching the template template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.7.9 ports: - containerPort: 80
部署后创建了两个 pod:
$ kubectl apply -f nginx-deployment.yaml $ kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE nginx-deployment-67594d6bf6-brfcw 1/1 Running 0 18m 10.32.0.6 k8s-0 <none> nginx-deployment-67594d6bf6-nnxg5 1/1 Running 0 18m 10.44.0.6 k8s-1 <none>
然后定义如下的 headless service:
apiVersion: v1 kind: Service metadata: name: nginx spec: clusterIP: None ports: - name: http port: 80 protocol: TCP targetPort: 9330 selector: app: nginx type: ClusterIP
创建该 service,并尝试解析 service DNS:
$ kubectl apply -f nginx-service.yaml service/nginx created $ kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 47m nginx ClusterIP None <none> 80/TCP 21s $ dig @10.96.0.10 nginx.default.svc.cluster.local ; <<>> DiG 9.11.3-1ubuntu1.5-Ubuntu <<>> @10.96.0.10 nginx.default.svc.cluster.local ; (1 server found) ;; global options: +cmd ;; Got answer: ;; WARNING: .local is reserved for Multicast DNS ;; You are currently testing what happens when an mDNS query is leaked to DNS ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 13949 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ; COOKIE: 095945e3dc8b19b8 (echoed) ;; QUESTION SECTION: ;nginx.default.svc.cluster.local. IN A ;; ANSWER SECTION: nginx.default.svc.cluster.local. 5 IN A 10.32.0.6 nginx.default.svc.cluster.local. 5 IN A 10.44.0.6 ;; Query time: 0 msec ;; SERVER: 10.96.0.10#53(10.96.0.10) ;; WHEN: Fri Apr 19 14:50:13 CST 2019 ;; MSG SIZE rcvd: 166
跟预期的一样,kube-dns 为 service 的名字返回了多条 A 记录,每一条对应一个 pod。上面 dig 命令中使用的 10.96.0.10 就是 kube-dns 的 cluster IP,可以在 kube-system namespace 中查看:
$ kubectl -n kube-system get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 52m
现在试试在 service 名字前面加上 pod 名字交给 kube-dns 做解析:
$ dig @10.96.0.10 nginx-deployment-67594d6bf6-brfcw.nginx.default.svc.cluster.local ; <<>> DiG 9.11.3-1ubuntu1.5-Ubuntu <<>> @10.96.0.10 nginx-deployment-67594d6bf6-brfcw.nginx.default.svc.cluster.local ; (1 server found) ;; global options: +cmd ;; Got answer: ;; WARNING: .local is reserved for Multicast DNS ;; You are currently testing what happens when an mDNS query is leaked to DNS ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 10513 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ; COOKIE: 2ddad0cd6f72dfd7 (echoed) ;; QUESTION SECTION: ;nginx-deployment-67594d6bf6-brfcw.nginx.default.svc.cluster.local. IN A ;; AUTHORITY SECTION: cluster.local. 30 IN SOA ns.dns.cluster.local. hostmaster.cluster.local. 1555656528 7200 1800 86400 30 ;; Query time: 0 msec ;; SERVER: 10.96.0.10#53(10.96.0.10) ;; WHEN: Fri Apr 19 14:51:01 CST 2019 ;; MSG SIZE rcvd: 199
无法解析。官方文档中有一段 Pod’s hostname and subdomain fields 说:
The Pod spec also has an optional subdomain field which can be used to specify its subdomain. For example, a Pod with hostname set to “foo”, and subdomain set to “bar”, in namespace “my-namespace”, will have the fully qualified domain name (FQDN) “foo.bar.my-namespace.svc.cluster.local”.
If there exists a headless service in the same namespace as the pod and with the same name as the subdomain, the cluster’s KubeDNS Server also returns an A record for the Pod’s fully qualified hostname.
编辑一下 nginx-deployment.yaml 加上 subdomain:
apiVersion: apps/v1 # for versions before 1.9.0 use apps/v1beta2 kind: Deployment metadata: name: nginx-deployment spec: selector: matchLabels: app: nginx replicas: 2 # tells deployment to run 2 pods matching the template template: metadata: labels: app: nginx spec: subdomain: nginx containers: - name: nginx image: nginx:1.7.9 ports: - containerPort: 80
更新部署再尝试解析 pod DNS (注意现在两个 pod 都是重新创建的,名字已变):
$ kubectl apply -f nginx-deployment.yaml $ kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE nginx-deployment-59d854d9c8-nbvdm 1/1 Running 0 11s 10.32.0.8 k8s-0 <none> nginx-deployment-59d854d9c8-thvkd 1/1 Running 0 13s 10.44.0.7 k8s-1 <none> $ dig @10.96.0.10 nginx-deployment-59d854d9c8-thvkd.nginx.default.svc.cluster.local ; <<>> DiG 9.11.3-1ubuntu1.5-Ubuntu <<>> @10.96.0.10 nginx-deployment-59d854d9c8-thvkd.nginx.default.svc.cluster.local ; (1 server found) ;; global options: +cmd ;; Got answer: ;; WARNING: .local is reserved for Multicast DNS ;; You are currently testing what happens when an mDNS query is leaked to DNS ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 4952 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ; COOKIE: 0e2b8426f0e8fe33 (echoed) ;; QUESTION SECTION: ;nginx-deployment-59d854d9c8-thvkd.nginx.default.svc.cluster.local. IN A ;; AUTHORITY SECTION: cluster.local. 30 IN SOA ns.dns.cluster.local. hostmaster.cluster.local. 1555658111 7200 1800 86400 30 ;; Query time: 0 msec ;; SERVER: 10.96.0.10#53(10.96.0.10) ;; WHEN: Fri Apr 19 15:18:14 CST 2019 ;; MSG SIZE rcvd: 199
还是不行!
那就试试官方文档中的例子 (还是之前链接中 Pod’s hostname and subdomain fields 那一段),不用 Deployment 直接创建 pod 吧。第一步先试试把 hostname 和 subdomain 注释掉:
apiVersion: v1 kind: Service metadata: name: default-subdomain spec: selector: name: busybox clusterIP: None ports: - name: foo # Actually, no port is needed. port: 1234 targetPort: 1234 --- apiVersion: v1 kind: Pod metadata: name: busybox1 labels: name: busybox spec: hostname: busybox-1 subdomain: default-subdomain containers: - image: busybox:1.28 command: - sleep - "3600" name: busybox --- apiVersion: v1 kind: Pod metadata: name: busybox2 labels: name: busybox spec: hostname: busybox-2 subdomain: default-subdomain containers: - image: busybox:1.28 command: - sleep - "3600" name: busybox
部署然后尝试解析 pod DNS (注意这里 hostname 和 pod 的名字有区别,中间多了减号):
$ kubectl apply -f individual-pods-example.yaml $ $ dig @10.96.0.10 busybox-1.default-subdomain.default.svc.cluster.local ; <<>> DiG 9.11.3-1ubuntu1.5-Ubuntu <<>> @10.96.0.10 busybox-1.default-subdomain.default.svc.cluster.local ; (1 server found) ;; global options: +cmd ;; Got answer: ;; WARNING: .local is reserved for Multicast DNS ;; You are currently testing what happens when an mDNS query is leaked to DNS ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12636 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ; COOKIE: 5499ded915cf1ff2 (echoed) ;; QUESTION SECTION: ;busybox-1.default-subdomain.default.svc.cluster.local. IN A ;; ANSWER SECTION: busybox-1.default-subdomain.default.svc.cluster.local. 5 IN A 10.44.0.6 ;; Query time: 0 msec ;; SERVER: 10.96.0.10#53(10.96.0.10) ;; WHEN: Fri Apr 19 15:27:38 CST 2019 ;; MSG SIZE rcvd: 163
终于可以了。我实际测试中发现 hostname 和 subdomain 二者都必须显式指定,缺一不可。我修改了一下之前的 nginx deployment 加上 hostname,果然可以解析了:
$ dig @10.96.0.10 nginx.nginx.default.svc.cluster.local ; <<>> DiG 9.11.3-1ubuntu1.5-Ubuntu <<>> @10.96.0.10 nginx.nginx.default.svc.cluster.local ; (1 server found) ;; global options: +cmd ;; Got answer: ;; WARNING: .local is reserved for Multicast DNS ;; You are currently testing what happens when an mDNS query is leaked to DNS ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16903 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ; COOKIE: 1a664224b34c5f87 (echoed) ;; QUESTION SECTION: ;nginx.nginx.default.svc.cluster.local. IN A ;; ANSWER SECTION: nginx.nginx.default.svc.cluster.local. 5 IN A 10.32.0.9 nginx.nginx.default.svc.cluster.local. 5 IN A 10.44.0.8 ;; Query time: 2 msec ;; SERVER: 10.96.0.10#53(10.96.0.10) ;; WHEN: Fri Apr 19 15:41:25 CST 2019 ;; MSG SIZE rcvd: 184
果然可以解析了。但是因为 deployment 中无法给每个 pod 指定不同的 hostname,所以两个 pod 有同样的 hostname,解析出来两个 IP,跟我们的本意就不符合了。
从这个角度也可以看出,etcd operator 是直接管理 pod,而不是通过 deployment。这可以理解,在这种场景中,没必要也不应该通过 deployment 去管理 pod。可以从源代码中看见 etcd operator 给 pod 设置 hostname 和 subdomain。在 changelog 里,Release 0.2.5 也记录了相应的变动。
其实社区也有人呼吁给让 deployment 创建的 pod 也支持 DNS 解析,但是呼声不够高,没有被采纳—— Make pod hostname/subdomain feature compatible with Deployments.