在Kubernetes中,部署MongoDB主要用到的是mongo-db-sidecar。
Docker Hub
1. 架構
Mongodb的集群搭建方式主要有三種,主從模式,Replica set模式,sharding模式, 三種模式各有優劣,適用於不同的場合,屬Replica set應用最為廣泛,主從模式現在用的較少,sharding模式最為完備,但配置維護較為復雜。mongo-db-sidecar使用的是Replica set模式,Mongodb的Replica Set即副本集方式主要有兩個目的,一個是數據冗余做故障恢復使用,當發生硬件故障或者其它原因造成的宕機時,可以使用副本進行恢復。另一個是做讀寫分離,讀的請求分流到副本上,減輕主(Primary)的讀壓力。
二進制部署MongoDB集群無需其他服務,直接在主節點執行類似以下的命令即可創建集群:
cfg={ _id:"testdb", members:[ {_id:0,host:'192.168.255.141:27017',priority:2}, {_id:1,host:'192.168.255.142:27017',priority:1}, {_id:2,host:'192.168.255.142:27019',arbiterOnly:true}] };
rs.initiate(cfg)
2. 部署
本文是部署Mongodb的實踐,因為此服務需要用到namespace下的pods的list權限進行集群操作,所以如果在實際部署時,請記得先進行2.5的RBAC操作,然后再進行2.4的Statefulset部署。
2.1 Namespace
kubectl create ns mongo
2.2 StorageClass
這里需要提前部署好NFS或者其他可提供SC的存儲集群。
Kubernetes使用NFS做持久化存儲
# mongo-clutser-sc.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: mongodb-data
provisioner: fuseim.pri/ifs
# create
kubectl create -f mongo-clutser-sc.yaml
2.3 Headless Service
apiVersion: v1
kind: Service
metadata:
name: mongo
namespace: mongo
labels:
name: mongo
spec:
ports:
- port: 27017
targetPort: 27017
clusterIP: None
selector:
role: mongo
2.4 Statefulset
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: mongo
namespace: mongo
spec:
serviceName: "mongo"
replicas: 3
template:
metadata:
labels:
role: mongo
environment: prod
spec:
terminationGracePeriodSeconds: 10
containers:
- name: mongo
image: harbor.s.com/redis/mongo:3.4.22
command:
- mongod
- "--replSet"
- rs0
- "--bind_ip"
- 0.0.0.0
- "--smallfiles"
- "--noprealloc"
ports:
- containerPort: 27017
volumeMounts:
- name: mongo-persistent-storage
mountPath: /data/db
- name: mongo-sidecar
image: harbor.s.com/redis/mongo-k8s-sidecar
env:
- name: MONGO_SIDECAR_POD_LABELS
value: "role=mongo,environment=prod"
volumeClaimTemplates:
- metadata:
name: mongo-persistent-storage
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: mongodb-data
resources:
requests:
storage: 10Gi
2.5 RBAC
這時候查看集群狀態,發現是不可用的。
kubectl exec -it mongo-0 -n mongo -- mongo
Defaulting container name to mongo.
Use 'kubectl describe pod/mongo-0 -n mongo' to see all of the containers in this pod.
MongoDB shell version v3.4.22
connecting to: mongodb://127.0.0.1:27017
MongoDB server version: 3.4.22
Server has startup warnings:
2019-08-24T09:23:57.039+0000 I CONTROL [initandlisten]
2019-08-24T09:23:57.039+0000 I CONTROL [initandlisten] ** WARNING: Access control is not enabled for the database.
2019-08-24T09:23:57.039+0000 I CONTROL [initandlisten] ** Read and write access to data and configuration is unrestricted.
2019-08-24T09:23:57.039+0000 I CONTROL [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended.
2019-08-24T09:23:57.039+0000 I CONTROL [initandlisten]
2019-08-24T09:23:57.040+0000 I CONTROL [initandlisten]
2019-08-24T09:23:57.040+0000 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'.
2019-08-24T09:23:57.040+0000 I CONTROL [initandlisten] ** We suggest setting it to 'never'
2019-08-24T09:23:57.040+0000 I CONTROL [initandlisten]
2019-08-24T09:23:57.040+0000 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.
2019-08-24T09:23:57.040+0000 I CONTROL [initandlisten] ** We suggest setting it to 'never'
2019-08-24T09:23:57.040+0000 I CONTROL [initandlisten]
> rs.status()
{
"info" : "run rs.initiate(...) if not yet done for the set",
"ok" : 0,
"errmsg" : "no replset config has been received",
"code" : 94,
"codeName" : "NotYetInitialized"
}
>
應該是mongo k8s sidecar沒有正確的配置,查看其日志:
kubectl logs mongo-0 mongo-sidecar -n mongo
···
Error in workloop { [Error: [object Object]]
message:
{ kind: 'Status',
apiVersion: 'v1',
metadata: {},
status: 'Failure',
message:
'pods is forbidden: User "system:serviceaccount:mongo:default" cannot list resource "pods" in API group "" at the cluster scope',
reason: 'Forbidden',
details: { kind: 'pods' },
code: 403 },
statusCode: 403 }
信息顯示默認分配的sa賬號沒有list此namespace下pods的權限,搜索了下這個問題早在很久之前在github上就有人提出,作者也給出了對應的解決方案,需要給默認的sa賬號提權,增加list pods的權限,但是實際測試發現雖然給system:serviceaccount:mongo:dafault賦予pods的list權限,仍然會報錯,以下是rbac配置:
mongo-k8s-sidecar/role.yaml at 2640ed1c2971b1279c2961efd257cde9fbe39574 · cvallance/mongo-k8s-sidecar · GitHub
# 使用后仍然無用的配置
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
namespace: mongo
name: mongo-pod-read
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "watch", "list"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: mongo-pod-read
namespace: mongo
subjects:
- kind: ServiceAccount
name: default
namespace: mongo
roleRef:
kind: Role
name: mongo-pod-read
apiGroup: rbac.authorization.k8s.io
所以我們需要重新想辦法,給此sa更大的權限,這里使用默認的clusterrole view權限進行賦權,我們可以使用clusterrole對sa進行界定namespace的賦權,相當於clusterrole是一個可以進行clusterrole與role進行binding的模板:
GCE - K8s 1.8 - pods is forbidden - Cannot list pods - Unknown user "system:serviceaccount:default:default" · Issue #75 · cvallance/mongo-k8s-sidecar · GitHub
# 正確的rbac配置
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: mongo-default-view
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: view
subjects:
- kind: ServiceAccount
name: default
namespace: mongo
但是pod在創建后,是無法更換sa賬號與sa權限的,所以需要重建pod:
# 查看statefulset
kubectl get statefulset -n mongo
NAME READY AGE
mongo 3/3 23h
# scale
kubectl scale statefulset mongo -n mongo --replicas=0
statefulset.apps/mongo scaled
# 過會重新配置副本數為3
kubectl scale statefulset mongo -n mongo --replicas=3
statefulset.apps/mongo scaled
# 查看已經建立完畢
kubectl get all -n mongo
NAME READY STATUS RESTARTS AGE
pod/mongo-0 2/2 Running 0 21s
pod/mongo-1 2/2 Running 0 17s
pod/mongo-2 2/2 Running 0 12s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/mongo ClusterIP None <none> 27017/TCP 23h
NAME READY AGE
statefulset.apps/mongo 3/3 23h
再次查看集群狀態,發現狀態已經正常,集群創建成功:
kubectl exec -it mongo-0 -n mongo -- mongo
rs0:PRIMARY> rs.status()
{
"set" : "rs0",
"date" : ISODate("2019-08-25T08:58:12.550Z"),
"myState" : 1,
"term" : NumberLong(2),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"heartbeatIntervalMillis" : NumberLong(2000),
"optimes" : {
"lastCommittedOpTime" : {
"ts" : Timestamp(1566723485, 1),
"t" : NumberLong(2)
},
"appliedOpTime" : {
"ts" : Timestamp(1566723485, 1),
"t" : NumberLong(2)
},
"durableOpTime" : {
"ts" : Timestamp(1566723485, 1),
"t" : NumberLong(2)
}
},
"members" : [
{
"_id" : 0,
"name" : "10.244.4.87:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 19,
"optime" : {
"ts" : Timestamp(1566723485, 1),
"t" : NumberLong(2)
},
"optimeDurable" : {
"ts" : Timestamp(1566723485, 1),
"t" : NumberLong(2)
},
"optimeDate" : ISODate("2019-08-25T08:58:05Z"),
"optimeDurableDate" : ISODate("2019-08-25T08:58:05Z"),
"lastHeartbeat" : ISODate("2019-08-25T08:58:11.877Z"),
"lastHeartbeatRecv" : ISODate("2019-08-25T08:58:11.192Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "",
"syncingTo" : "10.244.3.65:27017",
"syncSourceHost" : "10.244.3.65:27017",
"syncSourceId" : 3,
"infoMessage" : "",
"configVersion" : 171757
},
{
"_id" : 1,
"name" : "10.244.5.9:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 19,
"optime" : {
"ts" : Timestamp(1566723485, 1),
"t" : NumberLong(2)
},
"optimeDurable" : {
"ts" : Timestamp(1566723485, 1),
"t" : NumberLong(2)
},
"optimeDate" : ISODate("2019-08-25T08:58:05Z"),
"optimeDurableDate" : ISODate("2019-08-25T08:58:05Z"),
"lastHeartbeat" : ISODate("2019-08-25T08:58:11.875Z"),
"lastHeartbeatRecv" : ISODate("2019-08-25T08:58:11.478Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "",
"syncingTo" : "10.244.4.87:27017",
"syncSourceHost" : "10.244.4.87:27017",
"syncSourceId" : 0,
"infoMessage" : "",
"configVersion" : 171757
},
{
"_id" : 3,
"name" : "10.244.3.65:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 80,
"optime" : {
"ts" : Timestamp(1566723485, 1),
"t" : NumberLong(2)
},
"optimeDate" : ISODate("2019-08-25T08:58:05Z"),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "could not find member to sync from",
"electionTime" : Timestamp(1566723473, 1),
"electionDate" : ISODate("2019-08-25T08:57:53Z"),
"configVersion" : 171757,
"self" : true,
"lastHeartbeatMessage" : ""
}
],
"ok" : 1
}
rs0:PRIMARY>
2.6 擴容
如果需要對mongo擴容,只需要調整statefulset的replicas即可:
kubectl scale statefulset mongo --replicas=4 -n mongo
3. 使用/訪問
mongo cluster訪問默認連接為:
mongodb://mongo1,mongo2,mongo3:27017/dbname_?
在kubernetes中最常用的FQDN連接服務的連接為:
#appName.$HeadlessServiceName.$Namespace.svc.cluster.local
因為我們采用statefulset部署的pod,所以命名均有規則,所以實際上如果連接4副本的mongodb cluster,上面的默認連接該為(默認為namespace之外):
mongodb://mongo-0.mongo.mongo.svc.cluster.local:27017,mongo-1.mongo.mongo.svc.cluster.local:27017,mongo-2.mongo.mongo.svc.cluster.local:27017,mongo-3.mongo.mongo.svc.cluster.local:27017/?replicaSet=rs0
4. 監控
使用helm chart prometheus-mongodb-exporter進行監控。
4.1 部署exporter
注意,這里的uri后如果是集群,必須使用“”,不然會各種告警,我在這踩了無數的坑
看起來uri是固定的,而不是自動發現,所以如果需要對集群的副本進行增加或刪除,則需要helm修改uri,更新配置后重建pod。
image是為了內網容易部署,將默認image下載后放入harbor,並未做任何其他修改,可忽略。
# vi values.yaml 編輯定制參數
mongodb:
uri: "mongodb://mongo-0.mongo.mongo.svc.cluster.local:27017,mongo-1.mongo.mongo.svc.cluster.local:27017,mongo-2.mongo.mongo.svc.cluster.local:27017,mongo-3.mongo.mongo.svc.cluster.local:27017/?replicaSet=rs0"
image:
repository: harbor.s.com/mongo/mongodb-exporter
tag: 0.7.0
# 部署
helm upgrade --install mongo-exporter stable/prometheus-mongodb-exporter -f values.yaml --namespace mongo --force
# 查看結果
kubectl port-forward service/mongo-exporter-prometheus-mongodb-exporter 9216
curl http://127.0.0.1:9216/metrics
4.2 配置prometheus operator
感謝:
1.Kubernetes RBAC 詳解-www.qikqiak.com|陽明的博客|Kubernetes|Docker|istio|Python|Golang|Cloud Native
2.使用 StatefulSet 搭建 MongoDB 集群 - Kubernetes - Wiki.Shileizcc.com
3.GitHub - cvallance/mongo-k8s-sidecar: Kubernetes sidecar for Mongo
4.MongoDB集群搭建及使用 - SuperMap技術控 - CSDN博客
