前言
副本集部署是對數據的冗余和增加讀請求的處理能力,卻不能提高寫請求的處理能力;關鍵問題是隨着數據增加,單機硬件配置會成為性能的瓶頸。而分片集群可以很好的解決這一問題,通過水平擴展來提升性能。分片部署依賴三個組件:mongos(路由),config(配置服務),shard(分片)
shard:每個分片存儲被分片的部分數據,同時每個分片又可以部署成副本集
mongos:作為查詢路由器,為客戶端與分片集群之間通訊的提供訪問接口
config server:配置服務器存儲這個集群的元數據和配置信息
部署規划
1. 服務分布
3台機器:mongodbA 192.168.128.128,mongodbB 192.168.128.129,mongodbB 192.168.128.130
3個分片,分片和配置服務部署成副本集,如下圖:
副本集部署時,當節點數超過3個時候,需要配置仲裁節點;這里為了演示節點分布,所以配置了仲裁節點。
在實際生產環境部署的時候,可以根據機器資源靈活配置。
2. 目錄規划
mongos:
/data/mongo-cluster/mongos/log
config:
/data/mongo-cluster/config/data
/data/mongo-cluster/config/log
shard1:
/data/mongo-cluster/shard1/data
/data/mongo-cluster/shard1/log
shard2:
/data/mongo-cluster/shard2/data
/data/mongo-cluster/shard2/log
shard3:
/data/mongo-cluster/shard3/data
/data/mongo-cluster/shard3/log
3. 端口規划
mongos: 21000,22000,23000 config: 31000,32000,33000 shard1: 41010,41020,41030 shard2: 42010,42020,42030 shard3: 43010,43020,43030
准備工作
1. 下載安裝包
參考網上教程
本文檔安裝版本:
db version v3.6.4 git version: d0181a711f7e7f39e60b5aeb1dc7097bf6ae5856 OpenSSL version: OpenSSL 1.0.2g 1 Mar 2016 allocator: tcmalloc modules: enterprise build environment: distmod: ubuntu1604 distarch: x86_64 target_arch: x86_64
2. 部署目錄
mkdir /data/mongo-cluster -p mkdir /data/mongo-cluster/mongos/log -p mkdir /data/mongo-cluster/shard1/data -p mkdir /data/mongo-cluster/shard1/log -p
mkdir /data/mongo-cluster/shard2/data -p
mkdir /data/mongo-cluster/shard2/log -p
mkdir /data/mongo-cluster/shard3/data -p
mkdir /data/mongo-cluster/shard3/log -p
mkdir /data/mongo-cluster/config/log -p
mkdir /data/mongo-cluster/config/data -p
mkdir /data/mongo-cluster/keyfile -p
3. 創建配置文件
cd /data/mongo-cluster
A. mongod 配置文件 mongo_node.conf
mongo_node.conf 作為mongod實例共享的配置文件,內容如下:
# mongod.conf # for documentation of all options, see: # http://docs.mongodb.org/manual/reference/configuration-options/ # Where and how to store data. storage: # dbPath: /var/lib/mongodb journal: enabled: true engine: wiredTiger # mmapv1: # wiredTiger: # where to write logging data. systemLog: destination: file logAppend: true # path: /var/log/mongodb/mongod.log # network interfaces #net: # port: 27017 # bindIp: 127.0.0.1,192.168.147.128 # bindIpAll: true # how the process runs processManagement: timeZoneInfo: /usr/share/zoneinfo fork: true security: authorization: "enabled" #operationProfiling: #replication: #sharding: ## Enterprise-Only Options: #auditLog: #snmp:
B. mongos 配置文件 mongos.conf
systemLog: destination: file logAppend: true processManagement: fork: true
4. 創建keyfile文件
cd /data/mongo-cluster mkdir keyfile
openssl rand -base64 756 > mongo.key chmod 400 mongo.key
集群搭建
mongoCluster.sh內容:
#!/bin/sh WORK_DIR=/data/mongo-cluster KEYFILE=$WORK_DIR/keyfile/mongo.key CONFFILE=$WORK_DIR/mongo_node.conf MONGOS_CONFFILE=$WORK_DIR/mongos.conf MONGOD=/usr/bin/mongod MONGOS=/usr/bin/mongos mongos_start_cmd() { $MONGOS --port 21000 --bind_ip 127.0.0.1,192.168.128.130 --configdb configReplSet/192.168.128.130:31000,192.168.128.129:32000,192.168.128.128:33000 --keyFile $KEYFILE --pidfilepath $WORK_DIR/mongos.pid --logpath $WORK_DIR/mongos/log/mongo.log --config $MONGOS_CONFFILE return $? } config_start_cmd() { $MONGOD --port 31000 --bind_ip 127.0.0.1,192.168.128.130 --configsvr --replSet configReplSet --keyFile $KEYFILE --dbpath $WORK_DIR/config/data --pidfilepath $WORK_DIR/mongo_config.pid --logpath $WORK_DIR/config/log/mongo.log --config $CONFFILE return $? } shard_start_cmd() { $MONGOD --port 41010 --bind_ip 127.0.0.1,192.168.128.130 --shardsvr --replSet shard1 --keyFile $KEYFILE --dbpath $WORK_DIR/shard1/data --pidfilepath $WORK_DIR/mongo_shard1.pid --logpath $WORK_DIR/shard1/log/mongo.log --config $CONFFILE &&\ $MONGOD --port 42010 --bind_ip 127.0.0.1,192.168.128.130 --shardsvr --replSet shard2 --keyFile $KEYFILE --dbpath $WORK_DIR/shard2/data --pidfilepath $WORK_DIR/mongo_shard2.pid --logpath $WORK_DIR/shard2/log/mongo.log --config $CONFFILE &&\ $MONGOD --port 43010 --bind_ip 127.0.0.1,192.168.128.130 --shardsvr --replSet shard3 --keyFile $KEYFILE --dbpath $WORK_DIR/shard3/data --pidfilepath $WORK_DIR/mongo_shard3.pid --logpath $WORK_DIR/shard3/log/mongo.log --config $CONFFILE } mongos_server() { case $1 in start) mongos_start_cmd ;; stop) echo "stoping mongos server..." cat mongos.pid | xargs kill -9 ;; restart) echo "stoping mongos server..." cat mongos.pid | xargs kill -9 echo "setup mongos server..." mongos_start_cmd ;; status) pid=`cat mongos.pid` if ps -ef | grep -v "grep" | grep $pid > /dev/null 2>&1; then status="running" else status="stoped" fi echo "mongos server[$pid] status: ${status}" ;; *) echo "Usage: /etc/init.d/samba {start|stop|reload|restart|force-reload|status}" ;; esac } config_server() { case $1 in start) config_start_cmd ;; stop) echo "stoping config server..." cat mongo_config.pid | xargs kill -9 ;; restart) echo "stoping config server..." cat mongo_config.pid | xargs kill -9; echo "setup config server..." config_start_cmd ;; status) pid=`cat mongo_config.pid` if ps -ef | grep -v "grep" | grep $pid > /dev/null 2>&1; then status="running" else status="stoped" fi echo "config server[$pid] status: ${status}" ;; *) echo "Usage: `basename $0` config {start|stop|reload|restart|force-reload|status}" ;; esac } shard_server() { case $1 in start) shard_start_cmd ;; stop) cat mongo_shard1.pid | xargs kill -9; cat mongo_shard2.pid | xargs kill -9; cat mongo_shard3.pid | xargs kill -9; ;; restart) cat mongo_shard1.pid | xargs kill -9; cat mongo_shard2.pid | xargs kill -9; cat mongo_shard3.pid | xargs kill -9; shard_start_cmd ;; status) pid=`cat mongo_shard1.pid` if ps -ef | grep -v "grep" | grep $pid > /dev/null 2>&1; then status="running" else status="stoped" fi echo "shard1 server[$pid] status: ${status}" #shard2 pid=`cat mongo_shard2.pid` if ps -ef | grep -v "grep" | grep $pid > /dev/null; then status="running" else status="stoped" fi echo "shard2 server[$pid] status: ${status}" #shard3 pid=`cat mongo_shard3.pid` if ps -ef | grep -v "grep" | grep $pid > /dev/null; then status="running" else status="stoped" fi echo "shard3 server[$pid] status: ${status}" ;; *) echo "Usage: /etc/init.d/samba {start|stop|reload|restart|force-reload|status}" ;; esac } all_server () { case $1 in start) mongos_server start config_server start shard_server start ;; stop) mongos_server stop config_server stop shard_server stop cat mongo_config.pid | xargs kill -9 cat mongo_shard1.pid | xargs kill -9 cat mongo_shard2.pid | xargs kill -9 cat mongo_shard3.pid | xargs kill -9 ;; restart) mongos_server restart config_server restart shard_server restart ;; status) mongos_server status config_server status shard_server status exit 0 ;; *) echo "Usage: $0 {all|mongos|config|shard} {start|stop|reload|restart|force-reload|status}" exit 1 ;; esac } #set -e case $1 in all) all_server $2 ;; mongos) mongos_server $2 ;; config) config_server $2 ;; shard) shard_server $2 ;; *) echo "Usage: `basename $0` {all|mongos|config|shard} {start|stop|reload|restart|force-reload|status}" ;; esac
對於不同主機,腳本中綁定的IP地址要做相應的修改
1. config集群部署
分別在mongodbA,mongodbB,mongodbC機器上,
執行./mongoCluster.sh config start
啟動成功,會顯示如下信息:
連接其中一個config服務,執行副本集初始化
mongo --port 31000 --host 127.0.0.1
>cfg={ _id:"configReplSet", configsvr: true, members:[ {_id:0, host:'192.168.128.128:33000'}, {_id:1, host:'192.168.128.129:32000'}, {_id:2, host:'192.168.128.130:31000'} ]}; >rs.initiate(cfg);
2. shard集群部署
分別在mongodbA,mongodbB,mongodbC機器上:
執行./mongoCluster.sh shard start
啟動成功,會顯示如下信息:
連接其中一個Shard進程,執行副本集初始化
mongo --port 41010 --host 127.0.0.1
>cfg={ _id:"shard1", members:[ {_id:0, host:'192.168.128.128:41030',arbiterOnly:true}, {_id:1, host:'192.168.128.129:41020'}, {_id:2, host:'192.168.128.130:41010'} ]}; rs.initiate(cfg);
同樣shard2,shard3執行相同的操作:
shard2
cfg={ _id:"shard2", members:[ {_id:0, host:'192.168.128.128:42030'}, {_id:1, host:'192.168.128.129:42020',arbiterOnly:true}, {_id:2, host:'192.168.128.130:42010'} ]}; rs.initiate(cfg);
shard3
cfg={ _id:"shard3", members:[ {_id:0, host:'192.168.128.128:43030'}, {_id:1, host:'192.168.128.129:43020'}, {_id:2, host:'192.168.128.130:43010',arbiterOnly:true} ]}; rs.initiate(cfg);
3. mongos部署
分別在mongodbA,mongodbB,mongodbC機器上:
執行./mongoCluster.sh mongos start
啟動成功,會顯示如下信息:
接入其中一個mongos實例,執行添加分片操作:
./bin/mongo --port 21000 --host 127.0.0.1
>sh.addShard("shard1/192.168.128.130:41010") { "shardAdded" : "shard1", "ok" : 1 } >sh.addShard("shard2/192.168.128.129:42010") { "shardAdded" : "shard2", "ok" : 1 } >sh.addShard("shard2/192.168.128.128:43030") { "shardAdded" : "shard3", "ok" : 1 }
4. 初始化用戶
初始化mongos用戶
接入其中一個mongos實例,添加管理員用戶
>use admin >db.createUser({ user:'admin',pwd:'Admin@01', roles:[ {role:'clusterAdmin',db:'admin'}, {role:'userAdminAnyDatabase',db:'admin'}, {role:'dbAdminAnyDatabase',db:'admin'}, {role:'readWriteAnyDatabase',db:'admin'} ]})
當前admin用戶具有集群管理權限、所有數據庫的操作權限。
需要注意的是,在第一次創建用戶之后,localexception不再有效,接下來的所有操作要求先通過鑒權。
查看集群狀態,sh.status()
初始化shard集群用戶
分片集群中的訪問都會通過mongos入口,而鑒權數據是存儲在config副本集中的,即config實例中system.users數據庫存儲了集群用戶及角色權限配置。mongos與shard實例則通過內部鑒權(keyfile機制)完成,因此shard實例上可以通過添加本地用戶以方便操作管理。在一個副本集上,只需要在Primary節點上添加用戶及權限,相關數據會自動同步到Secondary節點。
>use admin >db.createUser({ user:'admin',pwd:'Admin123', roles:[ {role:'clusterAdmin',db:'admin'}, {role:'userAdminAnyDatabase',db:'admin'}, {role:'dbAdminAnyDatabase',db:'admin'}, {role:'readWriteAnyDatabase',db:'admin'} ]})
數據割接
集合分片一般都是針對大數據量的集合;小集合分片沒有必要進行分片,如果片鍵選擇不當反而會影響查詢效率
初始化數據分片
首先創建數據庫用戶uhome、為數據庫實例uhome啟動分片
use uhome db.createUser({user:'uhome',pwd:'Uhome123',roles:[{role:'dbOwner',db:'uhome'}]}) sh.enableSharding("uhome")
以customer為例,
use uhome db.createCollection("customer") sh.shardCollection("uhome.
customer
", {community_id:"hashed"}, false)
導入數據:
mongoimport --port 21000 --db uhome --collection customer -u uhome -p Uhome123 --type csv --headerline --ignoreBlanks --file ./customer-mongo.csv
查看一下數據分布
db.customer.getShardDistribution()
可以看出來,分布還算均勻
參考: