前言
副本集部署是對數據的冗余和增加讀請求的處理能力,卻不能提高寫請求的處理能力;關鍵問題是隨着數據增加,單機硬件配置會成為性能的瓶頸。而分片集群可以很好的解決這一問題,通過水平擴展來提升性能。分片部署依賴三個組件:mongos(路由),config(配置服務),shard(分片)
shard:每個分片存儲被分片的部分數據,同時每個分片又可以部署成副本集
mongos:作為查詢路由器,為客戶端與分片集群之間通訊的提供訪問接口
config server:配置服務器存儲這個集群的元數據和配置信息

部署規划
1. 服務分布
3台機器:mongodbA 192.168.128.128,mongodbB 192.168.128.129,mongodbB 192.168.128.130
3個分片,分片和配置服務部署成副本集,如下圖:

副本集部署時,當節點數超過3個時候,需要配置仲裁節點;這里為了演示節點分布,所以配置了仲裁節點。
在實際生產環境部署的時候,可以根據機器資源靈活配置。
2. 目錄規划
mongos:
/data/mongo-cluster/mongos/log
config:
/data/mongo-cluster/config/data
/data/mongo-cluster/config/log
shard1:
/data/mongo-cluster/shard1/data
/data/mongo-cluster/shard1/log
shard2:
/data/mongo-cluster/shard2/data
/data/mongo-cluster/shard2/log
shard3:
/data/mongo-cluster/shard3/data
/data/mongo-cluster/shard3/log
3. 端口規划
mongos: 21000,22000,23000 config: 31000,32000,33000 shard1: 41010,41020,41030 shard2: 42010,42020,42030 shard3: 43010,43020,43030
准備工作
1. 下載安裝包
參考網上教程
本文檔安裝版本:
db version v3.6.4
git version: d0181a711f7e7f39e60b5aeb1dc7097bf6ae5856
OpenSSL version: OpenSSL 1.0.2g 1 Mar 2016
allocator: tcmalloc
modules: enterprise
build environment:
distmod: ubuntu1604
distarch: x86_64
target_arch: x86_64
2. 部署目錄
mkdir /data/mongo-cluster -p mkdir /data/mongo-cluster/mongos/log -p mkdir /data/mongo-cluster/shard1/data -p mkdir /data/mongo-cluster/shard1/log -p
mkdir /data/mongo-cluster/shard2/data -p
mkdir /data/mongo-cluster/shard2/log -p
mkdir /data/mongo-cluster/shard3/data -p
mkdir /data/mongo-cluster/shard3/log -p
mkdir /data/mongo-cluster/config/log -p
mkdir /data/mongo-cluster/config/data -p
mkdir /data/mongo-cluster/keyfile -p
3. 創建配置文件
cd /data/mongo-cluster
A. mongod 配置文件 mongo_node.conf
mongo_node.conf 作為mongod實例共享的配置文件,內容如下:
# mongod.conf
# for documentation of all options, see:
# http://docs.mongodb.org/manual/reference/configuration-options/
# Where and how to store data.
storage:
# dbPath: /var/lib/mongodb
journal:
enabled: true
engine: wiredTiger
# mmapv1:
# wiredTiger:
# where to write logging data.
systemLog:
destination: file
logAppend: true
# path: /var/log/mongodb/mongod.log
# network interfaces
#net:
# port: 27017
# bindIp: 127.0.0.1,192.168.147.128
# bindIpAll: true
# how the process runs
processManagement:
timeZoneInfo: /usr/share/zoneinfo
fork: true
security:
authorization: "enabled"
#operationProfiling:
#replication:
#sharding:
## Enterprise-Only Options:
#auditLog:
#snmp:
B. mongos 配置文件 mongos.conf
systemLog:
destination: file
logAppend: true
processManagement:
fork: true
4. 創建keyfile文件
cd /data/mongo-cluster mkdir keyfile
openssl rand -base64 756 > mongo.key chmod 400 mongo.key
集群搭建
mongoCluster.sh內容:
#!/bin/sh
WORK_DIR=/data/mongo-cluster
KEYFILE=$WORK_DIR/keyfile/mongo.key
CONFFILE=$WORK_DIR/mongo_node.conf
MONGOS_CONFFILE=$WORK_DIR/mongos.conf
MONGOD=/usr/bin/mongod
MONGOS=/usr/bin/mongos
mongos_start_cmd()
{
$MONGOS --port 21000 --bind_ip 127.0.0.1,192.168.128.130 --configdb configReplSet/192.168.128.130:31000,192.168.128.129:32000,192.168.128.128:33000 --keyFile $KEYFILE --pidfilepath $WORK_DIR/mongos.pid --logpath $WORK_DIR/mongos/log/mongo.log --config $MONGOS_CONFFILE
return $?
}
config_start_cmd()
{
$MONGOD --port 31000 --bind_ip 127.0.0.1,192.168.128.130 --configsvr --replSet configReplSet --keyFile $KEYFILE --dbpath $WORK_DIR/config/data --pidfilepath $WORK_DIR/mongo_config.pid --logpath $WORK_DIR/config/log/mongo.log --config $CONFFILE
return $?
}
shard_start_cmd()
{
$MONGOD --port 41010 --bind_ip 127.0.0.1,192.168.128.130 --shardsvr --replSet shard1 --keyFile $KEYFILE --dbpath $WORK_DIR/shard1/data --pidfilepath $WORK_DIR/mongo_shard1.pid --logpath $WORK_DIR/shard1/log/mongo.log --config $CONFFILE &&\
$MONGOD --port 42010 --bind_ip 127.0.0.1,192.168.128.130 --shardsvr --replSet shard2 --keyFile $KEYFILE --dbpath $WORK_DIR/shard2/data --pidfilepath $WORK_DIR/mongo_shard2.pid --logpath $WORK_DIR/shard2/log/mongo.log --config $CONFFILE &&\
$MONGOD --port 43010 --bind_ip 127.0.0.1,192.168.128.130 --shardsvr --replSet shard3 --keyFile $KEYFILE --dbpath $WORK_DIR/shard3/data --pidfilepath $WORK_DIR/mongo_shard3.pid --logpath $WORK_DIR/shard3/log/mongo.log --config $CONFFILE
}
mongos_server()
{
case $1 in
start)
mongos_start_cmd
;;
stop)
echo "stoping mongos server..."
cat mongos.pid | xargs kill -9
;;
restart)
echo "stoping mongos server..."
cat mongos.pid | xargs kill -9
echo "setup mongos server..."
mongos_start_cmd
;;
status)
pid=`cat mongos.pid`
if ps -ef | grep -v "grep" | grep $pid > /dev/null 2>&1; then
status="running"
else
status="stoped"
fi
echo "mongos server[$pid] status: ${status}"
;;
*)
echo "Usage: /etc/init.d/samba {start|stop|reload|restart|force-reload|status}"
;;
esac
}
config_server()
{
case $1 in
start)
config_start_cmd
;;
stop)
echo "stoping config server..."
cat mongo_config.pid | xargs kill -9
;;
restart)
echo "stoping config server..."
cat mongo_config.pid | xargs kill -9;
echo "setup config server..."
config_start_cmd
;;
status)
pid=`cat mongo_config.pid`
if ps -ef | grep -v "grep" | grep $pid > /dev/null 2>&1; then
status="running"
else
status="stoped"
fi
echo "config server[$pid] status: ${status}"
;;
*)
echo "Usage: `basename $0` config {start|stop|reload|restart|force-reload|status}"
;;
esac
}
shard_server()
{
case $1 in
start)
shard_start_cmd
;;
stop)
cat mongo_shard1.pid | xargs kill -9;
cat mongo_shard2.pid | xargs kill -9;
cat mongo_shard3.pid | xargs kill -9;
;;
restart)
cat mongo_shard1.pid | xargs kill -9;
cat mongo_shard2.pid | xargs kill -9;
cat mongo_shard3.pid | xargs kill -9;
shard_start_cmd
;;
status)
pid=`cat mongo_shard1.pid`
if ps -ef | grep -v "grep" | grep $pid > /dev/null 2>&1; then
status="running"
else
status="stoped"
fi
echo "shard1 server[$pid] status: ${status}"
#shard2
pid=`cat mongo_shard2.pid`
if ps -ef | grep -v "grep" | grep $pid > /dev/null; then
status="running"
else
status="stoped"
fi
echo "shard2 server[$pid] status: ${status}"
#shard3
pid=`cat mongo_shard3.pid`
if ps -ef | grep -v "grep" | grep $pid > /dev/null; then
status="running"
else
status="stoped"
fi
echo "shard3 server[$pid] status: ${status}"
;;
*)
echo "Usage: /etc/init.d/samba {start|stop|reload|restart|force-reload|status}"
;;
esac
}
all_server ()
{
case $1 in
start)
mongos_server start
config_server start
shard_server start
;;
stop)
mongos_server stop
config_server stop
shard_server stop
cat mongo_config.pid | xargs kill -9
cat mongo_shard1.pid | xargs kill -9
cat mongo_shard2.pid | xargs kill -9
cat mongo_shard3.pid | xargs kill -9
;;
restart)
mongos_server restart
config_server restart
shard_server restart
;;
status)
mongos_server status
config_server status
shard_server status
exit 0
;;
*)
echo "Usage: $0 {all|mongos|config|shard} {start|stop|reload|restart|force-reload|status}"
exit 1
;;
esac
}
#set -e
case $1 in
all)
all_server $2
;;
mongos)
mongos_server $2
;;
config)
config_server $2
;;
shard)
shard_server $2
;;
*)
echo "Usage: `basename $0` {all|mongos|config|shard} {start|stop|reload|restart|force-reload|status}"
;;
esac
對於不同主機,腳本中綁定的IP地址要做相應的修改
1. config集群部署
分別在mongodbA,mongodbB,mongodbC機器上,
執行./mongoCluster.sh config start
啟動成功,會顯示如下信息:

連接其中一個config服務,執行副本集初始化
mongo --port 31000 --host 127.0.0.1
>cfg={
_id:"configReplSet",
configsvr: true,
members:[
{_id:0, host:'192.168.128.128:33000'},
{_id:1, host:'192.168.128.129:32000'},
{_id:2, host:'192.168.128.130:31000'}
]};
>rs.initiate(cfg);
2. shard集群部署
分別在mongodbA,mongodbB,mongodbC機器上:
執行./mongoCluster.sh shard start
啟動成功,會顯示如下信息:

連接其中一個Shard進程,執行副本集初始化
mongo --port 41010 --host 127.0.0.1
>cfg={
_id:"shard1",
members:[
{_id:0, host:'192.168.128.128:41030',arbiterOnly:true},
{_id:1, host:'192.168.128.129:41020'},
{_id:2, host:'192.168.128.130:41010'}
]};
rs.initiate(cfg);
同樣shard2,shard3執行相同的操作:
shard2
cfg={
_id:"shard2",
members:[
{_id:0, host:'192.168.128.128:42030'},
{_id:1, host:'192.168.128.129:42020',arbiterOnly:true},
{_id:2, host:'192.168.128.130:42010'}
]};
rs.initiate(cfg);
shard3
cfg={
_id:"shard3",
members:[
{_id:0, host:'192.168.128.128:43030'},
{_id:1, host:'192.168.128.129:43020'},
{_id:2, host:'192.168.128.130:43010',arbiterOnly:true}
]};
rs.initiate(cfg);
3. mongos部署
分別在mongodbA,mongodbB,mongodbC機器上:
執行./mongoCluster.sh mongos start
啟動成功,會顯示如下信息:

接入其中一個mongos實例,執行添加分片操作:
./bin/mongo --port 21000 --host 127.0.0.1
>sh.addShard("shard1/192.168.128.130:41010")
{ "shardAdded" : "shard1", "ok" : 1 }
>sh.addShard("shard2/192.168.128.129:42010")
{ "shardAdded" : "shard2", "ok" : 1 }
>sh.addShard("shard2/192.168.128.128:43030")
{ "shardAdded" : "shard3", "ok" : 1 }
4. 初始化用戶
初始化mongos用戶
接入其中一個mongos實例,添加管理員用戶
>use admin
>db.createUser({
user:'admin',pwd:'Admin@01',
roles:[
{role:'clusterAdmin',db:'admin'},
{role:'userAdminAnyDatabase',db:'admin'},
{role:'dbAdminAnyDatabase',db:'admin'},
{role:'readWriteAnyDatabase',db:'admin'}
]})
當前admin用戶具有集群管理權限、所有數據庫的操作權限。
需要注意的是,在第一次創建用戶之后,localexception不再有效,接下來的所有操作要求先通過鑒權。

查看集群狀態,sh.status()

初始化shard集群用戶
分片集群中的訪問都會通過mongos入口,而鑒權數據是存儲在config副本集中的,即config實例中system.users數據庫存儲了集群用戶及角色權限配置。mongos與shard實例則通過內部鑒權(keyfile機制)完成,因此shard實例上可以通過添加本地用戶以方便操作管理。在一個副本集上,只需要在Primary節點上添加用戶及權限,相關數據會自動同步到Secondary節點。
>use admin
>db.createUser({
user:'admin',pwd:'Admin123',
roles:[
{role:'clusterAdmin',db:'admin'},
{role:'userAdminAnyDatabase',db:'admin'},
{role:'dbAdminAnyDatabase',db:'admin'},
{role:'readWriteAnyDatabase',db:'admin'}
]})
數據割接
集合分片一般都是針對大數據量的集合;小集合分片沒有必要進行分片,如果片鍵選擇不當反而會影響查詢效率
初始化數據分片
首先創建數據庫用戶uhome、為數據庫實例uhome啟動分片
use uhome
db.createUser({user:'uhome',pwd:'Uhome123',roles:[{role:'dbOwner',db:'uhome'}]})
sh.enableSharding("uhome")
以customer為例,
use uhome
db.createCollection("customer")
sh.shardCollection("uhome.
customer
", {community_id:"hashed"}, false)
導入數據:
mongoimport --port 21000 --db uhome --collection customer -u uhome -p Uhome123 --type csv --headerline --ignoreBlanks --file ./customer-mongo.csv
查看一下數據分布
db.customer.getShardDistribution()

可以看出來,分布還算均勻
參考:
