mongodb集群性能優化


mongodb集群性能優化

在前面兩篇文章,我們介紹了如何去搭建mongodb集群,這篇文章我們將介紹如何去優化mongodb的各項配置,以達到最優的效果。

警告

不做任何的優化,集群搭建完成之后,使用命令連接mongodb終端,一般會遇到以下的警告信息:

如何你是用的是我最新一版集群搭建的腳本,估計警告會少幾個,因為里面已經做了一些優化

2017-08-16T18:33:42.985+0800 I STORAGE [initandlisten] ** WARNING: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine 2017-08-16T18:33:42.985+0800 I STORAGE [initandlisten] ** See http://dochub.mongodb.org/core/prodnotes-filesystem 2017-08-16T18:33:43.024+0800 I CONTROL [initandlisten] 2017-08-16T18:33:43.024+0800 I CONTROL [initandlisten] ** WARNING: Access control is not enabled for the database. 2017-08-16T18:33:43.024+0800 I CONTROL [initandlisten] ** Read and write access to data and configuration is unrestricted. 2017-08-16T18:33:43.024+0800 I CONTROL [initandlisten] 2017-08-16T18:33:43.025+0800 I CONTROL [initandlisten] 2017-08-16T18:33:43.025+0800 I CONTROL [initandlisten] ** WARNING: You are running on a NUMA machine. 2017-08-16T18:33:43.025+0800 I CONTROL [initandlisten] ** We suggest launching mongod like this to avoid performance problems: 2017-08-16T18:33:43.025+0800 I CONTROL [initandlisten] ** numactl --interleave=all mongod [other options] 2017-08-16T18:33:43.026+0800 I CONTROL [initandlisten] 2017-08-16T18:33:43.026+0800 I CONTROL [initandlisten] ** WARNING: /proc/sys/vm/zone_reclaim_mode is 1 2017-08-16T18:33:43.026+0800 I CONTROL [initandlisten] ** We suggest setting it to 0 2017-08-16T18:33:43.026+0800 I CONTROL [initandlisten] ** http://www.kernel.org/doc/Documentation/sysctl/vm.txt 2017-08-16T18:33:43.026+0800 I CONTROL [initandlisten] 2017-08-16T18:33:43.027+0800 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. 2017-08-16T18:33:43.027+0800 I CONTROL [initandlisten] ** We suggest setting it to 'never' 2017-08-16T18:33:43.027+0800 I CONTROL [initandlisten] 2017-08-16T18:33:43.027+0800 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. 2017-08-16T18:33:43.027+0800 I CONTROL [initandlisten] ** We suggest setting it to 'never' 2017-08-16T18:33:43.027+0800 I CONTROL [initandlisten]

這其實是六條建議,下來我們分別來進行解讀;

第一條,“Using the XFS filesystem is strongly recommended with the WiredTiger storage engine” ,意思就是強烈建議啟用XFS文件系統,啟用XFS文件系統會對性能有比較好的提升,我們使用的系統是centos6.9,重新格式化磁盤就可以。

具體如何格式成XFS文件系統,請參考這篇文章:centos6構建XFS文件系統

第二條,“Access control is not enabled for the database. Read and write access to data and configuration is unrestricted.”,就是說數據安全很重要,該配置管理員賬戶、密碼、權限的就趕緊配上。

如何去配置呢,參考官方文檔吧:Re-start the MongoDB instance with access control.

第三條,“ You are running on a NUMA machine.We suggest launching mongod like this to avoid,numactl --interleave=all mongod [other options]”,我們運行的服務器CPU架構是MUMA,因此建議啟動的時候加一些參數,可以提高性能,怎么加呢,看下面的命令,想多了解一些可以參考這篇文章:Mongodb NUMA 導致的性能問題

numactl --interleave=all mongod [other options]

第四條,“/proc/sys/vm/zone_reclaim_mode is 1,We suggest setting it to 0”,就是系統的這個值現在是1,請修改為0

echo 0 > /proc/sys/vm/zone_reclaim_mode

第五條,"sys/kernel/mm/transparent_hugepage/enabled is,We suggest setting it to 'never'",建議修改將配置I為”never“

echo never > /sys/kernel/mm/transparent_hugepage/enabled

第六條,”/sys/kernel/mm/transparent_hugepage/defrag is,We suggest setting it to 'never'“,和上面基本相似。

echo never > /sys/kernel/mm/transparent_hugepage/defrag
寫測試

生產使用了5台服務器,五個分片,每個分片兩個副本集,服務器內存128G,CPU 20核,硬盤3T 做完raid10之后,數據目錄data大概是2.1T的空間。

搭建完成時候迫不及待的進行了壓測,啟動了10個線程每個線程1萬數據,沒有啟動分片的勤快下,大概用了8分鍾。啟用分片之后,進行測試啟動了100個線程,每個線程插入100萬的時候,平均每秒插入速度8000。第二天過來的時候發現分片5節點掛了,報錯如下:

項目報錯:

Caused by: com.mongodb.WriteConcernException: Write failed with error code 83 and error message 'write results unavailable from 192.168.0.35:27005 :: caused by :: Location11002: socket exception [CONNECT_ERROR] for 192.168.0.35:27005'

mongos錯誤信息:

 Failed to connect to 192.168.0.35:27005, in(checking socket for error after poll), reason: Connection refused No primary detected for set shard5

congfig 報錯:

 No primary detected for set shard5 2017-08-21T11:08:22.709+0800 W NETWORK [Balancer] Failed to connect to 192.168.0.31:27005, in(checking socket for error after poll), reason: Connection refused 2017-08-21T11:08:22.710+0800 W NETWORK [Balancer] Failed to connect to 192.168.0.35:27005, in(checking socket for error after poll), reason: Connection refused 2017-08-21T11:08:22.710+0800 W NETWORK [Balancer] No primary detected for set shard5 2017-08-21T11:08:22.710+0800 I SHARDING [Balancer] caught exception while doing balance: could not find host matching read preference { mode: "primary" } for set shard5 2017-08-21T11:08:22.710+0800 I SHARDING [Balancer] about to log metadata event into actionlog: { _id: "mongodb34.hkrt.cn-2017-08-21T11:08:22.710+0800-599a4ea698ec442a0836e2d5", server: "mongodb34.hkrt.cn", clientAddr: "", time: new Date(1503284902710), what: "balancer.round", ns: "", details: { executionTimeMillis: 20051, errorOccured: true, errmsg: "could not find host matching read preference { mode: "primary" } for set shard5" } }

剩余分片(192.168.0.32)報錯:

not master and slaveOk=false could not find host matching read preference { mode: "primary" } for set shard5

同時查看五台服務器內存,全部被使用完了,重新啟動分片5之后,查看數據只插入了3000萬條,就崩了;啟動萬之后,在進行了10萬數據的壓測,結果分片三又蹦了。

看了mongodb的官方文檔,猜測大概是內存配置的問題。找到這一句:

storage.wiredTiger.engineConfig.cacheSizeGB Type: float The maximum size of the internal cache that WiredTiger will use for all data. Changed in version 3.4: Values can range from 256MB to 10TB and can be a float. In addition, the default value has also changed. Starting in 3.4, the WiredTiger internal cache, by default, will use the larger of either: 50% of RAM minus 1 GB, or 256 MB. Avoid increasing the WiredTiger internal cache size above its default value. With WiredTiger, MongoDB utilizes both the WiredTiger internal cache and the filesystem cache. Via the filesystem cache, MongoDB automatically uses all free memory that is not used by the WiredTiger cache or by other processes. Data in the filesystem cache is compressed. *NOTE* *The storage.wiredTiger.engineConfig.cacheSizeGB limits the size of the WiredTiger internal cache. The operating system will use the available free memory for filesystem cache, which allows the compressed MongoDB data files to stay in memory. In addition, the operating system will use any free RAM to buffer file system blocks and file system cache.* *To accommodate the additional consumers of RAM, you may have to decrease WiredTiger internal cache size.* The default WiredTiger internal cache size value assumes that there is a single mongod instance per machine. If a single machine contains multiple MongoDB instances, then you should decrease the setting to accommodate the other mongod instances. If you run mongod in a container (e.g. lxc, cgroups, Docker, etc.) that does not have access to all of the RAM available in a system, you must set storage.wiredTiger.engineConfig.cacheSizeGB to a value less than the amount of RAM available in the container. The exact amount depends on the other processes running in the container.

原文地址

大概意思就是,分配的緩存默認會使用內存的一半減去1G,如果部署多個實例的時候應該相應的減少緩存大小的配置。

如果不設置的話,128G的內存,每個分片的實例使用內存的大小是63G,三個分片的使用的大小是189G遠遠超出內存128G,導致內存使用過量分片down掉。

看到這塊的時候,感覺mongodb還是不夠智能,為什么不會根據實例的多少動態平衡呢非要把內容打滿down掉,隨后我們調整了每天實例使用內存的大小,設置如下:

分片的啟動參數中添加以下配置:

storage: wiredTiger: engineConfig: cacheSizeGB: 20

每台實例的大小設置為20G。

可以使用mongostat命令來查看每個分片使用內存的大小。

第二天再次進行了壓測,100個線程每個線程插入100萬,1億數據入庫,正常,插入的速度,8000/s,高峰期間可以達到2萬/s。

查詢測試

以下為我在生產中的測試記錄

100萬數據 分頁查詢list count 秒出 , agg聚合 3秒

1000萬數據 分頁查詢 list count 秒出 , agg 聚合 14秒

6000萬數據 分頁查詢 list秒出 count 1秒時間 agg聚合時間 第一次 4分鍾 第二次 1分鍾24秒

6000萬數據 條件分頁查詢 list秒出 count 40秒時間 agg聚合時間 2分11秒

1億 分頁查詢 list count 10秒時間 agg聚合時間 3分

1億4000萬數據 分頁查詢 list秒出 count 秒出 agg聚合時間 2分11秒

3億數據 分頁查詢 list 1 count 1秒 agg聚合時間 30分

3億數據 條件分頁查詢 list 1 count 3分鍾 agg聚合時間 18分鍾

插入測試:13:35分開始 1億條 每秒插入速度2000-3000 預計12-13小時左右,實際用時30小時。

4億數據 分頁查詢 list 1 count 1秒 agg聚合時間 45分

4億數據 條件分頁查詢 list 1秒 count 6分鍾 agg聚合時間 48分鍾

插入測試 2017.08.30 13:15分開始 插入數據100億

 
點擊查看更多內容
 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM