druid.io 是一個比較重型的數據庫查詢系統,分為5種節點 。
在此就不對數據庫進行介紹了,如果有疑問請參考白皮書:
http://pan.baidu.com/s/1eSFlIJS
單台機器的集群搭建
首先說一下通用的集群搭建,基於 0.9.1.1
下載地址 http://pan.baidu.com/s/1hrJBjlq:
修改 conf/druid/_common 內的 common.runtime.properties,參考如下配置:
-
#
-
# Licensed to Metamarkets Group Inc. (Metamarkets) under one
-
# or more contributor license agreements. See the NOTICE file
-
# distributed with this work for additional information
-
# regarding copyright ownership. Metamarkets licenses this file
-
# to you under the Apache License, Version 2.0 (the
-
# "License"); you may not use this file except in compliance
-
# with the License. You may obtain a copy of the License at
-
#
-
# http://www.apache.org/licenses/LICENSE-2.0
-
#
-
# Unless required by applicable law or agreed to in writing,
-
# software distributed under the License is distributed on an
-
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-
# KIND, either express or implied. See the License for the
-
# specific language governing permissions and limitations
-
# under the License.
-
#
-
-
#
-
# Extensions
-
#
-
-
# This is not the full list of Druid extensions, but common ones that people often use. You may need to change this list
-
# based on your particular setup.
-
#druid.extensions.loadList=["druid-kafka-eight", "druid-s3-extensions", "druid-histogram", "druid-datasketches", "druid-lookups-cached-global", "mysql-metadata-storage"]
-
druid.extensions.loadList=[ "mysql-metadata-storage"]
-
-
# If you have a different version of Hadoop, place your Hadoop client jar files in your hadoop-dependencies directory
-
# and uncomment the line below to point to your directory.
-
#druid.extensions.hadoopDependenciesDir=/my/dir/hadoop-dependencies
-
-
#
-
# Logging
-
#
-
-
# Log all runtime properties on startup. Disable to avoid logging properties on startup:
-
druid.startup.logging.logProperties= true
-
-
#
-
# Zookeeper
-
#
-
-
druid.zk.service.host= 10.202.4.22:2181
-
druid.zk.paths.base=/druid
-
-
#
-
# Metadata storage
-
#
-
-
# For Derby server on your Druid Coordinator (only viable in a cluster with a single Coordinator, no fail-over):
-
#druid.metadata.storage.type=derby
-
#druid.metadata.storage.connector.connectURI=jdbc:derby://metadata.store.ip:1527/var/druid/metadata.db;create=true
-
#druid.metadata.storage.connector.host=metadata.store.ip
-
#druid.metadata.storage.connector.port=1527
-
-
# For MySQL:
-
druid.metadata.storage.type=mysql
-
druid.metadata.storage.connector.connectURI=jdbc:mysql: //10.202.4.22:3306/druid?characterEncoding=UTF-8
-
druid.metadata.storage.connector.user=szh
-
druid.metadata.storage.connector.password= 123456
-
-
# For PostgreSQL (make sure to additionally include the Postgres extension):
-
#druid.metadata.storage.type=postgresql
-
#druid.metadata.storage.connector.connectURI=jdbc:postgresql://db.example.com:5432/druid
-
#druid.metadata.storage.connector.user=...
-
#druid.metadata.storage.connector.password=...
-
-
#
-
# Deep storage
-
#
-
-
# For local disk (only viable in a cluster if this is a network mount):
-
druid.storage.type=local
-
druid.storage.storageDirectory= var/druid/segments
-
-
# For HDFS (make sure to include the HDFS extension and that your Hadoop config files in the cp):
-
#druid.storage.type=hdfs
-
#druid.storage.storageDirectory=/druid/segments
-
-
# For S3:
-
#druid.storage.type=s3
-
#druid.storage.bucket=your-bucket
-
#druid.storage.baseKey=druid/segments
-
#druid.s3.accessKey=...
-
#druid.s3.secretKey=...
-
-
#
-
# Indexing service logs
-
#
-
-
# For local disk (only viable in a cluster if this is a network mount):
-
druid.indexer.logs.type=file
-
druid.indexer.logs.directory= var/druid/indexing-logs
-
-
# For HDFS (make sure to include the HDFS extension and that your Hadoop config files in the cp):
-
#druid.indexer.logs.type=hdfs
-
#druid.indexer.logs.directory=/druid/indexing-logs
-
-
# For S3:
-
#druid.indexer.logs.type=s3
-
#druid.indexer.logs.s3Bucket=your-bucket
-
#druid.indexer.logs.s3Prefix=druid/indexing-logs
-
-
#
-
# Service discovery
-
#
-
-
druid.selectors.indexing.serviceName=druid/overlord
-
druid.selectors.coordinator.serviceName=druid/coordinator
-
-
#
-
# Monitoring
-
#
-
-
druid.monitoring.monitors=[ "com.metamx.metrics.JvmMonitor"]
-
druid.emitter=logging
-
druid.emitter.logging.logLevel=info
0.9.1.1 默認是不帶mysql 擴展的需要自己下載,解壓后放置於extensions 下:
參考文章:
http://druid.io/docs/0.9.1.1/operations/including-extensions.html
擴展包的下載地址
http://druid.io/downloads.html
分別修改5個節點的啟動配置
/conf/druid/${serviceName}/runtime.properties
broker節點:
/conf/druid/broker/runtime.properties
-
druid.host=10.202.4.22:9102
-
druid.service=druid/broker
-
druid.port=9102
-
-
# HTTP server threads
-
druid.broker.http.numConnections=5
-
druid.server.http.numThreads=25
-
-
# Processing threads and buffers
-
druid.processing.buffer.sizeBytes=32768
-
druid.processing.numThreads=2
-
-
# Query cache
-
druid.broker.cache.useCache= true
-
druid.broker.cache.populateCache= true
-
druid.cache.type= local
-
druid.cache.sizeInBytes=2000000000
coordinator 節點:
/conf/druid/coordinator/runtime.properties
druid.host=10.202.4.22:8082
druid.service=druid/coordinator
druid.port=8082
historical 節點:
/conf/druid/historical/runtime.properties
-
druid.service=druid/historical
-
druid.host= 10.202.4.22:9002
-
druid.port= 9002
-
-
# HTTP server threads
-
druid. server.http.numThreads=25
-
-
# Processing threads and buffers
-
druid.processing.buffer.sizeBytes= 6870912
-
druid.processing.numThreads= 7
-
-
druid.historical.cache.useCache= false
-
druid.historical.cache.populateCache= false
-
-
# Segment storage
-
druid.segmentCache.locations=[{ "path":"var/druid/segment-cache","maxSize"\:13000000}]
-
druid. server.maxSize=13000000
middleManager 節點:
/conf/druid/middleManager/runtime.properties
-
druid.host= 10.202.4.22:8091
-
druid.service=druid/middleManager
-
druid.port= 8091
-
-
# Number of tasks per middleManager
-
druid.worker.capacity= 3
-
-
# Task launch parameters
-
druid.indexer.runner.javaOpts=-server -Xmx256m -Duser.timezone=UTC -Dfile.encoding=UTF -8 -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
-
druid.indexer.task.baseTaskDir= var/druid/task
-
-
# HTTP server threads
-
druid.server.http.numThreads= 25
-
-
# Processing threads and buffers
-
druid.processing.buffer.sizeBytes= 65536
-
druid.processing.numThreads= 2
-
-
# Hadoop indexing
-
druid.indexer.task.hadoopWorkingPath= var/druid/hadoop-tmp
-
druid.indexer.task.defaultHadoopCoordinates=[ "org.apache.hadoop:hadoop-client:2.3.0"]
overlord 節點:
/conf/druid/overlord/runtime.properties
-
druid.service=druid/overlord
-
druid.host= 10.202.4.22:9100
-
druid.port= 9100
-
-
#druid.indexer.queue.startDelay=PT30S
-
-
druid.indexer.runner. type=remote
-
druid.indexer.storage. type=metadata
啟動集群啟動 可以利用我寫的腳本:
-
#java `cat conf/druid/broker/jvm.config | xargs` -cp conf/druid/_common:conf/druid/broker:lib/* io.druid.cli.Main server broker
-
-
function help(){
-
echo "參數列表"
-
echo " 參數1 參數2"
-
echo " serviceName [-f]"
-
echo "參數1:serviceName: 啟動服務的名字"
-
echo "serviceName可選項:"
-
echo "1: broker"
-
echo "2: coordinator"
-
echo "3: historical"
-
echo "4: middleManager"
-
echo "5: overlord"
-
echo "參數2:[-f]: 是否前台啟動"
-
echo "-f:前台啟動,(不加)默認后台啟動"
-
}
-
-
function startService(){
-
# echo $0
-
# echo $1
-
# echo $2
-
echo $service
-
if [[ $2 == "-f" ]]; then
-
echo "前台啟動"
-
java -Xmx256m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -classpath conf/druid/_common:conf/druid/ $service:lib/* io.druid.cli.Main server $service
-
else
-
echo "后台啟動"
-
nohup java -Xmx256m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -classpath conf/druid/_common:conf/druid/ $service:lib/* io.druid.cli.Main server $service &
-
fi;
-
}
-
-
function tips(){
-
red=`tput setaf 1`
-
reset=`tput sgr0`
-
echo "${red}Not correct arguments${reset}"
-
echo "please use --help or -h for help"
-
}
-
-
if [[ $1 == "--help" || $1 == "-h" ]]; then
-
help
-
exit
-
fi
-
-
service= $1
-
-
case $service in
-
"broker")
-
;;
-
"coordinator")
-
;;
-
"historical")
-
;;
-
"middleManager")
-
;;
-
"overlord")
-
;;
-
*)
-
tips
-
exit
-
esac
-
-
if [[ $2 == "-f" || $2 == "" ]]; then
-
startService $1 $2;
-
else
-
tips
-
exit
-
fi
將上述腳本放到 druid的根目錄即可:
啟動效果如圖:
多台機器的集群搭建:
上面是單台機器的集群搭建, 擴展到多台
只需要修改 conf/druid/(broker | coordinator | historical | middleManager | overlord) 的 runtime.properties 中的
druid.host=10.202.4.22:9102 改成其他機器的IP地址即可
=========================================================
最后共享下我成功搭建的本地集群 (內含個人寫的集群啟動腳本):
http://pan.baidu.com/s/1bJjFzg
基於版本 0.9.1.1, 元數據存儲用的mysql (擴展包已經下載好了)
下載並解壓 我搭建的druid.tgz 的druid本地集群后:
1.修改各個配置文件
conf/druid/broker/runtime.properties
conf/druid/coordinator/runtime.properties
conf/druid/historical/runtime.properties
conf/druid/middleManager/runtime.properties
conf/druid/overlord/runtime.properties
druid.host=10.202.4.22:9102 改成自己的IP地址
2.修改 conf/druid/_common/common.runtime.properties 中的mysql端口地址, 用戶名,密碼, zookeeper地址等 替換為自己的地址。
Tips:mysql 沒有創建庫druid 要自己先創建(安裝mysql數據庫,並創建druid數據庫和druid用戶)
3.用根目錄下的 cluster_start_service.sh 啟動5個節點的服務
這時進入mysql數據庫 會發現druid自動創建了幾張表
4.進行測試
(1)數據導入
(2)數據查詢
切換到 test 目錄,下面有兩個文件夾:
(1)數據導入:
切換到 test_load 的目錄下
下面的 submit_csv_task.sh , submit_json_task.sh 分別是提交 csv 數據 與 json 數據的測試腳本
這里需要修改 env.sh
設置 overlord_ip 為自己的 overlord_ip 的 地址端口
這里試下執行 submit_json_task.sh 腳本,去頁面上查看 導入任務執行的狀態 ,如圖所示:
輸入的地址為 overlord 節點的地址+端口:
等待任務成功執行完成:
切換到查詢測試目錄,執行查詢腳本: