背景
前文簡單介紹了下Clickhouse的安裝和客戶端使用,在實際生產環境中,Clickhouse常常是以集群模式部署的,由於很多系統不滿足sse4.2指令,這里使用docker來搭建一個Clickhouse的集群。
1. 環境說明
1.1 機器列表
機器名
|
IP
|
配置
|
操作系統
|
部署的服務
|
備注
|
server01
|
192.168.21.21
|
8c8g
|
centos7.3
|
clickhouserver(cs01-01)和
clickhouserver(cs01-02)
|
clickhouse01-01: 實例1, 端口: tcp 9000, http 8123, 同步端口9009, 類型: 分片1, 副本1
clickhouse01-02: 實例2, 端口: tcp 9001, http 8124, 同步端口9010, 類型: 分片2, 副本2 (clickhouse2的副本)
|
server02
|
192.168.21.69
|
8c8g
|
centos7.3
|
clickhouserver(cs02-01)和
clickhouserver(cs02-02)
|
clickhouse02-01: 實例1, 端口: tcp 9000, http 8123, 同步端口9009, 類型: 分片2, 副本1
clickhouse02-02: 實例2, 端口: tcp 9001, http 8124, 同步端口9010, 類型: 分片3, 副本2 (clickhouse3的副本)
|
server03
|
192.168.21.6
|
8c8g
|
centos7.3
|
clickhouserver(cs03-01)和
clickhouserver(cs03-02)
|
clickhouse03-01: 實例1, 端口: tcp 9000, http 8123, 同步端口9009, 類型: 分片3, 副本1
clickhouse03-02: 實例2, 端口: tcp 9001, http 8124, 同步端口9010, 類型: 分片1, 副本2 (clickhouse1的副本)
|
1.2 機器初始化
1.2.1 配置host
執行:vi /etc/hosts, 加入下面三行:
192.168.21.21 server01
192.168.21.69 server02
192.168.21.6 server03
1.2.2 安裝docker
每台機器上均安裝相同版本的docker
參照:docker環境搭建
1.2.3 安裝Zookeeper
1.3 目錄初始化
1.3.1 創建對應本地路徑
分別在三台服務器,創建數據存儲目錄:
mkdir /data/clickhouse
1.3.2 獲取clickhouse-server的配置
1)在server01服務器做如下操作
先按照官方教程的docker命令啟動Clickhouse-Server
docker run -d --name clickhouse-server --ulimit nofile=262144:262144 --volume=/data/clickhouse/:/var/lib/clickhouse yandex/clickhouse-server
2)啟動完成后,復制容器內的配置文件到本機目錄下
#拷貝容器內容的配置到/etc目錄下
docker cp clickhouse-server:/etc/clickhouse-server/ /etc/
#將server01上的目錄重命名
cp -rf /etc/clickhouse-server/ /etc/clickhouse-server01/
cp -rf /etc/clickhouse-server/ /etc/clickhouse-server02/
3)然后將/etc/clickhouse-server/ 分別拷貝到每個機器上
#拷貝配置到server02上
scp /etc/clickhouse-server/ server02:/etc/clickhouse-server01/
scp /etc/clickhouse-server/ server02:/etc/clickhouse-server02/
#拷貝配置到server03上
scp /etc/clickhouse-server/ server03:/etc/clickhouse-server01/
scp /etc/clickhouse-server/ server03:/etc/clickhouse-server02/
2. 集群環境搭建
2.1 集群環境拓撲圖
集群環境說明:
clickhouse01-01: 實例1, 端口: tcp 9000, http 8123, 同步端口9009, 類型: 分片1, 副本1
clickhouse01-02: 實例2, 端口: tcp 9001, http 8124, 同步端口9010, 類型: 分片2, 副本2 (clickhouse2的副本)
clickhouse02-01: 實例1, 端口: tcp 9000, http 8123, 同步端口9009, 類型: 分片2, 副本1
clickhouse02-02: 實例2, 端口: tcp 9001, http 8124, 同步端口9010, 類型: 分片3, 副本2 (clickhouse3的副本)
clickhouse03-01: 實例1, 端口: tcp 9000, http 8123, 同步端口9009, 類型: 分片3, 副本1
clickhouse03-02: 實例2, 端口: tcp 9001, http 8124, 同步端口9010, 類型: 分片1, 副本2 (clickhouse1的副本)
2.2 配置集群
2.2.1 待修改的配置文件
需要修改的配置有兩個(如果有需要也可以配置user.xml):
- /etc/clickhouse-server/config.xml
- /etc/clickhouse-server/metrika.xml(新增文件)
2.2.2 server1上配置clickhouse-server的實例
2.2.2.1 clickhouse-01-01的配置:
1)/etc/clickhouse-server01/config.xml(其他實例此配置內容和這個一樣就行)
修改include from節點為實際的引用到的文件
<!-- If element has 'incl' attribute, then for it's value will be used corresponding substitution from another file. By default, path to file with substitutions is /etc/metrika.xml. It could be changed in config in 'include_from' element. Values for substitutions are specified in /yandex/name_of_substitution elements in that file. --> <include_from>/etc/clickhouse-server/metrika.xml</include_from> <listen_host>0.0.0.0</listen_host> <listen_host>127.0.0.1</listen_host>
2)/etc/clickhouse-server01/metrika.xml(所有實例的配置內容都和這個一樣就行)
<!--所有實例均使用這個集群配置,不用個性化 --> <yandex> <!-- 集群配置 --> <!-- clickhouse_remote_servers所有實例配置都一樣 --> <!-- 集群配置 --> <clickhouse_remote_servers> <cluster_3s_1r> <!-- 數據分片1 --> <shard> <internal_replication>true</internal_replication> <replica> <host>server01</host> <port>9000</port> <user>default</user> <password></password> </replica> <replica> <host>server03</host> <port>9001</port> <user>default</user> <password></password> </replica> </shard> <!-- 數據分片2 --> <shard> <internal_replication>true</internal_replication> <replica> <host>server02</host> <port>9000</port> <user>default</user> <password></password> </replica> <replica> <host>server01</host> <port>9001</port> <user>default</user> <password></password> </replica> </shard> <!-- 數據分片3 --> <shard> <internal_replication>true</internal_replication> <replica> <host>server03</host> <port>9000</port> <user>default</user> <password></password> </replica> <replica> <host>server02</host> <port>9001</port> <user>default</user> <password></password> </replica> </shard> </cluster_3s_1r> </clickhouse_remote_servers> <!-- ZK --> <!-- zookeeper_servers所有實例配置都一樣 --> <zookeeper-servers> <node index="1"> <host>192.168.21.66</host> <port>2181</port> </node> <node index="2"> <host>192.168.21.57</host> <port>2181</port> </node> <node index="3"> <host>192.168.21.17</host> <port>2181</port> </node> </zookeeper-servers> <!-- marcos每個實例配置不一樣 分片1, 副本1 --> <macros> <layer>01</layer> <shard>01</shard> <replica>cluster01-01-1</replica> </macros> <networks> <ip>::/0</ip> </networks> <!-- 數據壓縮算法 --> <clickhouse_compression> <case> <min_part_size>10000000000</min_part_size> <min_part_size_ratio>0.01</min_part_size_ratio> <method>lz4</method> </case> </clickhouse_compression> </yandex>
2.2.2.2 clickhouse-01-02的配置:
1)/etc/clickhouse-server02/metrika.xml(所有實例的配置內容都和這個一樣就行)
<!--所有實例均使用這個集群配置,不用個性化 --> <yandex> <!-- 集群配置 --> <!-- clickhouse_remote_servers所有實例配置都一樣 --> <clickhouse_remote_servers> <cluster_3s_1r> <!-- 數據分片1 --> <shard> <internal_replication>true</internal_replication> <replica> <host>server01</host> <port>9000</port> <user>default</user> <password></password> </replica> <replica> <host>server03</host> <port>9001</port> <user>default</user> <password></password> </replica> </shard> <!-- 數據分片2 --> <shard> <internal_replication>true</internal_replication> <replica> <host>server02</host> <port>9000</port> <user>default</user> <password></password> </replica> <replica> <host>server01</host> <port>9001</port> <user>default</user> <password></password> </replica> </shard> <!-- 數據分片3 --> <shard> <internal_replication>true</internal_replication> <replica> <host>server03</host> <port>9000</port> <user>default</user> <password></password> </replica> <replica> <host>server02</host> <port>9001</port> <user>default</user> <password></password> </replica> </shard> </cluster_3s_1r> </clickhouse_remote_servers> <!-- ZK --> <!-- zookeeper_servers所有實例配置都一樣 --> <zookeeper-servers> <node index="1"> <host>192.168.21.66</host> <port>2181</port> </node> <node index="2"> <host>192.168.21.57</host> <port>2181</port> </node> <node index="3"> <host>192.168.21.17</host> <port>2181</port> </node> </zookeeper-servers> <!-- marcos每個實例配置不一樣 分片2, 副本2--> <macros> <layer>01</layer> <shard>02</shard> <replica>cluster01-02-2</replica> </macros> <networks> <ip>::/0</ip> </networks> <!-- 數據壓縮算法 --> <clickhouse_compression> <case> <min_part_size>10000000000</min_part_size> <min_part_size_ratio>0.01</min_part_size_ratio> <method>lz4</method> </case> </clickhouse_compression> </yandex>
2.2.3 server2上配置clickhouse-server的實例
2.2.3.1 clickhouse-02-01的配置:
/etc/clickhouse-server01/metrika.xml
<!--所有實例均使用這個集群配置,不用個性化 --> <yandex> <!-- 集群配置 --> <!-- clickhouse_remote_servers所有實例配置都一樣 --> <clickhouse_remote_servers> <cluster_3s_1r> <!-- 數據分片1 --> <shard> <internal_replication>true</internal_replication> <replica> <host>server01</host> <port>9000</port> <user>default</user> <password></password> </replica> <replica> <host>server03</host> <port>9001</port> <user>default</user> <password></password> </replica> </shard> <!-- 數據分片2 --> <shard> <internal_replication>true</internal_replication> <replica> <host>server02</host> <port>9000</port> <user>default</user> <password></password> </replica> <replica> <host>server01</host> <port>9001</port> <user>default</user> <password></password> </replica> </shard> <!-- 數據分片3 --> <shard> <internal_replication>true</internal_replication> <replica> <host>server03</host> <port>9000</port> <user>default</user> <password></password> </replica> <replica> <host>server02</host> <port>9001</port> <user>default</user> <password></password> </replica> </shard> </cluster_3s_1r> </clickhouse_remote_servers> <!-- ZK --> <!-- zookeeper_servers所有實例配置都一樣 --> <zookeeper-servers> <node index="1"> <host>192.168.21.66</host> <port>2181</port> </node> <node index="2"> <host>192.168.21.57</host> <port>2181</port> </node> <node index="3"> <host>192.168.21.17</host> <port>2181</port> </node> </zookeeper-servers> <!-- marcos每個實例配置不一樣 分片2, 副本1--> <macros> <layer>01</layer> <shard>02</shard> <replica>cluster01-02-1</replica> </macros> <networks> <ip>::/0</ip> </networks> <!-- 數據壓縮算法 --> <clickhouse_compression> <case> <min_part_size>10000000000</min_part_size> <min_part_size_ratio>0.01</min_part_size_ratio> <method>lz4</method> </case> </clickhouse_compression> </yandex>
2.2.3.2 clickhouse-02-02的配置:
/etc/clickhouse-server02/metrika.xml
<!--所有實例均使用這個集群配置,不用個性化 --> <yandex> <!-- 集群配置 --> <!-- clickhouse_remote_servers所有實例配置都一樣 --> <clickhouse_remote_servers> <cluster_3s_1r> <!-- 數據分片1 --> <shard> <internal_replication>true</internal_replication> <replica> <host>server01</host> <port>9000</port> <user>default</user> <password></password> </replica> <replica> <host>server03</host> <port>9001</port> <user>default</user> <password></password> </replica> </shard> <!-- 數據分片2 --> <shard> <internal_replication>true</internal_replication> <replica> <host>server02</host> <port>9000</port> <user>default</user> <password></password> </replica> <replica> <host>server01</host> <port>9001</port> <user>default</user> <password></password> </replica> </shard> <!-- 數據分片3 --> <shard> <internal_replication>true</internal_replication> <replica> <host>server03</host> <port>9000</port> <user>default</user> <password></password> </replica> <replica> <host>server02</host> <port>9001</port> <user>default</user> <password></password> </replica> </shard> </cluster_3s_1r> </clickhouse_remote_servers> <!-- ZK --> <!-- zookeeper_servers所有實例配置都一樣 --> <zookeeper-servers> <node index="1"> <host>192.168.21.66</host> <port>2181</port> </node> <node index="2"> <host>192.168.21.57</host> <port>2181</port> </node> <node index="3"> <host>192.168.21.17</host> <port>2181</port> </node> </zookeeper-servers> <!-- marcos每個實例配置不一樣 分片3, 副本2--> <macros> <layer>01</layer> <shard>03</shard> <replica>cluster01-03-2</replica> </macros> <networks> <ip>::/0</ip> </networks> <!-- 數據壓縮算法 --> <clickhouse_compression> <case> <min_part_size>10000000000</min_part_size> <min_part_size_ratio>0.01</min_part_size_ratio> <method>lz4</method> </case> </clickhouse_compression> </yandex>
2.2.4 server3上配置clickhouse-server的實例
2.2.4.1 clickhouse-03-01的配置:
1)/etc/clickhouse-server01/metrika.xml
<!--所有實例均使用這個集群配置,不用個性化 --> <yandex> <!-- 集群配置 --> <!-- clickhouse_remote_servers所有實例配置都一樣 --> <clickhouse_remote_servers> <cluster_3s_1r> <!-- 數據分片1 --> <shard> <internal_replication>true</internal_replication> <replica> <host>server01</host> <port>9000</port> <user>default</user> <password></password> </replica> <replica> <host>server03</host> <port>9001</port> <user>default</user> <password></password> </replica> </shard> <!-- 數據分片2 --> <shard> <internal_replication>true</internal_replication> <replica> <host>server02</host> <port>9000</port> <user>default</user> <password></password> </replica> <replica> <host>server01</host> <port>9001</port> <user>default</user> <password></password> </replica> </shard> <!-- 數據分片3 --> <shard> <internal_replication>true</internal_replication> <replica> <host>server03</host> <port>9000</port> <user>default</user> <password></password> </replica> <replica> <host>server02</host> <port>9001</port> <user>default</user> <password></password> </replica> </shard> </cluster_3s_1r> </clickhouse_remote_servers> <!-- ZK --> <!-- zookeeper_servers所有實例配置都一樣 --> <zookeeper-servers> <node index="1"> <host>192.168.21.66</host> <port>2181</port> </node> <node index="2"> <host>192.168.21.57</host> <port>2181</port> </node> <node index="3"> <host>192.168.21.17</host> <port>2181</port> </node> </zookeeper-servers> <!-- marcos每個實例配置不一樣 分片3, 副本1--> <macros> <layer>01</layer> <shard>03</shard> <replica>cluster01-03-1</replica> </macros> <networks> <ip>::/0</ip> </networks> <!-- 數據壓縮算法 --> <clickhouse_compression> <case> <min_part_size>10000000000</min_part_size> <min_part_size_ratio>0.01</min_part_size_ratio> <method>lz4</method> </case> </clickhouse_compression> </yandex>
2.2.4.2 clickhouse-03-02的配置:
1. /etc/clickhouse-server02/metrika.xml
<!--所有實例均使用這個集群配置,不用個性化 --> <yandex> <!-- 集群配置 --> <!-- clickhouse_remote_servers所有實例配置都一樣 --> <clickhouse_remote_servers> <cluster_3s_1r> <!-- 數據分片1 --> <shard> <internal_replication>true</internal_replication> <replica> <host>server01</host> <port>9000</port> <user>default</user> <password></password> </replica> <replica> <host>server03</host> <port>9001</port> <user>default</user> <password></password> </replica> </shard> <!-- 數據分片2 --> <shard> <internal_replication>true</internal_replication> <replica> <host>server02</host> <port>9000</port> <user>default</user> <password></password> </replica> <replica> <host>server01</host> <port>9001</port> <user>default</user> <password></password> </replica> </shard> <!-- 數據分片3 --> <shard> <internal_replication>true</internal_replication> <replica> <host>server03</host> <port>9000</port> <user>default</user> <password></password> </replica> <replica> <host>server02</host> <port>9001</port> <user>default</user> <password></password> </replica> </shard> </cluster_3s_1r> </clickhouse_remote_servers> <!-- ZK --> <!-- zookeeper_servers所有實例配置都一樣 --> <zookeeper-servers> <node index="1"> <host>192.168.21.66</host> <port>2181</port> </node> <node index="2"> <host>192.168.21.57</host> <port>2181</port> </node> <node index="3"> <host>192.168.21.17</host> <port>2181</port> </node> </zookeeper-servers> <!-- marcos每個實例配置不一樣 分片1, 副本2--> <macros> <layer>01</layer> <shard>01</shard> <replica>cluster01-01-2</replica> </macros> <networks> <ip>::/0</ip> </networks> <!-- 數據壓縮算法 --> <clickhouse_compression> <case> <min_part_size>10000000000</min_part_size> <min_part_size_ratio>0.01</min_part_size_ratio> <method>lz4</method> </case> </clickhouse_compression> </yandex>
2.3 運行clickhouse集群
2.3.1 運行clickhouse01-01實例
登陸到server01:ssh server01
#先刪除cs01-01容器
docker rm -f cs01-01
#運行Clickhouse-server docker run -d \ --name cs01-01 \ --ulimit nofile=262144:262144 \ --volume=/data/clickhouse01/:/var/lib/clickhouse \ --volume=/etc/clickhouse-server01/:/etc/clickhouse-server/ \ --add-host server01:192.168.21.21 \ --add-host server02:192.168.21.69 \ --add-host server03:192.168.21.6 \ --add-host i-r9es2e0q:192.168.21.21 \ --add-host i-o91d619w:192.168.21.69 \ --add-host i-ldipmbwa:192.168.21.6 \ --hostname $(hostname) \ -p 9000:9000 \ -p 8123:8123 \ -p 9009:9009 \ yandex/clickhouse-server
說明:
--add-host參數:因為我們在配置文件中使用了hostname來指代我們的服務器,為了讓容器能夠識別,所以需要加此參數,對應的host配置會自動被添加到容器主機的/etc/hosts里面
--hostname參數:clickhouse中的system.clusters表會顯示集群信息,其中is_local的屬性如果不配置hostname的話clickhouse無法識別是否是當前本機。is_local都為0的話會影響集群操作,is_local通過clickhouse-client登錄到任一clickhouse-server上查看:SELECT * FROM system.clusters;
--p參數:暴露容器中的端口到本機端口中。
2.3.2 運行clickhouse01-02實例
登陸到server01:ssh server01;
docker run -d \ --name cs01-02 \ --ulimit nofile=262144:262144 \ --volume=/data/clickhouse02/:/var/lib/clickhouse \ --volume=/etc/clickhouse-server02/:/etc/clickhouse-server/ \ --add-host server01:192.168.21.21 \ --add-host server02:192.168.21.69 \ --add-host server03:192.168.21.6 \ --add-host i-r9es2e0q:192.168.21.21 \ --add-host i-o91d619w:192.168.21.69 \ --add-host i-ldipmbwa:192.168.21.6 \ --hostname $(hostname) \ -p 9001:9000 \ -p 8124:8123 \ -p 9010:9009 \ yandex/clickhouse-server
2.3.3 運行clickhouse02-01實例
登陸到server02:ssh server02
docker run -d \ --name cs02-01 \ --ulimit nofile=262144:262144 \ --volume=/data/clickhouse01/:/var/lib/clickhouse \ --volume=/etc/clickhouse-server01/:/etc/clickhouse-server/ \ --add-host server01:192.168.21.21 \ --add-host server02:192.168.21.69 \ --add-host server03:192.168.21.6 \ --add-host i-r9es2e0q:192.168.21.21 \ --add-host i-o91d619w:192.168.21.69 \ --add-host i-ldipmbwa:192.168.21.6 \ --hostname $(hostname) \ -p 9000:9000 \ -p 8123:8123 \ -p 9009:9009 \ yandex/clickhouse-server
2.3.4 運行clickhouse02-02實例
登陸到server02:ssh server02
docker run -d \ --name cs02-02 \ --ulimit nofile=262144:262144 \ --volume=/data/clickhouse02/:/var/lib/clickhouse \ --volume=/etc/clickhouse-server02/:/etc/clickhouse-server/ \ --add-host server01:192.168.21.21 \ --add-host server02:192.168.21.69 \ --add-host server03:192.168.21.6 \ --add-host i-r9es2e0q:192.168.21.21 \ --add-host i-o91d619w:192.168.21.69 \ --add-host i-ldipmbwa:192.168.21.6 \ --hostname $(hostname) \ -p 9001:9000 \ -p 8124:8123 \ -p 9010:9009 \ yandex/clickhouse-server
2.3.5 運行clickhouse03-01實例
登陸到server03:ssh server03
docker run -d \ --name cs03-01 \ --ulimit nofile=262144:262144 \ --volume=/data/clickhouse01/:/var/lib/clickhouse \ --volume=/etc/clickhouse-server01/:/etc/clickhouse-server/ \ --add-host server01:192.168.21.21 \ --add-host server02:192.168.21.69 \ --add-host server03:192.168.21.6 \ --add-host i-r9es2e0q:192.168.21.21 \ --add-host i-o91d619w:192.168.21.69 \ --add-host i-ldipmbwa:192.168.21.6 \ --hostname $(hostname) \ -p 9000:9000 \ -p 8123:8123 \ -p 9009:9009 \ yandex/clickhouse-server
2.3.6 運行clickhouse03-02實例
登陸到server03:ssh server03
docker run -d \ --name cs03-02 \ --ulimit nofile=262144:262144 \ --volume=/data/clickhouse02/:/var/lib/clickhouse \ --volume=/etc/clickhouse-server02/:/etc/clickhouse-server/ \ --add-host server01:192.168.21.21 \ --add-host server02:192.168.21.69 \ --add-host server03:192.168.21.6 \ --add-host i-r9es2e0q:192.168.21.21 \ --add-host i-o91d619w:192.168.21.69 \ --add-host i-ldipmbwa:192.168.21.6 \ --hostname $(hostname) \ -p 9001:9000 \ -p 8124:8123 \ -p 9010:9009 \ yandex/clickhouse-server
2.4 clickhouse集群的數據操作
2.4.1 運行客戶端連接clickhouse server
1)隨便在哪一台實例的機器上執行如下(連不同的clickhouse實例只需要改下host和port的值即可)
docker run -it \ --rm \ --add-host server01:192.168.21.21 \ --add-host server02:192.168.21.69 \ --add-host server03:192.168.21.6 \ yandex/clickhouse-client \ --host server01 \ --port 9000
2)執行下如下命令查看下配置信息:
#需要看下,是否和metrika.xml配置的分片和副本信息一致,如果不一致,需要check下每個clickhouse-server實例的配置
#應該is_local顯示為0,且分片和副本信息都正確
SELECT * FROM system.clusters;
2.4.2 創建本地復制表和分布式表
所有實例配置完上面這些之后,分別執行啟動命令啟動,然后所有實例都執行下面語句創建數據庫:
例如在實例01-02上執行:
#執行Clickhouse client進入Clickhouse docker run -it --rm --add-host server01:192.168.21.21 --add-host server02:192.168.21.69 --add-host server03:192.168.21.6 \ yandex/clickhouse-client --host server01 --port 9001 ClickHouse client version 19.17.5.18 (official build). Connecting to server01:9001 as user default. Connected to ClickHouse server version 19.17.5 revision 54428. i-r9es2e0q :) CREATE DATABASE test; CREATE DATABASE test Ok.
2.4.3 創建復制表
然后對於所有實例分別創建對應的復制表,這里測試創建一個簡單的表:
#在clickhouse01-01 9000上執行: docker run -it --rm --add-host server01:192.168.21.21 --add-host server02:192.168.21.69 --add-host server03:192.168.21.6 \ yandex/clickhouse-client --host server01 --port 9000 #然后執行 CREATE TABLE test.device_thing_data ( time UInt64, user_id String, device_id String, source_id String, thing_id String, identifier String, value_int32 Int32, value_float Float32, value_double Float64, value_string String, value_enum Enum8('0'=0,'1'=1,'2'=2,'3'=3,'4'=4,'5'=5,'6'=6,'7'=7,'8'=8), value_string_ex String, value_array_string Array(String), value_array_int32 Array(Int32), value_array_float Array(Float32), value_array_double Array(Float64), action_date Date, action_time DateTime ) Engine= ReplicatedMergeTree('/clickhouse/tables/01-01/device_thing_data','cluster01-01-1') PARTITION BY toYYYYMM(action_date) ORDER BY (user_id,device_id,thing_id,identifier,time,intHash64(time)) SAMPLE BY intHash64(time) SETTINGS index_granularity=8192 #在clickhouse01-02 9001上執行: docker run -it --rm --add-host server01:192.168.21.21 --add-host server02:192.168.21.69 --add-host server03:192.168.21.6 \ yandex/clickhouse-client --host server01 --port 9001 然后執行 CREATE TABLE test.device_thing_data ( time UInt64, user_id String, device_id String, source_id String, thing_id String, identifier String, value_int32 Int32, value_float Float32, value_double Float64, value_string String, value_enum Enum8('0'=0,'1'=1,'2'=2,'3'=3,'4'=4,'5'=5,'6'=6,'7'=7,'8'=8), value_string_ex String, value_array_string Array(String), value_array_int32 Array(Int32), value_array_float Array(Float32), value_array_double Array(Float64), action_date Date, action_time DateTime ) Engine= ReplicatedMergeTree('/clickhouse/tables/01-02/device_thing_data','cluster01-02-2') PARTITION BY toYYYYMM(action_date) ORDER BY (user_id,device_id,thing_id,identifier,time,intHash64(time)) SAMPLE BY intHash64(time) SETTINGS index_granularity=8192 #在clickhouse02-01 9000上執行: docker run -it --rm --add-host server01:192.168.21.21 --add-host server02:192.168.21.69 --add-host server03:192.168.21.6 \ yandex/clickhouse-client --host server02 --port 9000 #然后執行 CREATE TABLE test.device_thing_data ( time UInt64, user_id String, device_id String, source_id String, thing_id String, identifier String, value_int32 Int32, value_float Float32, value_double Float64, value_string String, value_enum Enum8('0'=0,'1'=1,'2'=2,'3'=3,'4'=4,'5'=5,'6'=6,'7'=7,'8'=8), value_string_ex String, value_array_string Array(String), value_array_int32 Array(Int32), value_array_float Array(Float32), value_array_double Array(Float64), action_date Date, action_time DateTime ) Engine= ReplicatedMergeTree('/clickhouse/tables/01-02/device_thing_data','cluster01-02-1') PARTITION BY toYYYYMM(action_date) ORDER BY (user_id,device_id,thing_id,identifier,time,intHash64(time)) SAMPLE BY intHash64(time) SETTINGS index_granularity=8192 #在clickhouse02-02 9001上執行: docker run -it --rm --add-host server01:192.168.21.21 --add-host server02:192.168.21.69 --add-host server03:192.168.21.6 \ yandex/clickhouse-client --host server02 --port 9001 #然后執行 CREATE TABLE test.device_thing_data ( time UInt64, user_id String, device_id String, source_id String, thing_id String, identifier String, value_int32 Int32, value_float Float32, value_double Float64, value_string String, value_enum Enum8('0'=0,'1'=1,'2'=2,'3'=3,'4'=4,'5'=5,'6'=6,'7'=7,'8'=8), value_string_ex String, value_array_string Array(String), value_array_int32 Array(Int32), value_array_float Array(Float32), value_array_double Array(Float64), action_date Date, action_time DateTime ) Engine= ReplicatedMergeTree('/clickhouse/tables/01-03/device_thing_data','cluster01-03-2') PARTITION BY toYYYYMM(action_date) ORDER BY (user_id,device_id,thing_id,identifier,time,intHash64(time)) SAMPLE BY intHash64(time) SETTINGS index_granularity=8192 #在clickhouse03-01 9000上執行: docker run -it --rm --add-host server01:192.168.21.21 --add-host server02:192.168.21.69 --add-host server03:192.168.21.6 \ yandex/clickhouse-client --host server03 --port 9000 #然后執行 CREATE TABLE test.device_thing_data ( time UInt64, user_id String, device_id String, source_id String, thing_id String, identifier String, value_int32 Int32, value_float Float32, value_double Float64, value_string String, value_enum Enum8('0'=0,'1'=1,'2'=2,'3'=3,'4'=4,'5'=5,'6'=6,'7'=7,'8'=8), value_string_ex String, value_array_string Array(String), value_array_int32 Array(Int32), value_array_float Array(Float32), value_array_double Array(Float64), action_date Date, action_time DateTime ) Engine= ReplicatedMergeTree('/clickhouse/tables/01-03/device_thing_data','cluster01-03-1') PARTITION BY toYYYYMM(action_date) ORDER BY (user_id,device_id,thing_id,identifier,time,intHash64(time)) SAMPLE BY intHash64(time) SETTINGS index_granularity=8192 #在clickhouse03-02 9001上執行: docker run -it --rm --add-host server01:192.168.21.21 --add-host server02:192.168.21.69 --add-host server03:192.168.21.6 \ yandex/clickhouse-client --host server03 --port 9001 然后執行 CREATE TABLE test.device_thing_data ( time UInt64, user_id String, device_id String, source_id String, thing_id String, identifier String, value_int32 Int32, value_float Float32, value_double Float64, value_string String, value_enum Enum8('0'=0,'1'=1,'2'=2,'3'=3,'4'=4,'5'=5,'6'=6,'7'=7,'8'=8), value_string_ex String, value_array_string Array(String), value_array_int32 Array(Int32), value_array_float Array(Float32), value_array_double Array(Float64), action_date Date, action_time DateTime ) Engine= ReplicatedMergeTree('/clickhouse/tables/01-01/device_thing_data','cluster01-01-2') PARTITION BY toYYYYMM(action_date) ORDER BY (user_id,device_id,thing_id,identifier,time,intHash64(time)) SAMPLE BY intHash64(time) SETTINGS index_granularity=8192
2.4.3 創建分布式表(用於查詢)
然后創建完上面復制表之后,可以創建分布式表,分布式表只是作為一個查詢引擎,本身不存儲任何數據,查詢時將sql發送到所有集群分片,然后進行進行處理和聚合后將結果返回給客戶端,因此clickhouse限制聚合結果大小不能大於分布式表節點的內存,當然這個一般條件下都不會超過;分布式表可以所有實例都創建,也可以只在一部分實例創建,這個和業務代碼中查詢的示例一致,建議設置多個,當某個節點掛掉時可以查詢其他節點上的表,分布式表的建表語句如下:
#在clickhouse-server集群上一次性創建所有的分布式表,操作卡主了。原因不明 CREATE TABLE device_thing_data_all ON CLUSTER cluster_3s_1r AS test.device_thing_data ENGINE = Distributed(cluster_3s_1r, default, device_thing_data, rand()) #如下這個需要每個機器上都操作一遍 CREATE TABLE device_thing_data_all AS test.device_thing_data ENGINE = Distributed(cluster_3s_1r, test, device_thing_data, rand())
2.4.4 測試可用性
#客戶端連接到某個clickhouse-server實例(例如cs01-01) #查詢分布式表,此時沒有查詢到數據 select * from image_label_all; #在cs01-01上執行如下查看是否有數據 #查詢本地復制表,此時沒有查詢到數據 select * from test.device_thing_data; #往復制表中表里插入一條數據 INSERT INTO test.device_thing_data; (user_id) VALUES ('1') #由於剛才在cs01-01上插入一條數據,所以應該有數據了 select * from test.device_thing_data; #查詢分布式表,也有數據了 select * from image_label_all; #在cs03-02上查詢復制表的數據(由於cs03-02是cs01-01的副本,所以數據被自動同步過來了),所以應該有數據了 select * from test.device_thing_data; #在cs03-01上插入一條數據,此時應該會把數據同步到cs02-02上 INSERT INTO test.device_thing_data (user_id) VALUES ('2') #再次查詢分布式表,此時應該查到cs03-01和cs01-01上的兩條數據 select * from image_label_all;
博主:測試生財(一個不為996而996的測開碼農)
座右銘:專注測試開發與自動化運維,努力讀書思考寫作,為內卷的人生奠定財務自由。
內容范疇:技術提升,職場雜談,事業發展,閱讀寫作,投資理財,健康人生。
csdn:https://blog.csdn.net/ccgshigao
博客園:https://www.cnblogs.com/qa-freeroad/
51cto:https://blog.51cto.com/14900374
微信公眾號:測試生財(定期分享獨家內容和資源)