clickhouse分布式集群


一.環境准備:

主機 系統 應用 ip
ckh-01 centos 8 jdk,zookeeper,clickhouse 192.168.205.190
ckh-02 centos 8 jdk,zookeeper,clickhouse 192.168.205.191
ckh-03 centos 8 jdk,zookeeper,clickhouse 192.168.205.192
ckh-04 centos 8 jdk,clickhouse 192.168.205.193
ckh-05 centos 8 jdk,clickhouse 192.168.205.194
ckh-06 centos 8 jdk,clickhouse 192.168.205.195

java環境和zookeeper集群安裝省略

各節點安裝clickhouse

yum -y  install yum-utils
rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG 
yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64
dnf -y install clickhouse-server clickhouse-client

二.集群配置:(這里用原有配置文件/etc/clickhouse-server/config.xml)

在配置文件/etc/clickhouse-server/config.xml找到

<!-- <listen_host>::</listen_host> -->取消掉注釋
<listen_host>::</listen_host>

使用3節點2副本6台服務器配置(以一台為例子,配置完成拷貝至其他節點。然后將副本改成唯一)

在<remote_servers></remote_servers>標簽中加入如下配置

<!--<perftest_3shards_1replicas>可以改成自定義名-->
<perftest_3shards_1replicas>
    <shard>
        <internal_replication>true</internal_replication>
            <replica>
                <host>ckh-01</host>
                <port>9000</port>
                <user>admin</user>
                <password>111111</password>
            </replica>
            <replica>
                <host>ckh-02</host>
                <port>9000</port>
                <user>admin</user>
                <password>111111</password>
            </replica>
    </shard>
    <shard>
        <internal_replication>true</internal_replication>
            <replica>                             
                 <host>ckh-03</host>
                 <port>9000</port>
                 <user>admin</user>
                 <password>111111</password>
            </replica>
            <replica>
                <host>ckh-04/host>
                <port>9000</port>
                <user>admin</user>
                <password>111111</password>
            </replica>
     </shard>
     <shard>
        <internal_replication>true</internal_replication>
            <replica>
                 <host>ckh-05</host>
                 <port>9000</port>
                 <user>admin</user>
                 <password>111111</password>
            </replica>
            <replica>
                <host>ckh-06</host>
                <port>9000</port>
                <user>admin</user>
                <password>111111</password>
            </replica>
     </shard>
</perftest_3shards_1replicas>
 

將如下配置加入到<yandex></yandex>標簽中

<!--zookeeper相關配置-->
​
        <zookeeper>
                <node>
                     <host>ckh-01</host>
                     <port>2181</port>
                </node>
                <node>
                      <host>ckh-02</host>
                      <port>2181</port>
                </node>
                <node >
                      <host>ckh-03</host>
                      <port>2181</port>
                </node>
        </zookeeper>
        <macros>
            <shard>01</shard>
             <replica>ckh-01-01</replica>
        </macros>
        <networks>
             <ip>::/0</ip>
        </networks>
        <clickhouse_compression>
                <case>
                    <min_part_size>10000000000</min_part_size>
                    <min_part_size_ratio>0.01</min_part_size_ratio>
                    <method>lz4</method>
                </case>
        </clickhouse_compression>

  

<macros></macros>標簽中接入對應分片和副本信息,確保每台副本唯一

主機名 分片 副本(主機名+副本編號)
ckh-01 01 ckh-01-01
ckh-02 01 ckh-02-02
ckh-03 02 ckh-03-01
ckh-04 02 ckh-04-02
ckh-05 03 ckh-05-01
ckh-06 03 ckh-06-02

修改以下文件,以便在一個節點上執行語句其他節點也同步執行

vim /etc/clickhouse-server/config.xml

三:用戶密碼配置

vim /etc/clickhouse-server/users.xml

如果配置密文加密請參照注釋說明

clickhouse 默認用戶為default 無密碼可以登錄,我們可以改成其他用戶 或禁用default

<users></users>標簽里添加其他用戶配置

        <admin>
            <password>111111</password>
            <networks incl="networks" replace="replace">
                <ip>::/0</ip>
            </networks>
            <!-- Settings profile for user. -->
            <profile>default</profile>
            <!-- Quota for user. -->
            <quota>default</quota>
        </admin>

  

<networks incl="networks" replace="replace"> 此處需要加上 incl="networks" replace="replace"

高版本已取消 ,在使用flink運行任務時會出現連接clickhouse超時現象

四.進入clickhouse-client:(注意如果需要外部訪問需要將vim /etc/clickhouse-server/config.xml配置文件listen_host改成如下設置)

各節點啟動服務:

systemctl start clickhouse-server

然后連接通過以下命令連接(-m為多行命令操作)

clickhouse-client -h  192.168.205.190 --port 9000 -m -u admin --password 111111 

查看數據庫信息:

show databases;

五.查看集群信息(任意一節點均可查看):

select * from system.clusters;

-------------------------------------------------------------------------------------------------------------------------------------

創建本地表及分布式表:

在各個節點分表創建數據庫test(在一個節點執行即可)

create database test ON CLUSTER perftest_3shards_1replicas;

下面給出ReplicatedMergeTree引擎的完整建表DDL語句。

創建本地表及表引擎

Replicated Table & ReplicatedMergeTree Engines

CREATE TABLE IF NOT EXISTS test.events_local ON CLUSTER perftest_3shards_1replicas (  ts_date Date,  ts_date_time DateTime,  user_id Int64,  event_type String,  site_id Int64,  groupon_id Int64,  category_id Int64,  merchandise_id Int64,  search_text String ) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/test/events_local','{replica}') PARTITION BY ts_date ORDER BY (ts_date,toStartOfHour(ts_date_time),site_id,event_type) SETTINGS index_granularity = 8192;

 

其中,ON CLUSTER語法表示分布式DDL,即執行一次就可在集群所有實例上創建同樣的本地表。集群標識符{cluster}、分片標識符{shard}和副本標識符{replica}來自之前提到過的復制表宏配置,即config.xml中<macros>一節的內容,配合ON CLUSTER語法一同使用,可以避免建表時在每個實例上反復修改這些值。

分布式表及分布式表引擎

Distributed Table & Distributed Engine

ClickHouse分布式表的本質並不是一張表,而是一些本地物理表(分片)的分布式視圖,本身並不存儲數據。

支持分布式表的引擎是Distributed,建表DDL語句示例如下,_all只是分布式表名比較通用的后綴而已。

CREATE TABLE IF NOT EXISTS test.events_all ON CLUSTER perftest_3shards_1replicas AS test.events_local ENGINE = Distributed(perftest_3shards_1replicas,test,events_local,rand());

任意節點插入數據:

insert into test.events_all values('2021-03-03','2021-03-03 16:10:00',1,'ceshi1',1,1,1,1,'test1'),('2021-03-03','2021-03-03 16:20:01',2,'ceshi2',2,2,2,2,'test2'),('2021-03-03','2021-03-03 16:30:02',3,'ceshi2',3,3,3,3,'test3'),('2021-03-03','2021-03-03 16:40:03',4,'ceshi4',4,4,4,4,'test4'),('2021-03-03','2021-03-03 16:50:04',5,'ceshi5',5,5,5,5,'test5'),('2021-03-03','2021-03-03 17:00:05',6,'ceshi6',6,6,6,6,'test6');
查詢各分片數據:

 select * from test.events_all;

查看副本節點也復制了一份同樣的數據

-------------------------------------------------------------------------------------

clickhouse基本操作:

查詢clickhouse集群信息

select * from system.clusters;

創建數據庫命令(一個節點上執行,多個節點同時創建)

create database test ON CLUSTER perftest_3shards_1replicas

刪除數據庫命令(一個節點上執行,多個節點同時刪除)

drop database test ON CLUSTER perftest_3shards_1replicas

刪除本地表數據(分布式表無法刪除表數據)

alter table test.events_local ON CLUSTER perftest_3shards_1replicas delete where 1=1;

1=1表示刪除所有數據,可以接字段名刪除滿足某個條件的數據

查看zookeeper下目錄

select * from system.zookeeper WHERE path='/'

 
       


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM