clickhouse高可用(三节点)


特别说明:此文章基于已安装完成clickhouse(三节点)单副本集群的情况

一、集群情况
    cdhserver2 1110.0.0.237 9006
    cdhserver3 1110.0.0.238 9006
    cdhserver4 1110.0.0.239 9006

1.1  安装多实例是为了用三台服务器测试3分片2备份集群。最后的部署如下表:

                备份1                                    备份2
分片1 1110.0.0.237:9006           1110.0.0.237:9007
分片2 1110.0.0.238:9006           1110.0.0.238:9007
分片3 1110.0.0.239:9006           1110.0.0.239:9007

 

二  配置多实例

2.1  将/etc/clickhouse-server/config.xml文件拷贝一份改名

cp /etc/clickhouse-server/config.xml /etc/clickhouse-server/config9007.xml

 

2.2  编辑/etc/clickhouse-server/config9007.xml更改以下内容将两个服务区分开来

多实例修改的config9007.xml:原来内容
<log>/var/log/clickhouse-server/clickhouse-server.log</log>
<errorlog>/var/log/clickhouse-server/clickhouse-server.err.log</errorlog>
<http_port>8123</http_port>
<tcp_port>9006</tcp_port>
<interserver_http_port>9009</interserver_http_port>
<path>/var/lib/clickhouse/</path>
<tmp_path>/var/lib/clickhouse/tmp/</tmp_path>
<user_files_path>/var/lib/clickhouse/user_files/</user_files_path>
<include_from>/etc/clickhouse-server/metrica.xml</include_from> #集群配置文件


多实例修改的config9007.xml:调整后内容
<log>/var/log/clickhouse-server/clickhouse-server-9007.log</log>
<errorlog>/var/log/clickhouse-server/clickhouse-server-9007.err.log</errorlog>
<http_port>8124</http_port>
<tcp_port>9007</tcp_port>
<interserver_http_port>9011</interserver_http_port><path>/var/lib/clickhouse9007/</path><tmp_path>/var/lib/clickhouse9007/tmp/</tmp_path>
<user_files_path>/var/lib/clickhouse9007/user_files/</user_files_path>
<include_from>/etc/clickhouse-server/metrica9007.xml</include_from>

2.3  创建对应的目录
[root@cdhserver4]# mkdir -p /data/clickhouse9007
[root@cdhserver4]# chown -R clickhouse:clickhouse /data/clickhouse9007
[root@cdhserver4]# mkdir -p /var/lib/clickhouse9007
[root@cdhserver4]# chown -R clickhouse:clickhouse /var/lib/clickhouse9007

2.4  增加实例对应的服务启动脚本

[root@cdhserver4 init.d]# cp /etc/init.d/clickhouse-server /etc/init.d/clickhouse-server9007
[root@cdhserver4 init.d]# vim /etc/init.d/clickhouse-server9007

调整内容如下:
调整后内容:
CLICKHOUSE_CONFIG=$CLICKHOUSE_CONFDIR/config9007.xml
CLICKHOUSE_PIDFILE="$CLICKHOUSE_PIDDIR/$PROGRAM-9007.pid"

调整前内容:
CLICKHOUSE_CONFIG=$CLICKHOUSE_CONFDIR/config.xml
CLICKHOUSE_PIDFILE="$CLICKHOUSE_PIDDIR/$PROGRAM.pid"

三  集群配置(三分片两备份)
3.1  六个metrica*.xml共同部分:

<?xml version="1.0"?>
<yandex>
<!--ck集群节点-->
<clickhouse_remote_servers>
    <idc_cluster_data>
        <!--分片1-->
        <shard>
            <internal_replication>true</internal_replication>
            <weight>1</weight>
            <replica>
                <host>cdhserver2</host>
                <port>9006</port>
                <user>ck</user>
                <password>123456</password>
               <compression>true</compression>
            </replica>
            <replica>
                <host>cdhserver3</host>
                <port>9007</port>
                <user>ck</user>
                <password>123456</password>
               <compression>true</compression>
            </replica>
        </shard>
        <!--分片2-->
        <shard>
            <internal_replication>true</internal_replication>
            <weight>1</weight>
            <replica>
                <host>cdhserver3</host>
                <port>9006</port>
                <user>ck</user>
                <password>123456</password>
                <compression>true</compression>
            </replica>
            <replica>
                <host>cdhserver4</host>
                <port>9007</port>
                <user>ck</user>
                <password>123456</password>
               <compression>true</compression>
            </replica>
        </shard>
        <!--分片3-->
        <shard>
            <internal_replication>true</internal_replication>
            <weight>1</weight>
            <replica>
                <host>cdhserver4</host>
                <port>9006</port>
                <user>ck</user>
                <password>123456</password>
                <compression>true</compression>
            </replica>
            <replica>
                <host>cdhserver2</host>
                <port>9007</port>
                <user>ck</user>
                <password>123456</password>
               <compression>true</compression>
             </replica>
        </shard>
    </idc_cluster_data>
</clickhouse_remote_servers>

<!--zookeeper相关配置-->
<zookeeper-servers>
    <node index="1">
        <host>cdhserver2</host>
        <port>2181</port>
    </node>
    <node index="2">
        <host>cdhserver3</host>
        <port>2181</port>
    </node>
    <node index="3">
        <host>cdhserver4</host>
        <port>2181</port>
    </node>
</zookeeper-servers>

<!-- macros配置 -->
<macros>
    <layer>01</layer>
    <shard>01</shard>
    <replica>cdhserver2</replica>
</macros>

<networks>
    <ip>::/0</ip>
</networks>

<!--压缩相关配置-->
<clickhouse_compression>
    <case>
        <min_part_size>10000000000</min_part_size>
        <min_part_size_ratio>0.01</min_part_size_ratio>
        <method>lz4</method> <!--压缩算法lz4压缩比zstd快, 更占磁盘-->
    </case>
</clickhouse_compression>
</yandex>

3.2  metrica*.xml不同部分修改如下:

cdhserver2实例1(端口:9006)对应metrica.xml调整:
<macros>
    <!-- <replica>cdhserver2</replica> -->
    <layer>01</layer>
    <shard>01</shard>
    <replica>cdhserver2</replica>
</macros>


cdhserver2实例2(端口:9007)对应metrica9007.xml调整:
<macros>
    <!-- <replica>cdhserver2</replica> -->
    <layer>01</layer>
    <shard>03</shard>
    <replica>cdhserver2-03-2</replica>
</macros>

cdhserver3实例1(端口:9006)对应metrica.xml调整:
<macros>
    <!-- <replica>cdhserver3</replica> -->
    <layer>01</layer>
    <shard>02</shard>
    <replica>cdhserver3</replica>
</macros>


cdhserver3实例2(端口:9007)对应metrica9007.xml调整:
<macros>
    <!-- <replica>cdhserver3</replica> -->
    <layer>01</layer>
    <shard>01</shard>
    <replica>cdhserver3-01-2</replica>
</macros>

cdhserver4实例1(端口:9006)对应metrica.xml调整:
<macros>
    <!-- <replica>cdhserver4</replica> -->
    <layer>01</layer>
    <shard>03</shard>
    <replica>cdhserver4</replica>
</macros>

cdhserver4实例2(端口:9007)对应metrica9007.xml调整:
<macros>
    <!-- <replica>cdhserver4</replica> -->
    <layer>01</layer>
    <shard>02</shard>
    <replica>cdhserver4-02-2</replica>
</macros>

 

3.3  启动高可用clickhouse集群


[root@cdhserver4 clickhouse-server]#  /etc/init.d/clickhouse-server start
[root@cdhserver4 clickhouse-server]#  /etc/init.d/clickhouse-server9007 start


3.4  登录数据库查看集群信息


clickhouse-client --host 110.0.0.237 --port 9006
clickhouse-client --host 110.0.0.237 --port 9007
clickhouse-client --host 110.0.0.238 --port 9006
clickhouse-client --host 110.0.0.238 --port 9007
clickhouse-client --host 110.0.0.239 --port 9006
clickhouse-client --host 110.0.0.239 --port 9007

四  集群高可用验证

4.1 高可用原理


zookeeper+ReplicatedMergeTree(复制表)+Distributed(分布式表)


4.2 首先创建ReplicatedMergeTree引擎表(采用分布式DDL建表语句,不用到六个节点都去执行这个命令).

CREATE TABLE idc_ha.t_s2_r2 on cluster idc_cluster_data
(
dt Date,
path String
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{layer}-{shard}/t_s2_r2','{replica}',dt, dt, 8192);

解释:
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{layer}-{shard}/test_clusters_ha','{replica}',dt, dt, 8192);
第一个参数为ZooKeeper中该表的路径。

第二个参数为ZooKeeper中该表的副本名称。

这里的{layer},{shard}和{replica}就是metrika*.xml中的macros标签对应的值。

 

4.3 创建分布表


采用分布式DDL建表,这样无论哪台服务器上的节点都挂了,集群还可以照常用

CREATE TABLE idc_ha.t_s2_r2_all on cluster idc_cluster_data
AS idc_ha.t_s2_r2 ENGINE = Distributed(idc_cluster_data, idc_ha, t_s2_r2, rand());

4.4  插入并查看数据
复制代码
insert into t_s2_r2_all values('2019-07-21','path1');
insert into t_s2_r2_all values('2019-07-22','path1');
insert into t_s2_r2_all values('2019-07-23','path1');
insert into t_s2_r2_all values('2019-07-23','path1');

复制代码

查看数据结果就不贴出来了。总数据4条,分布在三个9006实例的分片中。对应9007是三个分片的备份副本。


4.5 验证某个节点宕机现有数据查询一致性

a.将cdhserver3上的两个实例服务全部停止,模拟cdhserver3节点宕机。

[root@cdhserver3 clickhouse-server]# service clickhouse-server stop
Stop clickhouse-server service: DONE
[root@cdhserver3 clickhouse-server]# service clickhouse-server9007 stop
Stop clickhouse-server service: DONE

结果是单节点宕机数据一致性得到保证。同理如果只是宕机cdhserver4也是一样的结果。

 

b.如果cdhserver3和cdhserver4同时宕机会如何呢?

[root@cdhserver4 clickhouse-server]# service clickhouse-server stop
Stop clickhouse-server service: DONE
[root@cdhserver4 clickhouse-server]# service clickhouse-server9007 stop
Stop clickhouse-server service: DONE

[root@cdhserver3 clickhouse-server]# service clickhouse-server stop
Stop clickhouse-server service: DONE
[root@cdhserver3 clickhouse-server]# service clickhouse-server9007 stop
Stop clickhouse-server service: DONE

结果是: 如果两台机器上的两个实力都停掉,分布式表不可查询,当其中某个节点双实例都停掉,另外一台停掉分片实例(9006), 分布式表可用,若停掉副本实例(9007),数据不可查,原因是分布式表可查的条件是数据不能缺失,及不管是副本还是分片,
必须满足能提供完全数据的副本(或分片),缺少一点数据分布式表都不可用

c. 如果停掉 cdhserver2:9007、cdhserver3:9007、cdhserver4:9007(9007全为副本)

[root@cdhserver2 clickhouse-server]# /etc/init.d/clickhouse-server9007 stop
[root@cdhserver3 clickhouse-server]# /etc/init.d/clickhouse-server9007 stop
[root@cdhserver4 ~]# /etc/init.d/clickhouse-server9007 stop

查询表:cdhserver2、cdhserver3、cdhserver4 上9006 端口的数据一致性得到保证


d. 如果停掉 cdhserver2:9006、cdhserver3:9006、cdhserver4:9006(9006全为分片)

[root@cdhserver2 clickhouse-server]# /etc/init.d/clickhouse-server stop
[root@cdhserver3 clickhouse-server]# /etc/init.d/clickhouse-server stop
[root@cdhserver4 clickhouse-server]# /etc/init.d/clickhouse-server stop

查看9007端口数据一致性
cdhserver4 :) select * from t_s2_r2_all;

SELECT *
FROM t_s2_r2_all

┌─────────dt─┬─path──┐
│ 2019-07-23 │ path1 │
└────────────┴───────┘
┌─────────dt─┬─path──┐
│ 2019-07-23 │ path1 │
└────────────┴───────┘
┌─────────dt─┬─path──┐
│ 2019-07-22 │ path1 │
└────────────┴───────┘
┌─────────dt─┬─path──┐
│ 2019-07-21 │ path1 │
└────────────┴───────┘

4 rows in set. Elapsed: 0.017 se

cdhserver3 :) select * from t_s2_r2_all;

SELECT *
FROM t_s2_r2_all

┌─────────dt─┬─path──┐
│ 2019-07-21 │ path1 │
└────────────┴───────┘
┌─────────dt─┬─path──┐
│ 2019-07-22 │ path1 │
└────────────┴───────┘
┌─────────dt─┬─path──┐
│ 2019-07-23 │ path1 │
└────────────┴───────┘
┌─────────dt─┬─path──┐
│ 2019-07-23 │ path1 │
└────────────┴───────┘

4 rows in set. Elapsed: 0.007 sec.

cdhserver2 :) select * from t_s2_r2_all;

SELECT *
FROM t_s2_r2_all

┌─────────dt─┬─path──┐
│ 2019-07-23 │ path1 │
└────────────┴───────┘
┌─────────dt─┬─path──┐
│ 2019-07-22 │ path1 │
└────────────┴───────┘
┌─────────dt─┬─path──┐
│ 2019-07-21 │ path1 │
└────────────┴───────┘
┌─────────dt─┬─path──┐
│ 2019-07-23 │ path1 │
└────────────┴───────┘

4 rows in set. Elapsed: 0.015 sec.

结论:当停掉三个分片(9006), 保留三个9007端口对应的节点时,数据一致性得到保证

e. 停掉一个9006、两个9007端口对应的分片 或则停掉一个9007对应的副本、两个9006对应的分片

 

结论:数据均匀的分布在三台机器的9006端口对应的分片上,及9007端口对应的副本上,
只要分片及对应该分片数据的副本不同时挂机,且总能满足 分片总数 或 分片+副本数 或 副本总数 等于预设的值,
数据总能得到一致性保证

 

 

 


免责声明!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系本站邮箱yoyou2525@163.com删除。



 
粤ICP备18138465号  © 2018-2025 CODEPRJ.COM