1 按照日期(天)分片:
從開始日期算起,按照天數來分片 例如,從2017-11-01,每10天一個分片且可以指定結束日期
注意事項:需要提前將分片規划好,建好,否則有可能日期超出實際配置分片數
1.1 修改配置文件
#修改rule.xml 添加按日期分片的的分配規則
vi rule.xml
<function name="sharding-by-date" class="org.opencloudb.route.function.PartitionByDate">
<property name="dateFormat">yyyy-MM-dd</property> <!--日期格式-->
<property name="sBeginDate">2017-11-01</property> <!--開始日期-->
<property name="sPartionDay">10</property> <!--每分片天數-->
</function>
<tableRule name="sharding-by-date">
<rule> <columns>create_date</columns>
<algorithm>sharding-by-date</algorithm>
</rule>
</tableRule>
#修改schema.xml 添加邏輯表
vi schema.xml
<table name="mycatbydate" primaryKey="ID" dataNode="dn1,dn2,dn3" rule="sharding-by-date"/>
#重新加載配置文件
[root@localhost conf]# mysql -h 192.168.2.130 -P9066 -utest -ptest
Warning: Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 20
Server version: 5.5.8-mycat-1.5.1-RELEASE-20161130213509 MyCat Server (monitor)
Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql>
mysql>
mysql> reload @@config;
Query OK, 1 row affected (0.08 sec)
Reload config success
1.2 創建表后插入數據並分析日志
mysql> create table mycatbydate(id int not null auto_increment primary key,
create_date datetime,datanode varchar(10));
mysql> insert into mycatbydate(create_date,datanode) values('2017-06-01',database());
分析mycat.log日志
1.3 查詢語句1:分片字段是等值運算,分析mycat.log
mysql> select * from mycatbydate where create_date='2017-11-01';
+----+---------------------+----------+
| id | create_date | datanode |
+----+---------------------+----------+
| 1 | 2017-11-01 00:00:00 | db1 |
+----+---------------------+----------+
1 row in set (0.03 sec)
分析mycat.log日志
1.4 查詢語句2:分片字段范圍查詢,分析explain和mycat.log
這里分別用 where 跟 betwwen and 的方式來測試一下范圍查詢
where
mysql> explain select * from mycatbydate
where create_date >='2017-11-01' and create_date <'2017-11-10';
+-----------+-------------------------------------------------------------+
| DATA_NODE | SQL |
+-----------+-------------------------------------------------------------+
| dn1 | SELECT * FROM mycatbydate WHERE create_date >= '2017-11-01' |
| | AND create_date < '2017-11-10' LIMIT 100 |
| dn2 | SELECT * FROM mycatbydate WHERE create_date >= '2017-11-01' |
| | AND create_date < '2017-11-10' LIMIT 100 |
| | SELECT * FROM mycatbydate WHERE create_date >= '2017-11-01' | |
| dn3 | AND create_date < '2017-11-10' LIMIT 100 |
+-----------+-------------------------------------------------------------+
3 rows in set (0.00 sec)
分析mycat.log日志
between and
mysql> explain select * from mycatbydate where create_date
BETWEEN '2017-11-01' AND '2017-11-10';
+-----------+---------------------------------------------------+
| DATA_NODE | SQL |
+-----------+---------------------------------------------------+
| dn1 | SELECT * FROM mycatbydate WHERE create_date |
| | BETWEEN '2017-11-01' AND '2017-11-10' LIMIT 100 |
+-----------+---------------------------------------------------+
1 row in set (0.00 sec)
分析mycat.log日志
總結:對於范圍查詢,where語句是沒法優化,從而路由到相應的節點上,但是between and 可以優化,這個有咨詢過 lead us,答復是目前代碼做不到where優化。。。
1.5 插入限制以及解決方案
插入數據,如果超過30天數據提示ERROR 1064 (HY000): Index: 6, Size: 3
由於配置3個dn,每10天一個分片,從7月1日數據就無法插入,這種配置就存在數據超過實際分配個數,底層dn和實際數據要規划好
mysql> insert into mycatbydate(id,create_date,datanode) values(2,'2017-12-01',database());
ERROR 1064 (HY000): Index: 6, Size: 3
mysql> insert into mycatbydate(id,create_date,datanode) values(2,'2017-11-10',database());
Query OK, 1 row affected (0.01 sec)
mysql> insert into mycatbydate(create_date,datanode) values('2017-11-20',database());
Query OK, 1 row affected (0.00 sec)
mysql> insert into mycatbydate(create_date,datanode) values('2017-11-30',database());
Query OK, 1 row affected (0.00 sec)
#查詢數據分布:
mysql> select * from mycatbydate order by datanode;
+----+---------------------+----------+
| id | create_date | datanode |
+----+---------------------+----------+
| 1 | 2017-11-01 00:00:00 | db1 |
| 2 | 2017-11-10 00:00:00 | db1 |
| 2 | 2017-11-20 00:00:00 | db2 |
| 2 | 2017-11-30 00:00:00 | db3 |
+----+---------------------+----------+
4 rows in set (0.08 sec)
解決方案:指定分區結束時間,這樣分區可以重復使用
vi rule.xml
<function name="sharding-by-date" class="org.opencloudb.route.function.PartitionByDate">
<property name="dateFormat">yyyy-MM-dd</property> <!--日期格式-->
<property name="sBeginDate">2017-06-01</property> <!--開始日期-->
<property name="sEndDate">2017-06-30</property> <!--結束日期-->
<property name="sPartionDay">10</property> <!--每分片天數-->
</function>
@@reload_config
#重新加載配置文件后可以正常插入數據:
mysql> insert into mycatbydate(id,create_date,datanode) values(2,'2017-12-01',database());
ERROR 1062 (23000): Duplicate entry '2' for key 'PRIMARY'
mysql> insert into mycatbydate(id,create_date,datanode) values(5,'2017-12-01',database());
Query OK, 1 row affected (0.18 sec)
mysql> insert into mycatbydate(id,create_date,datanode) values(6,'2018-01-01',database());
Query OK, 1 row affected (0.03 sec)
mysql> insert into mycatbydate(id,create_date,datanode) values(7,'2018-02-01',database());
Query OK, 1 row affected (0.03 sec)
mysql> insert into mycatbydate(id,create_date,datanode) values(8,'2018-03-01',database());
Query OK, 1 row affected (0.04 sec)
mysql> insert into mycatbydate(id,create_date,datanode) values(9,'2018-04-01',database());
Query OK, 1 row affected (0.04 sec)
mysql>
mysql>
mysql> select * from mycatbydate order by datanode;
+----+---------------------+----------+
| id | create_date | datanode |
+----+---------------------+----------+
| 5 | 2017-12-01 00:00:00 | db1 |
| 2 | 2017-11-10 00:00:00 | db1 |
| 1 | 2017-11-01 00:00:00 | db1 |
| 6 | 2018-01-01 00:00:00 | db1 |
| 7 | 2018-02-01 00:00:00 | db1 |
| 8 | 2018-03-01 00:00:00 | db1 |
| 9 | 2018-04-01 00:00:00 | db1 |
| 2 | 2017-11-20 00:00:00 | db2 |
| 2 | 2017-11-30 00:00:00 | db3 |
+----+---------------------+----------+
9 rows in set (0.14 sec)
2 按照自然月分片
每個自然月一個分片,需要提前將分片數規划好,建好,否則有可能日期超出實際配置分片數
2.1修改配置文件
#新增分配規則
vi rule.xml
<function name="partbymonth" class="org.opencloudb.route.function.PartitionByMonth">
<property name="dateFormat">yyyy-MM-dd</property>
<property name="sBeginDate">2017-11-01</property>
</function>
<tableRule name="sharding-by-month">
<rule>
<columns>create_date</columns>
<algorithm>partbymonth</algorithm>
</rule>
</tableRule>
#schema.xml增加邏輯表
<table name="mycatbymonth" primaryKey="ID" dataNode="dn$1-12" rule="sharding-by-month" />
<dataNode name="dn1" dataHost="192.168.124.55" database="db1" />
<dataNode name="dn2" dataHost="192.168.124.55" database="db2" />
<dataNode name="dn3" dataHost="192.168.124.55" database="db3" />
<dataNode name="dn4" dataHost="192.168.124.55" database="db4" />
<dataNode name="dn5" dataHost="192.168.124.55" database="db5" />
<dataNode name="dn6" dataHost="192.168.124.55" database="db6" />
<dataNode name="dn7" dataHost="192.168.124.55" database="db7" />
<dataNode name="dn8" dataHost="192.168.124.55" database="db8" />
<dataNode name="dn9" dataHost="192.168.124.55" database="db9" />
<dataNode name="dn10" dataHost="192.168.124.55" database="db10" />
<dataNode name="dn11" dataHost="192.168.124.55" database="db11" />
<dataNode name="dn12" dataHost="192.168.124.55" database="db12" />
2.2 mycat重新加載配置文件
mysql> reload @@config;
Query OK, 1 row affected (0.10 sec)
Reload config success
#查看邏輯表
mysql> show tables;
+------------------+
| Tables in TESTDB |
+------------------+
| customer |
| customer_addr |
| mycatbydate |
| mycatbymonth |
| orders |
| order_items |
| travelrecord |
| t_vote |
+------------------+
8 rows in set (0.00 sec)
2.3 創建表后插入數據
mysql>create table mycatbymonth(id int not null auto_increment primary key, create_date datetime,datanode varchar(10));
Query OK, 0 rows affected (0.20 sec)
-> ;
mysql> explain select * from mycatbymonth;
+-----------+--------------------------------------+
| DATA_NODE | SQL |
+-----------+--------------------------------------+
| dn1 | SELECT * FROM mycatbymonth LIMIT 100 |
| dn10 | SELECT * FROM mycatbymonth LIMIT 100 |
| dn11 | SELECT * FROM mycatbymonth LIMIT 100 |
| dn12 | SELECT * FROM mycatbymonth LIMIT 100 |
| dn2 | SELECT * FROM mycatbymonth LIMIT 100 |
| dn3 | SELECT * FROM mycatbymonth LIMIT 100 |
| dn4 | SELECT * FROM mycatbymonth LIMIT 100 |
| dn5 | SELECT * FROM mycatbymonth LIMIT 100 |
| dn6 | SELECT * FROM mycatbymonth LIMIT 100 |
| dn7 | SELECT * FROM mycatbymonth LIMIT 100 |
| dn8 | SELECT * FROM mycatbymonth LIMIT 100 |
| dn9 | SELECT * FROM mycatbymonth LIMIT 100 |
+-----------+--------------------------------------+
12 rows in set (0.01 sec)
mysql> insert into mycatbymonth(id,create_date,datanode) values(1,'2017-11-24',database());
Query OK, 1 row affected (0.01 sec)
查看mycat日志
2.4 批量插入,驗證插入分區,不能超過一年的數據。
#批量插入數據:
insert into mycatbymonth(id,create_date,datanode) values(1,'2017-01-01',database());
insert into mycatbymonth(id,create_date,datanode) values(2,'2017-01-24',database());
insert into mycatbymonth(id,create_date,datanode) values(3,'2017-02-01',database());
insert into mycatbymonth(id,create_date,datanode) values(4,'2017-03-01',database());
insert into mycatbymonth(id,create_date,datanode) values(5,'2017-04-01',database());
insert into mycatbymonth(id,create_date,datanode) values(6,'2017-05-01',database());
insert into mycatbymonth(id,create_date,datanode) values(7,'2017-06-01',database());
insert into mycatbymonth(id,create_date,datanode) values(8,'2017-07-01',database());
insert into mycatbymonth(id,create_date,datanode) values(9,'2017-08-01',database());
insert into mycatbymonth(id,create_date,datanode) values(12,'2017-12-01',database());
#報錯的插入
mysql> insert into mycatbymonth(id,create_date,datanode) values(13,'2019-01-01','db1');
ERROR 1064 (HY000): Can't find a valid data node for specified
node index :MYCATBYMONTH -> CREATE_DATE -> 2019-01-01 -> Index : 24
2.5 查詢驗證,查詢分區字段等值 以及范圍查詢同 按照日期(天)分片的方式 不贅述!
連續分片總結:
1、針對時間范圍查詢,采用between可以直接路由到對應分片,從而避免查詢所有dn,只查詢特定分片(具體原因還沒有分
析),這個問題問過leaderUS,回復時,目前技術上還沒法做到普通范圍查詢的優化。
2、插入數據存在超過分片情況,針對月分區,插入數據不能超過一年,針對天的,也存在日期超過分區問題