MySQL數據庫分區操作【RANGE】


客服平台,線上查詢存在性能問題,為了解決或者說是緩解這個問題,除了加必要的索引,另外就是將表進行分區。

 

這里主要是針對既有的表進行分區,采用的是alter table xxx的方式,當然,也可以采用create table xxx partition by range(abc)的方式,都是可以的。兩種方式,都驗證和測試過,都可行!這里主要介紹alter的方式!

 

主要是因為alter的過程,遇到一點小小的問題,以備后查。

通過show create table 的方式查看我們的chat_message_history表,結構如下:

Table    Create Table
chat_message_history    CREATE TABLE `chat_message_history` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `visitor_id` varchar(128) DEFAULT NULL,
  `visitor_name` varchar(255) DEFAULT NULL,
  `contentBlob` blob,
  `sender` varchar(32) DEFAULT NULL,
  `message_time` datetime DEFAULT NULL COMMENT '消息發送時間',
  `jobId` varchar(11) DEFAULT NULL,
  `robot_response` varchar(2000) DEFAULT NULL COMMENT '機器人的回復消息',
  `skill_group_id` varchar(32) DEFAULT NULL,
  `type` varchar(255) DEFAULT NULL,
  `new_skill_group_id` varchar(32) DEFAULT NULL,
  `channel` varchar(16) DEFAULT 'WXAPP' COMMENT '渠道',
  `message_id` varchar(64) DEFAULT NULL,
  `sessionId` varchar(50) DEFAULT NULL,
  `message_status` varchar(11) DEFAULT NULL,
  `error_message` varchar(255) DEFAULT NULL,
  `businessType` varchar(11) DEFAULT NULL COMMENT '1-歡迎語;',
  `pFlag` varchar(11) DEFAULT NULL COMMENT '消息的產品屬性 1-微醫保 2-微重疾',
  PRIMARY KEY (`id`),
  KEY `IDX_jobId` (`jobId`),
  KEY `idx_his_vis_ctdesc_key` (`visitor_id`,`skill_group_id`,`message_time`)
) DEFAULT CHARSET=utf8

 

然后就是alter table的方式添加分區,分區按照消息時間,大體是每個月一個分區:

alter table chat_message_history partition by range(to_days(message_time)) (
    partition p201708 values less than (to_days('2017-08-31')),
    partition p201709 values less than (to_days('2017-09-30')),
    partition p201710 values less than (to_days('2017-10-31')),    
    partition p201711 values less than (to_days('2017-11-30')),
    partition p201712 values less than (to_days('2017-12-31')),    
    partition p201801 values less than (to_days('2018-01-31')),     
    partition p201802 values less than (to_days('2018-02-30')),
    partition p201803 values less than (to_days('2018-03-31')),
    partition p201804 values less than (to_days('2018-04-30')),  
    partition p201805 values less than (to_days('2018-05-31')),
    partition p201806 values less than (to_days('2018-06-30')),  
    partition p201807 values less than (to_days('2018-07-31')),  
    partition p201808 values less than (to_days('2018-08-31')),  
    partition p201809 values less than (to_days('2018-09-30')),  
    partition p201810 values less than (to_days('2018-10-31')),  
    partition p201811 values less than (to_days('2018-11-30')),    
    partition p201812 values less than (to_days('2018-12-31')),  
    partition p201901 values less than (to_days('2019-01-31')),   
    partition p201902 values less than (to_days('2019-02-30')),
    partition p201903 values less than (to_days('2019-03-31')),  
    partition p201904 values less than (to_days('2019-04-30')),
    partition p201905 values less than (to_days('2019-05-31')),  
    partition p201906 values less than (to_days('2019-06-30')),
    partition p201907 values less than (to_days('2019-07-31')),  
    partition p201908 values less than (to_days('2019-08-31')),  
    partition p201909 values less than (to_days('2019-09-30')),
    partition p201910 values less than (to_days('2019-10-31')),    
    partition p201911 values less than (to_days('2019-11-30')),  
    partition p201912 values less than (to_days('2019-12-31')),   
    partition p202001 values less than (to_days('2020-01-31')),
    partition p202002 values less than (to_days('2020-02-30')),
    partition p202003 values less than (to_days('2020-03-31')),
    partition p202004 values less than (to_days('2020-04-30')),
    partition p202005 values less than (to_days('2020-05-31')),
    partition p202006 values less than (to_days('2020-06-30')),
    partition p202007 values less than (to_days('2020-07-31')),
    partition p202008 values less than (to_days('2020-08-31')),
    partition p202009 values less than (to_days('2020-09-30')),
    partition p202010 values less than (to_days('2020-10-31')),
    partition p202011 values less than (to_days('2020-11-30')),
    partition p202012 values less than (to_days('2020-12-31')),
    PARTITION p202XYZ VALUES LESS THAN (MAXVALUE));

上述SQL執行報錯:

ERROR 1566 (HY000): Not allowed to use NULL value in VALUES LESS THAN

仔細查看,上述的LESS THAN后面沒有NULL的值啊,都是寫的很明確的年月日進行獲取天數來得到分界線的啊。。。 最后研究下to_days(expr)函數,

官方文檔:

TO_DAYS(date)

Given a date date, returns a day number (the number of days since year 0).

我懷疑是因為我給定的每年的2月份的年月日信息不合法,驗證一下:

mysql> select TO_DAYS('2017-02-30');
+-----------------------+
| TO_DAYS('2017-02-30') |
+-----------------------+
|                  NULL |
+-----------------------+
1 row in set, 1 warning (0.00 sec)

mysql> 
mysql> show warnings;
+---------+------+----------------------------------------+
| Level   | Code | Message                                |
+---------+------+----------------------------------------+
| Warning | 1292 | Incorrect datetime value: '2017-02-30' |
+---------+------+----------------------------------------+
1 row in set (0.00 sec)

 

結合上述錯誤提示,將分區SQL語句調整一下如下:

alter table chat_message_history partition by range(to_days(message_time)) (
    partition p201708 values less than (to_days('2017-09-01')),
    partition p201709 values less than (to_days('2017-10-01')),
    partition p201710 values less than (to_days('2017-11-01')),    
    partition p201711 values less than (to_days('2017-12-01')),
    partition p201712 values less than (to_days('2018-01-01')),    
    partition p201801 values less than (to_days('2018-02-01')),     
    partition p201802 values less than (to_days('2018-03-01')),
    partition p201803 values less than (to_days('2018-04-01')),
    partition p201804 values less than (to_days('2018-05-01')),  
    partition p201805 values less than (to_days('2018-06-01')),
    partition p201806 values less than (to_days('2018-07-01')),  
    partition p201807 values less than (to_days('2018-08-01')),  
    partition p201808 values less than (to_days('2018-09-01')),  
    partition p201809 values less than (to_days('2018-10-01')),  
    partition p201810 values less than (to_days('2018-11-01')),  
    partition p201811 values less than (to_days('2018-12-01')),    
    partition p201812 values less than (to_days('2019-01-01')),  
    partition p201901 values less than (to_days('2019-02-01')),   
    partition p201902 values less than (to_days('2019-03-01')),
    partition p201903 values less than (to_days('2019-04-01')),  
    partition p201904 values less than (to_days('2019-05-01')),
    partition p201905 values less than (to_days('2019-06-01')),  
    partition p201906 values less than (to_days('2019-07-01')),
    partition p201907 values less than (to_days('2019-08-01')),  
    partition p201908 values less than (to_days('2019-09-01')),  
    partition p201909 values less than (to_days('2019-10-01')),
    partition p201910 values less than (to_days('2019-11-01')),    
    partition p201911 values less than (to_days('2019-12-01')),  
    partition p201912 values less than (to_days('2020-01-01')),   
    partition p202001 values less than (to_days('2020-02-01')),
    partition p202002 values less than (to_days('2020-03-01')),
    partition p202003 values less than (to_days('2020-04-01')),
    partition p202004 values less than (to_days('2020-05-01')),
    partition p202005 values less than (to_days('2020-06-01')),
    partition p202006 values less than (to_days('2020-07-01')),
    partition p202007 values less than (to_days('2020-08-01')),
    partition p202008 values less than (to_days('2020-09-01')),
    partition p202009 values less than (to_days('2020-10-01')),
    partition p202010 values less than (to_days('2020-11-01')),
    partition p202011 values less than (to_days('2020-12-01')),
    partition p202012 values less than (to_days('2021-01-01')),
    PARTITION p202XYZ VALUES LESS THAN (MAXVALUE));

執行后還是報錯:

ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table's partitioning function

這個錯誤是說,分區函數里面,主鍵必須包含所有的用於建立分區的列。我這里分區,是按照message_time進行分區,所以,這里將message_time和既有的id主鍵建立聯合主鍵。SQL如下(先刪除既有的id主鍵,再建聯合主鍵):

alter table chat_message_history drop primary key,add primary key (`id`,`message_time`); 

 

再次執行創建分區的SQL:

mysql> alter table chat_message_history partition by range(to_days(message_time)) (
    -> partition p201708 values less than (to_days('2017-09-01')),
    -> partition p201709 values less than (to_days('2017-10-01')),
    -> partition p201710 values less than (to_days('2017-11-01')),    
    -> partition p201711 values less than (to_days('2017-12-01')),
    -> partition p201712 values less than (to_days('2018-01-01')),    
    -> partition p201801 values less than (to_days('2018-02-01')),     
    -> partition p201802 values less than (to_days('2018-03-01')),
    -> partition p201803 values less than (to_days('2018-04-01')),
    -> partition p201804 values less than (to_days('2018-05-01')),  
    -> partition p201805 values less than (to_days('2018-06-01')),
    -> partition p201806 values less than (to_days('2018-07-01')),  
    -> partition p201807 values less than (to_days('2018-08-01')),  
    -> partition p201808 values less than (to_days('2018-09-01')),  
    -> partition p201809 values less than (to_days('2018-10-01')),  
    -> partition p201810 values less than (to_days('2018-11-01')),  
    -> partition p201811 values less than (to_days('2018-12-01')),    
    -> partition p201812 values less than (to_days('2019-01-01')),  
    -> partition p201901 values less than (to_days('2019-02-01')),   
    -> partition p201902 values less than (to_days('2019-03-01')),
    -> partition p201903 values less than (to_days('2019-04-01')),  
    -> partition p201904 values less than (to_days('2019-05-01')),
    -> partition p201905 values less than (to_days('2019-06-01')),  
    -> partition p201906 values less than (to_days('2019-07-01')),
    -> partition p201907 values less than (to_days('2019-08-01')),  
    -> partition p201908 values less than (to_days('2019-09-01')),  
    -> partition p201909 values less than (to_days('2019-10-01')),
    -> partition p201910 values less than (to_days('2019-11-01')),    
    -> partition p201911 values less than (to_days('2019-12-01')),  
    -> partition p201912 values less than (to_days('2020-01-01')),   
    -> partition p202001 values less than (to_days('2021-02-01')),
    -> partition p202002 values less than (to_days('2020-03-01')),
    -> partition p202003 values less than (to_days('2020-04-01')),
    -> partition p202004 values less than (to_days('2020-05-01')),
    -> partition p202005 values less than (to_days('2020-06-01')),
    -> partition p202006 values less than (to_days('2020-07-01')),
    -> partition p202007 values less than (to_days('2020-08-01')),
    -> partition p202008 values less than (to_days('2020-09-01')),
    -> partition p202009 values less than (to_days('2020-10-01')),
    -> partition p202010 values less than (to_days('2020-11-01')),
    -> partition p202011 values less than (to_days('2020-12-01')),
    -> partition p202012 values less than (to_days('2021-01-01')),
    -> PARTITION p202XYZ VALUES LESS THAN (MAXVALUE));
Query OK, 0 rows affected (1.28 sec)
Records: 0  Duplicates: 0  Warnings: 0

這回成功了,真是折騰!!!

 

現在就要來驗證一下,我們的分區是否起到作用了。主要是進行對比唄,先看沒有建立分區的SQL查詢:

mysql> explain select * from chat_message_history where message_time > '2017-12-01' and message_time < '2018-01-01';
+----+-------------+----------------------+------+---------------+------+---------+------+---------+-------------+
| id | select_type | table                | type | possible_keys | key  | key_len | ref  | rows    | Extra       |
+----+-------------+----------------------+------+---------------+------+---------+------+---------+-------------+
|  1 | SIMPLE      | chat_message_history | ALL  | NULL          | NULL | NULL    | NULL | 5103176 | Using where |
+----+-------------+----------------------+------+---------------+------+---------+------+---------+-------------+
1 row in set (0.00 sec)

涉及到表掃描行數是5103176,這個表一共530W行記錄,這里就掃描了510W行,夠可以的。。。

 

那么,加了分區后呢?請看下面的SQL查詢:

mysql> explain select * from chat_message_history where message_time > '2017-12-01' and message_time < '2018-01-01';
+----+-------------+----------------------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table                | type | possible_keys | key  | key_len | ref  | rows   | Extra       |
+----+-------------+----------------------+------+---------------+------+---------+------+--------+-------------+
|  1 | SIMPLE      | chat_message_history | ALL  | NULL          | NULL | NULL    | NULL | 829848 | Using where |
+----+-------------+----------------------+------+---------------+------+---------+------+--------+-------------+
1 row in set (0.00 sec)

這回查詢掃描的行數,就變成了80多萬行了,少了不少啊!

 

從這次分區看,分區查詢和不分區查詢,影響到的掃描行數還是挺明顯的。

 

 

總結一下:

1,MySQL數據量達到幾百萬后,多表聯合查詢時,性能極其不穩定,這個是我們線上系統的真實寫照,幾天內,兩次查詢導致數據庫連接數耗盡,這次600個連接,全部占用,導致系統不可用!

2,數據量大了,采用分區,或者加索引,可以緩解眼前的問題,但是,隨着時間推移,若查詢數據量不做限制,最終還是會出現查詢響應非常慢的問題。所以,建議采用數據分割或者說是表拆分的方式,基於一定的業務場景或者需要進行,可以保證系統的高可用性。

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM