MYSQL之表分區----按日期分區
錯誤的按日期分區例子
最直觀的方法,就是直接用年月日這種日期格式來進行常規的分區:
PLAIN TEXT
CODE:
上面的例子中,就是直接用"Y-m-d"的格式來對一個table進行分區,可惜想當然往往不能奏效,會得到一個錯誤信息:
ERROR 1064 (42000): VALUES value must be of same type as partition function near '),
partition p1 VALUES LESS THAN ('2010-01-01'))' at line 3
上述分區方式沒有成功,而且明顯的不經濟,老練的DBA會用整型數值來進行分區:
PLAIN TEXT
CODE:
Query OK, 0 rows affected (0.01 sec)
搞定?接着往下分析
PLAIN TEXT
CODE:
萬惡的mysql居然對上面的sql使用全表掃描,而不是按照我們的日期分區分塊查詢。原文中解釋到MYSQL的優化器並不認這種日期形式的分區,花了大量的篇幅來引誘俺走上歧路,過分。
正確的日期分區例子
mysql優化器支持以下兩種內置的日期函數進行分區:
TO_DAYS()
YEAR()
看個例子:
PLAIN TEXT
CODE:
Query OK, 0 rows affected (0.00 sec)
以to_days()函數分區成功,我們分析一下看看:
PLAIN TEXT
CODE:
可以看到,mysql優化器這次不負眾望,僅僅在p1分區進行查詢。在這種情況下查詢,真的能夠帶來提升查詢效率么?下面分別對這次建立的part_date3和之前分區失敗的part_date1做一個查詢對比:
PLAIN TEXT
CODE:
可以看到,分區正確的話query花費時間為4秒,而分區錯誤則花費時間40秒(相當於沒有分區),效率有90%的提升!所以我們千萬要正確的使用分區功能,分區后務必用explain驗證,這樣才能獲得真正的性能提升。
注意:
在mysql5.1中建立分區表的語句中,只能包含下列函數:
最直觀的方法,就是直接用年月日這種日期格式來進行常規的分區:
PLAIN TEXT
CODE:
-
mysql > create table rms (d date)
-
-> partition by range (d)
-
-> (partition p0 values less than ('1995-01-01'),
-
-> partition p1 VALUES LESS THAN ('2010-01-01'));
上面的例子中,就是直接用"Y-m-d"的格式來對一個table進行分區,可惜想當然往往不能奏效,會得到一個錯誤信息:
ERROR 1064 (42000): VALUES value must be of same type as partition function near '),
partition p1 VALUES LESS THAN ('2010-01-01'))' at line 3
上述分區方式沒有成功,而且明顯的不經濟,老練的DBA會用整型數值來進行分區:
PLAIN TEXT
CODE:
-
mysql > CREATE TABLE part_date1
-
-> ( c1 int default NULL,
-
-> c2 varchar(30) default NULL,
-
-> c3 date default NULL) engine=myisam
-
-> partition by range (cast(date_format(c3,'%Y%m%d') as signed))
-
-> (PARTITION p0 VALUES LESS THAN (19950101),
-
-> PARTITION p1 VALUES LESS THAN (19960101) ,
-
-> PARTITION p2 VALUES LESS THAN (19970101) ,
-
-> PARTITION p3 VALUES LESS THAN (19980101) ,
-
-> PARTITION p4 VALUES LESS THAN (19990101) ,
-
-> PARTITION p5 VALUES LESS THAN (20000101) ,
-
-> PARTITION p6 VALUES LESS THAN (20010101) ,
-
-> PARTITION p7 VALUES LESS THAN (20020101) ,
-
-> PARTITION p8 VALUES LESS THAN (20030101) ,
-
-> PARTITION p9 VALUES LESS THAN (20040101) ,
-
-> PARTITION p10 VALUES LESS THAN (20100101),
-
-> PARTITION p11 VALUES LESS THAN MAXVALUE );
Query OK, 0 rows affected (0.01 sec)
搞定?接着往下分析
PLAIN TEXT
CODE:
-
mysql > explain partitions
-
-> select count(*) from part_date1 where
-
-> c3> '1995-01-01' and c3 <'1995-12-31'\G
-
*************************** 1. row ***************************
-
id: 1
-
select_type: SIMPLE
-
table: part_date1
-
partitions: p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11
-
type: ALL
-
possible_keys: NULL
-
key: NULL
-
key_len: NULL
-
ref: NULL
-
rows: 8100000
-
Extra: Using where
-
1 row in set (0.00 sec)
萬惡的mysql居然對上面的sql使用全表掃描,而不是按照我們的日期分區分塊查詢。原文中解釋到MYSQL的優化器並不認這種日期形式的分區,花了大量的篇幅來引誘俺走上歧路,過分。
正確的日期分區例子
mysql優化器支持以下兩種內置的日期函數進行分區:
TO_DAYS()
YEAR()
看個例子:
PLAIN TEXT
CODE:
-
mysql > CREATE TABLE part_date3
-
-> ( c1 int default NULL,
-
-> c2 varchar(30) default NULL,
-
-> c3 date default NULL) engine=myisam
-
-> partition by range (to_days(c3))
-
-> (PARTITION p0 VALUES LESS THAN (to_days('1995-01-01')),
-
-> PARTITION p1 VALUES LESS THAN (to_days('1996-01-01')) ,
-
-> PARTITION p2 VALUES LESS THAN (to_days('1997-01-01')) ,
-
-> PARTITION p3 VALUES LESS THAN (to_days('1998-01-01')) ,
-
-> PARTITION p4 VALUES LESS THAN (to_days('1999-01-01')) ,
-
-> PARTITION p5 VALUES LESS THAN (to_days('2000-01-01')) ,
-
-> PARTITION p6 VALUES LESS THAN (to_days('2001-01-01')) ,
-
-> PARTITION p7 VALUES LESS THAN (to_days('2002-01-01')) ,
-
-> PARTITION p8 VALUES LESS THAN (to_days('2003-01-01')) ,
-
-> PARTITION p9 VALUES LESS THAN (to_days('2004-01-01')) ,
-
-> PARTITION p10 VALUES LESS THAN (to_days('2010-01-01')),
-
-> PARTITION p11 VALUES LESS THAN MAXVALUE );
Query OK, 0 rows affected (0.00 sec)
以to_days()函數分區成功,我們分析一下看看:
PLAIN TEXT
CODE:
-
mysql > explain partitions
-
-> select count(*) from part_date3 where
-
-> c3> date '1995-01-01' and c3 <date '1995-12-31'\G
-
*************************** 1. row ***************************
-
id: 1
-
select_type: SIMPLE
-
table: part_date3
-
partitions: p1
-
type: ALL
-
possible_keys: NULL
-
key: NULL
-
key_len: NULL
-
ref: NULL
-
rows: 808431
-
Extra: Using where
-
1 row in set (0.00 sec)
可以看到,mysql優化器這次不負眾望,僅僅在p1分區進行查詢。在這種情況下查詢,真的能夠帶來提升查詢效率么?下面分別對這次建立的part_date3和之前分區失敗的part_date1做一個查詢對比:
PLAIN TEXT
CODE:
-
mysql > select count(*) from part_date3 where
-
-> c3> date '1995-01-01' and c3 <date '1995-12-31';
-
+----------+
-
| count(*) |
-
+----------+
-
| 805114 |
-
+----------+
-
1 row in set (4.11 sec)
-
-
mysql > select count(*) from part_date1 where
-
-> c3> date '1995-01-01' and c3 <date '1995-12-31';
-
+----------+
-
| count(*) |
-
+----------+
-
| 805114 |
-
+----------+
-
1 row in set (40.33 sec)
可以看到,分區正確的話query花費時間為4秒,而分區錯誤則花費時間40秒(相當於沒有分區),效率有90%的提升!所以我們千萬要正確的使用分區功能,分區后務必用explain驗證,這樣才能獲得真正的性能提升。
注意:
在mysql5.1中建立分區表的語句中,只能包含下列函數:
-
ABS()
-
CEILING() and FLOOR() (在使用這2個函數的建立分區表的前提是使用函數的分區鍵是INT類型),例如
-
mysql > CREATE TABLE t (c FLOAT) PARTITION BY LIST( FLOOR(c) )( -> PARTITION p0 VALUES IN (1,3,5), -> PARTITION p1 VALUES IN (2,4,6) -> );; ERROR 1491 (HY000): The PARTITION function returns the wrong type mysql> CREATE TABLE t (c int) PARTITION BY LIST( FLOOR(c) )( -> PARTITION p0 VALUES IN (1,3,5), -> PARTITION p1 VALUES IN (2,4,6) -> ); Query OK, 0 rows affected (0.01 sec)
-
DAY()
-
DAYOFMONTH()
-
DAYOFWEEK()
-
DAYOFYEAR()
-
DATEDIFF()
-
EXTRACT()
-
HOUR()
-
MICROSECOND()
-
MINUTE()
-
MOD()
-
MONTH()
-
QUARTER()
-
SECOND()
-
TIME_TO_SEC()
-
TO_DAYS()
-
WEEKDAY()
-
YEAR()
-
YEARWEEK()

