一、簡介
在數據庫日漸龐大的今天,為了方便對數據庫數據的管理,比如按時間,按地區去統計一些數據時,基數過於龐大,多有不便。很多商業數據庫都提供分區的概念,按不同的維度去存放數據,便於后期的管理,PostgreSQL也不例外。
PostgresSQL分區的意思是把邏輯上的一個大表分割成物理上的幾塊兒。分區不僅能帶來訪問速度的提升,關鍵的是,它能帶來管理和維護上的方便。
分區的具體好處是:
- 某些類型的查詢性能可以得到極大提升。
- 更新的性能也可以得到提升,因為表的每塊的索引要比在整個數據集上的索引要小。如果索引不能全部放在內存里,那么在索引上的讀和寫都會產生更多的磁盤訪問。
- 批量刪除可以用簡單的刪除某個分區來實現。
- 可以將很少用的數據移動到便宜的、轉速慢的存儲介質上。
在PG里表分區是通過表繼承來實現的,一般都是建立一個主表,里面是空,然后每個分區都去繼承它。無論何時,都應保證主表里面是空的。
小表分區不實際,表在多大情況下才考慮分區呢?PostgresSQL官方給出的建議是:當表本身大小超過了機器物理內存的實際大小時(the size of the table should exceed the physical memory of the database server),可以考慮分區。
PG目前(9.2.2)僅支持范圍分區和列表分區,尚未支持散列分區。
二、環境
系統環境:CentOS release 6.3 (Final)
PostgreSQL版本:PostgreSQL 9.2.2 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4), 64-bit
三、實現分區
3.1 創建主表
david=# create table tbl_partition ( david(# id integer, david(# name varchar(20), david(# gender boolean, david(# join_date date, david(# dept char(4)); CREATE TABLE david=#
3.2 創建分區表
david=# create table tbl_partition_201211 ( check ( join_date >= DATE '2012-11-01' AND join_date < DATE '2012-12-01' ) ) INHERITS (tbl_partition); CREATE TABLE david=# create table tbl_partition_201212 ( check ( join_date >= DATE '2012-12-01' AND join_date < DATE '2013-01-01' ) ) INHERITS (tbl_partition); CREATE TABLE david=# create table tbl_partition_201301 ( check ( join_date >= DATE '2013-01-01' AND join_date < DATE '2013-02-01' ) ) INHERITS (tbl_partition); CREATE TABLE david=# create table tbl_partition_201302 ( check ( join_date >= DATE '2013-02-01' AND join_date < DATE '2013-03-01' ) ) INHERITS (tbl_partition); CREATE TABLE david=# create table tbl_partition_201303 ( check ( join_date >= DATE '2013-03-01' AND join_date < DATE '2013-04-01' ) ) INHERITS (tbl_partition); CREATE TABLE david=# create table tbl_partition_201304 ( check ( join_date >= DATE '2013-04-01' AND join_date < DATE '2013-05-01' ) ) INHERITS (tbl_partition); CREATE TABLE david=# create table tbl_partition_201305 ( check ( join_date >= DATE '2013-05-01' AND join_date < DATE '2013-06-01' ) ) INHERITS (tbl_partition); CREATE TABLE david=#
3.3 分區鍵上建索引
david=# create index tbl_partition_201211_joindate on tbl_partition_201211 (join_date); CREATE INDEX david=# create index tbl_partition_201212_joindate on tbl_partition_201212 (join_date); CREATE INDEX david=# create index tbl_partition_201301_joindate on tbl_partition_201301 (join_date); CREATE INDEX david=# create index tbl_partition_201302_joindate on tbl_partition_201302 (join_date); CREATE INDEX david=# create index tbl_partition_201303_joindate on tbl_partition_201303 (join_date); CREATE INDEX david=# create index tbl_partition_201304_joindate on tbl_partition_201304 (join_date); CREATE INDEX david=# create index tbl_partition_201305_joindate on tbl_partition_201305 (join_date); CREATE INDEX david=#
對於開發人員來說,希望數據庫是透明的,只管 insert into tbl_partition。對於數據插向哪個分區,則希望由DB決定。這點,ORACLE實現了,但是PG不行,需要前期人工處理下。
3.4 創建觸發器函數
david=# CREATE OR REPLACE FUNCTION tbl_partition_insert_trigger() RETURNS TRIGGER AS $$ BEGIN IF ( NEW.join_date >= DATE '2012-11-01' AND NEW.join_date < DATE '2012-12-01' ) THEN INSERT INTO tbl_partition_201211 VALUES (NEW.*); ELSIF ( NEW.join_date >= DATE '2012-12-01' AND NEW.join_date < DATE '2013-01-01' ) THEN INSERT INTO tbl_partition_201212 VALUES (NEW.*); ELSIF ( NEW.join_date >= DATE '2013-01-01' AND NEW.join_date < DATE '2013-02-01' ) THEN INSERT INTO tbl_partition_201301 VALUES (NEW.*); ELSIF ( NEW.join_date >= DATE '2013-02-01' AND NEW.join_date < DATE '2013-03-01' ) THEN INSERT INTO tbl_partition_201302 VALUES (NEW.*); ELSIF ( NEW.join_date >= DATE '2013-03-01' AND NEW.join_date < DATE '2013-04-01' ) THEN INSERT INTO tbl_partition_201303 VALUES (NEW.*); ELSIF ( NEW.join_date >= DATE '2013-04-01' AND NEW.join_date < DATE '2013-05-01' ) THEN INSERT INTO tbl_partition_201304 VALUES (NEW.*); ELSIF ( NEW.join_date >= DATE '2013-05-01' AND NEW.join_date < DATE '2013-06-01' ) THEN INSERT INTO tbl_partition_201305 VALUES (NEW.*); ELSE RAISE EXCEPTION 'Date out of range. Fix the tbl_partition_insert_trigger() function!'; END IF; RETURN NULL; END; $$ LANGUAGE plpgsql; CREATE FUNCTION david=#
說明:如果不想丟失數據,上面的ELSE 條件可以改成 INSERT INTO tbl_partition_error_join_date VALUES (NEW.*); 同時需要創建一張結構和tbl_partition 一樣的表tbl_partition_error_join_date,這樣,錯誤的join_date 數據就可以插入到這張表中而不是報錯了。
3.5 創建觸發器
david=# CREATE TRIGGER insert_tbl_partition_trigger david-# BEFORE INSERT ON tbl_partition david-# FOR EACH ROW EXECUTE PROCEDURE tbl_partition_insert_trigger(); CREATE TRIGGER david=#
四、查看表
4.1 查看所有表
david=# \dt List of relations Schema | Name | Type | Owner --------+----------------------+-------+---------- public | tbl_partition | table | postgres public | tbl_partition_201211 | table | postgres public | tbl_partition_201212 | table | postgres public | tbl_partition_201301 | table | postgres public | tbl_partition_201302 | table | postgres public | tbl_partition_201303 | table | postgres public | tbl_partition_201304 | table | postgres public | tbl_partition_201305 | table | postgres (8 rows) david=#
4.2 查看主表
david=# \d tbl_partition Table "public.tbl_partition" Column | Type | Modifiers -----------+-----------------------+----------- id | integer | name | character varying(20) | gender | boolean | join_date | date | dept | character(4) | Triggers: insert_tbl_partition_trigger BEFORE INSERT ON tbl_partition FOR EACH ROW EXECUTE PROCEDURE tbl_partition_insert_trigger() Number of child tables: 7 (Use \d+ to list them.) david=#
4.3 查看分區表
david=# \d tbl_partition_201304 Table "public.tbl_partition_201304" Column | Type | Modifiers -----------+-----------------------+----------- id | integer | name | character varying(20) | gender | boolean | join_date | date | dept | character(4) | Indexes: "tbl_partition_201304_joindate" btree (join_date) Check constraints: "tbl_partition_201304_join_date_check" CHECK (join_date >= '2013-04-01'::date AND join_date < '2013-05-01'::date) Inherits: tbl_partition david=#
五、測試
5.1 插入數據
david=# insert into tbl_partition values (1, 'David', '1', '2013-01-10', 'TS'); INSERT 0 0 david=# insert into tbl_partition values (2, 'Sandy', '0', '2013-02-10', 'TS'); INSERT 0 0 david=# insert into tbl_partition values (3, 'Eagle', '1', '2012-11-01', 'TS'); INSERT 0 0 david=# insert into tbl_partition values (4, 'Miles', '1', '2012-12-15', 'SD'); INSERT 0 0 david=# insert into tbl_partition values (5, 'Simon', '1', '2012-12-10', 'SD'); INSERT 0 0 david=# insert into tbl_partition values (6, 'Rock', '1', '2012-11-10', 'SD'); INSERT 0 0 david=# insert into tbl_partition values (7, 'Peter', '1', '2013-01-11', 'SD'); INSERT 0 0 david=# insert into tbl_partition values (8, 'Sally', '0', '2013-03-10', 'BCSC'); INSERT 0 0 david=# insert into tbl_partition values (9, 'Carrie', '0', '2013-04-02', 'BCSC'); INSERT 0 0 david=# insert into tbl_partition values (10, 'Lee', '1', '2013-01-05', 'BMC'); INSERT 0 0 david=# insert into tbl_partition values (11, 'Nicole', '0', '2012-11-10', 'PROJ'); INSERT 0 0 david=# insert into tbl_partition values (12, 'Renee', '0', '2013-01-10', 'TS'); INSERT 0 0 david=#
5.2 查看主表數據
david=# select * from tbl_partition; id | name | gender | join_date | dept ----+--------+--------+------------+------ 3 | Eagle | t | 2012-11-01 | TS 6 | Rock | t | 2012-11-10 | SD 11 | Nicole | f | 2012-11-10 | PROJ 4 | Miles | t | 2012-12-15 | SD 5 | Simon | t | 2012-12-10 | SD 1 | David | t | 2013-01-10 | TS 7 | Peter | t | 2013-01-11 | SD 10 | Lee | t | 2013-01-05 | BMC 12 | Renee | f | 2013-01-10 | TS 2 | Sandy | f | 2013-02-10 | TS 8 | Sally | f | 2013-03-10 | BCSC 9 | Carrie | f | 2013-04-02 | BCSC (12 rows) david=#
5.3 查看分區表數據
david=# select * from tbl_partition_201301 ; id | name | gender | join_date | dept ----+-------+--------+------------+------ 1 | David | t | 2013-01-10 | TS 7 | Peter | t | 2013-01-11 | SD 10 | Lee | t | 2013-01-05 | BMC 12 | Renee | f | 2013-01-10 | TS (4 rows) david=#
六、管理分區
6.1 移除數據/分區
實現分區表之后,我們就可以很容易地移除不再使用的舊數據了,最簡單的方法就是:
david=# drop table tbl_partition_201304;
這樣可以快速移除大量數據,而不是逐條刪除數據。
另一個推薦做法是將分區從分區表中移除,但是保留訪問權限。
david=# alter table tbl_partition_201304 no inherit tbl_partition; ALTER TABLE david=#
和直接DROP 相比,該方式僅僅是使子表脫離了原有的主表,而存儲在子表中的數據仍然可以得到訪問,因為此時該表已經被還原成一個普通的數據表了。這樣對於數據庫的DBA來說,就可以在此時對該表進行必要的維護操作,如數據清理、歸檔等,在完成諸多例行性的操作之后,就可以考慮是直接刪除該表(DROP TABLE),還是先清空該表的數據(TRUNCATE TABLE),之后再讓該表重新繼承主表。
david=# alter table tbl_partition_201304 inherit tbl_partition; ALTER TABLE david=#
6.2 增加分區
我們可以像之前那樣增加一個分區
david=# create table tbl_partition_201306 ( check ( join_date >= DATE '2013-06-01' AND join_date < DATE '2013-07-01' ) ) INHERITS (tbl_partition);
david=# create index tbl_partition_201306_joindate on tbl_partition_201306 (join_date);
同時,需要修改觸發器函數,將插入條件改成相應的值。
說明:創建觸發器函數時,最好把插入條件寫更未來一點,比如多寫十年,這樣以后增加新分區時就不需要重新創建觸發器函數了,也可以避免一些不必要的錯誤。
另外,還可以如下增加新的分區:
david=# create table tbl_partition_201307 david-# (LIKE tbl_partition INCLUDING DEFAULTS INCLUDING CONSTRAINTS); CREATE TABLE david=# david=# alter table tbl_partition_201307 add constraint tbl_partition_201307_join_date_check david-# check ( join_date >= DATE '2013-07-01' AND join_date < DATE '2013-08-01' ); ALTER TABLE david=# david=# create index tbl_partition_201307_joindate on tbl_partition_201307 (join_date);
david=# copy tbl_partition_201307 from '/tmp/tbl_partition_201307.sql'; //從文件中拷貝數據,這些數據可以是事前准備的 david=# alter table tbl_partition_201307 inherit tbl_partition;
七、約束排除
約束排除(Constraint exclusion)是一種查詢優化技巧,它改進了用上面方法定義的表分區的性能。
確保postgresql.conf 里的配置參數constraint_exclusion 是打開的。沒有這個參數,查詢不會按照需要進行優化。這里我們需要做的是確保該選項在配置文件中沒有被注釋掉。
如果沒有約束排除,查詢會掃描tbl_partition 表中的每一個分區。打開了約束排除之后,規划器將檢查每個分區的約束然后再試圖證明該分區不需要被掃描,因為它不能包含任何符合WHERE子句條件的數據行。如果規划器可以證明這個,它就把該分區從查詢規划里排除出去。
可以使用EXPLAIN 命令顯示一個規划在constraint_exclusion 關閉和打開情況下的不同:
7.1 約束排除關閉
david=# set constraint_exclusion = off; SET david=# explain select count(*) from tbl_partition where join_date >= DATE '2013-04-01'; QUERY PLAN ------------------------------------------------------------------------------------------------- Aggregate (cost=172.80..172.81 rows=1 width=0) -> Append (cost=0.00..167.62 rows=2071 width=0) -> Seq Scan on tbl_partition (cost=0.00..0.00 rows=1 width=0) Filter: (join_date >= '2013-04-01'::date) -> Seq Scan on tbl_partition_201211 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2013-04-01'::date) -> Seq Scan on tbl_partition_201212 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2013-04-01'::date) -> Seq Scan on tbl_partition_201301 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2013-04-01'::date) -> Seq Scan on tbl_partition_201302 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2013-04-01'::date) -> Seq Scan on tbl_partition_201303 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2013-04-01'::date) -> Seq Scan on tbl_partition_201305 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2013-04-01'::date) -> Seq Scan on tbl_partition_201304 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2013-04-01'::date) -> Seq Scan on tbl_partition_201306 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2013-04-01'::date) -> Seq Scan on tbl_partition_201307 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2013-04-01'::date) (22 rows) david=#
從上面的查詢計划中可以看出,PostgreSQL 掃描了所有分區。下面我們再看一下打開約束排除之后的查詢計划:
7.2 約束排除開啟
david=# set constraint_exclusion = on; SET david=# explain select count(*) from tbl_partition where join_date >= DATE '2013-04-01'; QUERY PLAN ------------------------------------------------------------------------------------------------- Aggregate (cost=76.80..76.81 rows=1 width=0) -> Append (cost=0.00..74.50 rows=921 width=0) -> Seq Scan on tbl_partition (cost=0.00..0.00 rows=1 width=0) Filter: (join_date >= '2013-04-01'::date) -> Seq Scan on tbl_partition_201305 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2013-04-01'::date) -> Seq Scan on tbl_partition_201304 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2013-04-01'::date) -> Seq Scan on tbl_partition_201306 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2013-04-01'::date) -> Seq Scan on tbl_partition_201307 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2013-04-01'::date) (12 rows) david=#
可以看到,PostgreSQL 只掃描四月份以后的分區表。
八、可選的分區方式
還可以通過建立規則的方式進行分區。
CREATE RULE insert_tbl_partition_201211 AS ON INSERT TO tbl_partition WHERE ( join_date >= DATE '2012-11-01' AND join_date < DATE '2012-12-01' ) DO INSTEAD INSERT INTO tbl_partition_201211 VALUES (NEW.*); CREATE RULE insert_tbl_partition_201212 AS ON INSERT TO tbl_partition WHERE ( join_date >= DATE '2012-12-01' AND join_date < DATE '2013-01-01' ) DO INSTEAD INSERT INTO tbl_partition_201212 VALUES (NEW.*); ... CREATE RULE insert_tbl_partition_201306 AS ON INSERT TO tbl_partition WHERE ( join_date >= DATE '2013-06-01' AND join_date < DATE '2013-07-01' ) DO INSTEAD INSERT INTO tbl_partition_201306 VALUES (NEW.*); CREATE RULE insert_tbl_partition_201307 AS ON INSERT TO tbl_partition WHERE ( join_date >= DATE '2013-07-01' AND join_date < DATE '2013-08-01' ) DO INSTEAD INSERT INTO tbl_partition_201307 VALUES (NEW.*); CREATE RULE insert_tbl_partition_error_join_date AS ON INSERT TO tbl_partition WHERE ( join_date >= DATE '2013-08-01' OR join_date < DATE '2012-11-01' ) DO INSTEAD INSERT INTO tbl_partition_error_join_date VALUES (NEW.*);
九、注意事項
VACUUM 或 ANALYZE tbl_partition 只會對主表起作用,要想分析表,需要分別分析每個分區表。
十、參考資料
- PostgreSQL官方說明:http://www.postgresql.org/docs/9.2/static/ddl-partitioning.html
- ITEYE:http://diegoball.iteye.com/blog/713826
- kenyon(君羊):http://my.oschina.net/Kenyon/blog/59455