postgresql分區表探索(pg_pathman)


使用場景

許多系統在在使用幾年之后數據量不斷膨脹,這個時候單表數據量超過2000w+,數據庫的查詢也越來越慢,而隨着時間的推移許多歷史數據的重要性可能逐漸下降。這時候就可以考慮使用分區表來將冷熱數據分區存儲。

常用的使用場景比如sql分析的日志記錄,常用的分區字段有按照創建時間、省份、以及業務類型,具體使用需要結合需求

Postgresql官方的建議是單表大小超過了服務器內存大小可以考慮分區(大概的了解了下按照現代的服務器物理性能,單表大小不超過32GB,兩千萬記錄)

分區概念

分區的概念即是將邏輯上的一張大表分割成物理上的小塊,分區不僅能帶來查詢效率上的提升,也能給維護和管理帶來方便。

說明

postgresql在9.6以前的版本就支持分區,但都是基於觸發器性能並不是很好,pg10目前內置了分區但根據pg社區里的一些測試看出pg10分區性能不如pg_pathman。這里主要測試pg_pathman的range分區

安裝

安裝插件pg_pathman:連接

創建擴展

--創建擴展
create extension pg_pathman; --查看擴展是否安裝成功,或者\dx select * from pg_extension

RANGE分區

需要注意的是分區的字段必須是非空,類似於案件的立案日期結案日期就不能用作分區字段

--查看表數據
db_jcxxzypt=# select count(*) from db_jcxx.t_jcxxzy_tjaj; count   ---------- 17507701 --添加非空約束(分區字段要非空) db_jcxxzypt=# alter table t_jcxxzy_tjaj alter COLUMN d_slrq set not null; --創建分區表,1700w+數據按照年份創建分區表。使用非堵塞式的遷移方法。 select create_range_partitions( 't_jcxxzy_tjaj'::regclass, --主表oid 'd_slrq', --分區字段,一定要not null約束 '2000-01-01 00:00:00'::timestamp, --開始時間 interval '1 year',   --分區間隔、一年 20, --分區表數量 false -- 不立即將數據從主表遷移到分區 ); --遷移到分區表 select partition_table_concurrently('t_jcxxzy_tjaj'::regclass,                             10000, --一個事務批量遷移多少記錄 1-10000                             1.0); --查看后台的數據遷移任務 select * from pathman_concurrent_part_tasks; --查看分區表 db_jcxxzypt=# \d+ db_jcxx.t_jcxxzy_tjaj                                   Table "db_jcxx.t_jcxxzy_tjaj"   Column   |             Type             | Modifiers | Storage | Stats target | Description -------------+--------------------------------+-----------+----------+--- c_bh       | character(32)                 | not null | extended |             | ID c_xzdm     | character varying(300)         |           | extended |             | 行政代碼 省略字段... Indexes:   "t_jcxxzy_tjaj_new1_pkey" PRIMARY KEY, btree (c_bh)   "idx_jcxxzy_tjaj_ajdsrs" btree (n_ajdsrs)   "idx_ttjaj_cajly" btree (c_ajly)   "idx_ttjaj_dslrq" btree (d_slrq)   "idx_ttjaj_new1_ctwhbm" btree (c_twhbm)   "idx_ttjaj_xylx" btree (c_xylx) Child tables: db_jcxx.t_jcxxzy_tjaj_1,             db_jcxx.t_jcxxzy_tjaj_2,             db_jcxx.t_jcxxzy_tjaj_3,             db_jcxx.t_jcxxzy_tjaj_4,             db_jcxx.t_jcxxzy_tjaj_5,             db_jcxx.t_jcxxzy_tjaj_6 Options: parallel_workers=2 --分區完成后建議禁用主表 select set_enable_parent('t_jcxxzy_tjaj'::regclass,false); --分區表數據量 db_jcxxzypt=# select relname as tablename, reltuples::int as rowCounts from pg_class where relkind = 'r' and relname like 't_jcxxzy_tjaj%' order by rowCounts desc;   tablename   | rowcounts -----------------+----------- t_jcxxzy_tjaj_4 |   3662374 t_jcxxzy_tjaj_2 |   3661425 t_jcxxzy_tjaj_1 |   3660449 t_jcxxzy_tjaj_3 |   3658622 t_jcxxzy_tjaj_5 |   2864830 t_jcxxzy_tjaj   |         0 t_jcxxzy_tjaj_6 |         0 (7 rows)

1700w數據大概遷移了一個多小時,如果表有索引可以先刪除索引,數據遷移完成后再建索引,因為在創建分區的時候,所有的分區表都會單獨創建索引,這也是不能保證全局唯一的原因。

 

使用count計算c_xylx='02'的數據 分區vs不分區

--不分區
db_jcxxzypt=# explain analyze select count(*) from db_jcxx.t_jcxxzy_tjaj WHERE c_xylx = '02';                                                                       QUERY PLAN                                                                 ------------------------------------------------------------------------- Aggregate (cost=90147.38..90147.39 rows=1 width=8) (actual time=844.279..844.279 rows=1 loops=1)   -> Index Only Scan using idx_ttjaj_xylx on t_jcxxzy_tjaj (cost=0.44..82870.01 rows=2910947 width=0) (ac tual time=0.041..569.953 rows=2916043 loops=1)         Index Cond: (c_xylx = '02'::text)         Heap Fetches: 0 Planning time: 0.226 ms Execution time: 844.334 ms (6 rows) --不分區執行時間 db_jcxxzypt=# \timing Timing is off. db_jcxxzypt=# select count(*) from db_jcxx.t_jcxxzy_tjaj WHERE c_xylx = '02'; count --------- 2916043 (1 row) Time: 543.206 ms ​ --分區后執行計划 db_jcxxzypt=# explain analyze select count(*) from db_jcxx.t_jcxxzy_tjaj WHERE c_xylx = '02';                                                                                 QUERY PLAN                                                                                 ------------------------------------------------------------------------- Aggregate (cost=89754.14..89754.15 rows=1 width=8) (actual time=1215.401..1215.401 rows=1 loops=1)   -> Append (cost=0.43..82510.65 rows=2897393 width=0) (actual time=0.039..942.783 rows=2916043 loops=1)         -> Index Only Scan using t_jcxxzy_tjaj_1_c_xylx_idx on t_jcxxzy_tjaj_1 (cost=0.43..17406.09 rows=611295 width=0) (actual time=0.039..127.923 rows=609209 loops=1)               Index Cond: (c_xylx = '02'::text)               Heap Fetches: 0         -> Index Only Scan using t_jcxxzy_tjaj_2_c_xylx_idx on t_jcxxzy_tjaj_2 (cost=0.43..17105.00 rows=600718 width=0) (actual time=0.023..126.972 rows=609727 loops=1)               Index Cond: (c_xylx = '02'::text)               Heap Fetches: 0         -> Index Only Scan using idx_ttjaj_c_xylx on t_jcxxzy_tjaj_3 (cost=0.43..16936.90 rows=594770 width=0) (actual time=0.032..124.370 rows=608945 loops=1)               Index Cond: (c_xylx = '02'::text)               Heap Fetches: 0         -> Index Only Scan using t_jcxxzy_tjaj_4_c_xylx_idx on t_jcxxzy_tjaj_4 (cost=0.43..17313.76 rows=608076 width=0) (actual time=0.037..129.107 rows=611274 loops=1)               Index Cond: (c_xylx = '02'::text)               Heap Fetches: 0         -> Index Only Scan using t_jcxxzy_tjaj_5_c_xylx_idx on t_jcxxzy_tjaj_5 (cost=0.43..13740.76 rows=482533 width=0) (actual time=0.037..99.022 rows=476888 loops=1)               Index Cond: (c_xylx = '02'::text)               Heap Fetches: 0         -> Index Only Scan using i_t_jcxxzy_tjaj_h2_6 on t_jcxxzy_tjaj_6 (cost=0.12..8.14 rows=1 width=0) (actual time=0.006..0.006 rows=0 loops=1)               Index Cond: (c_xylx = '02'::text)               Heap Fetches: 0 Planning time: 0.948 ms Execution time: 1215.495 ms (22 rows) ​ Time: 1236.152 ms ​ ​ --分區后執行時間 db_jcxxzypt=# \timing Timing is on. db_jcxxzypt=# select count(*) from db_jcxx.t_jcxxzy_tjaj WHERE c_xylx = '02'; count --------- 2916043 (1 row) Time: 592.745 ms

可以看出分區后c_xylx='02'的每個分區都存在,執行計划顯示掃描了所有分區,分區后的時間和未分區的時間相差不大

按照日期范圍求c_xylx='02'的數據

--未分區執行計划
--首先創建聯合索引 create index i_t_jcxxzy_tjaj_h2 on t_jcxxzy_tjaj(d_slrq,c_xylx); db_jcxxzypt=# explain analyze select count(*) from db_jcxx.t_jcxxzy_tjaj WHERE d_slrq >='2016-01-01' and d_slrq <'2016-10-31' and c_xylx = '02';                                                                   QUERY PLAN                                                                   ------------------------------------------------------------------------- Aggregate (cost=368120.65..368120.66 rows=1 width=8) (actual time=799.274..799.274 rows=1 loops=1)   -> Bitmap Heap Scan on t_jcxxzy_tjaj (cost=18338.24..367801.05 rows=127840 width=0) (actual time=137.97 8..786.398 rows=126533 loops=1)         Recheck Cond: ((d_slrq >= '2016-01-01'::date) AND (d_slrq < '2016-10-31'::date) AND ((c_xylx)::text = '02'::text))         Rows Removed by Index Recheck: 1490760         Heap Blocks: exact=35508 lossy=82085         -> Bitmap Index Scan on i_t_jcxxzy_tjaj_h2 (cost=0.00..18306.28 rows=127840 width=0) (actual time =127.441..127.441 rows=126533 loops=1)               Index Cond: ((d_slrq >= '2016-01-01'::date) AND (d_slrq < '2016-10-31'::date) AND ((c_xylx):: text = '02'::text)) Planning time: 0.383 ms Execution time: 799.350 ms (9 rows) ​ Time: 801.140 ms ​ --未分區執行時間 db_jcxxzypt=# select count(*) from db_jcxx.t_jcxxzy_tjaj WHERE d_slrq >='2016-01-01' and d_slrq <'2016-10-31' and c_xylx = '02'; count -------- 126533 (1 row) Time: 772.393 ms ​ --創建索引 create index i_t_jcxxzy_tjaj_h2_1 on t_jcxxzy_tjaj_1(d_slrq,c_xylx); create index i_t_jcxxzy_tjaj_h2_2 on t_jcxxzy_tjaj_2(d_slrq,c_xylx); create index i_t_jcxxzy_tjaj_h2_3 on t_jcxxzy_tjaj_3(d_slrq,c_xylx); create index i_t_jcxxzy_tjaj_h2_4 on t_jcxxzy_tjaj_4(d_slrq,c_xylx); create index i_t_jcxxzy_tjaj_h2_5 on t_jcxxzy_tjaj_5(d_slrq,c_xylx); create index i_t_jcxxzy_tjaj_h2_6 on t_jcxxzy_tjaj_6(d_slrq,c_xylx); --分區后執行計划 db_jcxxzypt=# explain analyze select count(*) from db_jcxx.t_jcxxzy_tjaj WHERE d_slrq >='2016-01-01' and d_slrq <'2016-10-31' and c_xylx = '02';                                                                             QUERY PLAN                                                                         ------------------------------------------------------------------------- Aggregate (cost=17438.16..17438.17 rows=1 width=8) (actual time=106.158..106.158 rows=1 loops=1)   -> Append (cost=0.43..17120.03 rows=127253 width=0) (actual time=0.319..94.105 rows=126533 loops=1)         -> Index Only Scan using i_t_jcxxzy_tjaj_h2_5 on t_jcxxzy_tjaj_5 (cost=0.43..17120.03 rows=127253 width=0) (actual time=0.318..79.701 rows=126533 loops=1)               Index Cond: ((d_slrq < '2016-10-31'::date) AND (c_xylx = '02'::text))               Heap Fetches: 0 Planning time: 0.488 ms Execution time: 106.216 ms (7 rows) ​ Time: 107.383 ms --此處執行計划直接判斷d_slrq < '2016-10-31'而不判斷d_slrq >='2016-01-01',原因是該分區表的分區約束就是d_slrq >='2016-01-01'開始 db_jcxxzypt=# \d+ t_jcxxzy_tjaj_5                                 Table "db_jcxx.t_jcxxzy_tjaj_5"   Column   |             Type             | Modifiers | Storage | Stats target | Description -------------+--------------------------------+-----------+----------+--- c_bh       | character(32)                 | not null | extended |             | c_xzdm     | character varying(300)         |           | extended |             | ...... Check constraints:   "pathman_t_jcxxzy_tjaj_5_check" CHECK (d_slrq >= '2016-01-01'::date AND d_slrq < '2020-01-01'::date) Inherits: t_jcxxzy_tjaj --分區后執行時間 db_jcxxzypt=# select count(*) from db_jcxx.t_jcxxzy_tjaj WHERE d_slrq >='2016-01-01' and d_slrq <'2016-10-31' and c_xylx = '02'; count -------- 126533 (1 row) Time: 97.369 ms ​ 從執行計划可以看出分區后只掃描了t_jcxxzy_tjaj_5這張表、並且使用了index only scan、時間要比不分區快很多 ​ --跨分區的日期查詢 分別從t_jcxxzy_tjaj_4、t_jcxxzy_tjaj_5兩張表獲取數據 db_jcxxzypt=# explain analyze select count(*) from db_jcxx.t_jcxxzy_tjaj WHERE d_slrq >='2015-12-01' and d_slrq <'2016-1-31' and c_xylx = '02';                                                                           QUERY PLAN                                                                         ------------------------------------------------------------------------- Aggregate (cost=3458.09..3458.10 rows=1 width=8) (actual time=25.379..25.380 rows=1 loops=1)   -> Append (cost=0.43..3395.52 rows=25029 width=0) (actual time=0.119..22.684 rows=25622 loops=1)         -> Index Only Scan using i_t_jcxxzy_tjaj_h2_4 on t_jcxxzy_tjaj_4 (cost=0.43..1655.53 rows=12119 w idth=0) (actual time=0.117..11.829 rows=13032 loops=1)               Index Cond: ((d_slrq >= '2015-12-01'::date) AND (c_xylx = '02'::text))               Heap Fetches: 0         -> Index Only Scan using i_t_jcxxzy_tjaj_h2_5 on t_jcxxzy_tjaj_5 (cost=0.43..1739.99 rows=12910 w idth=0) (actual time=0.184..7.693 rows=12590 loops=1)               Index Cond: ((d_slrq < '2016-01-31'::date) AND (c_xylx = '02'::text))               Heap Fetches: 0 Planning time: 5.857 ms Execution time: 25.461 ms (10 rows) ​ Time: 32.039 ms 獲取日期分區掃描了t_jcxxzy_tjaj_4、和t_jcxxzy_tjaj_5來統計d_slrq >='2015-12-01' and d_slrq <'2016-1-31'

再有日期范圍條件下,可以只掃描分區表t_jcxxzy_tjaj_5來獲取數據,使用分區表時,每個表的索引是獨立的,每個分區表的索引都只針對一個小的分區表。分區的效率要比未分區高很多

sum()、avg()、group by 對比

--未分區
db_jcxxzypt=# select count(n_ajdsrs),n_ajdsrs from t_jcxxzy_tjaj group by n_ajdsrs; count | n_ajdsrs ---------+---------- 4378357 |       0 4377009 |       1 4374162 |       2 4378172 |       3 (4 rows) Time: 4770.810 ms db_jcxxzypt=# select sum(n_ajdsrs) from t_jcxxzy_tjaj ;   sum   ---------- 26259849 (1 row) ​ Time: 4059.588 ms db_jcxxzypt=# select avg(n_ajdsrs) from t_jcxxzy_tjaj ;       avg         -------------------- 1.4999028427491904 (1 row) ​ Time: 4098.815 ms --分區后 db_jcxxzypt=# select count(n_ajdsrs),n_ajdsrs from t_jcxxzy_tjaj group by n_ajdsrs; count | n_ajdsrs ---------+---------- 4378357 |       0 4377009 |       1 4374162 |       2 4378172 |       3 (4 rows) Time: 4050.820 ms db_jcxxzypt=# select sum(n_ajdsrs) from t_jcxxzy_tjaj;   sum   ---------- 26259849 (1 row) Time: 2543.786 ms db_jcxxzypt=# select avg(n_ajdsrs) from t_jcxxzy_tjaj;       avg         -------------------- 1.4999028427491904 (1 row) Time: 2727.279 ms

 

RANGE分區效率對比

針對t_jcxxzy_tjaj表的1750w數據range分區后,按照分區數,查詢效率對比

查詢方式 未分區 5分區(平均360w) 20分區(平均90w)
c_xylx = '02' 543.206 ms 599.155 ms 612.299 ms
d_slrq+c_xylx = '02' 772.393 ms 97.369 ms 77.807 ms
group by n_ajdsrs 4976.328 ms 4770.810 ms 4107.329 ms
avg(n_ajdsrs) 4098.815 ms 2727.279 ms 2643.653 ms
sum(n_ajdsrs) 4059.588 ms 2543.786ms 2535.021 ms

5分區和20分區的區別不大,而針對c_xylx='02'的所有分區掃描和不分區的效率相差不大,但是針對分區鍵的查詢效率上非常明顯,一些聚合函數的效率也要高。

單獨查詢分區表

--只查詢分區表t_jcxxzy_tjaj_5
db_jcxxzypt=#  explain analyze  select count(*) from db_jcxx.t_jcxxzy_tjaj_5 WHERE d_slrq >='2017-01-01' and d_slrq <'2017-1-31' and c_xylx = '02';                                                                         QUERY PLAN                                                                         ------------------------------------------------------------------------- Aggregate (cost=4456.74..4456.75 rows=1 width=8) (actual time=29.910..29.911 rows=1 loops=1)   -> Index Only Scan using i_t_jcxxzy_tjaj_h2_5 on t_jcxxzy_tjaj_5 (cost=0.43..4383.39 rows=29342 width=0 ) (actual time=0.157..26.854 rows=29497 loops=1)         Index Cond: ((d_slrq >= '2017-01-01'::date) AND (d_slrq < '2017-01-31'::date) AND (c_xylx = '02'::t ext))         Heap Fetches: 0 Planning time: 0.272 ms Execution time: 29.969 ms (6 rows) Time: 30.910 ms

分區表也可以單獨使用

常用的函數接口

--數據遷移完成后,建議禁用主表,這樣執行計划就不會出現主表了。實際測試如果不禁用主表可能大部分的掃描時間都在主表。
select set_enable_parent('t_jcxxzy_tjaj'::regclass,false); --新增分區(向后擴展),新增分區是在原來的基礎上擴展 db_jcxxzypt=# select append_range_partition('db_jcxx.t_jcxxzy_tjaj'::regclass); append_range_partition ------------------------ t_jcxxzy_tjaj_9 (1 row) --新增分區(向前添加) db_jcxxzypt=# select prepend_range_partition('t_jcxxzy_tjaj'::regclass); prepend_range_partition ------------------------- t_jcxxzy_tjaj_11 (1 row) db_jcxxzypt=# \d+ t_jcxxzy_tjaj_11                                 Table "db_jcxx.t_jcxxzy_tjaj_11"   Column   |             Type             | Modifiers | Storage | Stats target | Description -------------+--------------------------------+-----------+----------+--- c_bh       | character(32)                 | not null | extended |             | --省略了部分字段和索引... Check constraints:   "pathman_t_jcxxzy_tjaj_11_check" CHECK (d_slrq >= '1996-01-01'::date AND d_slrq < '2000-01-01'::date) Inherits: t_jcxxzy_tjaj --刪除單個范圍分區,false表示分區數據遷移到主表 db_jcxxzypt=# select drop_range_partition('t_jcxxzy_tjaj_11',false); NOTICE: 0 rows copied from t_jcxxzy_tjaj_11 drop_range_partition ---------------------- t_jcxxzy_tjaj_11 (1 row) -- 刪除所有分區表,並將數據遷移到主表。false表示分區數據遷移到主表 select drop_partitions('t_jcxxzy_tjaj_7'::regclass, false); --合並分區,必須為相鄰分區 select merge_range_partitions('t_jcxxzy_tjaj_10':: REGCLASS, 't_jcxxzy_tjaj_11' ::REGCLASS)   --分裂范圍分區,將分區表分裂為兩個分區,僅支持范圍分區表 select split_range_partition('t_jcxxzy_tjaj_6'::REGCLASS,           -- 分區oid                     '2022-01-01 00:00:00'::timestamp,         -- 分裂值                     't_jcxxzy_tjaj_6_1') --自動擴展分區表 select set_auto('t_jcxxzy_tjaj'::REGCLASS, true) --插入受理日期為2100-05-19這條數據 db_jcxxzypt=# INSERT INTO "db_jcxx"."t_jcxxzy_tjaj" ("c_bh", "d_slrq") VALUES ('7be7f21958e248a1b69a140f1151d4f4', '2100-05-19'); INSERT 0 1 db_jcxxzypt=# \d+ t_jcxxzy_tjaj                                   Table "db_jcxx.t_jcxxzy_tjaj"   Column   |             Type             | Modifiers | Storage | Stats target | Description -------------+--------------------------------+-----------+----------+--- c_bh       | character(32)                 | not null | extended |             | ID --省略字段... Child tables: t_jcxxzy_tjaj_1,             t_jcxxzy_tjaj_12,             t_jcxxzy_tjaj_13,             t_jcxxzy_tjaj_14,             t_jcxxzy_tjaj_15,             t_jcxxzy_tjaj_16,             t_jcxxzy_tjaj_17,             t_jcxxzy_tjaj_18,             t_jcxxzy_tjaj_19,             t_jcxxzy_tjaj_2,             t_jcxxzy_tjaj_20,             t_jcxxzy_tjaj_21,             t_jcxxzy_tjaj_22,             t_jcxxzy_tjaj_23,             t_jcxxzy_tjaj_24,             t_jcxxzy_tjaj_25,             t_jcxxzy_tjaj_26,             t_jcxxzy_tjaj_27,             t_jcxxzy_tjaj_28,             t_jcxxzy_tjaj_29,             t_jcxxzy_tjaj_3,             t_jcxxzy_tjaj_4,             t_jcxxzy_tjaj_5,             t_jcxxzy_tjaj_6,             t_jcxxzy_tjaj_6_1,             t_jcxxzy_tjaj_9 Options: parallel_workers=2 發現在原來t_jcxxzy_tjaj_11的自處上自動創建了許多擴展表、意思是他會根據插入數據的日期取匹配一直創建。如果有臟數據那么就會創建許多擴展、所以不建議打開

不建議打開自動擴展表,如果有臟數據那么會一直創建多個分區表。可以使用定時任務定時的來創建分區表。

解除分區表與主表的關系、刪除分區表

--解除分區表和主表關系
db_jcxxzypt=#   ALTER TABLE t_jcxxzy_tjaj_30 NO INHERIT t_jcxxzy_tjaj; ALTER TABLE Time: 2.922 ms ​ 解除關系后該表還是存在、可以單獨使用 --刪除分區表 DROP TABLE t_jcxxzy_tjaj_30;

如果分區表的數據已經過期需要刪除,直接刪除分區表即可,比delete更快,因為delete只是將數據標記為刪除,還需要vacuum。

結語

1.針對已經存在的表進行分區,最好將數據遷移完后在建索引

2.如果數據表已經存在,建議先建立分區表然后使用非堵塞式的遷移接口

3.如果要充分使用分區表的查詢優勢,必須使用分區時的字段作為過濾條件

4.需要注意分區后就沒有全局唯一性了,各個分區之間是可以有重復的uuid

5.對於分區鍵條件查詢,效率非常高

6.分區的字段必須是非空,類似於案件的立案日期結案日期就不能用作分區字段

7.VACUUM或ANALYZE t_jcxxzy_tjaj只會對主表起作用,要想分析表,需要分別分析每個分區表。

8.分區的備份可以單獨備份各個分區,但是如果要別分所有分區只能備份整個schema

9.數據遷移到分區表后建議禁用主表,如果主表未執行vacuum操作,那么執行計划會全表掃描主表,非常耗時。


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM