PostgreSQL 傳統 hash 分區方法和性能


背景

除了傳統的基於trigger和rule的分區,PostgreSQL 10開始已經內置了分區功能(目前僅支持list和range),使用pg_pathman則支持hash分區。

從性能角度,目前最好的還是pg_pathman分區。

但是,傳統的分區手段,依舊是最靈活的,在其他方法都不奏效時,可以考慮傳統方法。

如何創建傳統的hash分區

1、創建父表

create table tbl (id int, info text, crt_time timestamp);  

2、創建分區表,增加約束

do language plpgsql $$  
declare  
  parts int := 4;  
begin  
  for i in 0..parts-1 loop  
    execute format('create table tbl%s (like tbl including all) inherits (tbl)', i);  
    execute format('alter table tbl%s add constraint ck check(mod(id,%s)=%s)', i, parts, i);  
  end loop;  
end;  
$$;  

3、創建觸發器函數,內容為數據路由,路由后返回NULL(即不寫本地父表)

create or replace function ins_tbl() returns trigger as $$  
declare  
begin  
  case abs(mod(NEW.id,4))  
    when 0 then  
      insert into tbl0 values (NEW.*);  
    when 1 then  
      insert into tbl1 values (NEW.*);  
    when 2 then  
      insert into tbl2 values (NEW.*);  
    when 3 then  
      insert into tbl3 values (NEW.*);  
    else  
      return NEW;  -- 如果是NULL則寫本地父表  
    end case;  
    return null;  
end;  
$$ language plpgsql strict;  

4、創建before觸發器

create trigger tg1 before insert on tbl for each row when (NEW.id is not null) execute procedure ins_tbl();  

5、驗證

postgres=# insert into tbl values (1);  
INSERT 0 0  
postgres=# insert into tbl values (null);  
INSERT 0 1  
postgres=# insert into tbl values (0);  
INSERT 0 0  
postgres=# insert into tbl values (1);  
INSERT 0 0  
postgres=# insert into tbl values (2);  
INSERT 0 0  
postgres=# insert into tbl values (3);  
INSERT 0 0  
postgres=# insert into tbl values (4);  
INSERT 0 0  
  
  
postgres=# select  tableoid::regclass, * from tbl;  
tableoid | id | info | crt_time  
----------+----+------+----------  
tbl      |    |      |  
tbl0    |  0 |      |  
tbl0    |  4 |      |  
tbl1    |  1 |      |  
tbl1    |  1 |      |  
tbl2    |  2 |      |  
tbl3    |  3 |      |  
(7 rows)  

6、查詢時,只要提供了約束條件,會自動過濾到子表,不會掃描不符合約束條件的其他子表。

postgres=# explain select * from tbl where abs(mod(id,4)) = abs(mod(1,4)) and id=1;  
                                QUERY PLAN                                  
--------------------------------------------------------------------------  
Append  (cost=0.00..979127.84 rows=3 width=45)  
  ->  Seq Scan on tbl  (cost=0.00..840377.67 rows=2 width=45)  
        Filter: ((id = 1) AND (abs(mod(id, 4)) = 1))  
  ->  Seq Scan on tbl1  (cost=0.00..138750.17 rows=1 width=45)  
        Filter: ((id = 1) AND (abs(mod(id, 4)) = 1))  
(5 rows)  

這里應該是錯誤的,因為如果想利用constraint_exclusion來優化sql,where條件應該盡可能簡單,盡量和check約束保持一致,不要轉換類型,更談不上使用函數表達式了,上面實測執行計划是走的全表掃描。后面會列出官方文檔中提到的有關分區表和constraint_exclusion參數相關的注意事項。
這里我明白德哥的原意了,因為做的hash分區,取模的數值只有4個且均大於等於0,這里加上絕對值是恰當的,但這個abs應該加到check約束里面,不然constraint_exclusion的優化效果還是用不到的。

下面是實測執行計划及修改條件后的執行計划:
db版本:PostgreSQL 10.1,constraint_exclusion:partition

swrd=# explain select * from tbl where abs(mod(id,4)) = abs(mod(1,4)) and id=1;  
                         QUERY PLAN                         
------------------------------------------------------------
 Append  (cost=0.00..133.66 rows=5 width=44)
   ->  Seq Scan on tbl  (cost=0.00..3.26 rows=1 width=44)
         Filter: ((id = 1) AND (abs(mod(id, 4)) = 1))
   ->  Seq Scan on tbl0  (cost=0.00..32.60 rows=1 width=44)
         Filter: ((id = 1) AND (abs(mod(id, 4)) = 1))
   ->  Seq Scan on tbl1  (cost=0.00..32.60 rows=1 width=44)
         Filter: ((id = 1) AND (abs(mod(id, 4)) = 1))
   ->  Seq Scan on tbl2  (cost=0.00..32.60 rows=1 width=44)
         Filter: ((id = 1) AND (abs(mod(id, 4)) = 1))
   ->  Seq Scan on tbl3  (cost=0.00..32.60 rows=1 width=44)
         Filter: ((id = 1) AND (abs(mod(id, 4)) = 1))
(11 rows)

修改where條件后的執行計划:

swrd=# explain select * from tbl where mod(id,4) = mod(1,4) and id=1;  
                         QUERY PLAN                         
------------------------------------------------------------
 Append  (cost=0.00..32.75 rows=2 width=44)
   ->  Seq Scan on tbl  (cost=0.00..2.98 rows=1 width=44)
         Filter: ((id = 1) AND (mod(id, 4) = 1))
   ->  Seq Scan on tbl1  (cost=0.00..29.78 rows=1 width=44)
         Filter: ((id = 1) AND (mod(id, 4) = 1))
(5 rows)

傳統分區性能 對比 非分區表

傳統分區表性能

性能相比沒有分區有一定下降。(CPU開銷略有提升)

1、創建壓測腳本

vi test.sql  
\set id random(1,100000)  
insert into tbl values (:id);  

2、壓測

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 56 -j 56 -T 120  
  
transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 56  
number of threads: 56  
duration: 120 s  
number of transactions actually processed: 21277635  
latency average = 0.316 ms  
latency stddev = 0.170 ms  
tps = 177290.033472 (including connections establishing)  
tps = 177306.915203 (excluding connections establishing)  
script statistics:  
- statement latencies in milliseconds:  
        0.002  \set id random(1,100000)  
        0.315  insert into tbl values (:id);  

3、資源開銷

last pid: 36817;  load avg:  32.9,  15.7,  7.27;      up 15+00:46:36                                                                                                                                                              17:59:17  
63 processes: 34 running, 29 sleeping  
CPU states: 42.3% user,  0.0% nice, 20.4% system, 37.1% idle,  0.2% iowait  
Memory: 192G used, 29G free, 116M buffers, 186G cached  
DB activity: 168654 tps,  0 rollbs/s, 928 buffer r/s, 99 hit%,    176 row r/s, 168649 row w/  
DB I/O:    0 reads/s,    0 KB/s,    0 writes/s,    0 KB/s    
DB disk: 1455.4 GB total, 425.2 GB free (70% used)  
Swap:  

未分區表性能

postgres=# drop trigger tg1 on tbl ;  

1、TPS

transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 56  
number of threads: 56  
duration: 120 s  
number of transactions actually processed: 31188395  
latency average = 0.215 ms  
latency stddev = 0.261 ms  
tps = 259884.798007 (including connections establishing)  
tps = 259896.495810 (excluding connections establishing)  
script statistics:  
- statement latencies in milliseconds:  
        0.002  \set id random(1,100000)  
        0.214  insert into tbl values (:id);  

2、資源開銷

last pid: 36964;  load avg:  31.7,  18.7,  8.89;      up 15+00:47:41                                                                                                                                                              18:00:22  
63 processes: 45 running, 18 sleeping  
CPU states: 33.3% user,  0.0% nice, 26.8% system, 39.8% idle,  0.1% iowait  
Memory: 194G used, 26G free, 118M buffers, 188G cached  
DB activity: 256543 tps,  0 rollbs/s, 1006 buffer r/s, 99 hit%,    176 row r/s, 256538 row w  
DB I/O:    0 reads/s,    0 KB/s,    0 writes/s,    0 KB/s    
DB disk: 1455.4 GB total, 424.8 GB free (70% used)  
Swap:  

非整型字段,如何實現哈希分區

1、PostgreSQL內部提供了類型轉換的哈希函數,可以將任意類型轉換為整型。

                                  List of functions  
  Schema  |      Name      | Result data type |    Argument data types    |  Type    
------------+----------------+------------------+-----------------------------+--------  
pg_catalog | hash_aclitem  | integer          | aclitem                    | normal  
pg_catalog | hash_array    | integer          | anyarray                    | normal  
pg_catalog | hash_numeric  | integer          | numeric                    | normal  
pg_catalog | hash_range    | integer          | anyrange                    | normal  
pg_catalog | hashbpchar    | integer          | character                  | normal  
pg_catalog | hashchar      | integer          | "char"                      | normal  
pg_catalog | hashenum      | integer          | anyenum                    | normal  
pg_catalog | hashfloat4    | integer          | real                        | normal  
pg_catalog | hashfloat8    | integer          | double precision            | normal  
pg_catalog | hashinet      | integer          | inet                        | normal  
pg_catalog | hashint2      | integer          | smallint                    | normal  
pg_catalog | hashint4      | integer          | integer                    | normal  
pg_catalog | hashint8      | integer          | bigint                      | normal  
pg_catalog | hashmacaddr    | integer          | macaddr                    | normal  
pg_catalog | hashmacaddr8  | integer          | macaddr8                    | normal  
pg_catalog | hashname      | integer          | name                        | normal  
pg_catalog | hashoid        | integer          | oid                        | normal  
pg_catalog | hashoidvector  | integer          | oidvector                  | normal  
pg_catalog | hashtext      | integer          | text                        | normal  
pg_catalog | hashvarlena    | integer          | internal                    | normal  
pg_catalog | interval_hash  | integer          | interval                    | normal  
pg_catalog | jsonb_hash    | integer          | jsonb                      | normal  
pg_catalog | pg_lsn_hash    | integer          | pg_lsn                      | normal  
pg_catalog | time_hash      | integer          | time without time zone      | normal  
pg_catalog | timestamp_hash | integer          | timestamp without time zone | normal  
pg_catalog | timetz_hash    | integer          | time with time zone        | normal  
pg_catalog | uuid_hash      | integer          | uuid                        | normal  

2、其他字段類型的哈希表方法如下

如 hashtext

drop table tbl;  
  
create table tbl (id text, info text, crt_time timestamp);  
  
do language plpgsql $$  
declare  
  parts int := 4;  
begin  
  for i in 0..parts-1 loop  
    execute format('create table tbl%s (like tbl including all) inherits (tbl)', i);  
    execute format('alter table tbl%s add constraint ck check(abs(mod(hashtext(id),%s))=%s)', i, parts, i);  
  end loop;  
end;  
$$;  
  
create or replace function ins_tbl() returns trigger as $$  
declare  
begin  
  case abs(mod(hashtext(NEW.id),4))  
    when 0 then  
      insert into tbl0 values (NEW.*);  
    when 1 then  
      insert into tbl1 values (NEW.*);  
    when 2 then  
      insert into tbl2 values (NEW.*);  
    when 3 then  
      insert into tbl3 values (NEW.*);  
    else  
      return NEW;  
    end case;  
    return null;  
end;  
$$ language plpgsql strict;  
  
create trigger tg1 before insert on tbl for each row when (NEW.id is not null) execute procedure ins_tbl();  

性能與整型一樣。

傳統分區性能 對比 非分區表 - 性能結果

1、性能

模式 insert N 行/s
基於trigger的hash分區 17.7 萬
未分區 26 萬

2、CPU資源開銷

模式 user system idle
基於trigger的hash分區 42.3% 20.4% 37.1%
未分區 33.3% 26.8% 39.8%

小結

除了傳統的基於trigger和rule的分區,PostgreSQL 10開始已經內置了分區功能(目前僅支持list和range),使用pg_pathman則支持hash分區。

從性能角度,目前最好的還是pg_pathman分區。

《PostgreSQL 10 內置分區 vs pg_pathman perf profiling》

《PostgreSQL 10.0 preview 功能增強 - 內置分區表》

《PostgreSQL 9.5+ 高效分區表實現 - pg_pathman》

但是,傳統的分區手段,依舊是最靈活的,在其他方法都不奏效時,可以考慮傳統方法。

傳統手段中,最懶散的做法(當然是以犧牲性能為前提),例子:

《PostgreSQL general public partition table trigger》

下面則是pg10官方文檔中提到的有關分區表和有關參數constraint_exclusion的相關注意事項:

The following caveats apply to constraint exclusion, which is used by both inheritance and partitioned tables:

  • Constraint exclusion only works when the query's WHERE clause contains constants (or externally supplied parameters). For example, a comparison against a non-immutable function such as CURRENT_TIMESTAMP cannot be optimized, since the planner cannot know which partition the function value might fall into at run time.

  • Keep the partitioning constraints simple, else the planner may not be able to prove that partitions don't need to be visited. Use simple equality conditions for list partitioning, or simple range tests for range partitioning, as illustrated in the preceding examples. A good rule of thumb is that partitioning constraints should contain only comparisons of the partitioning column(s) to constants using B-tree-indexable operators, which applies even to partitioned tables, because only B-tree-indexable column(s) are allowed in the partition key. (This is not a problem when using declarative partitioning, since the automatically generated constraints are simple enough to be understood by the planner.)

  • All constraints on all partitions of the master table are examined during constraint exclusion, so large numbers of partitions are likely to increase query planning time considerably. Partitioning using these techniques will work well with up to perhaps a hundred partitions; don't try to use many thousands of partitions.

簡單翻譯:

  • 約束排除只有在查詢語句的where部分含有常量時,才有效。比如在做比較時,不可以用non-immutable function,類似CURRENT_TIMESTAMP就不能被優化,因為優化器不能確定這個函數在執行時會落到那個分區。
  • 盡量保持分區約束的簡單性,不然優化器可能無法確定要訪問哪個分區。
  • 所有分區表中的約束在優化器進行約束檢查時,都會查到,所以只要分區表數量不是成千上萬就不會影響太大。

摘自:
https://github.com/digoal/blog/blob/master/201711/20171122_02.md
https://www.postgresql.org/docs/10/static/ddl-partitioning.html#DDL-PARTITIONING-CONSTRAINT-EXCLUSION


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM