目錄
概念
pg_hint_plan是一款插件,類似於oracle的hint;用於選擇特定的執行計划,進行SQL調優。
編譯安裝
本文以pg_hint_plan-REL10_1_3_3.tar.gz+pg10.6為例;
#解壓 tar xf pg_hint_plan-REL10_1_3_3.tar.gz chown -R postgres:postgres pg_hint_plan-REL10_1_3_3 #編譯安裝 su – postgres export PATH=/opt/pgsql/bin:$PATH cd pg_hint_plan-REL10_1_3_3 gmake clean gmake gmake install #編譯后日志 /usr/bin/mkdir -p '/opt/pgsql/share/extension' /usr/bin/mkdir -p '/opt/pgsql/share/extension' /usr/bin/mkdir -p '/opt/pgsql/lib' /usr/bin/install -c -m 644 .//pg_hint_plan.control '/opt/pgsql/share/extension/' /usr/bin/install -c -m 644 .//pg_hint_plan--*.sql '/opt/pgsql/share/extension/' /usr/bin/install -c -m 755 pg_hint_plan.so '/opt/pgsql/lib/' #修改配置,創建插件 在postgresql.conf的shared_preload_libraries配置項加入pg_hint_plan shared_preload_libraries = 'pg_stat_statements,pg_jieba,rum,pg_pathman,pg_bigm,pg_hint_plan' #重啟PG /opt/pgsql/bin/pg_ctl -D /pgsql/data_10 restart #登陸PG,創建插件 postgres=# create extension pg_hint_plan ; CREATE EXTENSION |
驗證
#建表造數據 create table test11(id int); insert into test11 select * from generate_series(1,10); create index idx_test11 on test11 (id); #驗證 postgres=# explain select * from test11 where id=1; QUERY PLAN ------------------------------------------------------ Seq Scan on test11 (cost=0.00..1.12 rows=1 width=4) Filter: (id = 1) 使用hint: postgres=# /*+ IndexScan(test11) */ explain select * from test11 where id=1; QUERY PLAN ------------------------------------------------------------------------- Index Scan using idx_test11 on test11 (cost=0.14..8.15 rows=1 width=4) Index Cond: (id = 1) |
支持的hint參數
Format:(/*+ SeqScan(table) */) |
Description |
SeqScan(table) |
Forces sequential scan on the table |
TidScan(table) |
Forces TID scan on the table. |
IndexScan(table[ index...]) |
Forces index scan on the table. Restricts to specified indexes if any. |
IndexOnlyScan(table[ index...]) |
Forces index only scan on the table. Rstricts to specfied indexes if any. Index scan may be used if index only scan is not available. Available for PostgreSQL 9.2 and later. |
BitmapScan(table[ index...]) |
Forces bitmap scan on the table. Restoricts to specfied indexes if any. |
NoSeqScan(table) |
Forces not to do sequential scan on the table. |
NoTidScan(table) |
Forces not to do TID scan on the table. |
NoIndexScan(table) |
Forces not to do index scan and index only scan (For PostgreSQL 9.2 and later) on the table. |
NoIndexOnlyScan(table) |
Forces not to do index only scan on the table. Available for PostgreSQL 9.2 and later. |
NoBitmapScan(table) |
Forces not to do bitmap scan on the table. |
NestLoop(table table[ table...]) |
Forces nested loop for the joins consist of the specifiled tables. |
HashJoin(table table[ table...]) |
Forces hash join for the joins consist of the specifiled tables. |
MergeJoin(table table[ table...]) |
Forces merge join for the joins consist of the specifiled tables. |
NoNestLoop(table table[ table...]) |
Forces not to do nested loop for the joins consist of the specifiled tables. |
NoHashJoin(table table[ table...]) |
Forces not to do hash join for the joins consist of the specifiled tables. |
NoMergeJoin(table table[ table...]) |
Forces not to do merge join for the joins consist of the specifiled tables. |
Leading(table table[ table...]) |
Forces join order as specified. |
Leading() |
Forces join order and directions as specified. A join pair is a pair of tables and/or other join pairs enclosed by parentheses, which can make a nested structure. |
Rows(table table[ table...] correction) |
Corrects row number of a result of the joins consist of the specfied tables. The available correction methods are absolute (#), addition (+), subtract (-) and multiplication (*). should be a string that strtod() can read. |
Set(GUC-param value) |
Set the GUC parameter to the value while planner is running. |
實戰效果
原始SQL: explain (ANALYZE true,buffers true) WITH A AS ( SELECT RESULT.x1 FROM T1 TASK LEFT JOIN T2 RELATE ON RELATE.x2 = TASK. ID LEFT JOIN t3 RESULT ON RELATE.x3 = RESULT.x4 WHERE 1 = 1 AND TASK.ID = 'xxxxxxxxxx' AND RESULT.DATA_SOURCE IN ( 'P12', 'P14', 'P02', 'P17', 'P26' ) AND RESULT.CROWD_NAME IS NOT NULL ) SELECT COUNT( A.entity_id ) AS cnt FROM A; 對應執行計划: Aggregate (cost=113379.07..113379.08 rows=1 width=8) (actual time=2772.901..2772.901 rows=1 loops=1) Buffers: shared hit=719635, temp written=2607 CTE a -> Nested Loop (cost=0.57..113105.31 rows=12167 width=4) (actual time=0.050..2077.828 rows=1525996 loops=1) Buffers: shared hit=719635 -> Index Only Scan using idx_zdr_arrive_local_task_relate_code_task_id on zdr_arrive_local_task_relate relate (cost=0.14..3.33 rows=1 width=18) (actual time=0.010..0.012 rows=1 loops=1) Index Cond: (task_id = 'xxxxxx'::text) Heap Fetches: 1 Buffers: shared hit=3 -> Index Scan using zdr_arrive_local_cache_datasource_crowd_id on zdr_arrive_local_cache result (cost=0.43..112493.64 rows=60834 width=12) (actual time=0.037..1876.879 rows=1525996 loops=1) Index Cond: (((crowd_id)::text = (relate.relate_code)::text) AND ((data_source)::text = ANY ('{P12,P14,P02,P17,P26}'::text[]))) Filter: (crowd_name IS NOT NULL) Buffers: shared hit=719632 -> CTE Scan on a (cost=0.00..243.34 rows=12167 width=4) (actual time=0.051..2566.068 rows=1525996 loops=1) Buffers: shared hit=719635, temp written=2607 Planning time: 0.423 ms Execution time: 2778.586 ms
|
問題:
標紅處可以看到調用索引的情況下,返回了 1525996 條數據(該表總數據量兩千萬),明顯返回這么大數據集的情況下,走索引造成的隨機IO相當驚人,優化可以從耗時長的那一步入手,可以嘗試調用順序掃描。此索引有用於其他SQL不能被刪除。
優化后SQL:去掉不必要的表連接,改用順序掃描 /*+ noindexscan(result) */ explain (ANALYZE true,buffers true) WITH A AS ( Xxxxxx
對應執行計划:
Aggregate (cost=489540.59..489540.60 rows=1 width=8) (actual time=1134.751..1134.751 rows=1 loops=1) Buffers: shared hit=472204, temp written=2667 CTE a -> Gather (cost=3.35..489266.83 rows=12167 width=4) (actual time=0.458..512.064 rows=1560819 loops=1) Workers Planned: 6 Workers Launched: 6 Buffers: shared hit=472204 -> Hash Join (cost=3.35..489266.83 rows=2028 width=4) (actual time=0.340..667.536 rows=222974 loops=7) Hash Cond: ((result.crowd_id)::text = (relate.relate_code)::text) Buffers: shared hit=472204 -> Parallel Seq Scan on zdr_arrive_local_cache result (cost=0.00..488458.18 rows=209339 width=12) (actual time=0.153..599.215 rows=227961 loops=7) Filter: ((crowd_name IS NOT NULL) AND ((data_source)::text = ANY ('{P12,P14,P02,P17,P26}'::text[]))) Rows Removed by Filter: 734658 Buffers: shared hit=471811 -> Hash (cost=3.33..3.33 rows=1 width=18) (actual time=0.069..0.069 rows=1 loops=7) Buckets: 1024 Batches: 1 Memory Usage: 9kB Buffers: shared hit=111 -> Index Only Scan using idx_zdr_arrive_local_task_relate_code_task_id on zdr_arrive_local_task_relate relate (cost=0.14..3.33 rows=1 width=18) (actual time=0.066..0.067 rows=1 loops=7) Index Cond: (task_id = '76e7b381-1f86-47d7-a2f3-deac63c2e75f'::text) Heap Fetches: 1 Buffers: shared hit=111 -> CTE Scan on a (cost=0.00..243.34 rows=12167 width=4) (actual time=0.459..951.253 rows=1560819 loops=1) Buffers: shared hit=472204, temp written=2667 Planning time: 0.490 ms Execution time: 1179.083 ms |
性能提升2.35倍