pgsql_sql查詢效率優化


在pgsql中執行一個 5表 關聯查詢,效率比較差,問題定位

環境說明
5張外表,其中with 中的臨時表總記錄數比較大,共有 2 億條記錄,通過時間序模型提高查詢速度
另外4張表 左表的記錄非常小,最大的記錄數不超過 1w 條

在沒有做過任何調優的pgsql 中執行explain,會發現它的訪問計划中包含很多的 nested loop join

 Aggregate  (cost=99723528.30..99723528.31 rows=1 width=0)
   CTE f_acct_vchr_1_tmp
     ->  Foreign Scan on hdmp_pri5_fdm_f_acct_vchr vo_1  (cost=0.00..99722420.16 rows=1 width=1448)
           Filter: ((posting_dt >= '2015-12-01'::date) AND (posting_dt <= '2015-12-31'::date) AND (trans_no ~~ '301%'::text) AND (a
mt = 1000::double precision) AND ((posting_flg = 'Y'::text) OR (gl_acc_id = '99900'::text)))
           Foreign Namespace: hdmp_pri5_fdm.f_acct_vchr
   ->  Nested Loop Left Join  (cost=0.00..1108.15 rows=1 width=0)
         Join Filter: (vo.calc_trans_action = d3.trans_action_cd)
         ->  Nested Loop Left Join  (cost=0.00..902.53 rows=1 width=32)
               Join Filter: (vo.trans_action_cd = d2.trans_action_cd)
               ->  Nested Loop Left Join  (cost=0.00..696.92 rows=1 width=64)
                     Join Filter: (vo.fund_tnl_cd = f1.prod_cd)
                     ->  Nested Loop Left Join  (cost=0.00..360.10 rows=1 width=96)
                           Join Filter: (vo.calc_unit_id = u1.calc_unit_id)
                           ->  Nested Loop Left Join  (cost=0.00..352.15 rows=1 width=104)
                                 Join Filter: (vo.modl_id = d1.modl_id)
                                 ->  Nested Loop Left Join  (cost=0.00..336.84 rows=1 width=112)
                                       Join Filter: (vo.prod_cd = p.prod_cd)
                                       ->  CTE Scan on f_acct_vchr_1_tmp vo  (cost=0.00..0.02 rows=1 width=144)
                                       ->  Foreign Scan on d_prod p  (cost=0.00..336.22 rows=48 width=32)
                                             Filter: (eff_flg = 'Y'::text)
                                             Foreign Namespace: hdmp_pri5_fdm.d_prod
                                 ->  Foreign Scan on d_modl d1  (cost=0.00..13.36 rows=156 width=8)
                                       Foreign Namespace: hdmp_pri5_fdm.d_modl
                           ->  Foreign Scan on d_calc_unit u1  (cost=0.00..7.93 rows=1 width=8)
                                 Filter: (eff_flg = 'Y'::text)
                                 Foreign Namespace: hdmp_pri5_fdm.d_calc_unit
                     ->  Foreign Scan on d_prod f1  (cost=0.00..336.22 rows=48 width=32)
                           Filter: (eff_flg = 'Y'::text)
                           Foreign Namespace: hdmp_pri5_fdm.d_prod

我們通過對復雜sql 做進一步分析,發現臨時表 (with 里面的表)出來的結果集為 350 條記錄,不算太多,但是也不少
如果我們減少臨時表中的where 條件,將臨時表的結果集增大到 8700 條記錄,再執行 exlain 查看訪問計划,發現變成以下這樣

 Aggregate  (cost=99723547.48..99723547.49 rows=1 width=0)
   CTE f_acct_vchr_1_tmp
     ->  Foreign Scan on hdmp_pri5_fdm_f_acct_vchr vo_1  (cost=0.00..99722428.03 rows=127 width=1448)
           Filter: ((posting_dt >= '2015-12-01'::date) AND (posting_dt <= '2015-12-31'::date) AND (trans_no ~~ '301%'::text) AND ((
posting_flg = 'Y'::text) OR (gl_acc_id = '99900'::text)))
           Foreign Namespace: hdmp_pri5_fdm.f_acct_vchr
   ->  Hash Left Join  (cost=771.19..1119.14 rows=127 width=0)
         Hash Cond: (vo.fund_tnl_cd = f1.prod_cd)
         ->  Nested Loop Left Join  (cost=434.36..780.90 rows=127 width=32)
               Join Filter: (vo.calc_unit_id = u1.calc_unit_id)
               ->  Hash Right Join  (cost=434.36..771.07 rows=127 width=40)
                     Hash Cond: (p.prod_cd = vo.prod_cd)
                     ->  Foreign Scan on d_prod p  (cost=0.00..336.22 rows=48 width=32)
                           Filter: (eff_flg = 'Y'::text)
                           Foreign Namespace: hdmp_pri5_fdm.d_prod
                     ->  Hash  (cost=432.78..432.78 rows=127 width=72)
                           ->  Hash Left Join  (cost=226.27..432.78 rows=127 width=72)
                                 Hash Cond: (vo.calc_trans_action = d3.trans_action_cd)
                                 ->  Hash Right Join  (cost=20.65..226.20 rows=127 width=104)
                                       Hash Cond: (d2.trans_action_cd = vo.trans_action_cd)
                                       ->  Foreign Scan on d_trans_action d2  (cost=0.00..205.28 rows=27 width=32)
                                             Filter: (eff_flg = 'Y'::text)
                                             Foreign Namespace: hdmp_pri5_fdm.d_trans_action
                                       ->  Hash  (cost=19.06..19.06 rows=127 width=136)
                                             ->  Hash Right Join  (cost=4.13..19.06 rows=127 width=136)
                                                   Hash Cond: (d1.modl_id = vo.modl_id)
                                                   ->  Foreign Scan on d_modl d1  (cost=0.00..13.36 rows=156 width=8)
                                                         Foreign Namespace: hdmp_pri5_fdm.d_modl
                                                   ->  Hash  (cost=2.54..2.54 rows=127 width=144)
                                                         ->  CTE Scan on f_acct_vchr_1_tmp vo  (cost=0.00..2.54 rows=127 width=144)

nl join 減少了,查詢的效率也有相應的提升

我們再進一步分析sql 中的右表
其實通過count 命令,我們可以了解到,右表的結果集都非常小,最大的表只有 1w 條記錄而已
這樣我們就能理解,為什么臨時表只有 350 條記錄的查詢效率竟然會比 臨時表中有 8700 條記錄的查詢效率差

因為在第一個sql 中,關聯查詢基本上都是走 nl join ,需要不斷的訪問右表,並且在同時 5張表的關聯情況下,效率極低
而第二個sql中,由於臨時表的結果集為 8700 條,數量比較多,所以pgsql 的調度引擎自動幫助用戶優化為大部分 hash join ,少部分 nl join

我們從數據庫關聯的原理上理解,像這種查詢場景,應該所有的關聯查詢使用 hash join 是 效率最高的,因為臨時表出來的結果集不會太大,所有左表的結果集也比較小

pgsql 設置關閉 nl join 的命令

set enable_nestloop=off

關閉 nl join之后,再執行 explain 查看訪問計划

 Aggregate  (cost=99723457.95..99723457.96 rows=1 width=0)
   CTE f_acct_vchr_1_tmp
     ->  Foreign Scan on hdmp_pri5_fdm_f_acct_vchr vo_1  (cost=0.00..99722420.16 rows=1 width=1448)
           Filter: ((posting_dt >= '2015-12-01'::date) AND (posting_dt <= '2015-12-31'::date) AND (trans_no ~~ '301%'::text) AND (a
mt = 1000::double precision) AND ((posting_flg = 'Y'::text) OR (gl_acc_id = '99900'::text)))
           Foreign Namespace: hdmp_pri5_fdm.f_acct_vchr
   ->  Hash Left Join  (cost=724.37..1037.79 rows=1 width=0)
         Hash Cond: (vo.calc_unit_id = (u1.calc_unit_id)::double precision)
         ->  Hash Right Join  (cost=716.42..1029.83 rows=1 width=8)
               Hash Cond: (f1.prod_cd = vo.fund_tnl_cd)
               ->  Foreign Scan on hdmp_pri5_fdm_d_prod f1  (cost=0.00..313.22 rows=48 width=32)
                     Filter: (eff_flg = 'Y'::text)
                     Foreign Namespace: hdmp_pri5_fdm.d_prod
               ->  Hash  (cost=716.41..716.41 rows=1 width=40)
                     ->  Hash Right Join  (cost=403.00..716.41 rows=1 width=40)
                           Hash Cond: (p.prod_cd = vo.prod_cd)
                           ->  Foreign Scan on hdmp_pri5_fdm_d_prod p  (cost=0.00..313.22 rows=48 width=32)
                                 Filter: (eff_flg = 'Y'::text)
                                 Foreign Namespace: hdmp_pri5_fdm.d_prod
                           ->  Hash  (cost=402.98..402.98 rows=1 width=72)
                                 ->  Hash Right Join  (cost=208.60..402.98 rows=1 width=72)
                                       Hash Cond: (d3.trans_action_cd = vo.calc_trans_action)
                                       ->  Foreign Scan on hdmp_pri5_fdm_d_trans_action d3  (cost=0.00..194.28 rows=27 width=32)
                                             Filter: (eff_flg = 'Y'::text)
                                             Foreign Namespace: hdmp_pri5_fdm.d_trans_action
                                       ->  Hash  (cost=208.58..208.58 rows=1 width=104)
                                             ->  Hash Right Join  (cost=14.20..208.58 rows=1 width=104)
                                                   Hash Cond: (d2.trans_action_cd = vo.trans_action_cd)
                                                   ->  Foreign Scan on hdmp_pri5_fdm_d_trans_action d2  (cost=0.00..194.28 rows=27 width=32)

已經變成所有關聯都是 hash join 了,查詢效率也從最開始的 120 Sec 提升到 800 ms

總結
sql 查詢效率不好,一定要活用 explain 命令定位問題,像這個場景里,我們就能知道是由於 nl join 過多,導致了性能問題
其實sql 優化是一個系統的工作,有時候 需要多觀察,例如with 這個命令,在 pg 的外表中,也是比較好用的,大家有時間可以好好研究一下

************************************

第一個 sql 命令,臨時表的結果集為 350 條

explain WITH
    f_acct_vchr_1_tmp AS
    (
        SELECT
            *
        FROM
            hdmp_pri5_fdm_f_acct_vchr vo
        WHERE
            1=1
        AND posting_dt >= '2015-12-01'
        AND posting_dt <= '2015-12-31'
        AND trans_no LIKE '301%'
        and amt = 1000.00
        AND (
                posting_flg = 'Y'
            OR  vo.gl_acc_id = '99900')
            
                   
    )
SELECT
   count(1)
FROM
    F_ACCT_VCHR_1_tmp vo
LEFT JOIN
    d_prod p
ON
    vo.prod_cd=p.prod_cd
AND p.eff_flg = 'Y'
LEFT JOIN
    d_modl d1
ON
    vo.modl_id=d1.modl_id
LEFT JOIN
    d_calc_unit u1
ON
    vo.calc_unit_id=u1.calc_unit_id
AND u1.eff_flg = 'Y'
LEFT JOIN
    d_prod f1
ON
    vo.fund_tnl_cd=f1.prod_cd
AND f1.eff_flg = 'Y'
LEFT JOIN
    d_trans_action d2
ON
    vo.trans_action_cd=d2.trans_action_cd
AND d2.eff_flg = 'Y'
LEFT JOIN
    d_trans_action d3
ON
    vo.calc_trans_action=d3.trans_action_cd
AND d3.eff_flg = 'Y'


####################
第二個 sql,臨時表的結果集為 8700 條

explain WITH
    f_acct_vchr_1_tmp AS
    (
        SELECT
            *
        FROM
            hdmp_pri5_fdm_f_acct_vchr vo
        WHERE
            1=1
        AND posting_dt >= '2015-12-01'
        AND posting_dt <= '2015-12-31'
        AND trans_no LIKE '301%'

        AND (
                posting_flg = 'Y'
            OR  vo.gl_acc_id = '99900')
            
                   
    )
SELECT
   count(1)
FROM
    F_ACCT_VCHR_1_tmp vo
LEFT JOIN
    d_prod p
ON
    vo.prod_cd=p.prod_cd
AND p.eff_flg = 'Y'
LEFT JOIN
    d_modl d1
ON
    vo.modl_id=d1.modl_id
LEFT JOIN
    d_calc_unit u1
ON
    vo.calc_unit_id=u1.calc_unit_id
AND u1.eff_flg = 'Y'
LEFT JOIN
    d_prod f1
ON
    vo.fund_tnl_cd=f1.prod_cd
AND f1.eff_flg = 'Y'
LEFT JOIN
    d_trans_action d2
ON
    vo.trans_action_cd=d2.trans_action_cd
AND d2.eff_flg = 'Y'
LEFT JOIN
    d_trans_action d3
ON
    vo.calc_trans_action=d3.trans_action_cd
AND d3.eff_flg = 'Y'

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM