postgresql的effective_cache_size

本文轉載自查看原文 2021-03-26 08:29 628 pg/ postgres

優化器假設可以用於單個查詢的磁盤緩存的有效大小。這個因素會被用到使用索引的成本考慮中：值越大，使用索引掃描的可能性就越大；值越小，使用順序掃描的可能性就越大。
設置該參數的時候，需要同時考慮到shared buffer和內核對磁盤緩存的使用，盡管有些數據會同時存在shared buffer和內核的磁盤緩存中。同時要考慮到在不同的表上並發查詢的數量，因為他們也會使用到共享空間。
該參數不會影響分配給postgresql的共享內存，也不保留內核磁盤緩存。只是用於優化器的評估目的。系統也不會假設不同查詢之間的數據保留在磁盤緩存上。默認是4GB。

指定值的時候，如果不指定unit，默認就是block。

#select name, setting, unit from pg_settings where name like 'effective_cache_size'; 
         name         | setting | unit 
----------------------+---------+------
 effective_cache_size | 524288  | 8kB

成本評估要考慮很多因素：i/o數量、操作調用次數、處理的元組的數量、選擇性等等。但是i/o的成本是什么呢？很顯然，如果數據已經在cache中或數據在磁盤上，代價顯然是不同的。

參數effective_cache_size就是用來告訴優化器，系統可以提供多大的cache。這里的cache不僅僅是內存的cache，也考慮了文件系統cache、cpu的cache等。effective_cache_size是這些cache的總和。

postgres=# create table t_random as select id,random() as r from generate_series(1,1000000) as id order by random();
SELECT 1000000
postgres=# create table t_ordered as select id,random() as r from generate_series(1,1000000) AS id;
SELECT 1000000
postgres=# create index idx_random on t_random(id);
CREATE INDEX
postgres=# create index idx_ordered on t_ordered(id);
CREATE INDEX
postgres=# vacuum analyze t_random;
VACUUM
postgres=# vacuum analyze t_ordered;
VACUUM
postgres=#

兩個表都包含相同的數據，一個表是有序的，一個是無序的。

將effective_cache_size設置一個較小的值。優化器會認為系統的內存不是很多：

postgres=# set effective_cache_size to '1 MB';
SET
postgres=# show effective_cache_size;
 effective_cache_size 
----------------------
 1MB
(1 row)

postgres=# set enable_bitmapscan to on;
SET
postgres=# explain SELECT * FROM t_random WHERE id < 1000;
                                 QUERY PLAN                                 
----------------------------------------------------------------------------
 Bitmap Heap Scan on t_random  (cost=19.71..2453.44 rows=940 width=12)
   Recheck Cond: (id < 1000)
   ->  Bitmap Index Scan on idx_random  (cost=0.00..19.48 rows=940 width=0)
         Index Cond: (id < 1000)
(4 rows)

postgres=# set enable_bitmapscan to off;
SET
postgres=# explain SELECT * FROM t_random WHERE id < 1000;
                                   QUERY PLAN                                    
---------------------------------------------------------------------------------
 Index Scan using idx_random on t_random  (cost=0.42..3732.86 rows=940 width=12)
   Index Cond: (id < 1000)
(2 rows)

postgres=#

通常pg會走bitmap索引掃描，但是這里我們想看看索引掃描會發生什么。所以關閉了bitmap索引掃描。

postgres=# SET effective_cache_size TO '1000 GB';
SET
postgres=# explain SELECT * FROM t_random WHERE id < 1000;
                                   QUERY PLAN                                    
---------------------------------------------------------------------------------
 Index Scan using idx_random on t_random  (cost=0.42..3488.86 rows=940 width=12)
   Index Cond: (id < 1000)
(2 rows)

postgres=#

可以看到，索引掃描的成本降低了。

我們必須把成本看作是“相對的”。絕對的數字並不重要——重要的是一個計划與其他計划相比有多貴。
如果順序掃描的成本保持不變，而索引掃描的價格相對於順序掃描下降了，PostgreSQL會更傾向於索引。這正是effective_cache_size的核心內容:在有大量RAM的情況下，更有可能進行使用索引掃描。

當談及如何配置postgres.conf文件中的effective_cache_size的設置的時候，往往沒有意識到並不會有什么神奇的效果。

postgres=# set effective_cache_size to '1 MB';
SET
postgres=# explain SELECT * FROM t_ordered WHERE id < 1000;
                                   QUERY PLAN                                    
---------------------------------------------------------------------------------
 Index Scan using idx_ordered on t_ordered  (cost=0.42..38.85 rows=996 width=12)
   Index Cond: (id < 1000)
(2 rows)

postgres=# SET effective_cache_size TO '1000 GB';
SET
postgres=# explain SELECT * FROM t_ordered WHERE id < 1000;
                                   QUERY PLAN                                    
---------------------------------------------------------------------------------
 Index Scan using idx_ordered on t_ordered  (cost=0.42..38.85 rows=996 width=12)
   Index Cond: (id < 1000)
(2 rows)

postgres=#

優化器使用的表統計信息包含關於物理“相關性”的信息。如果相關性是1，即所有數據是有序的在磁盤上。effective_cache_size並不會改變什么。

如果只有一個列，同樣也不會有什么效果：

postgres=# ALTER TABLE t_random DROP COLUMN r;
ALTER TABLE
postgres=# SET effective_cache_size TO '1 MB';
SET
postgres=# explain SELECT * FROM t_random WHERE id < 1000;
                                    QUERY PLAN                                     
-----------------------------------------------------------------------------------
 Index Only Scan using idx_random on t_random  (cost=0.42..28.88 rows=940 width=4)
   Index Cond: (id < 1000)
(2 rows)

postgres=# SET effective_cache_size TO '1000 GB';
SET
postgres=# explain SELECT * FROM t_random WHERE id < 1000;
                                    QUERY PLAN                                     
-----------------------------------------------------------------------------------
 Index Only Scan using idx_random on t_random  (cost=0.42..28.88 rows=940 width=4)
   Index Cond: (id < 1000)
(2 rows)

postgres=#

調優建議：

effective_cache_size = RAM * 0.7

如果是pg專用服務器，也可以考慮設置為RAM*0.8。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 mysql-binlog_cache_size PostgreSQL的WAL(1)--Buffer Cache MYSQL-max_binlog_cache_size參數 mysql優化參數thread_cache_size Unknown system variable 'query_cache_size' Unknown system variable 'query_cache_size' Tomcat 警告：consider increasing the maximum size of the cache mysql 問題：Unknown system variable 'query_cache_size' Oracle優化 -- 關於Database Buffer Cache相關參數DB_CACHE_SIZE的優化設置 nginx: [emerg] the size 10485760 of shared memory zone "cache_one" conflicts with already declared size 0