postgresql-fillfactor

postgresql-fillfactor

fillfactor

fillfactor是在創建表的時候指定的參數，該參數是限制數據插入一頁時預留的空閑空間比例，對於數據庫表的默認值是100，索引默認值是90

table（默認100）

一個表的填充因子是一個10-100質檢的百分數。100（完全填滿）是默認值。設置較小的填充因子，insert操作會把表頁面只填滿到指定的百分比，剩余的空間留給頁面上行的更新。這就讓update有機會把一行的已更新版本放到在與原始版本相同的頁面上，這比把它放在一個不同的頁面上效率更高。對於不經常更新的表來說，設置為100是最好的選擇，如果更新頻繁設置較小的值更合適。這個參數對toast表不生效。

index（fillfactor默認90）

索引的填充因子是一個百分數，它決定索引方法將嘗試填充索引頁面的充滿程度。對於B-tree，在初始的索引構建過程中，葉子頁面會被填充至該百分數，B-tree默認的填充因子是90，可以設置為10-100的任何整數值。如果表是靜態的，那么填充因子100是最好的，這樣索引占用空間最小。對於更新頻繁的表，設置較小的值有利於最小化頁面分裂。

驗證

--創建表test_fill_1設置fillfactor=100
abase=# create table test_fill_1(n_id int,c_xm varchar(300)) with (fillfactor=100);
CREATE TABLE
--創建表test_fill_2設置fillfactor=80
abase=# create table test_fill_2(n_id int,c_xm varchar(300)) with (fillfactor=80);
CREATE TABLE
--添加主鍵
abase=#  alter table test_fill_1 add primary key(n_id);
ALTER TABLE

abase=# alter table test_fill_2 add primary key(n_id);
ALTER TABLE

--初始化數據
abase=# insert into test_fill_1 select generate_series(1,1000000),'zhangsan'||generate_series(1,1000000);
INSERT 0 1000000
Time: 7067.047 ms
abase=# insert into test_fill_2 select generate_series(1,1000000),'zhangsan'||generate_series(1,1000000);
INSERT 0 1000000
Time: 6849.234 ms

--表分析
postgres=# vacuum analyze test_fill_1;
VACUUM
postgres=# vacuum analyze test_fill_2;
--查看表的頁數
abase=# select relpages,reltuples from pg_class where relname = 'test_fill_1';
 relpages | reltuples 
----------+-----------
     6369 |     1e+06
(1 row)

abase=#  select relpages,reltuples from pg_class where relname = 'test_fill_2';
 relpages | reltuples 
----------+-----------
     7999 |     1e+06
(1 row)


--查看表結構
abase=# \d+ test_fill_1;
                             Table "public.test_fill_1"
 Column |          Type          | Modifiers | Storage  | Stats target | Description 
--------+------------------------+-----------+----------+--------------+-------------
 n_id   | integer                |           | plain    |              | 
 c_xm   | character varying(300) |           | extended |              | 
Options: fillfactor=100

abase=# \d+ test_fill_2;
                             Table "public.test_fill_2"
 Column |          Type          | Modifiers | Storage  | Stats target | Description 
--------+------------------------+-----------+----------+--------------+-------------
 n_id   | integer                |           | plain    |              | 
 c_xm   | character varying(300) |           | extended |              | 
Options: fillfactor=80

--查看表大小，fillfactor越大占用的空間越小
abase=#  select pg_size_pretty(pg_relation_size('test_fill_1'));
 pg_size_pretty 
----------------
 50 MB
(1 row)

Time: 1.251 ms
abase=# select pg_size_pretty(pg_relation_size('test_fill_2'));
 pg_size_pretty 
----------------
 62 MB
(1 row)

Time: 0.995 ms


--查看主鍵大小
abase=# select pg_size_pretty(pg_relation_size('test_fill_1_pkey'));
 pg_size_pretty 
----------------
 21 MB
(1 row)

abase=# select pg_size_pretty(pg_relation_size('test_fill_2_pkey'));
 pg_size_pretty 
----------------
 21 MB
(1 row)

初始化數據耗時差別不大，fillfactor=80略快。設置了fillfactor=80的表占用空間更大。索引方面占用空間一樣。

更新數據

--1.更新test_fill_1
abase=# select ctid,* from test_fill_1 where n_id =1;
 ctid  | n_id |   c_xm    
-------+------+-----------
 (0,1) |    1 | zhangsan1
(1 row)

abase=# update test_fill_1 set c_xm='李四' where n_id = 1;
UPDATE 1

--更新test_fill_1 fillfactor為100，更新后，數據插入到了最后一頁ctid為(6368,74)
abase=# select ctid,* from test_fill_1 where n_id =1;
   ctid    | n_id | c_xm 
-----------+------+------
 (6368,74) |    1 | 李四
(1 row)

--2.更新test_fill_2
abase=# select ctid,* from test_fill_2 where n_id =1;
 ctid  | n_id |   c_xm    
-------+------+-----------
 (0,1) |    1 | zhangsan1
(1 row)

Time: 1.392 ms
abase=#  update test_fill_2 set c_xm='李四' where n_id = 1;;
UPDATE 1

--test_fill_2表的fillfactor為80，還剩余20%的空間可以利用，更新后數據是在第一頁插入了這條數據，ctid為 (0,149)還在第一頁
abase=# select ctid,* from test_fill_2 where n_id =1;
  ctid   | n_id | c_xm 
---------+------+------
 (0,149) |    1 | 李四
(1 row)

設置了fillfactor=100后，更新數據會在最后一頁插入一條數據

而設置fillfactor=80，數據會在當前頁插入一條數據

更新效率

--為了看出明顯的效果，先將autovacuum關閉掉
--更新test_fill_1的所有數據‘
更新的效率來看較小的fillfactor更新更快
abase=# update test_fill_1 set c_xm = c_xm||'x';
UPDATE 1000000
Time: 13035.901 ms
--更新test_fill_2的所有數據
abase=#  update test_fill_2 set c_xm = c_xm||'x';
UPDATE 1000000
Time: 10162.411 ms

--再次全部更新
abase=#  update test_fill_1 set c_xm = c_xm||'y';
UPDATE 1000000
Time: 11741.058 ms
abase=# update test_fill_2 set c_xm = c_xm||'y';
UPDATE 1000000
Time: 10838.738 ms

abase=# update test_fill_1 set c_xm = c_xm||'z';
UPDATE 1000000
Time: 14763.844 ms
abase=# update test_fill_2 set c_xm = c_xm||'z';
UPDATE 1000000
Time: 9392.977 ms
--經過三次全部更新來看，設置fillfactor=80時，更新的速度在10s左右。

--多次更新后，可以看到fillfactor=80的頁在更新時，還是會使用前面的頁的舊行。
abase=# select ctid,*from test_fill_1 limit 100;
    ctid    |  n_id  |         c_xm          
------------+--------+-----------------------
 (6369,155) |      1 | zhangsan1xyzxxy
 (6369,156) |      2 | zhangsan2xyzxxy
 (6369,157) |      3 | zhangsan3xyzxxy
 (6369,158) |      4 | zhangsan4xyzxxy
 (6369,159) |      5 | zhangsan5xyzxxy
 (6369,160) |      6 | zhangsan6xyzxxy
 (6369,161) |      7 | zhangsan7xyzxxy
 (6369,162) |      8 | zhangsan8xyzxxy
 (6369,163) |      9 | zhangsan9xyzxxy
 (6369,164) |     10 | zhangsan10xyzxxy
 (6369,165) |     11 | zhangsan11xyzxxy
 (6369,166) |     12 | zhangsan12xyzxxy
 (6369,167) |     13 | zhangsan13xyzxxy
abase=# select ctid,*from test_fill_2 limit 100;
  ctid   | n_id |        c_xm         
---------+------+---------------------
 (0,158) |    1 | zhangsan1xyzxxy
 (0,159) |    2 | zhangsan2xyzxxy
 (0,160) |    3 | zhangsan3xyzxxy
 (0,161) |    4 | zhangsan4xyzxxy
 (0,162) |    5 | zhangsan5xyzxxy
 (0,163) |    6 | zhangsan6xyzxxy
 (0,164) |    7 | zhangsan7xyzxxy
 (0,165) |    8 | zhangsan8xyzxxy
 (0,166) |    9 | zhangsan9xyzxxy
 (0,167) |   10 | zhangsan10xyzxxy
 (0,168) |   11 | zhangsan11xyzxxy
 (0,169) |   12 | zhangsan12xyzxxy
 (0,170) |   13 | zhangsan13xyzxxy
 (0,171) |   14 | zhangsan14xyzxxy
 (0,172) |   15 | zhangsan15xyzxxy

--更新小范圍數據，test_fill_2效果很明顯
abase=# update test_fill_1 set c_xm = c_xm||'xx' where n_id>1000 and n_id <2000;
UPDATE 999
Time: 28.306 ms
abase=# update test_fill_2 set c_xm = c_xm||'xx' where n_id>1000 and n_id <2000;
UPDATE 999
Time: 13.577 ms

在update（全量）的時候fillfactor=80的效率更高，少量數據更新的時候低fillfactor效果更明顯。

再次update以后，可以看到fillfactor=80的頁在更新時，還是會使用更新到前面的頁的舊行。（沒有autovacuum的情況下仍然更新可以使用標記為刪除的行）

更新后索引大小

--test_fill_1表的所以更大
postgres=#  select pg_size_pretty(pg_relation_size('test_fill_1_pkey'));
 pg_size_pretty 
----------------
 66 MB
(1 row)

Time: 0.870 ms
postgres=#  select pg_size_pretty(pg_relation_size('test_fill_2_pkey'));
 pg_size_pretty 
----------------
 43 MB
(1 row)

Time: 0.782 ms

可以看到設置了fillfactor=80表的索引比fillfactor=100的索引要小，而fillfactor=100膨脹的更快，這是為什么呢？請往下看

Heap-Only Tuples(hot)

在PG中因為多版本功能的原因，當更新一行時，實際上舊行並未被刪除，只是插入一條新行。如果這個表上有索引，而更新的字段不是索引的鍵值時，由於新行的物理位置發生了變化，因此仍然需要更新索引，這將導致性能下降。為了解決這個問題PostgreSQL自8.3版本后，引入了一個名為Heap-Only Tuple的新技術，簡稱HOT。使用HOT技術之后，如果更新后的新行與舊行在同一個數據塊內，舊行會有一個指針，指向新行，這樣就不必更新索引了，當從索引訪問到數據行時，會根據這個指針找到新行。

HOT詳細說明見下圖，圖中表上有一個索引，其中“索引項n”指向數據塊的第3行。

圖片1.png

更新第三行后，因為有用HOT技術，所以索引項仍然指向原先的舊數據（第3行），而第3行舊數據中有一個指針指向新數據（第6行），如下圖：

圖片2.png

注意，如果在原先的數據塊中無法放下新行，就不能使用HOT技術了，即HOT技術中的行間指針只能在一個數據塊內，不能跨數據塊。所以為了使用HOT技術，應該在數據塊中留出較多的空閑空間，方法是把表的填充因子（fillfactor）設置為一個合適的值。
Fillfactor參數的意思是插入數據時，塊數據的空間占用率達到這個比率后，就不在插入數據了，默認值為100，表示塊中不留存空間，數據全部填滿數據塊。

當有空閑空間的時候，更新可能會走hot更新，數據是在一頁內變動的，索引不會變化。所以前面fillfactor=80在更新后索引要比fillfactor=100小。

總結

1.初始化數據耗時差別不大，設置了fillfactor=80的表占用空間更大，初始化時索引方面占用空間一樣。

2.設置了fillfactor=100后，更新數據會在最后一頁插入一條數據，而設置fillfactor=80，數據會在當前頁空閑空間插入一條數據

3.在更新較頻繁的表設置合適的fillfactor可以提高更新效率，因為有hot技術，減小索引膨脹

4.在update（全量）的時候fillfactor=80的效率更高，少量數據更新的時候低fillfactor效果更明顯。

5.多次update以后，可以看到fillfactor=80時，還是會使用前面的頁的舊行。（如果autovacuum沒有及時清理的情況下更新可以使用標記為刪除的行）

參考資料：https://my.oschina.net/207miner/blog/2994857

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 數據庫的索引和填充因子fillfactor 數據庫之數據庫對比數據庫：數據庫編程什么是數據庫，為什么要學習數據庫 Access數據庫 ldap 數據庫數據庫進階 SQLite數據庫（一） Redis 數據庫數據庫