InnoDB的ibd數據文件為什么比data_length+index_length+data_free的總和還要大?


問題描述:

同事在給jiradb做mysqldump時,發現dump出來的文件只有10MB左右,而ibd文件占用磁盤空間100MB左右。

最初,我們猜測可能是delete操作導致了大量的磁盤碎片,以及二級索引占用了很多空間。

但是對比了data_length+index_length+data_free的總和,與du的輸出結果對比,還是相差較多。

版本信息:Server version: 5.6.48-log MySQL Community Server (GPL)

 

概念解釋:

data_length:聚集索引所占用的空間,單位是bytes

For MyISAM, DATA_LENGTH is the length of the data file, in bytes.
For InnoDB, DATA_LENGTH is the approximate amount of space allocated  for  the clustered index, in bytes. Specifically, it is the clustered index size, in pages, multiplied by the InnoDB page size.
Refer to the notes at the end of  this  section  for  information regarding other storage engines.

 

index_length:二級索引所占用的空間,單位是bytes

For MyISAM, INDEX_LENGTH is the length of the index file, in bytes.
For InnoDB, INDEX_LENGTH is the approximate amount of space allocated  for  non-clustered indexes, in bytes. Specifically, it is the sum of non-clustered index sizes, in pages, multiplied by the InnoDB page size.
Refer to the notes at the end of  this  section  for  information regarding other storage engines.

 

data_free:已分配但是未使用的空間,單位是bytes

The number of allocated but unused bytes.
InnoDB tables report the free space of the tablespace to which the table belongs. For a table located in the shared tablespace,  this  is the free space of the shared tablespace. If you are using multiple tablespaces and the table has its own tablespace, the free space is  for  only that table. Free space means the number of bytes in completely free extents minus a safety margin. Even  if  free space displays as  0 , it may be possible to insert rows as  long  as  new  extents need not be allocated.
For NDB Cluster, DATA_FREE shows the space allocated on disk  for , but not used by, a Disk Data table or fragment on disk. (In-memory data resource usage is reported by the DATA_LENGTH column.)

參考鏈接:https://dev.mysql.com/doc/refman/5.6/en/tables-table.html

 

分析過程:

1、首先,抽查占用空間最大的changeitem.ibd,du顯示它占用磁盤11268KB

2、analyze table changeitem之后,查詢information_schema.tables,得出2637824 + 589824 + 4194304 = 7,421,952 = 7248KB,與du顯示的結果相差4020KB

mysql> select data_length,index_length,data_free,table_name from information_schema.tables where table_name= 'changeitem' ;
+-------------+--------------+-----------+------------+
| data_length | index_length | data_free | table_name |
+-------------+--------------+-----------+------------+
|      2637824  |        589824  |    4194304  | changeitem |
+-------------+--------------+-----------+------------+
1  row in set ( 0.00  sec)

3、使用py_innodb_page_info工具(原作者在https://code.google.com/archive/p/david-mysql-tools/中的原版已經找不到了,我在github上找到了其它版本,並做了一點小的修改,下載地址https://github.com/johnliu2008/py_innodb_page_info)分析ibd文件:

[root @localhost  ~]# python py_innodb_page_info/py_innodb_page_info.py changeitem.ibd
Total number of page:  704 :
Freshly Allocated Page:  527
Insert Buffer Bitmap:  1
File Space Header:  1
B-tree Node:  174
File Segment inode:  1

可以看到,B-tree Node有174個,這個數目是包含了聚集索引頁和二級索引頁的數量;這個工具的原理,是通過逐個塊地掃描ibd文件,通過每個塊的page_type值來判斷屬於什么類型的塊,其中用到的innodb_page_type字典,與源碼storage/innobase/include/fil0fil.h中的定義一致(題外話:8.0版本新增了SDI頁類型值,姜大原版的工具不能支持,我添加了SDI頁類型值的字典信息)

Freshly Allocated Page有527個,約‭8,634,368字節,比上一步中查出的data_free‬值要大很多,通過參照官檔關於data_free的描述(Free space means the number of bytes in completely free extents minus a safety margin.),知道data_free只是空閑空間的一部分。我好奇的是:InnoDB表的data_free值為什么都是1MB的整數倍?safety margin指的是什么?后面會通過源碼分析來解釋。

4、看看我們最初的猜測,是不是delete導致了很多磁盤碎片空間呢?

a.通過官檔對data_free的解釋,可以知道data_free≠碎片空間的容量,而是一種完全空閑的空間

b.實驗證明,僅僅insert,也會導致本文標題所描述的現象

因此,關於delete造成碎片空間的假設不成立。

源碼中關於data_free的計算方法:

storage/innobase/handler/i_s.cc:i_s_files_table_fill()中,avail_space = fsp_get_available_space_in_free_extents(space());

接着看看fsp_get_available_space_in_free_extents()函數,在storage\innobase\fsp\fsp0fsp.cc中:

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
/** Calculate how many KiB of new data we will be able to insert to the
tablespace without running out of space.
@param[in]  space_id    tablespace ID
@return available space in KiB
@retval UINTMAX_MAX if unknown */
uintmax_t
fsp_get_available_space_in_free_extents(
     ulint   space_id)
{
     FilSpace    space(space_id);
     if  (space() == NULL) {
         return (UINTMAX_MAX);
     }
 
     return (fsp_get_available_space_in_free_extents(space));
}
 
/** Calculate how many KiB of new data we will be able to insert to the
tablespace without running out of space. Start with a space object that has
been acquired by the caller who holds it for the calculation,
@param[in]  space       tablespace object from fil_space_acquire()
@return available space in KiB */
uintmax_t
fsp_get_available_space_in_free_extents(
//雖然函數的返回值是以KiB為單位,但是從函數名,大概可以猜到它還是以extent(1MB)為單位的,后續會有其它佐證
     const  fil_space_t*  space)
{
     ut_ad(space->n_pending_ops > 0);
 
     ulint   size_in_header = space->size_in_header;
     if  (size_in_header < FSP_EXTENT_SIZE) {
         return (0);       /* TODO: count free frag pages and
                     return a value based on that */
     }
 
     /* Below we play safe when counting free extents above the free limit:
     some of them will contain extent descriptor pages, and therefore
     will not be free extents */
     ut_ad(size_in_header >= space->free_limit);
//FSP_FREE_LIMIT:當前尚未初始化的最小Page No。從該Page往后的都尚未加入到表空間的FREE LIST上。 http://mysql.taobao.org/monthly/2016/02/01/
     ulint   n_free_up =
         (size_in_header - space->free_limit) / FSP_EXTENT_SIZE;
 
     page_size_t page_size(space->flags);
     if  (n_free_up > 0) {
         n_free_up--;
         n_free_up -= n_free_up / (page_size.physical()
                       / FSP_EXTENT_SIZE);
     }
 
     /* We reserve 1 extent + 0.5 % of the space size to undo logs
     and 1 extent + 0.5 % to cleaning operations; NOTE: this source
     code is duplicated in the function above!
     這就是上面官檔中提到的safety margin的解釋*/
 
     ulint   reserve = 2 + ((size_in_header / FSP_EXTENT_SIZE) * 2) / 200;
     ulint   n_free = space->free_len + n_free_up;
 
     if  (reserve > n_free) {
         return (0);
     }
 
     return ( static_cast <uintmax_t>(n_free - reserve)
            * FSP_EXTENT_SIZE * (page_size.physical() / 1024));
// 因為n_free和reserve變量的數據類型是ulint無符號整型,與FSP_EXTENT_SIZE(16K頁大小的話是1048576,也就是1MB)相乘之后,得到的數一定會是1MB的整數倍。這就是data_free的值為什么是1MB的整數倍的原因。
}

 

5、通過搜索,在Percona博客找到一篇關於如果計算InnoDB表占用磁盤空間的博文,不過是針對5.7以上版本的,不適用於5.6版本,記錄下鏈接https://www.percona.com/blog/2016/01/26/finding_mysql_table_size_on_disk/,以供參考。

6、查看官檔https://dev.mysql.com/doc/refman/5.6/en/innodb-file-per-table-tablespaces.html,其中描述ibd文件增長的步長是4MB,但是實際驗證發現,當有足夠多空閑空間的時候,ibd文件以小於4MB的步長增長,因為InnoDB分配磁盤空間以extent為單位,所以步長一定是1MB的整數倍:

The innodb_autoextend_increment variable, which defines the increment size  for  extending the size of an auto-extending system tablespace file when it becomes full, does not apply to file-per-table tablespace files, which are auto-extending regardless of the innodb_autoextend_increment setting. Initial file-per-table tablespace extensions are by small amounts, after which extensions occur in increments of 4MB.

那么,ibd文件的自增長時,為什么會有那么多的空閑空間呢?去看看源碼中的fsp_reserve_free_extents()函數:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
@param[in]  n_ext       number of extents to reserve
fsp_reserve_free_extents()
.............省略若干行.............
switch  (alloc_type) {
case  FSP_NORMAL:
         /* We reserve 1 extent + 0.5 % of the space size to undo logs
         and 1 extent + 0.5 % to cleaning operations; NOTE: this source
         code is duplicated in the function below! */
         reserve = 2 + ((size / FSP_EXTENT_SIZE) * 2) / 200;
         if  (n_free <= reserve + n_ext) {
             goto  try_to_extend;
         }
         break ;
//從fseg_create_general()函數和fseg_alloc_free_page_general()函數調用上述函數發現,n_ext參數傳的值都是2,reserve = 2 + ((size / FSP_EXTENT_SIZE) * 2) / 200; 說明reserve最少為2,只要當n_free <= 2+2時,就會try_to_extend
.............省略若干行.............

 

結論:

通過上面的分析,我們可以知道:

1、data_free≠碎片空間的容量,而是一種完全空閑的空間,大小是1MB的整數倍

2、ibd文件空閑空間<=4個extents也就是4MB時,就會嘗試進行擴展

3、我在官檔和源碼中,沒有找到ibd文件需要自動擴展的原因,但是結合工作經驗,我猜測:表空間文件擴展時開銷比較大,所以通過預先分配空間,以減少事務在寫入時遇到空間不足而臨時進行擴展的開銷。以前的項目中使用Oracle數據庫時,生產環境中的庫都會預先把表空間設置得比較大,這樣雖然會造成空間浪費,但是對性能友好,通俗地說,就是空間換時間。

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM