MySQL 8.0 索引特性1-函數索引

本文轉載自查看原文 2019-05-16 13:31 2363 2.mysql/ mysql

函數索引顧名思義就是加給字段加了函數的索引，這里的函數也可以是表達式。所以也叫表達式索引。
MySQL 5.7 推出了虛擬列的功能，MySQL8.0的函數索引內部其實也是依據虛擬列來實現的。

我們考慮以下幾種場景：

1.對比日期部分的過濾條件

SELECT ...
FROM tb1
WHERE date(time_field1) = current_date;

2.兩字段做計算

SELECT ...
FROM tb1
WHERE field2 + field3 = 5;

3.求某個字段中間某子串

SELECT ...
FROM tb1
WHERE substr(field4, 5, 9) = 'actionsky';

4.求某個字段末尾某子串

SELECT ...
FROM tb1
WHERE RIGHT(field4, 9) = 'actionsky';

5.求JSON格式的VALUE

SELECT ...
FROM tb1
WHERE CAST(field4 ->> '$.name' AS CHAR(30)) = 'actionsky';

以上五個場景如果不用函數索引，改寫起來難易不同。不過都要做相關修改，不是過濾條件修正就是表結構變更添加冗余字段加額外索引。

比如第1個場景改寫為，

SELECT ...
FROM tb1
WHERE time_field1 >= concat(current_date, ' 00:00:00')
    AND time_field1 <= concat(current_date, '23:59:59');

再比如第4個場景的改寫，
由於是求最末尾的子串，只能添加一個新的冗余字段，並且做相關的計划任務來一定頻率的異步更新或者添加觸發器來實時更新此字段值

SELECT ...
FROM tb1
WHERE field4_suffix = 'actionsky';

那我們看到，改寫也可以實現，不過這樣的SQL就沒有標准化而言，后期不能平滑的遷移了。

MySQL 8.0 推出來了函數索引讓這些變得相對容易許多。
不過函數索引也有自己的缺陷，就是寫法很固定，必須要嚴格按照定義的函數來寫，不然優化器不知所措。

我們來把上面那些場景實例化。
示例表結構，

總記錄數

mysql> SELECT COUNT(*)
FROM t_func;
+----------+
| count(*) |
+----------+
|    16384 |
+----------+
1 row in set (0.01 sec)

我們把上面幾個場景的索引全加上

mysql > ALTER TABLE t_func ADD INDEX idx_log_time ( ( date( log_time ) ) ),
ADD INDEX idx_u1 ( ( rank1 + rank2 ) ),
ADD INDEX idx_suffix_str3 ( ( RIGHT ( str3, 9 ) ) ),
ADD INDEX idx_substr_str1 ( ( substr( str1, 5, 9 ) ) ),
ADD INDEX idx_str2 ( ( CAST( str2 ->> '$.name' AS CHAR ( 9 ) ) ) );
QUERY OK,
0 rows affected ( 1.13 sec ) Records : 0 Duplicates : 0 WARNINGS : 0

我們再看下表結構，發現好幾個已經被轉換為系統自己的寫法了

MySQL 8.0 還有一個特性，就是可以把系統隱藏的列顯示出來。
我們用show extened 列出函數索引創建的虛擬列，

上面5個隨機字符串列名為函數索引隱式創建的虛擬COLUMNS。

我們先來看看場景2，兩個整形字段的相加，

mysql> SELECT COUNT(*)
FROM t_func
WHERE rank1 + rank2 = 121;
+----------+
| count(*) |
+----------+
|      878 |
+----------+
1 row in set (0.00 sec)

看下執行計划，用到了idx_u1函數索引，

mysql> explain SELECT COUNT(*)
FROM t_func
WHERE rank1 + rank2 = 121\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t_func
   partitions: NULL
         type: ref
possible_keys: idx_u1
          key: idx_u1
      key_len: 9
          ref: const
         rows: 878
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)

那如果我們稍微改下這個SQL的執行計划，發現此時不能用到函數索引，變為全表掃描了，所以要嚴格按照函數索引的定義來寫SQL

mysql> explain SELECT COUNT(*)
FROM t_func
WHERE rank1 = 121 - rank2\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t_func
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 16089
     filtered: 10.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

再來看看場景1的的改寫和不改寫的性能簡單對比

mysql> SELECT *
FROM t_func
WHERE date(log_time) = '2019-04-18'
LIMIT 1\G
*************************** 1. row ***************************
      id: 2
   rank1: 1
    str1: test-actionsky-test
    str2: {"age": 30, "name": "dell"}
   rank2: 120
    str3: test-actionsky
log_time: 2019-04-18 10:04:53
1 row in set (0.01 sec)

我們把普通的索引加上

mysql > ALTER TABLE t_func ADD INDEX idx_log_time_normal ( log_time );
QUERY OK,
0 rows affected ( 0.36 sec ) Records : 0 Duplicates : 0 WARNINGS : 0

然后改寫下SQL看下

mysql> SELECT *
FROM t_func
WHERE date(log_time) >= '2019-04-18 00:00:00'
    AND log_time < '2019-04-19 00:00:00'
*************************** 1. row ***************************
      id: 2
   rank1: 1
    str1: test-actionsky-test
    str2: {"age": 30, "name": "dell"}
   rank2: 120
    str3: test-actionsky
log_time: 2019-04-18 10:04:53
1 row in set (0.01 sec)

兩個看起來沒啥差別，我們仔細看下兩個的執行計划
– 普通索引

mysql> explain format=json SELECT *
FROM t_func
WHERE log_time >= '2019-04-18 00:00:00'
    AND log_time < '2019-04-19 00:00:00'
LIMIT 1\G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "630.71"
    },
    "table": {
      "table_name": "t_func",
      "access_type": "range",
      "possible_keys": [
        "idx_log_time_normal"
      ],
      "key": "idx_log_time_normal",
      "used_key_parts": [
        "log_time"
      ],
      "key_length": "6",
      "rows_examined_per_scan": 1401,
      "rows_produced_per_join": 1401,
      "filtered": "100.00",
      "index_condition": "((`ytt`.`t_func`.`log_time` >= '2019-04-18 00:00:00') and (`ytt`.`t_func`.`log_time` < '2019-04-19 00:00:00'))",
      "cost_info": {
        "read_cost": "490.61",
        "eval_cost": "140.10",
        "prefix_cost": "630.71",
        "data_read_per_join": "437K"
      },
      "used_columns": [
        "id",
        "rank1",
        "str1",
        "str2",
        "rank2",
        "str3",
        "log_time",
        "cast(`log_time` as date)",
        "(`rank1` + `rank2`)",
        "right(`str3`,9)",
        "substr(`str1`,5,9)",
        "cast(json_unquote(json_extract(`str2`,_utf8mb4'$.name')) as char(9) charset utf8mb4)"
      ]
    }
  }
}
1 row in set, 1 warning (0.00 sec)

-函數索引

mysql> explain format=json SELECT COUNT(*)
FROM t_func
WHERE date(log_time) = '2019-04-18'
LIMIT 1\G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "308.85"
    },
    "table": {
      "table_name": "t_func",
      "access_type": "ref",
      "possible_keys": [
        "idx_log_time"
      ],
      "key": "idx_log_time",
      "used_key_parts": [
        "cast(`log_time` as date)"
      ],
      "key_length": "4",
      "ref": [
        "const"
      ],
      "rows_examined_per_scan": 1401,
      "rows_produced_per_join": 1401,
      "filtered": "100.00",
      "cost_info": {
        "read_cost": "168.75",
        "eval_cost": "140.10",
        "prefix_cost": "308.85",
        "data_read_per_join": "437K"
      },
      "used_columns": [
        "log_time",
        "cast(`log_time` as date)"
      ]
    }
  }
}
1 row in set, 1 warning (0.00 sec)

從上面的執行計划看起來區別不是很大，唯一不同的是，普通索引在CPU的計算上消耗稍微大點，見紅色字體。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 MySQL 8.0 索引特性3 -倒序索引 MySQL 8.0新特性 -- 降序索引 MySQL 8.0新特性 -- 不可見索引 MySQL 8.0索引合並 MySQL 5.7新特性之Generated Column（函數索引） MySQL 8 新特性之降序索引 MySQL函數索引及優化 MYSQL的索引和常見函數牛逼！MySQL 8.0 中的索引可以隱藏了… 索引(一)Mysql創建索引