MySQL如何優化GROUP BY :松散索引掃描 VS 緊湊索引掃描


  

  執行GROUP BY子句的最一般的方法:先掃描整個表,然后創建一個新的臨時表,表中每個組的所有行應為連續的,最后使用該臨時表來找到組

並應用聚集函數。在某些情況中,MySQL通過訪問索引就可以得到結果,此類查詢的 EXPLAIN 輸出顯示 Extra 列的值為 Using index for group-by。

 

一、松散索引掃描

The most efficient way to process GROUP BY is when an index is used to directly retrieve the grouping columns.

With this access method, MySQL uses the property of some index types that the keys are ordered (for example, BTREE).

This property enables use of lookup groups in an index without having to consider all keys in the index that satisfy all WHERE conditions.

This access method considers only a fraction of the keys in an index, so it is called a loose index scan.

When there is no WHERE clause, a loose index scan reads as many keys as the number of groups, which may be a much smaller number than that of all keys.

If the WHERE clause contains range predicates ,  a loose index scan looks up the first key of each group that satisfies the range conditions,

and again reads the least possible number of keys. This is possible under the following conditions:

  • The query is over a single table.

  • The GROUP BY names only columns that form a leftmost prefix of the index and no other columns.

(If, instead of GROUP BY, the query has a DISTINCT clause, all distinct attributes refer to columns that form a leftmost prefix of the index.)

For example, if a table t1 has an index on (c1,c2,c3),

loose index scan is applicable if the query has GROUP BY c1, c2,.

It is not applicable if the query has GROUP BY c2, c3 (the columns are not a leftmost prefix) or GROUP BY c1, c2, c4 (c4 is not in the index).

  • The only aggregate functions used in the select list (if any) are MIN() and MAX(), and all of them refer to the same column. The column must be in the index and must immediately follow the columns in the GROUP BY.

  • Any other parts of the index than those from the GROUP BY referenced in the query must be constants (that is, they must be referenced in equalities with constants), except for the argument of MIN() or MAX() functions.

  • For columns in the index, full column values must be indexed, not just a prefix. For example, with c1 VARCHAR(20), INDEX (c1(10)), the index cannot be used for loose index scan.

 

mysql5.7示例如下:

CREATE TABLE `sm_wechat_binding` (
`id` bigint(20) NOT NULL,
`company_id` bigint(20) DEFAULT NULL,
`date_created` datetime NOT NULL,
`is_big_account` bit(1) NOT NULL,
`last_updated` datetime NOT NULL,
`open_id` varchar(64) NOT NULL,
`phone` varchar(14) DEFAULT NULL,
`deleted` datetime DEFAULT NULL,
`imported` datetime DEFAULT NULL,
`client_id` bigint(20) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `company_id_idx` (`company_id`),
KEY `openid_phone_index` (`open_id`,`phone`),
CONSTRAINT `FK_f95swnll9d3myf1pl7o5cxtws` FOREIGN KEY (`company_id`) REFERENCES `sm_company` (`company_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4

mysql> EXPLAIN SELECT distinct company_id FROM sm_wechat_binding;
+----+-------------+-------------------+-------+----------------+----------------+---------+------+------+--------------------------+
| id | select_type | table             | type  | possible_keys  | key            | key_len | ref  | rows | Extra                    |
+----+-------------+-------------------+-------+----------------+----------------+---------+------+------+--------------------------+
|  1 | SIMPLE      | sm_wechat_binding | range | company_id_idx | company_id_idx | 9       | NULL |  699 | Using index for group-by |
+----+-------------+-------------------+-------+----------------+----------------+---------+------+------+--------------------------+
1 row in set (0.02 sec)

mysql> EXPLAIN SELECT COUNT( company_id) FROM sm_wechat_binding GROUP BY company_id;
+----+-------------+-------------------+-------+----------------+----------------+---------+------+-------+-------------+
| id | select_type | table             | type  | possible_keys  | key            | key_len | ref  | rows  | Extra       |
+----+-------------+-------------------+-------+----------------+----------------+---------+------+-------+-------------+
|  1 | SIMPLE      | sm_wechat_binding | index | company_id_idx | company_id_idx | 9       | NULL | 39130 | Using index |
+----+-------------+-------------------+-------+----------------+----------------+---------+------+-------+-------------+
1 row in set (0.00 sec)

mysql> EXPLAIN SELECT COUNT(distinct company_id)  FROM sm_wechat_binding GROUP BY company_id;
+----+-------------+-------------------+-------+----------------+----------------+---------+------+------+--------------------------+
| id | select_type | table             | type  | possible_keys  | key            | key_len | ref  | rows | Extra                    |
+----+-------------+-------------------+-------+----------------+----------------+---------+------+------+--------------------------+
|  1 | SIMPLE      | sm_wechat_binding | range | company_id_idx | company_id_idx | 9       | NULL |  699 | Using index for group-by |
+----+-------------+-------------------+-------+----------------+----------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

mysql> EXPLAIN SELECT COUNT(distinct company_id) as num, company_id FROM sm_wechat_binding GROUP BY company_id;
+----+-------------+-------------------+-------+----------------+----------------+---------+------+------+--------------------------+
| id | select_type | table             | type  | possible_keys  | key            | key_len | ref  | rows | Extra                    |
+----+-------------+-------------------+-------+----------------+----------------+---------+------+------+--------------------------+
|  1 | SIMPLE      | sm_wechat_binding | range | company_id_idx | company_id_idx | 9       | NULL |  699 | Using index for group-by |
+----+-------------+-------------------+-------+----------------+----------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

mysql> EXPLAIN SELECT max(company_id), min(company_id) FROM sm_wechat_binding force index(company_id_idx);
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                        |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
|  1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Select tables optimized away |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
1 row in set (0.01 sec)

 

示例二

mysql> CREATE TABLE `loose_index_scan` (
    ->
    ->   `c1` int(11) DEFAULT NULL,
    ->   `c2` int(11) DEFAULT NULL,
    ->   `c3` int(11) DEFAULT NULL,
    ->   `c4` int(11) DEFAULT NULL,
    ->   KEY `idx_g` (`c1`,`c2`,`c3`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.90 sec)

mysql>
mysql>
mysql> explain select c1,c2  from loose_index_scan group by c1,c2;
+----+-------------+------------------+-------+---------------+-------+---------+------+------+-------------+
| id | select_type | table            | type  | possible_keys | key   | key_len | ref  | rows | Extra       |
+----+-------------+------------------+-------+---------------+-------+---------+------+------+-------------+
|  1 | SIMPLE      | loose_index_scan | index | idx_g         | idx_g | 15      | NULL |    1 | Using index |
+----+-------------+------------------+-------+---------------+-------+---------+------+------+-------------+
1 row in set (0.06 sec)

mysql>
mysql>
mysql> EXPLAIN SELECT COUNT(DISTINCT c1) FROM loose_index_scan GROUP BY c1;
+----+-------------+------------------+-------+---------------+-------+---------+------+------+-------------------------------------+
| id | select_type | table            | type  | possible_keys | key   | key_len | ref  | rows | Extra                               |
+----+-------------+------------------+-------+---------------+-------+---------+------+------+-------------------------------------+
|  1 | SIMPLE      | loose_index_scan | range | idx_g         | idx_g | 5       | NULL |    2 | Using index for group-by (scanning) |
+----+-------------+------------------+-------+---------------+-------+---------+------+------+-------------------------------------+
1 row in set (0.02 sec)

 

 

 

參考:

官方文檔:group-by-optimization


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM