最近遇到一條SQL線上執行超過5s,這顯然無法忍受了,必須要優化了。
首先看眼庫表結構和SQL語句。
CREATE TABLE `xxxxx` ( `id` bigint(20) NOT NULL AUTO_INCREMENT, `owner` bigint(20) NOT NULL, `publicStatus` int(11) NOT NULL DEFAULT '0', `title` varchar(512) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '', `type` int(11) NOT NULL, `deviceType` int(11) NOT NULL, `deviceName` varchar(128) COLLATE utf8_unicode_ci DEFAULT NULL, `createTime` bigint(20) NOT NULL, `startTime` bigint(20) NOT NULL, `finishTime` bigint(20) NOT NULL DEFAULT '0', `height` int(11) DEFAULT '0', `width` int(11) DEFAULT '0', `length` bigint(20) DEFAULT '0', `status` int(11) NOT NULL DEFAULT '0', `uploadServer` int(11) NOT NULL DEFAULT '0', `orgfileName` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL, `img` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL, `delStatus` int(11) NOT NULL DEFAULT '0', `location` varchar(128) COLLATE utf8_unicode_ci NOT NULL DEFAULT '', `locationText` varchar(256) COLLATE utf8_unicode_ci NOT NULL DEFAULT '', `lastModifyTime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, `extUrl` varchar(1024) COLLATE utf8_unicode_ci DEFAULT NULL, `oem` varchar(20) CHARACTER SET utf8mb4 DEFAULT NULL, `lat` float(10,6) NOT NULL DEFAULT '-1000.000000', `lng` float(10,6) NOT NULL DEFAULT '-1000.000000', PRIMARY KEY (`id`), KEY `index_owner` (`owner`), KEY `Index_public` (`publicStatus`), KEY `Index_status` (`status`), KEY `index_finishTime` (`finishTime`), KEY `idx_channel_oem` (`oem`), KEY `idx_dev_type` (`deviceType`), KEY `idx_delStatus` (`delStatus`), KEY `idx_loc_locText` (`location`,`locationText`(255)), KEY `idx_lat_lng` (`lat`,`lng`) ) ENGINE=InnoDB AUTO_INCREMENT=583029 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
顯然這個表結構直觀看上去就不是很優化的樣子,先不去關心,在看眼SQL。
select * from `AAA` c left join `BBB` o on c.id = o.channelid where c.publicStatus = 2 and c.status= 30 and c.delStatus = 0 order by c.finishTime desc limit 100;
雖然有一個left join,但是仔細看where條件就可以知道其實問題並不大,只是一個簡單的鏈接,因為所有查詢條件都屬於AAA表。
那么接下來就是需要看眼這個SQL的explain和profiling了。為了簡單一些,我們將left join去掉。
explain結果如下: *************************** 1. row *************************** id: 1 select_type: SIMPLE table: c type: index_merge possible_keys: Index_public,Index_status,idx_delStatus key: Index_public,Index_status,idx_delStatus key_len: 4,4,4 ref: NULL rows: 72362 Extra: Using intersect(Index_public,Index_status,idx_delStatus); Using where; Using filesort 1 row in set (0.00 sec)
show profiling結果如下: +----------+------------+------------------------------------------------------------------------------------------------------------------------------+ | Query_ID | Duration | Query | +----------+------------+------------------------------------------------------------------------------------------------------------------------------+ | 1 | 4.10154300 | select * from `channel` c where c.publicStatus = 2 and c.status= 30 and c.delStatus = 0 order by c.finishTime desc limit 100 | +----------+------------+------------------------------------------------------------------------------------------------------------------------------+ +--------------------------------+----------+ | Status | Duration | +--------------------------------+----------+ | starting | 0.000026 | | Waiting for query cache lock | 0.000003 | | checking query cache for query | 0.000048 | | checking permissions | 0.000005 | | Opening tables | 0.000021 | | System lock | 0.000009 | | Waiting for query cache lock | 0.000022 | | init | 0.000038 | | optimizing | 0.000003 | | statistics | 0.000167 | | preparing | 0.000072 | | executing | 0.000004 | | Sorting result | 4.096042 | | Sending data | 0.000715 | | Waiting for query cache lock | 0.000000 | | Sending data | 0.004289 | | end | 0.000007 | | query end | 0.000005 | | closing tables | 0.000008 | | freeing items | 0.000009 | | Waiting for query cache lock | 0.000002 | | freeing items | 0.000009 | | Waiting for query cache lock | 0.000002 | | freeing items | 0.000002 | | storing result in query cache | 0.000003 | | logging slow query | 0.000002 | | logging slow query | 0.000026 | | cleaning up | 0.000004 | +--------------------------------+----------+
從上面可以很明顯的看出來,sort占了最長的時間,那么這條SQL重點就是要解決sort問題。
解決sort問題就是解決order by問題,直觀的看這條sql,第一反應就是需要添加一個4個字段的聯合索引idx(publicstatus,status,delstatu,finishtime),通過試驗結果可以接受,但是掃描行數依然不少,達到1w行以上。
*************************** 1. row *************************** id: 1 select_type: SIMPLE table: c type: ref possible_keys: idx_test key: idx_test key_len: 12 ref: const,const,const rows: 13038 Extra: Using where 1 row in set (0.00 sec)
那么有沒有其他的優化思路呢? 我們看眼第一次的explain的結果,其中比較明顯的是index merge和useing intersect,這個代表什么呢?
查詢MySQL的官方文檔,可以得知,這是查詢解析器進行index merge的交叉算法優化。索引合並交叉算法同時對所有使用的索引進行掃描,並產生一個符合條件的行的交集。這個交集一般都比較大,而真正進行排序的字段的索引並沒有使用到,所以需要單獨進行排序,而一旦結果集過大,就會在磁盤上生成臨時文件進行排序,就出現了useing filesort的情況了。
以上可以參考:http://dev.mysql.com/doc/refman/5.5/en/index-merge-optimization.html
同時,擴展閱讀一下,如果對於這種情況不打算使用index merge,可以在服務器上進行如下配置
set optimizer_switch=‘index_merge_intersection=off’
就可以將index merge的交叉優化算法關閉了。
BTW:MySQL 5.6的 Index Codiction Pushdown對這個的優化會更好一些,有興趣的同學可以自行去看。
回到我們的主題,那么這個order by還有什么其他優化思路呢? 那么既然排序是最大的消耗,那么我們強制使用排序字段的索引會產生什么效果呢?
explain select * from `channel` c FORCE INDEX(index_finishtime) where c.publicStatus = 2 and c.status= 30 and c.delStatus = 0 order by c.finishTime desc limit 100\G; *************************** 1. row *************************** id: 1 select_type: SIMPLE table: c type: index possible_keys: NULL key: index_finishTime key_len: 8 ref: NULL rows: 100 Extra: Using where +----------+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Query_ID | Duration | Query | +----------+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 1 | 0.00427200 | select * from `channel` c FORCE INDEX(index_finishtime) where c.publicStatus = 2 and c.status= 30 and c.delStatus = 0 order by c.finishTime desc limit 100 | +----------+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+ +--------------------------------+----------+ | Status | Duration | +--------------------------------+----------+ | starting | 0.000021 | | Waiting for query cache lock | 0.000005 | | checking query cache for query | 0.000063 | | checking permissions | 0.000007 | | Opening tables | 0.000018 | | System lock | 0.000010 | | Waiting for query cache lock | 0.000026 | | init | 0.000043 | | optimizing | 0.000015 | | statistics | 0.000013 | | preparing | 0.000020 | | executing | 0.000003 | | Sorting result | 0.000005 | | Sending data | 0.001091 | | Waiting for query cache lock | 0.000004 | | Sending data | 0.000805 | | end | 0.000007 | | query end | 0.000006 | | closing tables | 0.000009 | | freeing items | 0.000012 | | Waiting for query cache lock | 0.000002 | | freeing items | 0.002067 | | Waiting for query cache lock | 0.000006 | | freeing items | 0.000003 | | storing result in query cache | 0.000005 | | logging slow query | 0.000002 | | cleaning up | 0.000004 | +--------------------------------+----------+
可以看到排序依然有,但是耗時已經下降到非常低了,掃描行數變為100行,總執行時間變為0.004秒,是原來4.101秒的0.09%,效率提高了近1000倍。
結論:
這次調整給我們提供了一個對order by的優化思路,不要相信mysql的查詢解析器,我們可以只針對排序字段建立索引,而不用去管前面的where條件,有時候會收到意想不到的效果。
還可以看@reples的同樣的一片blog: