問題語句
SELECT * FROM a WHERE `type` = 'appointment' AND `event` = 14 AND EXISTS ( SELECT * FROM b WHERE a.`sheet_id` = b.`id` AND `company_id` = 8 AND b.`deleted_at` IS NULL ) ORDER BY a.id DESC LIMIT 6;
解讀執行計划
在exists類型的子查詢的執行計划中,select_type一欄分別是PRIMARY和DEPENDENT SUBQUERY

DEPENDENT SUBQUERY的意思是:子查詢,依賴於外層的查詢;DEPEND SUBQUERY是依賴於SQL的主體部分,它的執行次數最大可能和SQL主體部分結果的行數一樣多(這里因為有limit6,所以看起來主表掃描行數是6,如果去掉這個limit6的話,這個值是1500W)
上面這句話解釋得通俗一點就是外連接先執行查詢,然后把查詢的結果集放入子查詢內進行匹配;外查詢每執行一次查詢,就要來子查詢匹配一次
join的執行計划中,select_type一欄都是simple

join的第一行的就是外表
從上面的對比可以看出,無論是 exists類型的子查詢 還是join,都基本可以看作遵循了第一行就是驅動表的規則(注意不是所有子查詢都遵循這個規則,本篇只針對exists類型的dependent subquery)
分析
圖一PRIMARY對應的表就是圖二中的a表,DEPENDENT SUBQUERY表就是圖二中的b表;a表有1500W行數據,b表有2W行數據
所以圖一的SQL執行效率如此低下的原因就是大表驅動小表
優化
exists改寫為join
1 SELECT a.* 2 FROM a join b on a.`sheet_id` = b.`id` 3 WHERE a.`type` = 'appointment' 4 AND a.`event` = 14 5 AND b.`company_id` = 8 6 AND b.`deleted_at` IS NULL 7 ORDER BY a.`id` DESC 8 LIMIT 6;
由於a表作為內表,因此在a.`sheet_id`,a.`type`,a.`event`上創建聯合索引;語句中出現了b表的本地謂詞,所以b表的b.`company_id`,b.`deleted_at`上也要創建聯合索引
優化結果,執行時間:117s→0.36s,性能提升了2000倍
這個語句有一個更極端的取值,在b.`company_id` = 2的時候,小表不會搜出任何滿足條件的結果,在這種情況下,原語句執行時間在350s以上,而新語句僅需要0.03s,性能提升萬倍
優化案例
今天優化的這批語句中,大多數是exists子查詢的問題,可以看出這個研發小哥非常的喜歡用exists這種寫法;前面的那個exists語句是泛用型,后面的exists語句加了些新花樣
eg.
1 SELECT SUM(`xxxx`) AS ag 2 FROM a 3 WHERE EXISTS ( 4 SELECT * FROM b 5 WHERE a.`delivery_sheet_id` = b.`id` 6 AND (`status` = 4 7 OR `is_rejected` = '1') 8 AND `company_id` = 8 9 AND b.`deleted_at` IS NULL 10 ) 11 AND `status` IN (0, 4) 12 AND `collection_type` IN (2, 3) 13 AND a.`deleted_at` IS NULL;
or的優化通常改寫union,但這里是求sum不能這么改,需要改寫成2個語句然后求和;對應的列要建好索引
1 select c.ag+d.ag as ag from 2 (SELECT SUM(a.`xxxx`) AS ag 3 FROM a join b 4 on a.`delivery_sheet_id` = b.`id` 5 where 6 b.`status` = 4 7 AND b.`company_id` = 8 8 AND b.`deleted_at` IS NULL 9 AND a.`status` IN (0, 4) 10 AND a.`collection_type` IN (2, 3) 11 AND a.`deleted_at` IS NULL) c, 12 ( 13 SELECT SUM(a.`xxxx`) AS ag 14 FROM a join b 15 on a.`delivery_sheet_id` = b.`id` 16 where 17 b.`is_rejected` = '1' 18 AND b.`company_id` = 8 19 AND b.`deleted_at` IS NULL 20 AND a.`status` IN (0, 4) 21 AND a.`collection_type` IN (2, 3) 22 AND a.`deleted_at` IS NULL) d;
優化結果,執行時間:18s→0.2s
in改寫join的思路和exists差不多
這里沒有現成的例子,粘貼一篇鄭松華老師公眾號的分析過來
原語句
1 SELECT 2 3 COUNT( * ) AS totalNum, 4 5 sum( CASE WHEN F.ALARM_LEVEL = 1 THEN 1 ELSE 0 END ) AS LEVELS1, 6 7 sum( CASE WHEN F.ALARM_LEVEL = 2 THEN 1 ELSE 0 END ) AS LEVELS2, 8 9 sum( CASE WHEN F.ALARM_LEVEL = 3 THEN 1 ELSE 0 END ) AS LEVELS3, 10 11 sum( CASE WHEN F.DEAL_STATE = 0 THEN 1 ELSE 0 END ) AS DESTS 12 13 FROM 14 15 F 16 17 LEFT JOIN DC ON DC.ID = F.CONST_ID 18 19 LEFT JOIN V ON V.ID = F.VEHICLE_ID 20 21 LEFT JOIN AREA ON AREA.ID = V.SYS_DIVISION_ID 22 23 WHERE 24 25 DC.ID IS NOT NULL 26 27 AND V.ID IS NOT NULL 28 29 AND F.DEAL_STATE = 0 30 31 AND ALARM_LEVEL IN ( 1, 2, 3 ) 32 33 AND F.VEHICLE_ID IN ( 34 35 SELECT 36 37 VEHICLE_ID 38 39 FROM 40 41 GVLK 42 43 WHERE 44 45 GROUP_ID IN ( SELECT GROUP_ID FROM GULK WHERE USER_ID = 'ff8080816091b09c0161f9b825750a9a' ) 46 47 UNION 48 49 SELECT 50 51 VEHICLE_ID 52 53 FROM 54 55 UVLK 56 57 WHERE 58 59 USER_ID = 'ff8080816091b09c0161f9b825750a9a' 60 61 ) 62 63 AND date( F.ALARM_TIME ) BETWEEN '2000-01-01' 64 65 AND '2018-08-14' 66 67 AND AREA.PATH LIKE CONCAT( ( SELECT ARE.PATH FROM ARE WHERE ARE.ID = '0' ), '%' )
執行計划如下

改寫如下(in改join)
1 explain extended 2 3 SELECT 4 5 COUNT( * ) AS totalNum, 6 7 sum( CASE WHEN F.ALARM_LEVEL = 1 THEN 1 ELSE 0 END ) AS LEVELS1, 8 9 sum( CASE WHEN F.ALARM_LEVEL = 2 THEN 1 ELSE 0 END ) AS LEVELS2, 10 11 sum( CASE WHEN F.ALARM_LEVEL = 3 THEN 1 ELSE 0 END ) AS LEVELS3, 12 13 sum( CASE WHEN F.DEAL_STATE = 0 THEN 1 ELSE 0 END ) AS DESTS 14 15 FROM 16 17 F 18 19 straight_join ( 20 21 SELECT 22 23 VEHICLE_ID 24 25 FROM 26 27 GVLK 28 29 WHERE 30 31 GROUP_ID IN ( SELECT GROUP_ID FROM GULK WHERE USER_ID = 'ff8080816091b09c0161f9b825750a9a' ) 32 33 UNION 34 35 SELECT 36 37 VEHICLE_ID 38 39 FROM 40 41 UVLK 42 43 WHERE 44 45 USER_ID = 'ff8080816091b09c0161f9b825750a9a' 46 47 ) s on F.VEHICLE_ID = s.VEHICLE_ID 48 49 straight_join DC ON DC.ID = F.CONST_ID 50 51 straight_join V ON V.ID = F.VEHICLE_ID 52 53 straight_join AREA ON AREA.ID = V.SYS_DIVISION_ID 54 55 WHERE 56 57 DC.ID IS NOT NULL 58 59 AND V.ID IS NOT NULL 60 61 AND F.DEAL_STATE = 0 62 63 AND ALARM_LEVEL IN ( 1, 2, 3 ) 64 65 AND date( F.ALARM_TIME ) BETWEEN '2000-01-01' 66 67 AND '2018-08-14' 68 69 AND AREA.PATH LIKE CONCAT( ( SELECT ARE.PATH FROM ARE WHERE ARE.ID = '0' ), '%' )