MySQL 5.7 優化SQL提升100倍執行效率的深度思考(GO)


系統環境:微軟雲Linux DS12系列、Centos6.5 、MySQL 5.7.10、生產環境,step1,step2是案例,精彩的剖析部分在step3,step4.

 

1、慢sql語句大概需要13秒

原來的sql語句要13秒,sql如下:

SELECT

  (SELECT

    COUNT(*)

  FROM

    TB_BIS_POS_DEVICE t1,

    TB_BIS_MERCHANT t2

  WHERE t1.`PROJECT_ID` = '1024'

    AND t1.MERCHANT_ID = t2.ID

    AND t2.SPACE_ID = 'DE907E67FB9B487FA762E6E9B795072A') AS '安裝',

        

  (SELECT

    COUNT(*)

  FROM

    TB_BIS_POS_DEVICE t1,

    TB_BIS_POS_HEARTBEAT t2,

    TB_BIS_MERCHANT t3

  WHERE t1.`PROJECT_ID` = '1024'

    AND t1.MERCHANT_ID = t3.ID

    AND t3.SPACE_ID = 'DE907E67FB9B487FA762E6E9B795072A'

    AND t1.`ID` = t2.`DEVICE_ID`

    AND t2.ENABLED = 1) AS '在線',

        

  (SELECT

    COUNT(DISTINCT (t1.`SN`))

  FROM

    TB_BIS_POS_DEVICE t1,

    TB_BIS_POS_ORDER t2,

    TB_BIS_MERCHANT t3

  WHERE t1.`PROJECT_ID` = '1024'

    AND t1.MERCHANT_ID = t3.ID

    AND t3.SPACE_ID = 'DE907E67FB9B487FA762E6E9B795072A'

    AND t1.ID = t2.`DEVICE_ID`) AS '連通',

        

  (SELECT

    COUNT(*)

  FROM

    TB_BIS_POS_DEVICE t1,

    TB_BIS_MERCHANT t2

  WHERE t1.`PROJECT_ID` = '1024'

    AND t1.MERCHANT_ID = t2.ID

    AND t2.SPACE_ID = 'DE907E67FB9B487FA762E6E9B795072A'

         AND exists( select 1 )

    AND t1.ID IN

    (SELECT

      t2.`DEVICE_ID`

    FROM

      TB_BIS_POS_ORDER t2

    WHERE t2.`CREATE_DATE` >= DATE_FORMAT(NOW(), '%Y-%m-%d'))

    AND t1.ID NOT IN

    (SELECT

      t2.`DEVICE_ID`

    FROM

      TB_BIS_POS_ORDER t2

    WHERE t2.`CREATE_DATE` <= DATE_FORMAT(NOW(), '%Y-%m-%d'))

) AS '今日連通',

        

  (SELECT

    COUNT(DISTINCT (t1.`SN`))

  FROM

    TB_BIS_POS_DEVICE t1,

    TB_BIS_POS_ORDER t2,

    TB_BIS_MERCHANT t3

  WHERE t1.`PROJECT_ID` = '1024'

    AND t1.MERCHANT_ID = t3.ID

    AND t3.SPACE_ID = 'DE907E67FB9B487FA762E6E9B795072A'

    AND t1.ID = t2.`DEVICE_ID`

    AND UNIX_TIMESTAMP(t2.CREATE_DATE) >= UNIX_TIMESTAMP(NOW()) - 60 * 60 * 2) AS '正常交易',

        

  (SELECT

    COUNT(*)

  FROM

    TB_BIS_POS_DEVICE t1,

    TB_BIS_POS_ORDER t2,

    TB_BIS_MERCHANT t3

  WHERE t1.`PROJECT_ID` = '1024'

    AND t1.MERCHANT_ID = t3.ID

    AND t3.SPACE_ID = 'DE907E67FB9B487FA762E6E9B795072A'

    AND t1.ID = t2.`DEVICE_ID`) AS '交易共計',

        

  (SELECT

    COUNT(*)

  FROM

    TB_BIS_POS_DEVICE t1,

    TB_BIS_POS_ORDER t2,

    TB_BIS_MERCHANT t3

  WHERE t1.`PROJECT_ID` = '1024'

    AND t1.MERCHANT_ID = t3.ID

    AND t3.SPACE_ID = 'DE907E67FB9B487FA762E6E9B795072A'

    AND t1.ID = t2.`DEVICE_ID`

    AND t2.`CREATE_DATE` >= DATE_FORMAT(NOW(), '%Y-%m-%d')) AS '今日產生'

FROM

  DUAL ;

 

 

 

2、優化后提升100倍,只要0.09秒

和開發人員熟悉了業務之后,優化成如下,從13秒到0.09秒,效率提升了100多倍。

采用如下3種策略提升百倍效率,如下

   /*(1)內連接+distinct效率低下,換成exists高效*/    

    /*(2)IN不走索引,優化成EXISTS如下*/

/*(3)字段不能做函數處理,不然不走索引,優化成如下*/

 

SELECT sql_no_cache

  (

  SELECT

    COUNT(*)

  FROM

    TB_BIS_POS_DEVICE t1,

    TB_BIS_MERCHANT t2

  WHERE t1.`PROJECT_ID` = '1024'

    AND t1.MERCHANT_ID = t2.ID

    AND t2.SPACE_ID = 'DE907E67FB9B487FA762E6E9B795072A'

    ) AS '安裝',

        

  (

  SELECT

    COUNT(*)

  FROM

    TB_BIS_POS_DEVICE t1,

    TB_BIS_POS_HEARTBEAT t2,

    TB_BIS_MERCHANT t3

  WHERE t1.`PROJECT_ID` = '1024'

    AND t1.MERCHANT_ID = t3.ID

    AND t3.SPACE_ID = 'DE907E67FB9B487FA762E6E9B795072A'

    AND t1.`ID` = t2.`DEVICE_ID`

    AND t2.ENABLED = 1

    ) AS '在線',

        

  (

  /*

  SELECT

    COUNT(DISTINCT (t1.`SN`))

  FROM

    TB_BIS_POS_DEVICE t1,

    TB_BIS_POS_ORDER t2,

    TB_BIS_MERCHANT t3

  WHERE t1.`PROJECT_ID` = '1024'

    AND t1.MERCHANT_ID = t3.ID

    AND t3.SPACE_ID = 'DE907E67FB9B487FA762E6E9B795072A'

    AND t1.ID = t2.`DEVICE_ID`*/

   /*(1)內連接+distinct效率低下,換成exists高效*/

     SELECT

    COUNT(t1.`SN`)

  FROM

    TB_BIS_POS_DEVICE t1

  WHERE t1.`PROJECT_ID` = '1024'

  AND EXISTS(SELECT 1 FROM  TB_BIS_POS_ORDER t2 WHERE t1.ID = t2.`DEVICE_ID`)

  AND EXISTS(SELECT 1 FROM  TB_BIS_MERCHANT t3 WHERE t1.MERCHANT_ID = t3.ID     AND t3.SPACE_ID = 'DE907E67FB9B487FA762E6E9B795072A')

  

    ) AS '連通',

        

  (

 

/* 

  SELECT

    COUNT(*)

  FROM

    TB_BIS_POS_DEVICE t1,

    TB_BIS_MERCHANT t2

  WHERE t1.`PROJECT_ID` = '1024'

    AND t1.MERCHANT_ID = t2.ID

    AND t2.SPACE_ID = 'DE907E67FB9B487FA762E6E9B795072A'

         AND exists( select 1 )

    AND t1.ID IN

    (SELECT

      t2.`DEVICE_ID`

    FROM

      TB_BIS_POS_ORDER t2

    WHERE t2.`CREATE_DATE` >= DATE_FORMAT(NOW(), '%Y-%m-%d'))

    AND t1.ID NOT IN

    (SELECT

      t2.`DEVICE_ID`

    FROM

      TB_BIS_POS_ORDER t2

    WHERE t2.`CREATE_DATE` <= DATE_FORMAT(NOW(), '%Y-%m-%d'))

    */

    /*(2)IN不走索引,優化成EXISTS如下*/

  SELECT

    COUNT(*)

  FROM

    TB_BIS_POS_DEVICE t1,

    TB_BIS_MERCHANT t2

  WHERE t1.`PROJECT_ID` = '1024'

    AND t1.MERCHANT_ID = t2.ID

    AND t2.SPACE_ID = 'DE907E67FB9B487FA762E6E9B795072A'

     AND EXISTS( SELECT 1  FROM  TB_BIS_POS_ORDER t3  WHERE t3.`CREATE_DATE` >= DATE_FORMAT(NOW(), '%Y-%m-%d') AND t3.`DEVICE_ID`=t1.`ID`)   

   

   

) AS '今日連通',

        

  (

  SELECT

    COUNT(DISTINCT (t1.`SN`))

  FROM

    TB_BIS_POS_DEVICE t1,

    TB_BIS_POS_ORDER t2,

    TB_BIS_MERCHANT t3

  WHERE t1.`PROJECT_ID` = '1024'

    AND t1.MERCHANT_ID = t3.ID

    AND t3.SPACE_ID = 'DE907E67FB9B487FA762E6E9B795072A'

    AND t1.ID = t2.`DEVICE_ID`    

/*AND UNIX_TIMESTAMP(t2.CREATE_DATE) >= UNIX_TIMESTAMP(NOW()) - 60 * 60 * 2*/

/*(3)字段不能做函數處理,不然不走索引,優化成如下*/

    AND t2.CREATE_DATE >= DATE_ADD(NOW(),INTERVAL 2 HOUR)

 ) AS '正常交易',

        

  (

 

  SELECT

    COUNT(*)

  FROM

    TB_BIS_POS_DEVICE t1,

    TB_BIS_POS_ORDER t2,

    TB_BIS_MERCHANT t3

  WHERE t1.`PROJECT_ID` = '1024'

    AND t1.MERCHANT_ID = t3.ID

    AND t3.SPACE_ID = 'DE907E67FB9B487FA762E6E9B795072A'

    AND t1.ID = t2.`DEVICE_ID`

 

   

   ) AS '交易共計',

        

  (

  SELECT

    COUNT(*)

  FROM

    TB_BIS_POS_DEVICE t1,

    TB_BIS_POS_ORDER t2,

    TB_BIS_MERCHANT t3

  WHERE t1.`PROJECT_ID` = '1024'

    AND t1.MERCHANT_ID = t3.ID

    AND t3.SPACE_ID = 'DE907E67FB9B487FA762E6E9B795072A'

    AND t1.ID = t2.`DEVICE_ID`

    AND t2.`CREATE_DATE` >= DATE_FORMAT(NOW(), '%Y-%m-%d')

    ) AS '今日產生'

FROM

  DUAL ;

 

 

 

3、SQL優化准則:小結果集驅動大結果集

 

大家遇到相似的,可以借鑒下,當然還有其它的情況,也需要注意,接下來說下在機械磁盤的時代浪潮里面,優化必須要遵守的一大准則à用小結果集驅動大結果集

 

永遠用小的結果集驅動大的結果集

很多看過數據庫開發指南或者聽過某某大師網絡課程的開發人緣,喜歡在優化 SQL 的時候使用小表驅動大表,在在很多時候有效,但是並不是100%有效,必須看實際場景,主要是因為大表經過 WHERE 條件過濾之后返回的結果集並不一定就比小表所返回的大,也許更小。在這種情況下如果仍然采用小表驅動大表,就會得到相反的性能效果。

 

bty:他們說的用小表驅動大表只是為了讓開發人員方便記憶方便理解,但是開發人員不能死抱這個不放,需要理解深層次的原因。

 

因為在MySQL中,只有 Nested Loop 一種 Join 方式,也就是說MySQL的 Join 都是通過嵌套循環來實現的。驅動結果集越大,所需要循環就越多,那么被驅動表的訪問次數自然也就越多,而每次訪問被驅動表,即使需要的邏輯 IO 很少,循環次數多了,總量也不可能小,而且每次循環都不能避免消耗CPU,所以 CPU 運算量也會跟着增加。如果僅僅以表的大小來作為驅動表的判斷依據,假若小表過濾后所剩下的結果集比大表多很多,結果就會在嵌套循環中帶來更多的循環次數,這種情況小勇大表驅動小表就是低效率了(因為根據在機械磁盤的時代里面,IO是最大瓶頸,減少IO量就是提升sql效率,增加IO就意味增加cpu消耗,就意味着效率低下),反之,所需要的循環次數就會更少,總體 IO 量和 CPU 運算量也會更少。

 

而在非 Nested Loop  的 Join  算法中,比如 Oracle  中的 Hash  Join,就不是以表大小來決定,而是以結果集來決定,所以以小結果集驅動大的結果集同樣是最優的選擇。

 

所以,在優化數據庫Join Query 的時候,不管是MySQL還是Oracle等,最基本的原則就是“用小結果集驅動大結果集”,通過這個原則來減少嵌套循環中的循環次數,以減少 IO總量及CPU運算的次數,如下SQL模板所示:

SELECT  t1.c1,t2.c2   FROM 小結果集 AS t1  LEFT JOIN 大結果集 AS t2 ON t1.id=t2.cid WHERE t1.created_time > ‘2016-10-13’ AND t1.is_del=’0’ AND t2.project_id=’XJ160603’ and ……;

 

 

 

4、深度思考 IN ---- EXISTS

按照前面的小結果集驅動大結果集的規則,來深度解析下in和exists,in是把外表和內表作hash 連接,是以in子查詢驅動外面的表集合,而exists 是對外表作loop 循環,每次loop 循環再對內表進行查詢,以外表集合驅動exists子查詢,這一點上in和exists是相反的。
 
所以一直以來dba會經常講認為exists 比in 效率高的說法的前提是普通開發人員沒有認識這么深刻,為了讓開發人員容易理解,才這樣粗魯簡單的說,而且一般子查詢的結果集都會比外表要大,所以80%的情況下都適用。
 
但是數據庫工程師要知道in和exists的根本核心區別,所以說,如果查詢的兩個表大小相當,那么用in 和exists 差別不大。如果外表集合和子查詢集合中,一個較小,一個是較大,則子查詢表集合大的用exists,子查詢表集合小的用in,永遠遵循小結果集驅動大結果集的原則。


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM