MySQL的JOIN(四):JOIN優化實踐之快速匹配


這篇博文講述如何優化掃描速度。我們通過MySQL的JOIN(二):JOIN原理得知了兩張表的JOIN操作就是不斷從驅動表中取出記錄,然后查找出被驅動表中與之匹配的記錄並連接。這個過程的實質就是查詢操作,想要優化查詢操作,建索引是最常用的方式。那索引怎么建呢?我們來討論下,首先插入測試數據。

    CREATE TABLE t1 (
        id INT PRIMARY KEY AUTO_INCREMENT,
        type INT
    );
    SELECT COUNT(*) FROM t1;
    +----------+
    | COUNT(*) |
    +----------+
    |   110000 |
    +----------+
    CREATE TABLE t2 (
        id INT PRIMARY KEY AUTO_INCREMENT,
        type INT
    );
    SELECT COUNT(*) FROM t2;
    +----------+
    | COUNT(*) |
    +----------+
    |      100 |
    +----------+

左連接

左連接中,左表是驅動表,右表是被驅動表。想要快速查找被驅動表中匹配的記錄,所以我們可以在右表建索引,從而提高連接性能。

    -- 首先兩個表都沒建索引
    EXPLAIN SELECT * FROM t1 LEFT JOIN t2 ON t1.type=t2.type;
    +----+-------+------+------+--------+----------------------------------------------------+
    | id | table | type | key  | rows   | Extra                                              |
    +----+-------+------+------+--------+----------------------------------------------------+
    |  1 | t1    | ALL  | NULL | 110428 | NULL                                               |
    |  1 | t2    | ALL  | NULL |    100 | Using where; Using join buffer (Block Nested Loop) |
    +----+-------+------+------+--------+----------------------------------------------------+
-- 嘗試在左表建立索引,改進不大 CREATE INDEX idx_type ON t1(type); EXPLAIN SELECT * FROM t1 LEFT JOIN t2 ON t1.type=t2.type; +----+-------+-------+----------+--------+----------------------------------------------------+ | id | table | type | key | rows | Extra | +----+-------+-------+----------+--------+----------------------------------------------------+ | 1 | t1 | index | idx_type | 110428 | Using index | | 1 | t2 | ALL | NULL | 100 | Using where; Using join buffer (Block Nested Loop) | +----+-------+-------+----------+--------+----------------------------------------------------+

-- 嘗試在右表建立索引,效果拔群,Using index!!! DROP INDEX idx_type ON t1; CREATE INDEX idx_type ON t2(type); EXPLAIN SELECT * FROM t1 LEFT JOIN t2 ON t1.type=t2.type; +----+-------+------+---------------+----------+--------+-------------+ | id | table | type | possible_keys | key | rows | Extra | +----+-------+------+---------------+----------+--------+-------------+ | 1 | t1 | ALL | NULL | NULL | 110428 | NULL | | 1 | t2 | ref | idx_type | idx_type | 1 | Using index | +----+-------+------+---------------+----------+--------+-------------+

右連接

右連接中,右表是驅動表,左表是被驅動表,想要快速查找被驅動表中匹配的記錄,所以我們可以在左表建索引,從而提高連接性能。

    DROP INDEX idx_type ON t2;
    -- 兩個表都沒有索引
    EXPLAIN SELECT * FROM t1 RIGHT JOIN t2 ON t1.type=t2.type;
    +----+-------+------+------+--------+----------------------------------------------------+
    | id | table | type | key  | rows   | Extra                                              |
    +----+-------+------+------+--------+----------------------------------------------------+
    |  1 | t2    | ALL  | NULL |    100 | NULL                                               |
    |  1 | t1    | ALL  | NULL | 110428 | Using where; Using join buffer (Block Nested Loop) |
    +----+-------+------+------+--------+----------------------------------------------------+

-- 在右邊建立索引,改進不大 CREATE INDEX idx_type ON t2(type); EXPLAIN SELECT * FROM t1 RIGHT JOIN t2 ON t1.type=t2.type; +----+-------+-------+---------------+----------+--------+----------------------------------------------------+ | id | table | type | possible_keys | key | rows | Extra | +----+-------+-------+---------------+----------+--------+----------------------------------------------------+ | 1 | t2 | index | NULL | idx_type | 100 | Using index | | 1 | t1 | ALL | NULL | NULL | 110428 | Using where; Using join buffer (Block Nested Loop) | +----+-------+-------+---------------+----------+--------+----------------------------------------------------+

-- 嘗試在左邊建立索引,效果拔群! DROP INDEX idx_type ON t2; CREATE INDEX idx_type ON t1(type); EXPLAIN SELECT * FROM t1 RIGHT JOIN t2 ON t1.type=t2.type; +----+-------+------+---------------+--------------+------+-------------+ | id | table | type | possible_keys | ref | rows | Extra | +----+-------+------+---------------+--------------+------+-------------+ | 1 | t2 | ALL | NULL | NULL | 100 | NULL | | 1 | t1 | ref | idx_type | test.t2.type | 5 | Using index | +----+-------+------+---------------+--------------+------+-------------+

內連接

我們知道,MySQL Optimizer會對內連接做優化,不管誰內連接誰,都是用小表驅動大表,所以如果要優化內連接,可以在大表上建立索引,以提高連接性能。

另外注意一點,在小表上建立索引時,MySQL Optimizer會認為用大表驅動小表效率更快,轉而用大表驅動小表。

對內連接小表驅動大表的優化策略不清楚的話,可以看MySQL的JOIN(三):JOIN優化實踐之內循環的次數

    DROP INDEX idx_type ON t1;
    -- 兩個表都沒有索引,t2驅動t1
    EXPLAIN SELECT * FROM t1 INNER JOIN t2 ON t1.type=t2.type;
    +----+-------+------+------+--------+----------------------------------------------------+
    | id | table | type | key  | rows   | Extra                                              |
    +----+-------+------+------+--------+----------------------------------------------------+
    |  1 | t2    | ALL  | NULL |    100 | NULL                                               |
    |  1 | t1    | ALL  | NULL | 110428 | Using where; Using join buffer (Block Nested Loop) |
    +----+-------+------+------+--------+----------------------------------------------------+
-- 在t2表上建立索引,MySQL的Optimizer發現后,用大表驅動了小表 CREATE INDEX idx_type ON t2(type); EXPLAIN SELECT * FROM t1 INNER JOIN t2 ON t1.type=t2.type; +----+-------+------+----------+--------+-------------+ | id | table | type | key | rows | Extra | +----+-------+------+----------+--------+-------------+ | 1 | t1 | ALL | NULL | 110428 | Using where | | 1 | t2 | ref | idx_type | 1 | Using index | +----+-------+------+----------+--------+-------------+

-- 在t1表上建立索引,再加上t1是大表,符合“小表驅動大表”的原則,性能比上面的語句要好 DROP INDEX idx_type ON t2; CREATE INDEX idx_type ON t1(type); EXPLAIN SELECT * FROM t1 INNER JOIN t2 ON t1.type=t2.type; +----+-------+------+---------------+----------+------+-------------+ | id | table | type | possible_keys | key | rows | Extra | +----+-------+------+---------------+----------+------+-------------+ | 1 | t2 | ALL | NULL | NULL | 100 | Using where | | 1 | t1 | ref | idx_type | idx_type | 5 | Using index | +----+-------+------+---------------+----------+------+-------------+

三表連接

上面都是兩表連接,三表連接也是一樣的,找出驅動表和被驅動表,在被驅動表上建立索引,即可提高連接性能。

總結

想要從快速匹配的角度優化JOIN,首先就是找出誰是驅動表,誰是被驅動表,然后在被驅動表上建立索引即可。


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM