網上可以查到很多這樣的說法:
如果查詢的兩個表大小相當,那么用in和exists差別不大。
如果兩個表中一個較小,一個是大表,則子查詢表大的用exists,子查詢表小的用in:
例如:表A(小表),表B(大表)
1:
select * from A where cc in (select cc from B) 效率低,用到了A表上cc列的索引;
select * from A where exists(select cc from B where cc=A.cc) 效率高,用到了B表上cc列的索引。
相反的
2:
select * from B where cc in (select cc from A) 效率高,用到了B表上cc列的索引;
select * from B where exists(select cc from A where cc=B.cc) 效率低,用到了A表上cc列的索引。
將下面的語句執行優化:
select count(uid) from user where uid in (SELECT did FROM demo);
select count(uid) from user where exists (SELECT 1 FROM demowhere demo.did = user.uid);
1.注意慢的原因就是內部每次與外部比較時,都需要遍歷一次表操作,可以采用另外一個方法,在嵌套一層子查詢,避免多次遍歷操作
SELECT count(did) FROM demo where exists (SELECT uid FROM (SELECT uid from user) as b where b.uid = demo.did);
2.第二種優化就是先將子查詢里的語句執行,使用GROUP_CONCAT將字段連接起來,
如果字符串長度不夠可以使用:SET SESSION group_concat_max_len = 102400;
原sql:
SELECT
c.id
FROM
c 此表有712995條數據
LEFT JOIN u ON c.user_id = u.id
LEFT JOIN doc ON c.doctor_id = doc.id
LEFT JOIN s ON c.meal_id = s.id
WHERE
s.renew = 1
AND c.orderstatus = 1
AND c.endtime < UNIX_TIMESTAMP()
AND c.org_type = 'c'
AND u.is_doctor = 0
AND u.active = 1
AND doc.is_doctor IN (4, 5)
AND doc.is_family_doctor = 1
AND doc.active = 1
AND c.user_id NOT IN (
SELECT
user_id
FROM
d 此表有934455條數據
WHERE
d.log LIKE '%結束'
);
-- 執行時間為2.265s
優化后:
SET SESSION group_concat_max_len = 102400;
SELECT GROUP_CONCAT(user_id) FROM d WHERE d.log LIKE '%結束'; -- 執行了0.521s
SELECT
c.id
FROM
c
LEFT JOIN u ON c.user_id = u.id
LEFT JOIN doc ON c.doctor_id = doc.id
LEFT JOIN s ON c.meal_id = s.id
WHERE
s.renew = 1
AND c.orderstatus = 1
AND c.endtime < UNIX_TIMESTAMP()
AND c.org_type = 'c'
AND u.is_d = 0
AND u.active = 1
AND doc.is_d IN (4, 5)
AND doc.is_f_d = 1
AND doc.active = 1
AND c.user_id NOT IN (24986,24986,24986,24986,24986,24986,..............................................大概5千個id);
-- 執行時間1.579s
執行時間少了0.686s,但是GROUP_CONCAT(user_id)還執行了0.521s,所以總體時間沒有什么差別(當前數量級),
而且后一個需要考慮字符串的大小問題。
目前就了解這些,以后有時間再細細琢磨。