SQL優化--inner、left join替換in、not in、except


新系統上線,用戶基數16萬,各種查詢timeout。打開砂鍋問到底,直接看sql語句吧,都是淚呀,一大堆in\not in\except。這里總結一下,怎么替換掉in\not in\except。

1. in/except->left join

查詢目的

根據

  • 客戶表(Customer,按照站點、冊本划分,16萬數據)
  • 水表表(Meter,16萬數據)
  • 水表抄表數據表(Meter_Data,遠傳表每天更新,27萬數據)

關聯查詢,查詢某天某個冊本下水表未上傳抄表數據的用戶。

原查詢結構

select * 
from Customer cs
where 
cs.Group_No = '冊本編號' and
cs.Customer_No in 
(
    select Customer_No 
    from  Customer cs
    left join Meter me on cs.Customer_No = me.Customer_No
    where cs.Group_No = '冊本編號'
    except
    select Customer_No
    from Customer cs
    left join Meter me on cs.Customer_No = me.Customer_No
    inner join Meter_data md on me.meter_no = md.meter_no and md.date = '2019-04-09'
    where cs.Group_NO='冊本編號'
)

原查詢思路

  1. 查詢出目標冊本已上傳數據的用戶編號
select Customer_No
from Customer cs
left join Meter me on cs.Customer_No = me.Customer_No
inner join Meter_data md on me.meter_no = md.meter_no and md.date = '2019-04-09'
where cs.Group_NO='冊本編號'
  1. 查詢出目標冊本全部用戶編號
select Customer_No 
from  Customer cs
left join Meter me on cs.Customer_No = me.Customer_No
where cs.Group_No = '冊本編號'
  1. 全部用戶編號中排除已上傳數據的用戶編號,即為未上傳數據的用戶編號
全部用戶編號 except 已抄表的用戶編號
  1. 查詢出在未抄表用戶編號集合中的用戶信息。
select * 
from Customer cs
where 
cs.Group_No = '冊本編號' and
cs.Customer_No in 
(全部用戶編號 except 已抄表的用戶編號)

思路倒是沒有問題,但是in+except查詢效率不要太慢了,本來想測試個時間,結果執行了幾分鍾愣是沒出結果,直接終止掉了

優化查詢結構

其實in\not in\except這些語法在查詢中使用,效率不高是公認的事實,但是可能是由於語義比較明顯吧,很多人還是喜歡這樣用。我們這里使用left join來替代in+except。這里就來改掉上面的查詢:

select cs.*
from Customer cs
left join Meter me on cs.Customer_No = me.Customer_No
left join Meter_data md on me.meter_no = md.meter_no and md.date = '2019-04-09'
where cs.Group_NO='冊本編號' and md.meter_no is null;

優化查詢思路

  1. 用left join代替in+except,通過left join獲取目標冊本下全部用戶的信息,並與當天上傳的抄表數據進行連接;
  2. 連接中,右表為空即抄表數據為空的,即為當前未上傳數據的客戶信息;

left join on expression where expression 執行時,首先確保左表數據全部返回,然后應用on后指定的條件。因此,on的條件如果是對左表數據的過濾,是無效的;對右表數據的過濾是有效的。對左表數據的過濾條件,需要放到where條件中。

2. not in->left join

上面in+except的寫法,可以使用not in簡化一下,但是一樣效率不高。這里想要說明的是not in也可以很方便的使用left join替換。

not in結構

select * 
from Customer cs
where 
cs.Group_No = '冊本編號' and
cs.Customer_No not in 
(
    select Customer_No
    from Customer cs
    left join Meter me on cs.Customer_No = me.Customer_No
    inner join Meter_data md on me.meter_no = md.meter_no and md.date = '2019-04-09'
    where cs.Group_NO='冊本編號'
)

left join結構

select cs.*
from Customer cs
left join Meter me on cs.Customer_No = me.Customer_No
left join Meter_data md on me.meter_no = md.meter_no and md.date = '2019-04-09'
where cs.Group_NO='冊本編號' and md.meter_no is null;

3. in->inner join

查詢目的

還是上面的查詢背景,這里查詢某天某個冊本已經上傳抄表數據的用戶信息。

in結構

select * 
from Customer cs
where 
cs.Group_No = '冊本編號' and
cs.Customer_No in 
(
    select Customer_No
    from Customer cs
    left join Meter me on cs.Customer_No = me.Customer_No
    inner join Meter_data md on me.meter_no = md.meter_no and md.date = '2019-04-09'
    where cs.Group_NO='冊本編號'
)

這里使用in不夠高效,但是我們使用left join是否可以呢?

left join結構

select cs.*
from Customer cs
left join Meter me on cs.Customer_No = me.Customer_No
left join Meter_data md on me.meter_no = md.meter_no and md.date = '2019-04-09'
where cs.Group_NO='冊本編號' and md.meter_no is not null;

left join結構的話,這里需要使用is not null作為篩選條件。但是is not null同樣非常低效。因此我們使用inner join

inner join結構

select cs.*
from Customer cs
left join Meter me on cs.Customer_No = me.Customer_No
inner join Meter_data md on me.meter_no = md.meter_no and md.date = '2019-04-09'
where cs.Group_NO='冊本編號';

inner join通過連接操作,直接獲取到已上傳抄表數據的用戶信息。

4. not in -> in -> inner join

前面的查詢場景中,我們默認的條件是未上傳抄表數據的用戶,當天在meter_data表是沒有記錄的。現在假設我們每天凌晨初始化meter_data表,設置抄表數值默認為零,抄表數據上傳默認為state=0未上傳。上傳后,更新抄表數值和抄表狀態state=1。

這時,我們來優化上面的not in查詢結構還有另外一種思路。

not in結構

select * 
from Customer cs
where 
cs.Group_No = '冊本編號' and
cs.Customer_No not in 
(
    select Customer_No
    from Customer cs
    left join Meter me on cs.Customer_No = me.Customer_No
    inner join Meter_data md on me.meter_no = md.meter_no and md.date = '2019-04-09'
    where cs.Group_NO='冊本編號' and meter.state=1
)

in結構

通過篩選條件取反,變換not in->in

select * 
from Customer cs
where 
cs.Group_No = '冊本編號' and
cs.Customer_No in 
(
    select Customer_No
    from Customer cs
    left join Meter me on cs.Customer_No = me.Customer_No
    inner join Meter_data md on me.meter_no = md.meter_no and md.date = '2019-04-09'
    where cs.Group_NO='冊本編號' and meter.state=0
)

inner join結構

select cs.*
from Customer cs
left join Meter me on cs.Customer_No = me.Customer_No
inner join Meter_data md on me.meter_no = md.meter_no and md.date = '2019-04-09'
where cs.Group_NO='冊本編號' and meter.state=0;

5. 總結如下

上面的查詢結構拆分出來后,大家可能覺得這么簡單的sql怎么可能寫成這個沙雕。其實真實業務系統,還有關聯其他將近10張表。這里想說的是,在in\not in\except這種查詢結構時,如果涉及到的數據量較大,建議堅決用連接替換。

  • ... in (all except sub)... 查詢結構可以轉換為->left join
  • ... not in ... 查詢結構可以轉換為->left join
  • ... not in ... 查詢也可以轉換為 in -> inner join,這里需要確認轉換查詢條件時,是否有對應的數據
  • ... in 查詢結構可以轉換為->inner join


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM