1、開窗函數是什么?
開窗函數用於為行定義一個窗口(這里的窗口是指運算將要操作的行的集合),它對一組值進行操作,不需要使用GROUP BY子句對數據進行分組,能夠在同一行中同時返回基礎行的列和聚合列。
2、開窗函數有什么用?
開窗函數的功能本質是聚合,但是相比聚合,開窗函數可以提供的信息更多。
3、first_value/last_value函數
first_value()over(partition by 列名1,列名2 order by 列名1,列名2)是求一組數據的第一個值
last_value()over(partition by 列名1,列名2 order by 列名1,列名2)是求一組數據的最后一個值
first_value用法:
select distinct a.date,a.name,first_value(date)over(partition by name order by date asc)as `每個人對應最早的date`
,first_value(date)over(partition by name order by date desc)as `每個人對應最晚的date`
from
(
select '張三'as name,'2021-04-11' as date
union all
select '李四'as name,'2021-04-09' as date
union all
select '趙四'as name,'2021-04-16' as date
union all
select '張三'as name,'2021-03-10'as date
union all
select '李四'as name,'2020-01-01'as date
)a
last_value用法
select distinct a.date,a.name
,last_value(date)over(partition by name order by date asc)as `每個人對應最晚的date`
from
(
select '張三'as name,'2021-04-11' as date
union all
select '李四'as name,'2021-04-09' as date
union all
select '趙四'as name,'2021-04-16' as date
union all
select '張三'as name,'2021-03-10'as date
union all
select '李四'as name,'2020-01-01'as date
)a
可以看到使用last_value函數求每個人最后一個日期,結果並不是想要的。那該怎么辦呢,查詢該函數的具體用法發現:
last_value()默認的統計范圍是”rows between unbounded preceding and current row【無界的前面行和當前行之間】”怎么理解呢?見下:
rows between unbounded preceding and current row,可以這么理解: x∈(-∞,X)
rows between unbounded preceding and unbounded following, x∈(-∞,+ ∞)
rows between current row and unbounded following, x∈(X,+ ∞)
last_value()默認是升序,如果限制了是降序,則等同於first_value()升序
select distinct a.date,a.name
,last_value(date)over(partition by name order by date rows between unbounded preceding and current row)as `(-∞,X)`
,last_value(date)over(partition by name order by date rows between unbounded preceding and unbounded following)as `(-∞,+ ∞)`
,last_value(date)over(partition by name order by date rows between current row and unbounded following)as `(X,+ ∞)`
from
(
select '張三'as name,'2021-04-11' as date
union all
select '李四'as name,'2021-04-09' as date
union all
select '趙四'as name,'2021-04-16' as date
union all
select '張三'as name,'2021-03-10'as date
union all
select '李四'as name,'2020-01-01'as date
)a
rows可以換成range,下次再補充