OVER(PARTITION BY)函數介紹
開窗函數指定了分析函數工作的數據窗口大小,這個數據窗口大小可能會隨着行的變化而變化,舉例如下: 1:over后的寫法: over(order by salary) 按照salary排序進行累計,order by是個默認的開窗函數 over(partition by deptno)按照部門分區
2:開窗的窗口范圍: over(order by salary range between 5 preceding and 5 following):窗口范圍為當前行數據幅度減5加5后的范圍內的。
舉例:
--sum(s)over(order by s range between 2 preceding and 2 following) 表示加2或2的范圍內的求和
select name,class,s, sum(s)over(order by s range between 2 preceding and 2 following) mm from t2 adf 3 45 45 --45加2減2即43到47,但是s在這個范圍內只有45 asdf 3 55 55 cfe 2 74 74 3dd 3 78 158 --78在76到80范圍內有78,80,求和得158 fda 1 80 158 gds 2 92 92 ffd 1 95 190 dss 1 95 190 ddd 3 99 198gf 3 99 198
舉例:
3、與over函數結合的幾個函數介紹
下面以班級成績表t2來說明其應用
t2表信息如下: cfe 2 74 dss 1 95 ffd 1 95 fda 1 80 gds 2 92 gf 3 99 ddd 3 99 adf 3 45 asdf 3 55 3dd 3 78
select * from ( select name,class,s,rank()over(partition by class order by s desc) mm from t2 ) where mm=1; 得到的結果是: dss 1 95 1 ffd 1 95 1 gds 2 92 1 gf 3 99 1 ddd 3 99 1
注意: 1.在求第一名成績的時候,不能用row_number(),因為如果同班有兩個並列第一,row_number()只返回一個結果; select * from ( select name,class,s,row_number()over(partition by class order by s desc) mm from t2 ) where mm=1; 1 95 1 --95有兩名但是只顯示一個 2 92 1 3 99 1 --99有兩名但也只顯示一個
2.rank()和dense_rank()可以將所有的都查找出來: 如上可以看到采用rank可以將並列第一名的都查找出來; rank()和dense_rank()區別: --rank()是跳躍排序,有兩個第二名時接下來就是第四名; select name,class,s,rank()over(partition by class order by s desc) mm from t2 dss 1 95 1 ffd 1 95 1 fda 1 80 3 --直接就跳到了第三 gds 2 92 1 cfe 2 74 2 gf 3 99 1 ddd 3 99 1 3dd 3 78 3 asdf 3 55 4 adf 3 45 5 --dense_rank()l是連續排序,有兩個第二名時仍然跟着第三名 select name,class,s,dense_rank()over(partition by class order by s desc) mm from t2 dss 1 95 1 ffd 1 95 1 fda 1 80 2 --連續排序(仍為2) gds 2 92 1 cfe 2 74 2 gf 3 99 1 ddd 3 99 1 3dd 3 78 2 asdf 3 55 3 adf 3 45 4
--sum()over()的使用
select name,class,s, sum(s)over(partition by class order by s desc) mm from t2 --根據班級進行分數求和 dss 1 95 190 --由於兩個95都是第一名,所以累加時是兩個第一名的相加 ffd 1 95 190 fda 1 80 270 --第一名加上第二名的 gds 2 92 92 cfe 2 74 166 gf 3 99 198 ddd 3 99 198 3dd 3 78 276 asdf 3 55 331 adf 3 45 376
first_value() over()和last_value() over()的使用
--找出這三條電路每條電路的第一條記錄類型和最后一條記錄類型
注:rows BETWEEN unbounded preceding AND unbounded following 的使用
--取last_value時不使用rows BETWEEN unbounded preceding AND unbounded following的結果
如下圖可以看到,如果不使用
數據如下:
取出該電路的第一條記錄,加上ignore nulls后,如果第一條是判斷的那個字段是空的,則默認取下一條,結果如下所示:
--lead() over()函數用法(取出后N行數據)
lead(expresstion,<offset>,<default>) with a as (select 1 id,'a' name from dual union select 2 id,'b' name from dual union select 3 id,'c' name from dual union select 4 id,'d' name from dual union select 5 id,'e' name from dual ) select id,name,lead(id,1,'')over(order by name) from a;
--ratio_to_report(a)函數用法 Ratio_to_report() 括號中就是分子,over() 括號中就是分母
with a as (select 1 a from dual union all select 1 a from dual union all select 1 a from dual union all select 2 a from dual union all select 3 a from dual union all select 4 a from dual union all select 4 a from dual union all select 5 a from dual ) select a, ratio_to_report(a)over(partition by a) b from a order by a;
with a as (select 1 a from dual union all select 1 a from dual union all select 1 a from dual union all select 2 a from dual union all select 3 a from dual union all select 4 a from dual union all select 4 a from dual union all select 5 a from dual ) select a, ratio_to_report(a)over() b from a --分母缺省就是整個占比 order by a;
with a as (select 1 a from dual union all select 1 a from dual union all select 1 a from dual union all select 2 a from dual union all select 3 a from dual union all select 4 a from dual union all select 4 a from dual union all select 5 a from dual ) select a, ratio_to_report(a)over() b from a group by a order by a;--分組后的占比
SAMPLE:下例中0.7的分布值在部門30中沒有對應的Cume_Dist值,所以就取下一個分布值0.83333333所對應的SALARY來替代
SELECT ename, sal, deptno, percentile_disc(0.7) within GROUP(ORDER BY sal) over(PARTITION BY deptno) "Percentile_Disc", cume_dist() over(PARTITION BY deptno ORDER BY sal) "Cume_Dist" FROM emp WHERE deptno IN (30, 60);
詳細參考http://www.cnblogs.com/lanzi/archive/2010/10/26/1861338.html
--select max(t.check_date),t.user_id from attendance t group by t.user_id;
insert into attendance_day
(id, user_id, check_day, gps_x, gpx_y)
select seq_attendance.nextval, t.user_id, t.check_date, t.gps_x, t.gpx_y
--from attendance t
from (select m.*,
row_number() over(partition by m.user_id order by m.check_date desc) rn
from attendance m) t
where rn = 1;