Oracle SQL高級編程——分析函數(窗口函數)全面講解
注:本文來源於:《Oracle SQL高級編程——分析函數(窗口函數)全面講解》
概述
分析函數是以一定的方法在一個與當前行相關的結果子集中進行計算,也稱為窗口函數。
一般結構為
Function(arg1 , arg2 ……) over(partition by clause order by clause windowing clause ) Windowing clause : rows | range between start_expr and end_expr Start_expr is unbounded preceding | current row | n preceding | n following End_expr is unbounded following | current row | n preceding | n following
不是所有的分析函數都支持開窗子句。
創建測試表
create table sales_fact as select country_name country , country_subregion region , prod_name product , calendar_year year , calendar_week_number week , sum(amount_sold) sale , sum(amount_sold* (case when mod(rownum , 10 ) = 0 then 1.4 when mod(rownum , 5)= 0 then 0.6 when mod(rownum , 2)= 0 then 0.9 when mod(rownum , 2)=1 then 1.2 else 1 end ) ) receipts from sales , times , customers , countries , products where sales.time_id = times.time_id and sales.prod_id = products.prod_id and sales.cust_id = customers.cust_id and customers.country_id = countries.country_id group by country_name , country_subregion , prod_name , calendar_year , calendar_week_number ;
把聚合函數當作分析函數使用
分析函數列只是一列數值,每一行對應一個值,對於查詢的其它方面沒有任何影響。
從以下查詢可以得出以下幾點:
1.over分區條件中的列可以不在select列表中,但是必須在數據源中。
2.over排序條件中的列可以不在select列表中,但是必須在數據源中。
3.over排序條件是對所在分區中的數據進行排序,與select語句中的排序無關。但是會影響到分析函數的結果。
4.over中的開窗條件的范圍一般僅限於分區本身。rows between unbounded preceding and current row表示從分區的最開始到當前行。
5.分析函數的數據來自結果集(施加了where條件之后的)。下面的查詢中的分析列表示該年從開始到該周的銷售累計。
SH@ prod> select year , week , sale , sum(sale) over( partition by region , year order by week rows between unbounded preceding and current row ) running_sum_ytd from sales_fact where country in ('Australia') and product='Xtend Memory' and week < 10 order by year , week ; YEAR WEEK SALE RUNNING_SUM_YTD ---------- ---------- ---------- --------------- 1998 1 58.15 58.15 1998 2 29.39 87.54 1998 3 29.49 117.03 1998 4 29.49 146.52 1998 5 29.8 176.32 1998 6 58.78 235.1 1998 9 58.78 293.88 1999 1 53.52 53.52 1999 3 94.6 148.12 1999 4 40.5 188.62 1999 5 80.01 268.63 1999 6 40.5 309.13 1999 8 103.11 412.24 1999 9 53.34 465.58 2000 1 46.7 46.7 2000 3 93.41 140.11 2000 4 46.54 186.65 2000 5 46.7 233.35 2000 7 70.8 304.15 2000 8 46.54 350.69 2001 1 92.26 92.26 2001 2 118.38 210.64 2001 3 47.24 257.88 2001 4 256.7 514.58 2001 5 93.44 608.02 2001 6 22.44 630.46 2001 7 69.96 700.42 YEAR WEEK SALE RUNNING_SUM_YTD ---------- ---------- ---------- --------------- 2001 8 46.06 746.48 2001 9 92.67 839.15 29 rows selected.結果與上面相同,只是排序不同方式,分析列看起來就沒有規律了。
SH@ prod> select year , week , sale , sum(sale) over( partition by region , year order by week rows between unbounded preceding and current row ) running_sum_ytdfrom sales_fact where country in ('Australia') and product='Xtend Memory' and week < 10 order by year , sale ; YEAR WEEK SALE RUNNING_SUM_YTD ---------- ---------- ---------- --------------- 1998 2 29.39 87.54 1998 4 29.49 146.52 1998 3 29.49 117.03 1998 5 29.8 176.32 1998 1 58.15 58.15 1998 6 58.78 235.1 1998 9 58.78 293.88 1999 4 40.5 188.62 1999 6 40.5 309.13 1999 9 53.34 465.58 1999 1 53.52 53.52 1999 5 80.01 268.63 1999 3 94.6 148.12 1999 8 103.11 412.24 2000 4 46.54 186.65 2000 8 46.54 350.69 2000 1 46.7 46.7 2000 5 46.7 233.35 2000 7 70.8 304.15 2000 3 93.41 140.11 2001 6 22.44 630.46 2001 8 46.06 746.48 2001 3 47.24 257.88 2001 7 69.96 700.42 2001 1 92.26 92.26 2001 9 92.67 839.15 2001 5 93.44 608.02 YEAR WEEK SALE RUNNING_SUM_YTD ---------- ---------- ---------- --------------- 2001 2 118.38 210.64 2001 4 256.7 514.58 29 rows selected.
分區中的排序選取不恰當,則分析列結果沒有什么意義了。分區開窗排序的選取與分析列的結果密切相關。
SH@ prod> select year , week , sale , sum(sale) over( partition by region , year order by sale rows between unbounded preceding and current row ) running_sum_ytd from sales_fact where country in ('Australia') and product='Xtend Memory' and week < 10 order by year , week ; YEAR WEEK SALE RUNNING_SUM_YTD ---------- ---------- ---------- --------------- 1998 1 58.15 176.32 1998 2 29.39 29.39 1998 3 29.49 88.37 1998 4 29.49 58.88 1998 5 29.8 118.17 1998 6 58.78 235.1 1998 9 58.78 293.88 1999 1 53.52 187.86 1999 3 94.6 362.47 1999 4 40.5 40.5 1999 5 80.01 267.87 1999 6 40.5 81 1999 8 103.11 465.58 1999 9 53.34 134.34 2000 1 46.7 186.48 2000 3 93.41 350.69 2000 4 46.54 46.54 2000 5 46.7 139.78 2000 7 70.8 257.28 2000 8 46.54 93.08 2001 1 92.26 277.96 2001 2 118.38 582.45 2001 3 47.24 115.74 2001 4 256.7 839.15 2001 5 93.44 464.07 2001 6 22.44 22.44 2001 7 69.96 185.7 YEAR WEEK SALE RUNNING_SUM_YTD ---------- ---------- ---------- --------------- 2001 8 46.06 68.5 2001 9 92.67 370.63 29 rows selected.
分析函數的執行計划
雖然有分析函數還是只需要一次全表掃描,但是需要排序。
WINDOW SORT是分析函數的典型特征。
SH@ prod> explain plan for select year , week , sale , sum(sale) over( partition by region , year order by sale rows between unbounded preceding and current row ) running_sum_ytdfrom sales_fact where country in ('Australia') and product='Xtend Memory' and week < 10 order by year , week ; Explained. SH@ prod> select * from table(dbms_xplan.display()) ; PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------ Plan hash value: 173857439 ---------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 18 | 1890 | 311 (1)| 00:00:04 | | 1 | SORT ORDER BY | | 18 | 1890 | 311 (1)| 00:00:04 | | 2 | WINDOW SORT | | 18 | 1890 | 311 (1)| 00:00:04 | |* 3 | TABLE ACCESS FULL| SALES_FACT | 18 | 1890 | 309 (1)| 00:00:04 | ---------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("COUNTRY"='Australia' AND "PRODUCT"='Xtend Memory' AND "WEEK"<10) Note ----- - dynamic sampling used for this statement (level=2) 說明該表還沒有統計信息。 20 rows selected.不加分析列,只是少了一步window sort。
SH@ prod> explain plan for 2 select year , week , sale 3 from sales_fact 4 where country in ('Australia') and product='Xtend Memory' and week < 10 5 order by year , week ; Explained. SH@ prod> select * from table(dbms_xplan.display()) ; PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------ Plan hash value: 1978576542 --------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 18 | 1584 | 310 (1)| 00:00:04 | | 1 | SORT ORDER BY | | 18 | 1584 | 310 (1)| 00:00:04 | |* 2 | TABLE ACCESS FULL| SALES_FACT | 18 | 1584 | 309 (1)| 00:00:04 | --------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("COUNTRY"='Australia' AND "PRODUCT"='Xtend Memory' AND "WEEK"<10) Note ----- - dynamic sampling used for this statement (level=2) 19 rows selected.
如何使窗口充滿整個分區
SH@ prod> select year , week , sale , max(sale) over(partition by product , country , region , year 2 order by week 3 rows between unbounded preceding and unbounded following ) 4 max_sale 5 from sales_fact 6 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10 7 order by product , country , year , week ; YEAR WEEK SALE MAX_SALE ---------- ---------- ---------- ---------- 1998 1 58.15 58.78 1998 2 29.39 58.78 1998 3 29.49 58.78 1998 4 29.49 58.78 1998 5 29.8 58.78 1998 6 58.78 58.78 1998 9 58.78 58.78 1999 1 53.52 103.11 1999 3 94.6 103.11 1999 4 40.5 103.11 1999 5 80.01 103.11 1999 6 40.5 103.11 1999 8 103.11 103.11 1999 9 53.34 103.11 2000 1 46.7 93.41 2000 3 93.41 93.41 2000 4 46.54 93.41 2000 5 46.7 93.41 2000 7 70.8 93.41 2000 8 46.54 93.41 2001 1 92.26 256.7 2001 2 118.38 256.7 2001 3 47.24 256.7 2001 4 256.7 256.7 2001 5 93.44 256.7 2001 6 22.44 256.7 2001 7 69.96 256.7 YEAR WEEK SALE MAX_SALE ---------- ---------- ---------- ---------- 2001 8 46.06 256.7 2001 9 92.67 256.7 29 rows selected.
兩個邊界都滑動的窗口
下面語句的窗口是往前兩周,加往后兩周,加當前周,一共五周。(到達邊界時窗口會自動縮小)
SH@ prod> select year , week , sale , max(sale) over(partition by product , country , region , year 2 order by week 3 rows between 2 preceding and 2 following ) 4 max_sale 5 from sales_fact 6 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10 7 order by product , country , year , week ; YEAR WEEK SALE MAX_SALE ---------- ---------- ---------- ---------- 1998 1 58.15 58.15 1998 2 29.39 58.15 1998 3 29.49 58.15 1998 4 29.49 58.78 1998 5 29.8 58.78 1998 6 58.78 58.78 1998 9 58.78 58.78 1999 1 53.52 94.6 1999 3 94.6 94.6 1999 4 40.5 94.6 1999 5 80.01 103.11 1999 6 40.5 103.11 1999 8 103.11 103.11 1999 9 53.34 103.11 2000 1 46.7 93.41 2000 3 93.41 93.41 2000 4 46.54 93.41 2000 5 46.7 93.41 2000 7 70.8 70.8 2000 8 46.54 70.8 這里只所以是70.8因為窗口縮小了。 2001 1 92.26 118.38 2001 2 118.38 256.7 2001 3 47.24 256.7 2001 4 256.7 256.7 2001 5 93.44 256.7 2001 6 22.44 256.7 2001 7 69.96 93.44 YEAR WEEK SALE MAX_SALE ---------- ---------- ---------- ---------- 2001 8 46.06 92.67 2001 9 92.67 92.67 29 rows selected.
默認窗口是什么?
一看便知。
SH@ prod> select year , week , sale , max(sale) over(partition by product , country , region , year 2 order by week ) 3 max_sale 4 from sales_fact 5 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10 6 order by product , country , year , week ; YEAR WEEK SALE MAX_SALE ---------- ---------- ---------- ---------- 1998 1 58.15 58.15 1998 2 29.39 58.15 1998 3 29.49 58.15 1998 4 29.49 58.15 1998 5 29.8 58.15 1998 6 58.78 58.78 1998 9 58.78 58.78 1999 1 53.52 53.52 1999 3 94.6 94.6 1999 4 40.5 94.6 1999 5 80.01 94.6 1999 6 40.5 94.6 1999 8 103.11 103.11 1999 9 53.34 103.11 2000 1 46.7 46.7 2000 3 93.41 93.41 2000 4 46.54 93.41 2000 5 46.7 93.41 2000 7 70.8 93.41 2000 8 46.54 93.41 2001 1 92.26 92.26 2001 2 118.38 118.38 2001 3 47.24 118.38 2001 4 256.7 256.7 2001 5 93.44 256.7 2001 6 22.44 256.7 2001 7 69.96 256.7 YEAR WEEK SALE MAX_SALE ---------- ---------- ---------- ---------- 2001 8 46.06 256.7 2001 9 92.67 256.7 29 rows selected.
Lead和Lag(不支持開窗的函數)
有開窗語句時會報這樣的錯
2 3 4 rows between 2 preceding and 2 following ) * ERROR at line 3: ORA-00907: missing right parenthesis
LEAD是求下一個,而不是前一個。在分區的下邊界處,LEAD處回空值。
SH@ prod> select year , week , sale , lead(sale) over(partition by product , country , region , year 2 order by week ) 3 former_sale 4 from sales_fact 5 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10 6 order by product , country , year , week ; YEAR WEEK SALE FORMER_SALE ---------- ---------- ---------- ----------- 1998 1 58.15 29.39 1998 2 29.39 29.49 1998 3 29.49 29.49 1998 4 29.49 29.8 1998 5 29.8 58.78 1998 6 58.78 58.78 1998 9 58.78 1999 1 53.52 94.6 1999 3 94.6 40.5 1999 4 40.5 80.01 1999 5 80.01 40.5 1999 6 40.5 103.11 1999 8 103.11 53.34 1999 9 53.34 2000 1 46.7 93.41 2000 3 93.41 46.54 2000 4 46.54 46.7 2000 5 46.7 70.8 2000 7 70.8 46.54 2000 8 46.54 2001 1 92.26 118.38 2001 2 118.38 47.24 2001 3 47.24 256.7 2001 4 256.7 93.44 2001 5 93.44 22.44 2001 6 22.44 69.96 2001 7 69.96 46.06 YEAR WEEK SALE FORMER_SALE ---------- ---------- ---------- ----------- 2001 8 46.06 92.67 2001 9 92.67 29 rows selected.
LAG求上一個,也就是前一個。在分區的上邊界處返回空值。
SH@ prod> select year , week , sale , lag(sale) over(partition by product , country , region , year 2 order by week ) 3 former_sale 4 from sales_fact 5 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10 6 order by product , country , year , week ; YEAR WEEK SALE FORMER_SALE ---------- ---------- ---------- ----------- 1998 1 58.15 1998 2 29.39 58.15 1998 3 29.49 29.39 1998 4 29.49 29.49 1998 5 29.8 29.49 1998 6 58.78 29.8 1998 9 58.78 58.78 1999 1 53.52 1999 3 94.6 53.52 1999 4 40.5 94.6 1999 5 80.01 40.5 1999 6 40.5 80.01 1999 8 103.11 40.5 1999 9 53.34 103.11 2000 1 46.7 2000 3 93.41 46.7 2000 4 46.54 93.41 2000 5 46.7 46.54 2000 7 70.8 46.7 2000 8 46.54 70.8 2001 1 92.26 2001 2 118.38 92.26 2001 3 47.24 118.38 2001 4 256.7 47.24 2001 5 93.44 256.7 2001 6 22.44 93.44 2001 7 69.96 22.44 YEAR WEEK SALE FORMER_SALE ---------- ---------- ---------- ----------- 2001 8 46.06 69.96 2001 9 92.67 46.06 29 rows selected.
復雜的Lead和Lag
Lead和lag函數的第一參數為返回的列,第二參數為相隔行數(非負),第三個參數為不存在時的默認值(可以指定為當前行的值)。
SH@ prod> select year , week , sale , lag(sale , 2 , 0 ) over(partition by product , country , region , year 2 order by week ) 3 former_sale 4 from sales_fact 5 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10 6 order by product , country , year , week ; YEAR WEEK SALE FORMER_SALE ---------- ---------- ---------- ----------- 1998 1 58.15 0 1998 2 29.39 0 1998 3 29.49 58.15 1998 4 29.49 29.39 1998 5 29.8 29.49 1998 6 58.78 29.49 1998 9 58.78 29.8 1999 1 53.52 0 1999 3 94.6 0 1999 4 40.5 53.52 1999 5 80.01 94.6 1999 6 40.5 40.5 1999 8 103.11 80.01 1999 9 53.34 40.5 2000 1 46.7 0 2000 3 93.41 0 2000 4 46.54 46.7 2000 5 46.7 93.41 2000 7 70.8 46.54 2000 8 46.54 46.7 2001 1 92.26 0 2001 2 118.38 0 2001 3 47.24 92.26 2001 4 256.7 118.38 2001 5 93.44 47.24 2001 6 22.44 256.7 2001 7 69.96 93.44 YEAR WEEK SALE FORMER_SALE ---------- ---------- ---------- ----------- 2001 8 46.06 22.44 2001 9 92.67 69.96 29 rows selected.
將默認值指定為當前行的值。
SH@ prod> select year , week , sale , lag(sale , 2 , sale ) over(partition by product , country , region , year 2 order by week ) 3 former_sale 4 from sales_fact 5 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10 6 order by product , country , year , week ; YEAR WEEK SALE FORMER_SALE ---------- ---------- ---------- ----------- 1998 1 58.15 58.15 1998 2 29.39 29.39 1998 3 29.49 58.15 1998 4 29.49 29.39 1998 5 29.8 29.49 1998 6 58.78 29.49 1998 9 58.78 29.8 1999 1 53.52 53.52 1999 3 94.6 94.6 1999 4 40.5 53.52 1999 5 80.01 94.6 1999 6 40.5 40.5 1999 8 103.11 80.01 1999 9 53.34 40.5 2000 1 46.7 46.7 2000 3 93.41 93.41 2000 4 46.54 46.7 2000 5 46.7 93.41 2000 7 70.8 46.54 2000 8 46.54 46.7 2001 1 92.26 92.26 2001 2 118.38 118.38 2001 3 47.24 92.26 2001 4 256.7 118.38 2001 5 93.44 47.24 2001 6 22.44 256.7 2001 7 69.96 93.44 YEAR WEEK SALE FORMER_SALE ---------- ---------- ---------- ----------- 2001 8 46.06 22.44 2001 9 92.67 69.96 29 rows selected.
LEAD與LAG關於數據缺口的問題
LAG(sale , 10 ) 這表示與它相隔10行的數據,可是我想訪問的10周前的數據。如果中間數據有缺口會出現嚴重的問題。
FIRST_VALUE和LAST_VALUE
這兩個函數都可以與order by條件配合得到最大值和最小值。
First_value返回窗口中的第一個值。Ignore nulls表示忽略空值,如果第一個是空值返回第二個。SH@ prod> select year , week , sale , first_value(sale ignore nulls) over(partition by product , country , region , year 2 order by week 3 rows between unbounded preceding and unbounded following ) 4 former_sale 5 from sales_fact 6 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10 7 order by product , country , year , week ; YEAR WEEK SALE FORMER_SALE ---------- ---------- ---------- ----------- 1998 1 58.15 58.15 1998 2 29.39 58.15 1998 3 29.49 58.15 1998 4 29.49 58.15 1998 5 29.8 58.15 1998 6 58.78 58.15 1998 9 58.78 58.15 1999 1 53.52 53.52 1999 3 94.6 53.52 1999 4 40.5 53.52 1999 5 80.01 53.52 1999 6 40.5 53.52 1999 8 103.11 53.52 1999 9 53.34 53.52 2000 1 46.7 46.7 2000 3 93.41 46.7 2000 4 46.54 46.7 2000 5 46.7 46.7 2000 7 70.8 46.7 2000 8 46.54 46.7 2001 1 92.26 92.26 2001 2 118.38 92.26 2001 3 47.24 92.26 2001 4 256.7 92.26 2001 5 93.44 92.26 2001 6 22.44 92.26 2001 7 69.96 92.26 YEAR WEEK SALE FORMER_SALE ---------- ---------- ---------- ----------- 2001 8 46.06 92.26 2001 9 92.67 92.26 29 rows selected.
Last_value返回窗口中的最后一個值。Respect nulls表示識別空值,如果最后一個是空值也將其返回。
SH@ prod> select year , week , sale , last_value(sale respect nulls) over(partition by product , country , region , year 2 order by week 3 rows between unbounded preceding and unbounded following ) 4 former_sale 5 from sales_fact 6 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10 7 order by product , country , year , week ; YEAR WEEK SALE FORMER_SALE ---------- ---------- ---------- ----------- 1998 1 58.15 58.78 1998 2 29.39 58.78 1998 3 29.49 58.78 1998 4 29.49 58.78 1998 5 29.8 58.78 1998 6 58.78 58.78 1998 9 58.78 58.78 1999 1 53.52 53.34 1999 3 94.6 53.34 1999 4 40.5 53.34 1999 5 80.01 53.34 1999 6 40.5 53.34 1999 8 103.11 53.34 1999 9 53.34 53.34 2000 1 46.7 46.54 2000 3 93.41 46.54 2000 4 46.54 46.54 2000 5 46.7 46.54 2000 7 70.8 46.54 2000 8 46.54 46.54 2001 1 92.26 92.67 2001 2 118.38 92.67 2001 3 47.24 92.67 2001 4 256.7 92.67 2001 5 93.44 92.67 2001 6 22.44 92.67 2001 7 69.96 92.67 YEAR WEEK SALE FORMER_SALE ---------- ---------- ---------- ----------- 2001 8 46.06 92.67 2001 9 92.67 92.67 29 rows selected.
NTH_VALUE訪問分區別的任意指定行
FIRST_VALUE相當於NTH_VALUE(sale , 1 )或者NTH_VALUE(sale , 1 )from first respect nulls。
可以與排序配合求第幾大,第幾小。SH@ prod> select year , week , sale , nth_value(sale , 1 ) from last ignore nulls over(partition by product , country , region , year 2 order by week 3 rows between unbounded preceding and unbounded following ) 4 former_sale 5 from sales_fact 6 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10 7 order by product , country , year , week ; YEAR WEEK SALE FORMER_SALE ---------- ---------- ---------- ----------- 1998 1 58.15 58.78 1998 2 29.39 58.78 1998 3 29.49 58.78 1998 4 29.49 58.78 1998 5 29.8 58.78 1998 6 58.78 58.78 1998 9 58.78 58.78 1999 1 53.52 53.34 1999 3 94.6 53.34 1999 4 40.5 53.34 1999 5 80.01 53.34 1999 6 40.5 53.34 1999 8 103.11 53.34 1999 9 53.34 53.34 2000 1 46.7 46.54 2000 3 93.41 46.54 2000 4 46.54 46.54 2000 5 46.7 46.54 2000 7 70.8 46.54 2000 8 46.54 46.54 2001 1 92.26 92.67 2001 2 118.38 92.67 2001 3 47.24 92.67 2001 4 256.7 92.67 2001 5 93.44 92.67 2001 6 22.44 92.67 2001 7 69.96 92.67 YEAR WEEK SALE FORMER_SALE ---------- ---------- ---------- ----------- 2001 8 46.06 92.67 2001 9 92.67 92.67 29 rows selected.
RANK函數(不能開窗,作用於整個分區)
必須有排序條件,rank就是根據order by條件中的列來定排名的。
RANK函數的排名中,如果出現並列,排名將不連續。
如:1 2(2) 4 5 6 7 8 9 。 如果有兩個第二名,那么第三名就不存在了。
請注意空值,在排序子句中可以使用NULLS LAST來把空值放在最后面SH@ prod> select year , week , sale , rank() over(partition by product , country , region , year 2 order by sale ) 3 former_sale 4 from sales_fact 5 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10 6 order by product , country , year , week ; YEAR WEEK SALE FORMER_SALE ---------- ---------- ---------- ----------- 1998 1 58.15 5 沒有3 1998 2 29.39 1 1998 3 29.49 2 1998 4 29.49 2 1998 5 29.8 4 1998 6 58.78 6 1998 9 58.78 6 1999 1 53.52 4 1999 3 94.6 6 1999 4 40.5 1 1999 5 80.01 5 1999 6 40.5 1 1999 8 103.11 7 1999 9 53.34 3 2000 1 46.7 3 2000 3 93.41 6 2000 4 46.54 1 2000 5 46.7 3 2000 7 70.8 5 2000 8 46.54 1 2001 1 92.26 5 2001 2 118.38 8 2001 3 47.24 3 2001 4 256.7 9 2001 5 93.44 7 2001 6 22.44 1 2001 7 69.96 4 YEAR WEEK SALE FORMER_SALE ---------- ---------- ---------- ----------- 2001 8 46.06 2 2001 9 92.67 6 29 rows selected.
DENSE_RANK(與RANK的區別在於排名一是連續的)
SH@ prod> select year , week , sale , dense_rank() over(partition by product , country , region , year 2 order by sale ) 3 former_sale 4 from sales_fact 5 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10 6 order by product , country , year , week ; YEAR WEEK SALE FORMER_SALE ---------- ---------- ---------- ----------- 1998 1 58.15 4 第三名是存在的 1998 2 29.39 1 1998 3 29.49 2 1998 4 29.49 2 1998 5 29.8 3 1998 6 58.78 5 1998 9 58.78 5 1999 1 53.52 3 1999 3 94.6 5 1999 4 40.5 1 1999 5 80.01 4 1999 6 40.5 1 1999 8 103.11 6 1999 9 53.34 2 2000 1 46.7 2 2000 3 93.41 4 2000 4 46.54 1 2000 5 46.7 2 2000 7 70.8 3 2000 8 46.54 1 2001 1 92.26 5 2001 2 118.38 8 2001 3 47.24 3 2001 4 256.7 9 2001 5 93.44 7 2001 6 22.44 1 2001 7 69.96 4 YEAR WEEK SALE FORMER_SALE ---------- ---------- ---------- ----------- 2001 8 46.06 2 2001 9 92.67 6 29 rows selected.
ROW_NUMBER(不支持開窗,不確定性函數)
為分區中的每一行指定一個遞增的編號,如果排序的列的值相同,誰先誰后是隨機的。
SH@ prod> select year , week , sale , row_number() over(partition by product , country , region , year 2 order by sale ) 3 former_sale 4 from sales_fact 5 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10 6 order by product , country , year , sale ; YEAR WEEK SALE FORMER_SALE ---------- ---------- ---------- ----------- 1998 2 29.39 1 1998 4 29.49 2 1998 3 29.49 3 1998 5 29.8 4 1998 1 58.15 5 1998 6 58.78 6 1998 9 58.78 7 1999 4 40.5 1 1999 6 40.5 2 1999 9 53.34 3 1999 1 53.52 4 1999 5 80.01 5 1999 3 94.6 6 1999 8 103.11 7 2000 4 46.54 1 2000 8 46.54 2 2000 5 46.7 3 2000 1 46.7 4 2000 7 70.8 5 2000 3 93.41 6 2001 6 22.44 1 2001 8 46.06 2 2001 3 47.24 3 2001 7 69.96 4 2001 1 92.26 5 2001 9 92.67 6 2001 5 93.44 7 YEAR WEEK SALE FORMER_SALE ---------- ---------- ---------- ----------- 2001 2 118.38 8 2001 4 256.7 9 29 rows selected.
Ratio_to_report(當前行的值與分區總和的比值)
這個函數不支持排序和開窗。
求各周的銷量在每年中的比例以及在整個產品銷量中的比例。
SH@ prod> select year , week , sale , 2 trunc(100* ratio_to_report(sale) over(partition by year ) , 2) sales_yr , 3 trunc(100* ratio_to_report(sale) over() , 2 ) sales_prod 4 from sales_fact 5 where country in ('Australia') and product = 'Xtend Memory' and week < 10 6 order by year , week ; YEAR WEEK SALE SALES_YR SALES_PROD ---------- ---------- ---------- ---------- ---------- 1998 1 58.15 19.78 2.98 1998 2 29.39 10 1.5 1998 3 29.49 10.03 1.51 1998 4 29.49 10.03 1.51 1998 5 29.8 10.14 1.52 1998 6 58.78 20 3.01 1998 9 58.78 20 3.01 1999 1 53.52 11.49 2.74 1999 3 94.6 20.31 4.85 1999 4 40.5 8.69 2.07 1999 5 80.01 17.18 4.1 1999 6 40.5 8.69 2.07 1999 8 103.11 22.14 5.28 1999 9 53.34 11.45 2.73 2000 1 46.7 13.31 2.39 2000 3 93.41 26.63 4.79 2000 4 46.54 13.27 2.38 2000 5 46.7 13.31 2.39 2000 7 70.8 20.18 3.63 2000 8 46.54 13.27 2.38 2001 1 92.26 10.99 4.73 2001 2 118.38 14.1 6.07 2001 3 47.24 5.62 2.42 2001 4 256.7 30.59 13.16 2001 5 93.44 11.13 4.79 2001 6 22.44 2.67 1.15 2001 7 69.96 8.33 3.58 YEAR WEEK SALE SALES_YR SALES_PROD ---------- ---------- ---------- ---------- ---------- 2001 8 46.06 5.48 2.36 2001 9 92.67 11.04 4.75 29 rows selected.
Percent_rank(排在前百分之幾)
用來求當前行的排名的相對百分位置。
比如你對人說自己是第10名,別人可能覺得沒什么,如果是100000中的第10名,那就是前1/10000,那就非常牛了。
這個函數與RANK的推導公式為:
PERCENT_RANK = (RANK - 1) / (N – 1) , N代表總行數。
RANK – 1代表排名大於自己的人數。
N – 1代表除自己以外的總人數。
總體的意思是除自己之外的其它中人,排名比自己高的人所占的比例。
SH@ prod> select year , week , sale , rank() over(partition by product , country , region , year 2 order by sale ) 3 former_sale 4 from sales_fact 5 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10 6 order by product , country , year , sale ; YEAR WEEK SALE FORMER_SALE ---------- ---------- ---------- ----------- 1998 2 29.39 1 1998 4 29.49 2 1998 3 29.49 2 1998 5 29.8 4 1998 1 58.15 5 1998 6 58.78 6 1998 9 58.78 6 1999 4 40.5 1 1999 6 40.5 1 1999 9 53.34 3 1999 1 53.52 4 1999 5 80.01 5 1999 3 94.6 6 1999 8 103.11 7 2000 4 46.54 1 2000 8 46.54 1 2000 5 46.7 3 2000 1 46.7 3 2000 7 70.8 5 2000 3 93.41 6 2001 6 22.44 1 2001 8 46.06 2 2001 3 47.24 3 2001 7 69.96 4 2001 1 92.26 5 2001 9 92.67 6 2001 5 93.44 7 YEAR WEEK SALE FORMER_SALE ---------- ---------- ---------- ----------- 2001 2 118.38 8 2001 4 256.7 9 29 rows selected. SH@ prod> select year , week , sale , 100*percent_rank() over(partition by product , country , region , year 2 order by sale ) 3 former_sale 4 from sales_fact 5 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10 6 order by product , country , year , sale ; YEAR WEEK SALE FORMER_SALE ---------- ---------- ---------- ----------- 1998 2 29.39 0 1998 4 29.49 16.6666667 1998 3 29.49 16.6666667 1998 5 29.8 50 1998 1 58.15 66.6666667 1998 6 58.78 83.3333333 1998 9 58.78 83.3333333 1999 4 40.5 0 1999 6 40.5 0 1999 9 53.34 33.3333333 1999 1 53.52 50 1999 5 80.01 66.6666667 1999 3 94.6 83.3333333 1999 8 103.11 100 2000 4 46.54 0 2000 8 46.54 0 2000 5 46.7 40 2000 1 46.7 40 2000 7 70.8 80 2000 3 93.41 100 2001 6 22.44 0 2001 8 46.06 12.5 2001 3 47.24 25 2001 7 69.96 37.5 2001 1 92.26 50 2001 9 92.67 62.5 2001 5 93.44 75 YEAR WEEK SALE FORMER_SALE ---------- ---------- ---------- ----------- 2001 2 118.38 87.5 2001 4 256.7 100 29 rows selected.
Percentile_cont(大體意思求排在某個百分比時所需的數值)
也可以說是,現在說這樣一個值,向分區里面插入這個值,其排名在百分之N(percent_rank為N%),求這個值。
如果有一個行的percent_rank正好等於N,那么就是這個么的值。如果沒有匹配的,則要計算概率最大的。SH@ prod> select year , week , sale , 2 percentile_cont(0.5) within group(order by sale desc )over(partition by year) pc , 3 percent_rank() over( partition by year order by sale desc ) pr 4 from sales_fact 5 where country in ('Australia') and product = 'Xtend Memory' and week < 11 ; YEAR WEEK SALE PC PR ---------- ---------- ---------- ---------- ---------- 1998 10 117.76 43.975 0 1998 9 58.78 43.975 .142857143 1998 6 58.78 43.975 .142857143 1998 1 58.15 43.975 .428571429 1998 5 29.8 43.975 .571428571 1998 3 29.49 43.975 .714285714 1998 4 29.49 43.975 .714285714 1998 2 29.39 43.975 1 1999 8 103.11 62.76 0 1999 3 94.6 62.76 .142857143 1999 5 80.01 62.76 .285714286 1999 10 72 62.76 .428571429 1999 1 53.52 62.76 .571428571 1999 9 53.34 62.76 .714285714 1999 6 40.5 62.76 .857142857 1999 4 40.5 62.76 .857142857 2000 3 93.41 46.7 0 2000 7 70.8 46.7 .2 2000 5 46.7 46.7 .4 2000 1 46.7 46.7 .4 2000 4 46.54 46.7 .8 2000 8 46.54 46.7 .8 2001 4 256.7 81.11 0 2001 2 118.38 81.11 .111111111 2001 5 93.44 81.11 .222222222 2001 9 92.67 81.11 .333333333 2001 1 92.26 81.11 .444444444 YEAR WEEK SALE PC PR ---------- ---------- ---------- ---------- ---------- 2001 7 69.96 81.11 .555555556 2001 10 69.05 81.11 .666666667 2001 3 47.24 81.11 .777777778 2001 8 46.06 81.11 .888888889 2001 6 22.44 81.11 1 32 rows selected.
Percentile_disc(功能與Percentile_cont大體相同)
區別在於這個函數取到的值一定是在這個分區的行中的。
如果沒有匹配的,Percentile_disc會按照排序取上一個。SH@ prod> select year , week , sale , 2 percentile_disc(0.5) within group(order by sale desc )over(partition by year) pc , 3 percent_rank() over( partition by year order by sale desc ) pr 4 from sales_fact 5 where country in ('Australia') and product = 'Xtend Memory' and week < 11 ; YEAR WEEK SALE PC PR ---------- ---------- ---------- ---------- ---------- 1998 10 117.76 58.15 0 1998 9 58.78 58.15 .142857143 1998 6 58.78 58.15 .142857143 1998 1 58.15 58.15 .428571429 1998 5 29.8 58.15 .571428571 1998 3 29.49 58.15 .714285714 1998 4 29.49 58.15 .714285714 1998 2 29.39 58.15 1 1999 8 103.11 72 0 1999 3 94.6 72 .142857143 1999 5 80.01 72 .285714286 1999 10 72 72 .428571429 1999 1 53.52 72 .571428571 1999 9 53.34 72 .714285714 1999 6 40.5 72 .857142857 1999 4 40.5 72 .857142857 2000 3 93.41 46.7 0 2000 7 70.8 46.7 .2 2000 5 46.7 46.7 .4 2000 1 46.7 46.7 .4 2000 4 46.54 46.7 .8 2000 8 46.54 46.7 .8 2001 4 256.7 92.26 0 2001 2 118.38 92.26 .111111111 2001 5 93.44 92.26 .222222222 2001 9 92.67 92.26 .333333333 2001 1 92.26 92.26 .444444444 YEAR WEEK SALE PC PR ---------- ---------- ---------- ---------- ---------- 2001 7 69.96 92.26 .555555556 2001 10 69.05 92.26 .666666667 2001 3 47.24 92.26 .777777778 2001 8 46.06 92.26 .888888889 2001 6 22.44 92.26 1 32 rows selected. SH@ prod> select year , week , sale , 2 percentile_cont(0.5) within group(order by sale desc )over(partition by year) pc , 3 percent_rank() over( partition by year order by sale desc ) pr 4 from sales_fact 5 where country in ('Australia') and product = 'Xtend Memory' and week < 11 ; YEAR WEEK SALE PC PR ---------- ---------- ---------- ---------- ---------- 1998 10 117.76 43.975 0 1998 9 58.78 43.975 .142857143 1998 6 58.78 43.975 .142857143 1998 1 58.15 43.975 .428571429 1998 5 29.8 43.975 .571428571 1998 3 29.49 43.975 .714285714 1998 4 29.49 43.975 .714285714 1998 2 29.39 43.975 1 1999 8 103.11 62.76 0 1999 3 94.6 62.76 .142857143 1999 5 80.01 62.76 .285714286 1999 10 72 62.76 .428571429 1999 1 53.52 62.76 .571428571 1999 9 53.34 62.76 .714285714 1999 6 40.5 62.76 .857142857 1999 4 40.5 62.76 .857142857 2000 3 93.41 46.7 0 2000 7 70.8 46.7 .2 2000 5 46.7 46.7 .4 2000 1 46.7 46.7 .4 2000 4 46.54 46.7 .8 2000 8 46.54 46.7 .8 2001 4 256.7 81.11 0 2001 2 118.38 81.11 .111111111 2001 5 93.44 81.11 .222222222 2001 9 92.67 81.11 .333333333 2001 1 92.26 81.11 .444444444 YEAR WEEK SALE PC PR ---------- ---------- ---------- ---------- ---------- 2001 7 69.96 81.11 .555555556 2001 10 69.05 81.11 .666666667 2001 3 47.24 81.11 .777777778 2001 8 46.06 81.11 .888888889 2001 6 22.44 81.11 1 32 rows selected.
NTILE(類型於建立直方圖,不支持開窗)
將排序后的數據均勻分配到指定個數據桶中,返回桶編號,如果不能等分,各個桶中的行數最多相差一行。
在以后的處理中可以通過去除首桶或尾去除異常值。
注意:並不是按值分配的。SH@ prod> select year , week , sale , 2 ntile(10) over(order by sale ) group# 3 from sales_fact 4 where country in ('Australia') and product = 'Xtend Memory' and year = 1998 order by year , sale; YEAR WEEK SALE GROUP# ---------- ---------- ---------- ---------- 1998 50 28.76 1 1998 2 29.39 1 1998 4 29.49 1 1998 3 29.49 1 1998 5 29.8 2 1998 43 57.52 2 1998 35 57.52 2 1998 40 57.52 2 1998 46 57.52 3 1998 27 57.52 3 1998 45 57.52 3 1998 44 57.52 3 1998 47 57.72 4 1998 29 57.72 4 1998 28 57.72 4 1998 1 58.15 4 1998 41 58.32 5 1998 51 58.32 5 1998 14 58.78 5 1998 9 58.78 5 1998 15 58.78 6 1998 17 58.78 6 1998 6 58.78 6 1998 19 58.98 6 1998 21 59.6 7 1998 12 59.6 7 1998 52 86.38 7 YEAR WEEK SALE GROUP# ---------- ---------- ---------- ---------- 1998 34 115.44 8 1998 39 115.84 8 1998 42 115.84 8 1998 38 115.84 9 1998 23 117.56 9 1998 18 117.56 9 1998 26 117.56 10 1998 10 117.76 10 1998 48 172.56 10 36 rows selected.
Stddev計算標准差(方差的平方根,支持開窗)
SH@ prod> select year , week , sale , 2 stddev(sale) over( 3 partition by product , country , region , year 4 order by sale desc 5 rows between 2 preceding and 2 following ) stddv 6 from sales_fact 7 where country in ('Australia') and product = 'Xtend Memory' and week < 10 8 order by year , week ; YEAR WEEK SALE STDDV ---------- ---------- ---------- ---------- 1998 1 58.15 15.8453416 1998 2 29.39 .057735027 1998 3 29.49 .178021534 1998 4 29.49 12.7945918 1998 5 29.8 15.815738 1998 6 58.78 .36373067 1998 9 58.78 14.3880654 1999 1 53.52 22.178931 1999 3 94.6 21.7319902 1999 4 40.5 7.46550065 1999 5 80.01 22.9761992 1999 6 40.5 7.41317746 1999 8 103.11 11.6825953 1999 9 53.34 16.1305511 2000 1 46.7 21.0022332 2000 3 93.41 23.3589605 2000 4 46.54 .092376043 2000 5 46.7 10.8139207 2000 7 70.8 22.4285538 2000 8 46.54 .092376043 2001 1 92.26 20.3811452 2001 2 118.38 78.5152276 2001 3 47.24 26.5077898 2001 4 256.7 87.947194 2001 5 93.44 71.309193 2001 6 22.44 13.9900965 2001 7 69.96 22.9124643 YEAR WEEK SALE STDDV ---------- ---------- ---------- ---------- 2001 8 46.06 19.407678 2001 9 92.67 17.1409691 29 rows selected.
Listagg(把分區中的列按照順序拼接起來,不支持開窗)
SH@ prod> col stddv for a60 SH@ prod> select year , week , sale , 2 listagg(sale , ' , ')within group(order by sale desc) over( 3 partition by product , country , region , year ) stddv 4 from sales_fact 5 where country in ('Australia') and product = 'Xtend Memory' and week < 5 6 order by year , week ; YEAR WEEK SALE STDDV ---------- ---------- ---------- ------------------------------------------------------------ 1998 1 58.15 58.15 , 29.49 , 29.49 , 29.39 1998 2 29.39 58.15 , 29.49 , 29.49 , 29.39 1998 3 29.49 58.15 , 29.49 , 29.49 , 29.39 1998 4 29.49 58.15 , 29.49 , 29.49 , 29.39 1999 1 53.52 94.6 , 53.52 , 40.5 1999 3 94.6 94.6 , 53.52 , 40.5 1999 4 40.5 94.6 , 53.52 , 40.5 2000 1 46.7 93.41 , 46.7 , 46.54 2000 3 93.41 93.41 , 46.7 , 46.54 2000 4 46.54 93.41 , 46.7 , 46.54 2001 1 92.26 256.7 , 118.38 , 92.26 , 47.24 2001 2 118.38 256.7 , 118.38 , 92.26 , 47.24 2001 3 47.24 256.7 , 118.38 , 92.26 , 47.24 2001 4 256.7 256.7 , 118.38 , 92.26 , 47.24 14 rows selected.
分析函數對謂詞前推的影響
使用了分析函數的視圖,會影響視圖前推,因為分析函數的結果是跨行引用得來的,如果對數據源進行的剪裁,結果可能會不一樣
SH@ prod> create or replace view max_5_weeks_vw as 2 select country , product , region , year , week , sale , 3 max(sale) over( 4 partition by product , country , region , year order by year , week 5 rows between 2 preceding and 2 following ) max_weeks_5 6 from sales_fact ; View created. SH@ prod> select year , week , sale , max_weeks_5 from max_5_weeks_vw 2 where country in ('Australia' ) and product = 'Xtend Memory' 3 and region = 'Australia' and year = 2000 and week < 14 4 order by year , week ; YEAR WEEK SALE MAX_WEEKS_5 ---------- ---------- ---------- ----------- 2000 1 46.7 93.41 2000 3 93.41 93.41 2000 4 46.54 93.41 2000 5 46.7 93.41 2000 7 70.8 93.74 2000 8 46.54 93.74 2000 11 93.74 117.5 2000 12 46.54 117.67 2000 13 117.5 117.67 9 rows selected. SH@ prod> explain plan for 2 select year , week , sale , max_weeks_5 from max_5_weeks_vw 3 where country in ('Australia' ) and product = 'Xtend Memory' 4 and region = 'Australia' and year = 2000 and week < 14 5 order by year , week ; Explained. SH@ prod> select * from table(dbms_xplan.display()); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------ Plan hash value: 4167461139 -------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 90 | 5220 | 310 (1)| 00:00:04 | |* 1 | VIEW | MAX_5_WEEKS_VW | 90 | 5220 | 310 (1)| 00:00:04 | | 2 | WINDOW SORT | | 90 | 9450 | 310 (1)| 00:00:04 | |* 3 | TABLE ACCESS FULL| SALES_FACT | 90 | 9450 | 309 (1)| 00:00:04 | -------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("WEEK"<14) 3 - filter("COUNTRY"='Australia' AND "PRODUCT"='Xtend Memory' AND "REGION"='Australia' AND "YEAR"=2000) Note ----- - dynamic sampling used for this statement (level=2) 21 rows selected.對比沒有分析函數的視圖。直接將謂詞推入到視圖里面。
SH@ prod> create or replace view max_5_weeks_vw1 as 2 select country , product , region , year , week , sale 3 from sales_fact ; View created. SH@ prod> explain plan for 2 select year , week , sale from max_5_weeks_vw1 3 where country in ('Australia' ) and product = 'Xtend Memory' 4 and region = 'Australia' and year = 2000 and week < 14 5 order by year , week ; Explained. SH@ prod> select * from table(dbms_xplan.display()); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------ Plan hash value: 1978576542 --------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 105 | 310 (1)| 00:00:04 | | 1 | SORT ORDER BY | | 1 | 105 | 310 (1)| 00:00:04 | |* 2 | TABLE ACCESS FULL| SALES_FACT | 1 | 105 | 309 (1)| 00:00:04 | --------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("COUNTRY"='Australia' AND "PRODUCT"='Xtend Memory' AND "REGION"='Australia' AND "YEAR"=2000 AND "WEEK"<14) Note ----- - dynamic sampling used for this statement (level=2) 19 rows selected.
分析函數用在動態SQL中
SH@ prod> create or replace procedure analytic_dynamic_prc ( part_col_string varchar2 , v_country varchar2 , v_product varchar2 ) 2 is 3 type numtab is table of number(18 , 2) index by binary_integer ; 4 l_year numtab ; 5 l_week numtab ; 6 l_sale numtab ; 7 l_rank numtab ; 8 l_sql_string varchar2(512) ; 9 begin 10 l_sql_string := 'select * from ( select year , week , sale , rank() over( partition by ' || part_col_string 11 || ' order by sale desc ) sales_rank from sales_fact where country in (' 12 || chr(39) || v_country || chr(39) 13 || ' ) and product = ' || chr(39) || v_product || chr(39) 14 || 'order by product , country , year , week ) where sales_rank <= 10 order by 1,4' ; 15 execute immediate l_sql_string bulk collect into l_year , l_week , l_sale , l_rank ; 16 for i in 1..l_year.count loop 17 dbms_output.put_line( l_year(i) || ' | ' || l_week(i) || ' | ' || l_sale(i) || ' | ' || l_rank(i) ) ; 18 end loop ; 19 end ; 20 / Procedure created. SH@ prod> exec analytic_dynamic_prc('product , country , region' , 'Australia' , 'Xtend Memory' ) ; 1998 | 48 | 172.56 | 9 2000 | 46 | 246.74 | 3 2000 | 21 | 187.48 | 5 2000 | 43 | 179.12 | 7 2000 | 34 | 178.52 | 8 2001 | 16 | 278.44 | 1 2001 | 4 | 256.7 | 2 2001 | 21 | 233.7 | 4 2001 | 48 | 182.96 | 6 2001 | 30 | 162.91 | 10 2001 | 14 | 162.91 | 10 PL/SQL procedure successfully completed.
分析函數的“嵌套”
分析函數不能直接嵌套,可能通過子查詢來實現。
select year , week , top_sale_year , lag( top_sale_year ) over ( order by year desc ) prev_top_sale_yer from ( select distinct first_value(year) over ( 這里的作用不能用MAX代替,這里取列與排序的列是不同的。 partition by product , country , region , year order by sale desc rows between unbounded preceding and unbounded following ) year , first_value(week) over ( partition by product , country , region , year order by sale desc rows between unbounded preceding and unbounded following ) week , first_value(sale) over ( partition by product , country , region , year order by sale desc rows between unbounded preceding and unbounded following ) top_sale_year from sales_fact where country in ('Australia') and product = 'Xtend Memory' ) order by year , week ;
執行結果。
SH@ prod> select year , week , top_sale_year , 2 lag( top_sale_year ) over ( order by year desc ) prev_top_sale_yer 3 from ( 4 select distinct 5 first_value(year) over ( 6 partition by product , country , region , year 7 order by sale desc 8 rows between unbounded preceding and unbounded following ) year , 9 first_value(week) over ( 10 partition by product , country , region , year 11 order by sale desc 12 rows between unbounded preceding and unbounded following ) week , 13 first_value(sale) over ( 14 partition by product , country , region , year 15 order by sale desc 16 rows between unbounded preceding and unbounded following ) top_sale_year 17 from sales_fact 18 where country in ('Australia') and product = 'Xtend Memory' ) 19 order by year , week ; YEAR WEEK TOP_SALE_YEAR PREV_TOP_SALE_YER ---------- ---------- ------------- ----------------- 1998 48 172.56 148.12 1999 17 148.12 246.74 2000 46 246.74 278.44 2001 16 278.44
分析函數的並行
SH@ prod> explain plan for 2 select year , week , top_sale_year , 3 lag( top_sale_year ) over ( order by year desc ) prev_top_sale_yer 4 from ( 5 select distinct 6 first_value(year) over ( 7 partition by product , country , region , year 8 order by sale desc 9 rows between unbounded preceding and unbounded following ) year , 10 first_value(week) over ( 11 partition by product , country , region , year 12 order by sale desc 13 rows between unbounded preceding and unbounded following ) week , 14 first_value(sale) over ( 15 partition by product , country , region , year 16 order by sale desc 17 rows between unbounded preceding and unbounded following ) top_sale_year 18 from sales_fact 19 where country in ('Australia') and product = 'Xtend Memory' ) 20 order by year , week ; Explained. SH@ prod> select * from table(dbms_xplan.display()); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------ Plan hash value: 2124823565 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 197 | 7683 | 313 (2)| 00:00:04 | | 1 | SORT ORDER BY | | 197 | 7683 | 313 (2)| 00:00:04 | | 2 | WINDOW SORT | | 197 | 7683 | 313 (2)| 00:00:04 | | 3 | VIEW | | 197 | 7683 | 311 (1)| 00:00:04 | | 4 | HASH UNIQUE | | 197 | 20685 | 311 (1)| 00:00:04 | | 5 | WINDOW SORT | | 197 | 20685 | 311 (1)| 00:00:04 | |* 6 | TABLE ACCESS FULL| SALES_FACT | 197 | 20685 | 309 (1)| 00:00:04 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 6 - filter("COUNTRY"='Australia' AND "PRODUCT"='Xtend Memory') Note ----- - dynamic sampling used for this statement (level=2) 22 rows selected. (注意DISTINCT操作采用的是HASH UNIQUE而不是排序)
為上面的語句添加並行提示。
SH@ prod> explain plan for 2 select /*+ parallel(3)*/ year , week , top_sale_year , 3 lag( top_sale_year ) over ( order by year desc ) prev_top_sale_yer 4 from ( 5 select distinct 6 first_value(year) over ( 7 partition by product , country , region , year 8 order by sale desc 9 rows between unbounded preceding and unbounded following ) year , 10 first_value(week) over ( 11 partition by product , country , region , year 12 order by sale desc 13 rows between unbounded preceding and unbounded following ) week , 14 first_value(sale) over ( 15 partition by product , country , region , year 16 order by sale desc 17 rows between unbounded preceding and unbounded following ) top_sale_year 18 from sales_fact 19 where country in ('Australia') and product = 'Xtend Memory' ) 20 order by year , week ; Explained. SH@ prod> set linesize 180 SH@ prod> select * from table(dbms_xplan.display()); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Plan hash value: 2880616722 ---------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib | ---------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 197 | 7683 | 119 (5)| 00:00:02 | | | | | 1 | SORT ORDER BY | | 197 | 7683 | 119 (5)| 00:00:02 | | | | | 2 | WINDOW BUFFER | | 197 | 7683 | 119 (5)| 00:00:02 | | | | | 3 | PX COORDINATOR | | | | | | | | | | 4 | PX SEND QC (ORDER) | :TQ10003 | 197 | 7683 | 119 (5)| 00:00:02 | Q1,03 | P->S | QC (ORDER) | | 5 | SORT ORDER BY | | 197 | 7683 | 119 (5)| 00:00:02 | Q1,03 | PCWP | | | 6 | PX RECEIVE | | 197 | 7683 | 117 (3)| 00:00:02 | Q1,03 | PCWP | | | 7 | PX SEND RANGE | :TQ10002 | 197 | 7683 | 117 (3)| 00:00:02 | Q1,02 | P->P | RANGE | | 8 | VIEW | | 197 | 7683 | 117 (3)| 00:00:02 | Q1,02 | PCWP | | | 9 | HASH UNIQUE | | 197 | 20685 | 117 (3)| 00:00:02 | Q1,02 | PCWP | | | 10 | PX RECEIVE | | 197 | 20685 | 117 (3)| 00:00:02 | Q1,02 | PCWP | | | 11 | PX SEND HASH | :TQ10001 | 197 | 20685 | 117 (3)| 00:00:02 | Q1,01 | P->P | HASH | | 12 | WINDOW SORT | | 197 | 20685 | 117 (3)| 00:00:02 | Q1,01 | PCWP | | | 13 | PX RECEIVE | | 197 | 20685 | 114 (0)| 00:00:02 | Q1,01 | PCWP | | | 14 | PX SEND HASH | :TQ10000 | 197 | 20685 | 114 (0)| 00:00:02 | Q1,00 | P->P | HASH | | 15 | PX BLOCK ITERATOR | | 197 | 20685 | 114 (0)| 00:00:02 | Q1,00 | PCWC | | |* 16 | TABLE ACCESS FULL| SALES_FACT | 197 | 20685 | 114 (0)| 00:00:02 | Q1,00 | PCWP | | ---------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 16 - filter("COUNTRY"='Australia' AND "PRODUCT"='Xtend Memory') Note ----- - dynamic sampling used for this statement (level=2) - Degree of Parallelism is 3 because of hint 33 rows selected.
Oracle 高級排序函數 和 高級分組函數
注:本內容來源於《Oracle 高級排序函數 和 高級分組函數》
高級排序函數:
[ ROW_NUMBER()| RANK() | DENSE_RANK ] OVER (partition by xx order by xx)
1.row_number() 連續且遞增的數字 1 2 3 4
row_number() over (partition by xx order by xx )
--學生表中按照所在專業分組,同專業內按成績倒序排序,成績相同則按學號正序排序,並給予組內等級
select row_number() over(partition by class_id order by score desc)rn,t.* from student2016 t
2.rank() 跳躍排序 若有相同數據則排名相同 然后跳躍排序 1 2 2 2 5
rank() over (partition by xx order by xx )
select rank() over(partition by class_id order by score desc)rn,t.* from student2016 t
3.dense_rank 若有相同數據則排名相同 然后遞增排序
dense_rank over (partition by xx order by xx ) 1 2 2 2 3
select dense_rank() over(partition by class_id order by score desc)rn,t.* from student2016 t
----------------------------------------------------------------------------------------------------------------------------
高級分組函數
group by rollup(a,b,c)
select a,b,c,sum(d) from test group by rollup(a,b,c)
對rollup后面的列 按從右到左以少一列的方式進行分組直到所有列都去掉后的分組(也就是全表分組)
對於n個參數的 rollup,有n+1次分組即按a,b,c,分組,union all a,b分組 union all a分組 union from test
----------------------------------------------------------------------------------
group by cube(a,b,c)
對n個參數,有2^n次分組
即按 ab,ac,a,bc,b,c最后對 全部分組
----------------------------------------------------------------------------------
group by grouping sets(a,b)
即只列出 對 a分組后,和對 b分組的結果集
-- 創建銷售表 create table sales_tab( year_id number not null, month_id number not null, day_id number not null, sales_value number(10,2) not null ); -- 插入數據 insert into sales_tab select trunc(dbms_random.value(low=>2010,high=>2012)) as year_id, trunc(dbms_random.value(low=>1,high=>13)) as month_id, trunc(dbms_random.value(low=>1,high=>32)) as day_id, round(dbms_random.value(low=>1,high=>100)) as sales_value from dual connect by level <=1000; -- 查詢 group by 后的數據 select sum(t.sales_value) from SALES_TAB t -- 1行 select t.year_id,t.month_id,t.day_id,sum(t.sales_value) sales from SALES_TAB t group by t.year_id,t.month_id,t.day_id order by t.year_id,t.month_id,t.day_id desc; -- 540行 select t.year_id,t.month_id,sum(t.sales_value) sales from SALES_TAB t group by t.year_id,t.month_id order by t.year_id,t.month_id desc; -- 24 行 select t.year_id,sum(t.sales_value) sales from SALES_TAB t group by t.year_id order by t.year_id desc; -- 2 行 -- 使用高級分組函數 -- group by rollup(a,b,c) select t.year_id,t.month_id,t.day_id,sum(t.sales_value) sales from SALES_TAB t group by rollup(t.year_id,t.month_id,t.day_id) order by t.year_id,t.month_id,t.day_id; -- 567 行 = 同上面 1+540+24+2 -- group by cube(a,b,c) select t.year_id,t.month_id,t.day_id,sum(t.sales_value) sales from SALES_TAB t group by cube(t.year_id,t.month_id,t.day_id) order by t.year_id,t.month_id,t.day_id; --group by grouping sets(a,b,c) select t.year_id,t.month_id,sum(t.sales_value) sales from SALES_TAB t group by cube(t.year_id,t.month_id) order by 1,2; -- 39 行 select t.year_id,t.month_id,sum(t.sales_value) sales from SALES_TAB t group by grouping sets(t.year_id,t.month_id) order by 1,2; -- 14 行
Oracle高級查詢之OVER (PARTITION BY ..)
注:本內容來源於《Oracle高級查詢之OVER (PARTITION BY ..)》
為了方便大家學習和測試,所有的例子都是在Oracle自帶用戶Scott下建立的。
注:標題中的紅色order by是說明在使用該方法的時候必須要帶上order by。
一、rank()/dense_rank() over(partition by ...order by ...)
現在客戶有這樣一個需求,查詢每個部門工資最高的雇員的信息,相信有一定oracle應用知識的同學都能寫出下面的SQL語句:
- select e.ename, e.job, e.sal, e.deptno
- from scott.emp e,
- (select e.deptno, max(e.sal) sal from scott.emp e group by e.deptno) me
- where e.deptno = me.deptno
- and e.sal = me.sal;
在滿足客戶需求的同時,大家應該習慣性的思考一下是否還有別的方法。這個是肯定的,就是使用本小節標題中rank() over(partition by...)或dense_rank() over(partition by...)語法,SQL分別如下:
- select e.ename, e.job, e.sal, e.deptno
- from (select e.ename,
- e.job,
- e.sal,
- e.deptno,
- rank() over(partition by e.deptno order by e.sal desc) rank
- from scott.emp e) e
- where e.rank = 1;
- select e.ename, e.job, e.sal, e.deptno
- from (select e.ename,
- e.job,
- e.sal,
- e.deptno,
- dense_rank() over(partition by e.deptno order by e.sal desc) rank
- from scott.emp e) e
- where e.rank = 1;
為什么會得出跟上面的語句一樣的結果呢?這里補充講解一下rank()/dense_rank() over(partition by e.deptno order by e.sal desc)語法。
over: 在什么條件之上。
partition by e.deptno: 按部門編號划分(分區)。
order by e.sal desc: 按工資從高到低排序(使用rank()/dense_rank() 時,必須要帶order by否則非法)
rank()/dense_rank(): 分級
整個語句的意思就是:在按部門划分的基礎上,按工資從高到低對雇員進行分級,“級別”由從小到大的數字表示(最小值一定為1)。
那么rank()和dense_rank()有什么區別呢?
rank(): 跳躍排序,如果有兩個第一級時,接下來就是第三級。
dense_rank(): 連續排序,如果有兩個第一級時,接下來仍然是第二級。
小作業:查詢部門最低工資的雇員信息。
二、min()/max() over(partition by ...)
現在我們已經查詢得到了部門最高/最低工資,客戶需求又來了,查詢雇員信息的同時算出雇員工資與部門最高/最低工資的差額。這個還是比較簡單,在第一節的groupby語句的基礎上進行修改如下:
- select e.ename,
- e.job,
- e.sal,
- e.deptno,
- e.sal - me.min_sal diff_min_sal,
- me.max_sal - e.sal diff_max_sal
- from scott.emp e,
- (select e.deptno, min(e.sal) min_sal, max(e.sal) max_sal
- from scott.emp e
- group by e.deptno) me
- where e.deptno = me.deptno
- order by e.deptno, e.sal;
上面我們用到了min()和max(),前者求最小值,后者求最大值。如果這兩個方法配合over(partition by ...)使用會是什么效果呢?大家看看下面的SQL語句:
- select e.ename,
- e.job,
- e.sal,
- e.deptno,
- nvl(e.sal - min(e.sal) over(partition by e.deptno), 0) diff_min_sal,
- nvl(max(e.sal) over(partition by e.deptno) - e.sal, 0) diff_max_sal
- from scott.emp e;
這兩個語句的查詢結果是一樣的,大家可以看到min()和max()實際上求的還是最小值和最大值,只不過是在partition by分區基礎上的。
小作業:如果在本例中加上order by,會得到什么結果呢?
三、lead()/lag() over(partition by ... order by ...)
中國人愛攀比,好面子,聞名世界。客戶更是好這一口,在和最高/最低工資比較完之后還覺得不過癮,這次就提出了一個比較變態的需求,計算個人工資與比自己高一位/低一位工資的差額。這個需求確實讓我很是為難,在groupby語句中不知道應該怎么去實現。不過。。。。現在我們有了over(partition by ...),一切看起來是那么的簡單。如下:
- select e.ename,
- e.job,
- e.sal,
- e.deptno,
- lead(e.sal, 1, 0) over(partition by e.deptno order by e.sal) lead_sal,
- lag(e.sal, 1, 0) over(partition by e.deptno order by e.sal) lag_sal,
- nvl(lead(e.sal) over(partition by e.deptno order by e.sal) - e.sal,
- 0) diff_lead_sal,
- nvl(e.sal - lag(e.sal) over(partition by e.deptno order by e.sal), 0) diff_lag_sal
- from scott.emp e;
看了上面的語句后,大家是否也會覺得虛驚一場呢(驚出一身冷汗后突然雞凍起來,這樣容易感冒)?我們還是來講解一下上面用到的兩個新方法吧。
lead(列名,n,m): 當前記錄后面第n行記錄的<列名>的值,沒有則默認值為m;如果不帶參數n,m,則查找當前記錄后面第一行的記錄<列名>的值,沒有則默認值為null。
lag(列名,n,m): 當前記錄前面第n行記錄的<列名>的值,沒有則默認值為m;如果不帶參數n,m,則查找當前記錄前面第一行的記錄<列名>的值,沒有則默認值為null。
下面再列舉一些常用的方法在該語法中的應用(注:帶order by子句的方法說明在使用該方法的時候必須要帶order by):
- select e.ename,
- e.job,
- e.sal,
- e.deptno,
- first_value(e.sal) over(partition by e.deptno) first_sal,
- last_value(e.sal) over(partition by e.deptno) last_sal,
- sum(e.sal) over(partition by e.deptno) sum_sal,
- avg(e.sal) over(partition by e.deptno) avg_sal,
- count(e.sal) over(partition by e.deptno) count_num,
- row_number() over(partition by e.deptno order by e.sal) row_num
- from scott.emp e;
重要提示:大家在讀完本片文章之后可能會有點誤解,就是OVER (PARTITION BY ..)比GROUP BY更好,實際並非如此,前者不可能替代后者,而且在執行效率上前者也沒有后者高,只是前者提供了更多的功能而已,所以希望大家在使用中要根據需求情況進行選擇。
==================================================================================================================================================