Oracle SQL高級編程——分析函數(窗口函數)全面講解





Oracle SQL高級編程——分析函數(窗口函數)全面講解

注:本文來源於:《Oracle SQL高級編程——分析函數(窗口函數)全面講解



概述

分析函數是以一定的方法在一個與當前行相關的結果子集中進行計算,也稱為窗口函數。
一般結構為

Function(arg1 , arg2 ……) over(partition by clause order by clause windowing clause )

Windowing clause : rows | range between start_expr and end_expr
Start_expr is unbounded preceding | current row | n preceding | n following
End_expr is unbounded following | current row | n preceding | n following


不是所有的分析函數都支持開窗子句。


創建測試表


create table sales_fact  as
    select country_name country , country_subregion region , prod_name product , calendar_year year , calendar_week_number week ,
    sum(amount_sold) sale , sum(amount_sold*
    (case when mod(rownum , 10 ) = 0 then 1.4
    when mod(rownum , 5)= 0 then 0.6
    when mod(rownum , 2)= 0 then 0.9
    when mod(rownum , 2)=1 then 1.2
    else 1 end ) ) receipts
    from sales , times , customers , countries , products
    where sales.time_id = times.time_id and
    sales.prod_id = products.prod_id and
    sales.cust_id = customers.cust_id and
    customers.country_id = countries.country_id
    group by country_name , country_subregion , prod_name , calendar_year , calendar_week_number ;


把聚合函數當作分析函數使用

分析函數列只是一列數值,每一行對應一個值,對於查詢的其它方面沒有任何影響。

               從以下查詢可以得出以下幾點:


1.over分區條件中的列可以不在select列表中,但是必須在數據源中。
2.over排序條件中的列可以不在select列表中,但是必須在數據源中。
3.over排序條件是對所在分區中的數據進行排序,與select語句中的排序無關。但是會影響到分析函數的結果。
4.over中的開窗條件的范圍一般僅限於分區本身。rows between unbounded preceding and current row表示從分區的最開始到當前行。
5.分析函數的數據來自結果集(施加了where條件之后的)。

               下面的查詢中的分析列表示該年從開始到該周的銷售累計。

SH@ prod> select year , week , sale ,
    sum(sale) over( partition by region , year order by week rows between unbounded preceding and current row ) running_sum_ytd
    from sales_fact
    where country in ('Australia') and product='Xtend Memory' and week < 10
    order by year , week ;

      YEAR       WEEK       SALE RUNNING_SUM_YTD
---------- ---------- ---------- ---------------
      1998          1      58.15           58.15
      1998          2      29.39           87.54
      1998          3      29.49          117.03
      1998          4      29.49          146.52
      1998          5       29.8          176.32
      1998          6      58.78           235.1
      1998          9      58.78          293.88
      1999          1      53.52           53.52
      1999          3       94.6          148.12
      1999          4       40.5          188.62
      1999          5      80.01          268.63
      1999          6       40.5          309.13
      1999          8     103.11          412.24
      1999          9      53.34          465.58
      2000          1       46.7            46.7
      2000          3      93.41          140.11
      2000          4      46.54          186.65
      2000          5       46.7          233.35
      2000          7       70.8          304.15
      2000          8      46.54          350.69
      2001          1      92.26           92.26
      2001          2     118.38          210.64
      2001          3      47.24          257.88
      2001          4      256.7          514.58
      2001          5      93.44          608.02
      2001          6      22.44          630.46
      2001          7      69.96          700.42

      YEAR       WEEK       SALE RUNNING_SUM_YTD
---------- ---------- ---------- ---------------
      2001          8      46.06          746.48
      2001          9      92.67          839.15

29 rows selected.

結果與上面相同,只是排序不同方式,分析列看起來就沒有規律了。

SH@ prod> select year , week , sale , sum(sale) over( partition by region , year order by week  rows between unbounded preceding and current row ) running_sum_ytd
  from sales_fact   where country in ('Australia') and product='Xtend Memory' and week < 10  order by year , sale ;

      YEAR       WEEK       SALE RUNNING_SUM_YTD
---------- ---------- ---------- ---------------
      1998          2      29.39           87.54
      1998          4      29.49          146.52
      1998          3      29.49          117.03
      1998          5       29.8          176.32
      1998          1      58.15           58.15
      1998          6      58.78           235.1
      1998          9      58.78          293.88
      1999          4       40.5          188.62
      1999          6       40.5          309.13
      1999          9      53.34          465.58
      1999          1      53.52           53.52
      1999          5      80.01          268.63
      1999          3       94.6          148.12
      1999          8     103.11          412.24
      2000          4      46.54          186.65
      2000          8      46.54          350.69
      2000          1       46.7            46.7
      2000          5       46.7          233.35
      2000          7       70.8          304.15
      2000          3      93.41          140.11
      2001          6      22.44          630.46
      2001          8      46.06          746.48
      2001          3      47.24          257.88
      2001          7      69.96          700.42
      2001          1      92.26           92.26
      2001          9      92.67          839.15
      2001          5      93.44          608.02

      YEAR       WEEK       SALE RUNNING_SUM_YTD
---------- ---------- ---------- ---------------
      2001          2     118.38          210.64
      2001          4      256.7          514.58

29 rows selected.


分區中的排序選取不恰當,則分析列結果沒有什么意義了。分區開窗排序的選取與分析列的結果密切相關。

SH@ prod> select year , week , sale ,   sum(sale) over( partition by  region , year   order by sale rows between unbounded preceding and current row ) running_sum_ytd
  from sales_fact
  where country in ('Australia') and product='Xtend Memory' and week < 10  order by  year , week ;

      YEAR       WEEK       SALE RUNNING_SUM_YTD
---------- ---------- ---------- ---------------
      1998          1      58.15          176.32
      1998          2      29.39           29.39
      1998          3      29.49           88.37
      1998          4      29.49           58.88
      1998          5       29.8          118.17
      1998          6      58.78           235.1
      1998          9      58.78          293.88
      1999          1      53.52          187.86
      1999          3       94.6          362.47
      1999          4       40.5            40.5
      1999          5      80.01          267.87
      1999          6       40.5              81
      1999          8     103.11          465.58
      1999          9      53.34          134.34
      2000          1       46.7          186.48
      2000          3      93.41          350.69
      2000          4      46.54           46.54
      2000          5       46.7          139.78
      2000          7       70.8          257.28
      2000          8      46.54           93.08
      2001          1      92.26          277.96
      2001          2     118.38          582.45
      2001          3      47.24          115.74
      2001          4      256.7          839.15
      2001          5      93.44          464.07
      2001          6      22.44           22.44
      2001          7      69.96           185.7

      YEAR       WEEK       SALE RUNNING_SUM_YTD
---------- ---------- ---------- ---------------
      2001          8      46.06            68.5
      2001          9      92.67          370.63

29 rows selected.





分析函數的執行計划

雖然有分析函數還是只需要一次全表掃描,但是需要排序。

               WINDOW SORT是分析函數的典型特征。

SH@ prod> explain plan for  select year , week , sale ,   sum(sale) over( partition by  region , year   order by sale  rows between unbounded preceding and current row ) running_sum_ytd  
 from sales_fact   where country in ('Australia') and product='Xtend Memory' and week < 10   order by  year , week ;

Explained.

SH@ prod> select * from table(dbms_xplan.display()) ;

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 173857439

----------------------------------------------------------------------------------
| Id  | Operation           | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |            |    18 |  1890 |   311   (1)| 00:00:04 |
|   1 |  SORT ORDER BY      |            |    18 |  1890 |   311   (1)| 00:00:04 |
|   2 |   WINDOW SORT       |            |    18 |  1890 |   311   (1)| 00:00:04 |
|*  3 |    TABLE ACCESS FULL| SALES_FACT |    18 |  1890 |   309   (1)| 00:00:04 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("COUNTRY"='Australia' AND "PRODUCT"='Xtend Memory' AND
              "WEEK"<10)

Note
-----
   - dynamic sampling used for this statement (level=2)   說明該表還沒有統計信息。

20 rows selected.

               不加分析列,只是少了一步window sort。

SH@ prod> explain plan for
  2  select year , week , sale
  3  from sales_fact
  4  where country in ('Australia') and product='Xtend Memory' and week < 10
  5  order by  year , week ;

Explained.

SH@ prod> select * from table(dbms_xplan.display()) ;

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1978576542

---------------------------------------------------------------------------------
| Id  | Operation          | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |            |    18 |  1584 |   310   (1)| 00:00:04 |
|   1 |  SORT ORDER BY     |            |    18 |  1584 |   310   (1)| 00:00:04 |
|*  2 |   TABLE ACCESS FULL| SALES_FACT |    18 |  1584 |   309   (1)| 00:00:04 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("COUNTRY"='Australia' AND "PRODUCT"='Xtend Memory' AND
              "WEEK"<10)

Note
-----
   - dynamic sampling used for this statement (level=2)

19 rows selected.









如何使窗口充滿整個分區

SH@ prod> select year , week , sale , max(sale) over(partition by product , country , region , year
  2  order by week
  3  rows between unbounded preceding and unbounded following )
  4  max_sale
  5  from sales_fact
  6  where country in ( 'Australia') and product = 'Xtend Memory' and week < 10
  7  order by product , country , year , week ;

      YEAR       WEEK       SALE   MAX_SALE
---------- ---------- ---------- ----------
      1998          1      58.15      58.78
      1998          2      29.39      58.78
      1998          3      29.49      58.78
      1998          4      29.49      58.78
      1998          5       29.8      58.78
      1998          6      58.78      58.78
      1998          9      58.78      58.78
      1999          1      53.52     103.11
      1999          3       94.6     103.11
      1999          4       40.5     103.11
      1999          5      80.01     103.11
      1999          6       40.5     103.11
      1999          8     103.11     103.11
      1999          9      53.34     103.11
      2000          1       46.7      93.41
      2000          3      93.41      93.41
      2000          4      46.54      93.41
      2000          5       46.7      93.41
      2000          7       70.8      93.41
      2000          8      46.54      93.41
      2001          1      92.26      256.7
      2001          2     118.38      256.7
      2001          3      47.24      256.7
      2001          4      256.7      256.7
      2001          5      93.44      256.7
      2001          6      22.44      256.7
      2001          7      69.96      256.7

      YEAR       WEEK       SALE   MAX_SALE
---------- ---------- ---------- ----------
      2001          8      46.06      256.7
      2001          9      92.67      256.7

29 rows selected.


兩個邊界都滑動的窗口


下面語句的窗口是往前兩周,加往后兩周,加當前周,一共五周。(到達邊界時窗口會自動縮小)

SH@ prod> select year , week , sale , max(sale) over(partition by product , country , region , year
  2  order by week
  3  rows between 2 preceding and 2 following )
  4  max_sale
  5  from sales_fact
  6  where country in ( 'Australia') and product = 'Xtend Memory' and week < 10
  7  order by product , country , year , week ;

      YEAR       WEEK       SALE   MAX_SALE
---------- ---------- ---------- ----------
      1998          1      58.15      58.15
      1998          2      29.39      58.15
      1998          3      29.49      58.15
      1998          4      29.49      58.78
      1998          5       29.8      58.78
      1998          6      58.78      58.78
      1998          9      58.78      58.78
      1999          1      53.52       94.6
      1999          3       94.6       94.6
      1999          4       40.5       94.6
      1999          5      80.01     103.11
      1999          6       40.5     103.11
      1999          8     103.11     103.11
      1999          9      53.34     103.11
      2000          1       46.7      93.41
      2000          3      93.41      93.41
      2000          4      46.54      93.41
      2000          5       46.7      93.41
      2000          7       70.8       70.8
      2000          8      46.54       70.8  這里只所以是70.8因為窗口縮小了。
      2001          1      92.26     118.38
      2001          2     118.38      256.7
      2001          3      47.24      256.7
      2001          4      256.7      256.7
      2001          5      93.44      256.7
      2001          6      22.44      256.7
      2001          7      69.96      93.44

      YEAR       WEEK       SALE   MAX_SALE
---------- ---------- ---------- ----------
      2001          8      46.06      92.67
      2001          9      92.67      92.67

29 rows selected.





默認窗口是什么?

一看便知。

SH@ prod> select year , week , sale , max(sale) over(partition by product , country , region , year
  2  order by week )
  3  max_sale
  4  from sales_fact
  5  where country in ( 'Australia') and product = 'Xtend Memory' and week < 10
  6  order by product , country , year , week ;

      YEAR       WEEK       SALE   MAX_SALE
---------- ---------- ---------- ----------
      1998          1      58.15      58.15
      1998          2      29.39      58.15
      1998          3      29.49      58.15
      1998          4      29.49      58.15
      1998          5       29.8      58.15
      1998          6      58.78      58.78
      1998          9      58.78      58.78
      1999          1      53.52      53.52
      1999          3       94.6       94.6
      1999          4       40.5       94.6
      1999          5      80.01       94.6
      1999          6       40.5       94.6
      1999          8     103.11     103.11
      1999          9      53.34     103.11
      2000          1       46.7       46.7
      2000          3      93.41      93.41
      2000          4      46.54      93.41
      2000          5       46.7      93.41
      2000          7       70.8      93.41
      2000          8      46.54      93.41
      2001          1      92.26      92.26
      2001          2     118.38     118.38
      2001          3      47.24     118.38
      2001          4      256.7      256.7
      2001          5      93.44      256.7
      2001          6      22.44      256.7
      2001          7      69.96      256.7

      YEAR       WEEK       SALE   MAX_SALE
---------- ---------- ---------- ----------
      2001          8      46.06      256.7
      2001          9      92.67      256.7

29 rows selected.


Lead和Lag(不支持開窗的函數)

有開窗語句時會報這樣的錯

2
3
4

rows between 2 preceding and 2 following )
*
ERROR at line 3:
ORA-00907: missing right parenthesis


               LEAD是求下一個,而不是前一個。在分區的下邊界處,LEAD處回空值。


SH@ prod> select year , week , sale , lead(sale) over(partition by product , country , region , year
  2  order by week  )
  3  former_sale
  4  from sales_fact
  5  where country in ( 'Australia') and product = 'Xtend Memory' and week < 10
  6  order by product , country , year , week ;

      YEAR       WEEK       SALE FORMER_SALE
---------- ---------- ---------- -----------
      1998          1      58.15       29.39
      1998          2      29.39       29.49
      1998          3      29.49       29.49
      1998          4      29.49        29.8
      1998          5       29.8       58.78
      1998          6      58.78       58.78
      1998          9      58.78
      1999          1      53.52        94.6
      1999          3       94.6        40.5
      1999          4       40.5       80.01
      1999          5      80.01        40.5
      1999          6       40.5      103.11
      1999          8     103.11       53.34
      1999          9      53.34
      2000          1       46.7       93.41
      2000          3      93.41       46.54
      2000          4      46.54        46.7
      2000          5       46.7        70.8
      2000          7       70.8       46.54
      2000          8      46.54
      2001          1      92.26      118.38
      2001          2     118.38       47.24
      2001          3      47.24       256.7
      2001          4      256.7       93.44
      2001          5      93.44       22.44
      2001          6      22.44       69.96
      2001          7      69.96       46.06

      YEAR       WEEK       SALE FORMER_SALE
---------- ---------- ---------- -----------
      2001          8      46.06       92.67
      2001          9      92.67

29 rows selected.


             LAG求上一個,也就是前一個。在分區的上邊界處返回空值。


SH@ prod> select year , week , sale , lag(sale) over(partition by product , country , region , year
  2  order by week  )
  3  former_sale
  4  from sales_fact
  5  where country in ( 'Australia') and product = 'Xtend Memory' and week < 10
  6  order by product , country , year , week ;

      YEAR       WEEK       SALE FORMER_SALE
---------- ---------- ---------- -----------
      1998          1      58.15
      1998          2      29.39       58.15
      1998          3      29.49       29.39
      1998          4      29.49       29.49
      1998          5       29.8       29.49
      1998          6      58.78        29.8
      1998          9      58.78       58.78
      1999          1      53.52
      1999          3       94.6       53.52
      1999          4       40.5        94.6
      1999          5      80.01        40.5
      1999          6       40.5       80.01
      1999          8     103.11        40.5
      1999          9      53.34      103.11
      2000          1       46.7
      2000          3      93.41        46.7
      2000          4      46.54       93.41
      2000          5       46.7       46.54
      2000          7       70.8        46.7
      2000          8      46.54        70.8
      2001          1      92.26
      2001          2     118.38       92.26
      2001          3      47.24      118.38
      2001          4      256.7       47.24
      2001          5      93.44       256.7
      2001          6      22.44       93.44
      2001          7      69.96       22.44

      YEAR       WEEK       SALE FORMER_SALE
---------- ---------- ---------- -----------
      2001          8      46.06       69.96
      2001          9      92.67       46.06

29 rows selected.


復雜的Lead和Lag

Lead和lag函數的第一參數為返回的列,第二參數為相隔行數(非負),第三個參數為不存在時的默認值(可以指定為當前行的值)。

SH@ prod> select year , week , sale , lag(sale , 2 , 0 ) over(partition by product , country , region , year
  2  order by week  )
  3  former_sale
  4  from sales_fact
  5  where country in ( 'Australia') and product = 'Xtend Memory' and week < 10
  6  order by product , country , year , week ;

      YEAR       WEEK       SALE FORMER_SALE
---------- ---------- ---------- -----------
      1998          1      58.15           0
      1998          2      29.39           0
      1998          3      29.49       58.15
      1998          4      29.49       29.39
      1998          5       29.8       29.49
      1998          6      58.78       29.49
      1998          9      58.78        29.8
      1999          1      53.52           0
      1999          3       94.6           0
      1999          4       40.5       53.52
      1999          5      80.01        94.6
      1999          6       40.5        40.5
      1999          8     103.11       80.01
      1999          9      53.34        40.5
      2000          1       46.7           0
      2000          3      93.41           0
      2000          4      46.54        46.7
      2000          5       46.7       93.41
      2000          7       70.8       46.54
      2000          8      46.54        46.7
      2001          1      92.26           0
      2001          2     118.38           0
      2001          3      47.24       92.26
      2001          4      256.7      118.38
      2001          5      93.44       47.24
      2001          6      22.44       256.7
      2001          7      69.96       93.44

      YEAR       WEEK       SALE FORMER_SALE
---------- ---------- ---------- -----------
      2001          8      46.06       22.44
      2001          9      92.67       69.96

29 rows selected.


將默認值指定為當前行的值。

SH@ prod> select year , week , sale , lag(sale , 2 , sale ) over(partition by product , country , region , year
  2  order by week  )
  3  former_sale
  4  from sales_fact
  5  where country in ( 'Australia') and product = 'Xtend Memory' and week < 10
  6  order by product , country , year , week ;

      YEAR       WEEK       SALE FORMER_SALE
---------- ---------- ---------- -----------
      1998          1      58.15       58.15
      1998          2      29.39       29.39
      1998          3      29.49       58.15
      1998          4      29.49       29.39
      1998          5       29.8       29.49
      1998          6      58.78       29.49
      1998          9      58.78        29.8
      1999          1      53.52       53.52
      1999          3       94.6        94.6
      1999          4       40.5       53.52
      1999          5      80.01        94.6
      1999          6       40.5        40.5
      1999          8     103.11       80.01
      1999          9      53.34        40.5
      2000          1       46.7        46.7
      2000          3      93.41       93.41
      2000          4      46.54        46.7
      2000          5       46.7       93.41
      2000          7       70.8       46.54
      2000          8      46.54        46.7
      2001          1      92.26       92.26
      2001          2     118.38      118.38
      2001          3      47.24       92.26
      2001          4      256.7      118.38
      2001          5      93.44       47.24
      2001          6      22.44       256.7
      2001          7      69.96       93.44

      YEAR       WEEK       SALE FORMER_SALE
---------- ---------- ---------- -----------
      2001          8      46.06       22.44
      2001          9      92.67       69.96

29 rows selected.


           LEAD與LAG關於數據缺口的問題

LAG(sale , 10 ) 這表示與它相隔10行的數據,可是我想訪問的10周前的數據。如果中間數據有缺口會出現嚴重的問題。


FIRST_VALUE和LAST_VALUE


這兩個函數都可以與order by條件配合得到最大值和最小值。
First_value返回窗口中的第一個值。Ignore nulls表示忽略空值,如果第一個是空值返回第二個。

SH@ prod> select year , week , sale , first_value(sale ignore nulls) over(partition by product , country , region , year
  2  order by week
  3  rows between unbounded preceding and unbounded following )
  4  former_sale
  5  from sales_fact
  6  where country in ( 'Australia') and product = 'Xtend Memory' and week < 10
  7  order by product , country , year , week ;

      YEAR       WEEK       SALE FORMER_SALE
---------- ---------- ---------- -----------
      1998          1      58.15       58.15
      1998          2      29.39       58.15
      1998          3      29.49       58.15
      1998          4      29.49       58.15
      1998          5       29.8       58.15
      1998          6      58.78       58.15
      1998          9      58.78       58.15
      1999          1      53.52       53.52
      1999          3       94.6       53.52
      1999          4       40.5       53.52
      1999          5      80.01       53.52
      1999          6       40.5       53.52
      1999          8     103.11       53.52
      1999          9      53.34       53.52
      2000          1       46.7        46.7
      2000          3      93.41        46.7
      2000          4      46.54        46.7
      2000          5       46.7        46.7
      2000          7       70.8        46.7
      2000          8      46.54        46.7
      2001          1      92.26       92.26
      2001          2     118.38       92.26
      2001          3      47.24       92.26
      2001          4      256.7       92.26
      2001          5      93.44       92.26
      2001          6      22.44       92.26
      2001          7      69.96       92.26

      YEAR       WEEK       SALE FORMER_SALE
---------- ---------- ---------- -----------
      2001          8      46.06       92.26
      2001          9      92.67       92.26

29 rows selected.


Last_value返回窗口中的最后一個值。Respect nulls表示識別空值,如果最后一個是空值也將其返回。



SH@ prod> select year , week , sale , last_value(sale respect nulls) over(partition by product , country , region , year
  2  order by week
  3  rows between unbounded preceding and unbounded following )
  4  former_sale
  5  from sales_fact
  6  where country in ( 'Australia') and product = 'Xtend Memory' and week < 10
  7  order by product , country , year , week ;

      YEAR       WEEK       SALE FORMER_SALE
---------- ---------- ---------- -----------
      1998          1      58.15       58.78
      1998          2      29.39       58.78
      1998          3      29.49       58.78
      1998          4      29.49       58.78
      1998          5       29.8       58.78
      1998          6      58.78       58.78
      1998          9      58.78       58.78
      1999          1      53.52       53.34
      1999          3       94.6       53.34
      1999          4       40.5       53.34
      1999          5      80.01       53.34
      1999          6       40.5       53.34
      1999          8     103.11       53.34
      1999          9      53.34       53.34
      2000          1       46.7       46.54
      2000          3      93.41       46.54
      2000          4      46.54       46.54
      2000          5       46.7       46.54
      2000          7       70.8       46.54
      2000          8      46.54       46.54
      2001          1      92.26       92.67
      2001          2     118.38       92.67
      2001          3      47.24       92.67
      2001          4      256.7       92.67
      2001          5      93.44       92.67
      2001          6      22.44       92.67
      2001          7      69.96       92.67

      YEAR       WEEK       SALE FORMER_SALE
---------- ---------- ---------- -----------
      2001          8      46.06       92.67
      2001          9      92.67       92.67

29 rows selected.


NTH_VALUE訪問分區別的任意指定行

FIRST_VALUE相當於NTH_VALUE(sale , 1 )或者NTH_VALUE(sale , 1 )from first respect nulls。
可以與排序配合求第幾大,第幾小。

SH@ prod> select year , week , sale , nth_value(sale , 1 ) from last ignore nulls over(partition by product , country , region , year
  2  order by week
  3  rows between unbounded preceding and unbounded following )
  4  former_sale
  5  from sales_fact
  6  where country in ( 'Australia') and product = 'Xtend Memory' and week < 10
  7  order by product , country , year , week ;

      YEAR       WEEK       SALE FORMER_SALE
---------- ---------- ---------- -----------
      1998          1      58.15       58.78
      1998          2      29.39       58.78
      1998          3      29.49       58.78
      1998          4      29.49       58.78
      1998          5       29.8       58.78
      1998          6      58.78       58.78
      1998          9      58.78       58.78
      1999          1      53.52       53.34
      1999          3       94.6       53.34
      1999          4       40.5       53.34
      1999          5      80.01       53.34
      1999          6       40.5       53.34
      1999          8     103.11       53.34
      1999          9      53.34       53.34
      2000          1       46.7       46.54
      2000          3      93.41       46.54
      2000          4      46.54       46.54
      2000          5       46.7       46.54
      2000          7       70.8       46.54
      2000          8      46.54       46.54
      2001          1      92.26       92.67
      2001          2     118.38       92.67
      2001          3      47.24       92.67
      2001          4      256.7       92.67
      2001          5      93.44       92.67
      2001          6      22.44       92.67
      2001          7      69.96       92.67

      YEAR       WEEK       SALE FORMER_SALE
---------- ---------- ---------- -----------
      2001          8      46.06       92.67
      2001          9      92.67       92.67

29 rows selected.


RANK函數(不能開窗,作用於整個分區)


必須有排序條件,rank就是根據order by條件中的列來定排名的。
RANK函數的排名中,如果出現並列,排名將不連續。
如:1 2(2) 4 5 6 7 8 9 。 如果有兩個第二名,那么第三名就不存在了。
請注意空值,在排序子句中可以使用NULLS LAST來把空值放在最后面

SH@ prod> select year , week , sale , rank() over(partition by product , country , region , year
  2  order by sale )
  3  former_sale
  4  from sales_fact
  5  where country in ( 'Australia') and product = 'Xtend Memory' and week < 10
  6  order by product , country , year , week ;

      YEAR       WEEK       SALE FORMER_SALE
---------- ---------- ---------- -----------
      1998          1      58.15           5   沒有3
      1998          2      29.39           1
      1998          3      29.49           2
      1998          4      29.49           2
      1998          5       29.8           4
      1998          6      58.78           6
      1998          9      58.78           6
      1999          1      53.52           4
      1999          3       94.6           6
      1999          4       40.5           1
      1999          5      80.01           5
      1999          6       40.5           1
      1999          8     103.11           7
      1999          9      53.34           3
      2000          1       46.7           3
      2000          3      93.41           6
      2000          4      46.54           1
      2000          5       46.7           3
      2000          7       70.8           5
      2000          8      46.54           1
      2001          1      92.26           5
      2001          2     118.38           8
      2001          3      47.24           3
      2001          4      256.7           9
      2001          5      93.44           7
      2001          6      22.44           1
      2001          7      69.96           4

      YEAR       WEEK       SALE FORMER_SALE
---------- ---------- ---------- -----------
      2001          8      46.06           2
      2001          9      92.67           6

29 rows selected.



DENSE_RANK(與RANK的區別在於排名一是連續的)


SH@ prod> select year , week , sale , dense_rank() over(partition by product , country , region , year
  2  order by sale )
  3  former_sale
  4  from sales_fact
  5  where country in ( 'Australia') and product = 'Xtend Memory' and week < 10
  6  order by product , country , year , week ;

      YEAR       WEEK       SALE FORMER_SALE
---------- ---------- ---------- -----------
      1998          1      58.15           4  第三名是存在的
      1998          2      29.39           1
      1998          3      29.49           2
      1998          4      29.49           2
      1998          5       29.8           3
      1998          6      58.78           5
      1998          9      58.78           5
      1999          1      53.52           3
      1999          3       94.6           5
      1999          4       40.5           1
      1999          5      80.01           4
      1999          6       40.5           1
      1999          8     103.11           6
      1999          9      53.34           2
      2000          1       46.7           2
      2000          3      93.41           4
      2000          4      46.54           1
      2000          5       46.7           2
      2000          7       70.8           3
      2000          8      46.54           1
      2001          1      92.26           5
      2001          2     118.38           8
      2001          3      47.24           3
      2001          4      256.7           9
      2001          5      93.44           7
      2001          6      22.44           1
      2001          7      69.96           4

      YEAR       WEEK       SALE FORMER_SALE
---------- ---------- ---------- -----------
      2001          8      46.06           2
      2001          9      92.67           6

29 rows selected.


ROW_NUMBER(不支持開窗,不確定性函數)

為分區中的每一行指定一個遞增的編號,如果排序的列的值相同,誰先誰后是隨機的。

SH@ prod> select year , week , sale , row_number() over(partition by product , country , region , year
  2  order by sale )
  3  former_sale
  4  from sales_fact
  5  where country in ( 'Australia') and product = 'Xtend Memory' and week < 10
  6  order by product , country , year , sale ;

      YEAR       WEEK       SALE FORMER_SALE
---------- ---------- ---------- -----------
      1998          2      29.39           1
      1998          4      29.49           2
      1998          3      29.49           3
      1998          5       29.8           4
      1998          1      58.15           5
      1998          6      58.78           6
      1998          9      58.78           7
      1999          4       40.5           1
      1999          6       40.5           2
      1999          9      53.34           3
      1999          1      53.52           4
      1999          5      80.01           5
      1999          3       94.6           6
      1999          8     103.11           7
      2000          4      46.54           1
      2000          8      46.54           2
      2000          5       46.7           3
      2000          1       46.7           4
      2000          7       70.8           5
      2000          3      93.41           6
      2001          6      22.44           1
      2001          8      46.06           2
      2001          3      47.24           3
      2001          7      69.96           4
      2001          1      92.26           5
      2001          9      92.67           6
      2001          5      93.44           7

      YEAR       WEEK       SALE FORMER_SALE
---------- ---------- ---------- -----------
      2001          2     118.38           8
      2001          4      256.7           9

29 rows selected.


Ratio_to_report(當前行的值與分區總和的比值)

這個函數不支持排序和開窗。
求各周的銷量在每年中的比例以及在整個產品銷量中的比例。

SH@ prod> select year , week , sale ,
  2  trunc(100* ratio_to_report(sale) over(partition by year ) , 2) sales_yr ,
  3  trunc(100* ratio_to_report(sale) over() , 2 ) sales_prod
  4  from sales_fact
  5  where country in ('Australia') and product = 'Xtend Memory' and week < 10
  6  order by year , week ;

      YEAR       WEEK       SALE   SALES_YR SALES_PROD
---------- ---------- ---------- ---------- ----------
      1998          1      58.15      19.78       2.98
      1998          2      29.39         10        1.5
      1998          3      29.49      10.03       1.51
      1998          4      29.49      10.03       1.51
      1998          5       29.8      10.14       1.52
      1998          6      58.78         20       3.01
      1998          9      58.78         20       3.01
      1999          1      53.52      11.49       2.74
      1999          3       94.6      20.31       4.85
      1999          4       40.5       8.69       2.07
      1999          5      80.01      17.18        4.1
      1999          6       40.5       8.69       2.07
      1999          8     103.11      22.14       5.28
      1999          9      53.34      11.45       2.73
      2000          1       46.7      13.31       2.39
      2000          3      93.41      26.63       4.79
      2000          4      46.54      13.27       2.38
      2000          5       46.7      13.31       2.39
      2000          7       70.8      20.18       3.63
      2000          8      46.54      13.27       2.38
      2001          1      92.26      10.99       4.73
      2001          2     118.38       14.1       6.07
      2001          3      47.24       5.62       2.42
      2001          4      256.7      30.59      13.16
      2001          5      93.44      11.13       4.79
      2001          6      22.44       2.67       1.15
      2001          7      69.96       8.33       3.58

      YEAR       WEEK       SALE   SALES_YR SALES_PROD
---------- ---------- ---------- ---------- ----------
      2001          8      46.06       5.48       2.36
      2001          9      92.67      11.04       4.75

29 rows selected.


Percent_rank(排在前百分之幾)


用來求當前行的排名的相對百分位置。
比如你對人說自己是第10名,別人可能覺得沒什么,如果是100000中的第10名,那就是前1/10000,那就非常牛了。
這個函數與RANK的推導公式為:
PERCENT_RANK = (RANK - 1) / (N – 1) , N代表總行數。
RANK – 1代表排名大於自己的人數。
N – 1代表除自己以外的總人數。
總體的意思是除自己之外的其它中人,排名比自己高的人所占的比例。


SH@ prod> select year , week , sale , rank() over(partition by product , country , region , year
  2  order by sale )
  3  former_sale
  4  from sales_fact
  5  where country in ( 'Australia') and product = 'Xtend Memory' and week < 10
  6  order by product , country , year , sale ;

      YEAR       WEEK       SALE FORMER_SALE
---------- ---------- ---------- -----------
      1998          2      29.39           1
      1998          4      29.49           2
      1998          3      29.49           2
      1998          5       29.8           4
      1998          1      58.15           5
      1998          6      58.78           6
      1998          9      58.78           6
      1999          4       40.5           1
      1999          6       40.5           1
      1999          9      53.34           3
      1999          1      53.52           4
      1999          5      80.01           5
      1999          3       94.6           6
      1999          8     103.11           7
      2000          4      46.54           1
      2000          8      46.54           1
      2000          5       46.7           3
      2000          1       46.7           3
      2000          7       70.8           5
      2000          3      93.41           6
      2001          6      22.44           1
      2001          8      46.06           2
      2001          3      47.24           3
      2001          7      69.96           4
      2001          1      92.26           5
      2001          9      92.67           6
      2001          5      93.44           7

      YEAR       WEEK       SALE FORMER_SALE
---------- ---------- ---------- -----------
      2001          2     118.38           8
      2001          4      256.7           9

29 rows selected.

SH@ prod> select year , week , sale , 100*percent_rank() over(partition by product , country , region , year
  2  order by sale )
  3  former_sale
  4  from sales_fact
  5  where country in ( 'Australia') and product = 'Xtend Memory' and week < 10
  6  order by product , country , year , sale ;

      YEAR       WEEK       SALE FORMER_SALE
---------- ---------- ---------- -----------
      1998          2      29.39           0
      1998          4      29.49  16.6666667
      1998          3      29.49  16.6666667
      1998          5       29.8          50
      1998          1      58.15  66.6666667
      1998          6      58.78  83.3333333
      1998          9      58.78  83.3333333
      1999          4       40.5           0
      1999          6       40.5           0
      1999          9      53.34  33.3333333
      1999          1      53.52          50
      1999          5      80.01  66.6666667
      1999          3       94.6  83.3333333
      1999          8     103.11         100
      2000          4      46.54           0
      2000          8      46.54           0
      2000          5       46.7          40
      2000          1       46.7          40
      2000          7       70.8          80
      2000          3      93.41         100
      2001          6      22.44           0
      2001          8      46.06        12.5
      2001          3      47.24          25
      2001          7      69.96        37.5
      2001          1      92.26          50
      2001          9      92.67        62.5
      2001          5      93.44          75

      YEAR       WEEK       SALE FORMER_SALE
---------- ---------- ---------- -----------
      2001          2     118.38        87.5
      2001          4      256.7         100

29 rows selected.


Percentile_cont(大體意思求排在某個百分比時所需的數值)


也可以說是,現在說這樣一個值,向分區里面插入這個值,其排名在百分之N(percent_rank為N%),求這個值。
如果有一個行的percent_rank正好等於N,那么就是這個么的值。如果沒有匹配的,則要計算概率最大的。
SH@ prod> select year , week , sale ,
  2  percentile_cont(0.5) within group(order by sale desc )over(partition by year) pc ,
  3  percent_rank() over( partition by year order by sale desc ) pr
  4  from sales_fact
  5  where country in ('Australia') and product = 'Xtend Memory' and week < 11 ;

      YEAR       WEEK       SALE         PC         PR
---------- ---------- ---------- ---------- ----------
      1998         10     117.76     43.975          0
      1998          9      58.78     43.975 .142857143
      1998          6      58.78     43.975 .142857143
      1998          1      58.15     43.975 .428571429
      1998          5       29.8     43.975 .571428571
      1998          3      29.49     43.975 .714285714
      1998          4      29.49     43.975 .714285714
      1998          2      29.39     43.975          1
      1999          8     103.11      62.76          0
      1999          3       94.6      62.76 .142857143
      1999          5      80.01      62.76 .285714286
      1999         10         72      62.76 .428571429
      1999          1      53.52      62.76 .571428571
      1999          9      53.34      62.76 .714285714
      1999          6       40.5      62.76 .857142857
      1999          4       40.5      62.76 .857142857
      2000          3      93.41       46.7          0
      2000          7       70.8       46.7         .2
      2000          5       46.7       46.7         .4
      2000          1       46.7       46.7         .4
      2000          4      46.54       46.7         .8
      2000          8      46.54       46.7         .8
      2001          4      256.7      81.11          0
      2001          2     118.38      81.11 .111111111
      2001          5      93.44      81.11 .222222222
      2001          9      92.67      81.11 .333333333
      2001          1      92.26      81.11 .444444444

      YEAR       WEEK       SALE         PC         PR
---------- ---------- ---------- ---------- ----------
      2001          7      69.96      81.11 .555555556
      2001         10      69.05      81.11 .666666667
      2001          3      47.24      81.11 .777777778
      2001          8      46.06      81.11 .888888889
      2001          6      22.44      81.11          1

32 rows selected.



Percentile_disc(功能與Percentile_cont大體相同)


區別在於這個函數取到的值一定是在這個分區的行中的。
如果沒有匹配的,Percentile_disc會按照排序取上一個。

SH@ prod> select year , week , sale ,
  2  percentile_disc(0.5) within group(order by sale desc )over(partition by year) pc ,
  3  percent_rank() over( partition by year order by sale desc ) pr
  4  from sales_fact
  5  where country in ('Australia') and product = 'Xtend Memory' and week < 11 ;

      YEAR       WEEK       SALE         PC         PR
---------- ---------- ---------- ---------- ----------
      1998         10     117.76      58.15          0
      1998          9      58.78      58.15 .142857143
      1998          6      58.78      58.15 .142857143
      1998          1      58.15      58.15 .428571429
      1998          5       29.8      58.15 .571428571
      1998          3      29.49      58.15 .714285714
      1998          4      29.49      58.15 .714285714
      1998          2      29.39      58.15          1
      1999          8     103.11         72          0
      1999          3       94.6         72 .142857143
      1999          5      80.01         72 .285714286
      1999         10         72         72 .428571429
      1999          1      53.52         72 .571428571
      1999          9      53.34         72 .714285714
      1999          6       40.5         72 .857142857
      1999          4       40.5         72 .857142857
      2000          3      93.41       46.7          0
      2000          7       70.8       46.7         .2
      2000          5       46.7       46.7         .4
      2000          1       46.7       46.7         .4
      2000          4      46.54       46.7         .8
      2000          8      46.54       46.7         .8
      2001          4      256.7      92.26          0
      2001          2     118.38      92.26 .111111111
      2001          5      93.44      92.26 .222222222
      2001          9      92.67      92.26 .333333333
      2001          1      92.26      92.26 .444444444

      YEAR       WEEK       SALE         PC         PR
---------- ---------- ---------- ---------- ----------
      2001          7      69.96      92.26 .555555556
      2001         10      69.05      92.26 .666666667
      2001          3      47.24      92.26 .777777778
      2001          8      46.06      92.26 .888888889
      2001          6      22.44      92.26          1

32 rows selected.

SH@ prod> select year , week , sale ,
  2  percentile_cont(0.5) within group(order by sale desc )over(partition by year) pc ,
  3  percent_rank() over( partition by year order by sale desc ) pr
  4  from sales_fact
  5  where country in ('Australia') and product = 'Xtend Memory' and week < 11 ;

      YEAR       WEEK       SALE         PC         PR
---------- ---------- ---------- ---------- ----------
      1998         10     117.76     43.975          0
      1998          9      58.78     43.975 .142857143
      1998          6      58.78     43.975 .142857143
      1998          1      58.15     43.975 .428571429
      1998          5       29.8     43.975 .571428571
      1998          3      29.49     43.975 .714285714
      1998          4      29.49     43.975 .714285714
      1998          2      29.39     43.975          1
      1999          8     103.11      62.76          0
      1999          3       94.6      62.76 .142857143
      1999          5      80.01      62.76 .285714286
      1999         10         72      62.76 .428571429
      1999          1      53.52      62.76 .571428571
      1999          9      53.34      62.76 .714285714
      1999          6       40.5      62.76 .857142857
      1999          4       40.5      62.76 .857142857
      2000          3      93.41       46.7          0
      2000          7       70.8       46.7         .2
      2000          5       46.7       46.7         .4
      2000          1       46.7       46.7         .4
      2000          4      46.54       46.7         .8
      2000          8      46.54       46.7         .8
      2001          4      256.7      81.11          0
      2001          2     118.38      81.11 .111111111
      2001          5      93.44      81.11 .222222222
      2001          9      92.67      81.11 .333333333
      2001          1      92.26      81.11 .444444444

      YEAR       WEEK       SALE         PC         PR
---------- ---------- ---------- ---------- ----------
      2001          7      69.96      81.11 .555555556
      2001         10      69.05      81.11 .666666667
      2001          3      47.24      81.11 .777777778
      2001          8      46.06      81.11 .888888889
      2001          6      22.44      81.11          1

32 rows selected.


NTILE(類型於建立直方圖,不支持開窗)


將排序后的數據均勻分配到指定個數據桶中,返回桶編號,如果不能等分,各個桶中的行數最多相差一行。
在以后的處理中可以通過去除首桶或尾去除異常值。
注意:並不是按值分配的。

SH@ prod> select year , week , sale ,
  2  ntile(10) over(order by sale ) group#
  3  from sales_fact
  4  where country in ('Australia') and product = 'Xtend Memory' and year = 1998 order by year , sale;

      YEAR       WEEK       SALE     GROUP#
---------- ---------- ---------- ----------
      1998         50      28.76          1
      1998          2      29.39          1
      1998          4      29.49          1
      1998          3      29.49          1
      1998          5       29.8          2
      1998         43      57.52          2
      1998         35      57.52          2
      1998         40      57.52          2
      1998         46      57.52          3
      1998         27      57.52          3
      1998         45      57.52          3
      1998         44      57.52          3
      1998         47      57.72          4
      1998         29      57.72          4
      1998         28      57.72          4
      1998          1      58.15          4
      1998         41      58.32          5
      1998         51      58.32          5
      1998         14      58.78          5
      1998          9      58.78          5
      1998         15      58.78          6
      1998         17      58.78          6
      1998          6      58.78          6
      1998         19      58.98          6
      1998         21       59.6          7
      1998         12       59.6          7
      1998         52      86.38          7

      YEAR       WEEK       SALE     GROUP#
---------- ---------- ---------- ----------
      1998         34     115.44          8
      1998         39     115.84          8
      1998         42     115.84          8
      1998         38     115.84          9
      1998         23     117.56          9
      1998         18     117.56          9
      1998         26     117.56         10
      1998         10     117.76         10
      1998         48     172.56         10

36 rows selected.


Stddev計算標准差(方差的平方根,支持開窗)

SH@ prod> select year , week , sale ,
  2  stddev(sale) over(
  3  partition by product , country , region , year
  4  order by sale desc
  5  rows between 2 preceding and 2 following ) stddv
  6  from sales_fact
  7  where country in ('Australia') and product = 'Xtend Memory' and week < 10
  8  order by year , week ;

      YEAR       WEEK       SALE      STDDV
---------- ---------- ---------- ----------
      1998          1      58.15 15.8453416
      1998          2      29.39 .057735027
      1998          3      29.49 .178021534
      1998          4      29.49 12.7945918
      1998          5       29.8  15.815738
      1998          6      58.78  .36373067
      1998          9      58.78 14.3880654
      1999          1      53.52  22.178931
      1999          3       94.6 21.7319902
      1999          4       40.5 7.46550065
      1999          5      80.01 22.9761992
      1999          6       40.5 7.41317746
      1999          8     103.11 11.6825953
      1999          9      53.34 16.1305511
      2000          1       46.7 21.0022332
      2000          3      93.41 23.3589605
      2000          4      46.54 .092376043
      2000          5       46.7 10.8139207
      2000          7       70.8 22.4285538
      2000          8      46.54 .092376043
      2001          1      92.26 20.3811452
      2001          2     118.38 78.5152276
      2001          3      47.24 26.5077898
      2001          4      256.7  87.947194
      2001          5      93.44  71.309193
      2001          6      22.44 13.9900965
      2001          7      69.96 22.9124643

      YEAR       WEEK       SALE      STDDV
---------- ---------- ---------- ----------
      2001          8      46.06  19.407678
      2001          9      92.67 17.1409691

29 rows selected.


Listagg(把分區中的列按照順序拼接起來,不支持開窗)


SH@ prod> col stddv for a60
SH@ prod> select year , week , sale ,
  2  listagg(sale , ' , ')within group(order by sale desc) over(
  3  partition by product , country , region , year  ) stddv
  4  from sales_fact
  5  where country in ('Australia') and product = 'Xtend Memory' and week < 5
  6  order by year , week ;

      YEAR       WEEK       SALE STDDV
---------- ---------- ---------- ------------------------------------------------------------
      1998          1      58.15 58.15 , 29.49 , 29.49 , 29.39
      1998          2      29.39 58.15 , 29.49 , 29.49 , 29.39
      1998          3      29.49 58.15 , 29.49 , 29.49 , 29.39
      1998          4      29.49 58.15 , 29.49 , 29.49 , 29.39
      1999          1      53.52 94.6 , 53.52 , 40.5
      1999          3       94.6 94.6 , 53.52 , 40.5
      1999          4       40.5 94.6 , 53.52 , 40.5
      2000          1       46.7 93.41 , 46.7 , 46.54
      2000          3      93.41 93.41 , 46.7 , 46.54
      2000          4      46.54 93.41 , 46.7 , 46.54
      2001          1      92.26 256.7 , 118.38 , 92.26 , 47.24
      2001          2     118.38 256.7 , 118.38 , 92.26 , 47.24
      2001          3      47.24 256.7 , 118.38 , 92.26 , 47.24
      2001          4      256.7 256.7 , 118.38 , 92.26 , 47.24

14 rows selected.






分析函數對謂詞前推的影響

使用了分析函數的視圖,會影響視圖前推,因為分析函數的結果是跨行引用得來的,如果對數據源進行的剪裁,結果可能會不一樣

SH@ prod> create or replace view max_5_weeks_vw as
  2  select country , product , region , year , week , sale ,
  3  max(sale) over(
  4  partition by product , country , region , year order by year , week
  5  rows between 2 preceding and 2 following ) max_weeks_5
  6  from sales_fact ;

View created.

SH@ prod> select year , week , sale , max_weeks_5 from max_5_weeks_vw
  2  where country in ('Australia' ) and product = 'Xtend Memory'
  3  and region = 'Australia' and year = 2000 and week < 14
  4  order by year , week ;

      YEAR       WEEK       SALE MAX_WEEKS_5
---------- ---------- ---------- -----------
      2000          1       46.7       93.41
      2000          3      93.41       93.41
      2000          4      46.54       93.41
      2000          5       46.7       93.41
      2000          7       70.8       93.74
      2000          8      46.54       93.74
      2000         11      93.74       117.5
      2000         12      46.54      117.67
      2000         13      117.5      117.67

9 rows selected.

SH@ prod> explain plan for
  2  select year , week , sale , max_weeks_5 from max_5_weeks_vw
  3  where country in ('Australia' ) and product = 'Xtend Memory'
  4  and region = 'Australia' and year = 2000 and week < 14
  5  order by year , week ;

Explained.

SH@ prod> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 4167461139

--------------------------------------------------------------------------------------
| Id  | Operation           | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |                |    90 |  5220 |   310   (1)| 00:00:04 |
|*  1 |  VIEW               | MAX_5_WEEKS_VW |    90 |  5220 |   310   (1)| 00:00:04 |
|   2 |   WINDOW SORT       |                |    90 |  9450 |   310   (1)| 00:00:04 |
|*  3 |    TABLE ACCESS FULL| SALES_FACT     |    90 |  9450 |   309   (1)| 00:00:04 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("WEEK"<14)
   3 - filter("COUNTRY"='Australia' AND "PRODUCT"='Xtend Memory' AND
              "REGION"='Australia' AND "YEAR"=2000)

Note
-----
   - dynamic sampling used for this statement (level=2)

21 rows selected.

對比沒有分析函數的視圖。直接將謂詞推入到視圖里面。

SH@ prod> create or replace view max_5_weeks_vw1 as
  2  select country , product , region , year , week , sale
  3  from sales_fact ;

View created.

SH@ prod> explain plan for
  2  select year , week , sale from max_5_weeks_vw1
  3  where country in ('Australia' ) and product = 'Xtend Memory'
  4  and region = 'Australia' and year = 2000 and week < 14
  5  order by year , week ;

Explained.

SH@ prod> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1978576542

---------------------------------------------------------------------------------
| Id  | Operation          | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |            |     1 |   105 |   310   (1)| 00:00:04 |
|   1 |  SORT ORDER BY     |            |     1 |   105 |   310   (1)| 00:00:04 |
|*  2 |   TABLE ACCESS FULL| SALES_FACT |     1 |   105 |   309   (1)| 00:00:04 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("COUNTRY"='Australia' AND "PRODUCT"='Xtend Memory' AND
              "REGION"='Australia' AND "YEAR"=2000 AND "WEEK"<14)

Note
-----
   - dynamic sampling used for this statement (level=2)

19 rows selected.


分析函數用在動態SQL中

SH@ prod> create or replace procedure analytic_dynamic_prc ( part_col_string varchar2 , v_country varchar2 , v_product varchar2 )
  2  is
  3  type numtab is table of number(18 , 2) index by binary_integer ;
  4  l_year numtab ;
  5  l_week numtab ;
  6  l_sale numtab ;
  7  l_rank numtab ;
  8  l_sql_string varchar2(512) ;
  9  begin
 10  l_sql_string := 'select * from ( select year , week , sale , rank() over( partition by ' || part_col_string
 11  || ' order by sale desc ) sales_rank from sales_fact where country in ('
 12  || chr(39) || v_country || chr(39)
 13  || ' ) and product = ' || chr(39) || v_product || chr(39)
 14  || 'order by product , country , year , week ) where sales_rank <= 10  order by 1,4' ;
 15  execute immediate l_sql_string bulk collect into l_year , l_week , l_sale , l_rank ;
 16  for i in 1..l_year.count loop
 17  dbms_output.put_line( l_year(i) || ' | ' || l_week(i) || ' | ' || l_sale(i) || ' | ' || l_rank(i) ) ;
 18  end loop ;
 19  end ;
 20  /

Procedure created.

SH@ prod> exec analytic_dynamic_prc('product , country , region' , 'Australia' , 'Xtend Memory' ) ;
1998 | 48 | 172.56 | 9
2000 | 46 | 246.74 | 3
2000 | 21 | 187.48 | 5
2000 | 43 | 179.12 | 7
2000 | 34 | 178.52 | 8
2001 | 16 | 278.44 | 1
2001 | 4 | 256.7 | 2
2001 | 21 | 233.7 | 4
2001 | 48 | 182.96 | 6
2001 | 30 | 162.91 | 10
2001 | 14 | 162.91 | 10

PL/SQL procedure successfully completed.


分析函數的“嵌套”

分析函數不能直接嵌套,可能通過子查詢來實現。

select year , week , top_sale_year ,
lag( top_sale_year ) over ( order by year desc ) prev_top_sale_yer
from (
    select distinct
        first_value(year) over (   這里的作用不能用MAX代替,這里取列與排序的列是不同的。
        partition by product , country , region , year
        order by sale desc
        rows between unbounded preceding and unbounded following ) year ,
        first_value(week) over (
        partition by product , country , region , year
        order by sale desc
        rows between unbounded preceding and unbounded following ) week ,
        first_value(sale) over (
        partition by product , country , region , year
        order by sale desc
        rows between unbounded preceding and unbounded following ) top_sale_year
    from sales_fact
    where country in ('Australia') and product = 'Xtend Memory' )
order by year , week ;


執行結果。

SH@ prod> select year , week , top_sale_year ,
  2  lag( top_sale_year ) over ( order by year desc ) prev_top_sale_yer
  3  from (
  4  select distinct
  5  first_value(year) over (
  6  partition by product , country , region , year
  7  order by sale desc
  8  rows between unbounded preceding and unbounded following ) year ,
  9  first_value(week) over (
 10  partition by product , country , region , year
 11  order by sale desc
 12  rows between unbounded preceding and unbounded following ) week ,
 13  first_value(sale) over (
 14  partition by product , country , region , year
 15  order by sale desc
 16  rows between unbounded preceding and unbounded following ) top_sale_year
 17  from sales_fact
 18  where country in ('Australia') and product = 'Xtend Memory' )
 19  order by year , week ;

      YEAR       WEEK TOP_SALE_YEAR PREV_TOP_SALE_YER
---------- ---------- ------------- -----------------
      1998         48        172.56            148.12
      1999         17        148.12            246.74
      2000         46        246.74            278.44
      2001         16        278.44



分析函數的並行

SH@ prod> explain plan for
  2  select year , week , top_sale_year ,
  3  lag( top_sale_year ) over ( order by year desc ) prev_top_sale_yer
  4  from (
  5  select distinct
  6  first_value(year) over (
  7  partition by product , country , region , year
  8  order by sale desc
  9  rows between unbounded preceding and unbounded following ) year ,
 10  first_value(week) over (
 11  partition by product , country , region , year
 12  order by sale desc
 13  rows between unbounded preceding and unbounded following ) week ,
 14  first_value(sale) over (
 15  partition by product , country , region , year
 16  order by sale desc
 17  rows between unbounded preceding and unbounded following ) top_sale_year
 18  from sales_fact
 19  where country in ('Australia') and product = 'Xtend Memory' )
 20  order by year , week ;

Explained.

SH@ prod> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2124823565

-------------------------------------------------------------------------------------
| Id  | Operation              | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |            |   197 |  7683 |   313   (2)| 00:00:04 |
|   1 |  SORT ORDER BY         |            |   197 |  7683 |   313   (2)| 00:00:04 |
|   2 |   WINDOW SORT          |            |   197 |  7683 |   313   (2)| 00:00:04 |
|   3 |    VIEW                |            |   197 |  7683 |   311   (1)| 00:00:04 |
|   4 |     HASH UNIQUE        |            |   197 | 20685 |   311   (1)| 00:00:04 |
|   5 |      WINDOW SORT       |            |   197 | 20685 |   311   (1)| 00:00:04 |
|*  6 |       TABLE ACCESS FULL| SALES_FACT |   197 | 20685 |   309   (1)| 00:00:04 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - filter("COUNTRY"='Australia' AND "PRODUCT"='Xtend Memory')

Note
-----
   - dynamic sampling used for this statement (level=2)

22 rows selected.
(注意DISTINCT操作采用的是HASH UNIQUE而不是排序)



為上面的語句添加並行提示。

SH@ prod> explain plan for
  2  select /*+ parallel(3)*/ year , week , top_sale_year ,
  3  lag( top_sale_year ) over ( order by year desc ) prev_top_sale_yer
  4  from (
  5  select distinct
  6  first_value(year) over (
  7  partition by product , country , region , year
  8  order by sale desc
  9  rows between unbounded preceding and unbounded following ) year ,
 10  first_value(week) over (
 11  partition by product , country , region , year
 12  order by sale desc
 13  rows between unbounded preceding and unbounded following ) week ,
 14  first_value(sale) over (
 15  partition by product , country , region , year
 16  order by sale desc
 17  rows between unbounded preceding and unbounded following ) top_sale_year
 18  from sales_fact
 19  where country in ('Australia') and product = 'Xtend Memory' )
 20  order by year , week ;

Explained.

SH@ prod> set linesize 180
SH@ prod> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2880616722

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name       | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |            |   197 |  7683 |   119   (5)| 00:00:02 |        |      |            |
|   1 |  SORT ORDER BY                   |            |   197 |  7683 |   119   (5)| 00:00:02 |        |      |            |
|   2 |   WINDOW BUFFER                  |            |   197 |  7683 |   119   (5)| 00:00:02 |        |      |            |
|   3 |    PX COORDINATOR                |            |       |       |            |          |        |      |            |
|   4 |     PX SEND QC (ORDER)           | :TQ10003   |   197 |  7683 |   119   (5)| 00:00:02 |  Q1,03 | P->S | QC (ORDER) |
|   5 |      SORT ORDER BY               |            |   197 |  7683 |   119   (5)| 00:00:02 |  Q1,03 | PCWP |            |
|   6 |       PX RECEIVE                 |            |   197 |  7683 |   117   (3)| 00:00:02 |  Q1,03 | PCWP |            |
|   7 |        PX SEND RANGE             | :TQ10002   |   197 |  7683 |   117   (3)| 00:00:02 |  Q1,02 | P->P | RANGE      |
|   8 |         VIEW                     |            |   197 |  7683 |   117   (3)| 00:00:02 |  Q1,02 | PCWP |            |
|   9 |          HASH UNIQUE             |            |   197 | 20685 |   117   (3)| 00:00:02 |  Q1,02 | PCWP |            |
|  10 |           PX RECEIVE             |            |   197 | 20685 |   117   (3)| 00:00:02 |  Q1,02 | PCWP |            |
|  11 |            PX SEND HASH          | :TQ10001   |   197 | 20685 |   117   (3)| 00:00:02 |  Q1,01 | P->P | HASH       |
|  12 |             WINDOW SORT          |            |   197 | 20685 |   117   (3)| 00:00:02 |  Q1,01 | PCWP |            |
|  13 |              PX RECEIVE          |            |   197 | 20685 |   114   (0)| 00:00:02 |  Q1,01 | PCWP |            |
|  14 |               PX SEND HASH       | :TQ10000   |   197 | 20685 |   114   (0)| 00:00:02 |  Q1,00 | P->P | HASH       |
|  15 |                PX BLOCK ITERATOR |            |   197 | 20685 |   114   (0)| 00:00:02 |  Q1,00 | PCWC |            |
|* 16 |                 TABLE ACCESS FULL| SALES_FACT |   197 | 20685 |   114   (0)| 00:00:02 |  Q1,00 | PCWP |            |
----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------


PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  16 - filter("COUNTRY"='Australia' AND "PRODUCT"='Xtend Memory')

Note
-----
   - dynamic sampling used for this statement (level=2)
   - Degree of Parallelism is 3 because of hint

33 rows selected.








Oracle 高級排序函數 和 高級分組函數

注:本內容來源於《Oracle 高級排序函數 和 高級分組函數


高級排序函數:


[ ROW_NUMBER()| RANK() | DENSE_RANK ] OVER (partition by xx order by xx)

1.row_number() 連續且遞增的數字 1 2 3 4
  row_number() over (partition by xx order by xx ) 
--學生表中按照所在專業分組,同專業內按成績倒序排序,成績相同則按學號正序排序,並給予組內等級
select row_number() over(partition by class_id order by score desc)rn,t.* from student2016 t
2.rank() 跳躍排序 若有相同數據則排名相同 然后跳躍排序 1 2 2 2 5
  rank() over (partition by xx order by xx )
select rank() over(partition by class_id order by score desc)rn,t.* from student2016 t
3.dense_rank 若有相同數據則排名相同 然后遞增排序
dense_rank  over (partition by xx order by xx ) 1 2 2 2 3
select dense_rank() over(partition by class_id order by score desc)rn,t.* from student2016 t

----------------------------------------------------------------------------------------------------------------------------

高級分組函數

                  group by rollup(a,b,c)

select a,b,c,sum(d) from test group by rollup(a,b,c)

對rollup后面的列 按從右到左以少一列的方式進行分組直到所有列都去掉后的分組(也就是全表分組)
對於n個參數的 rollup,有n+1次分組

即按a,b,c,分組,union all a,b分組 union all a分組 union from test

----------------------------------------------------------------------------------

                  group by cube(a,b,c)

對n個參數,有2^n次分組

即按 ab,ac,a,bc,b,c最后對 全部分組

----------------------------------------------------------------------------------

                  group by grouping sets(a,b)

即只列出 對 a分組后,和對 b分組的結果集


-- 創建銷售表
create table sales_tab(
year_id number not null,
month_id number not null,
day_id number not null,
sales_value number(10,2) not null
);

-- 插入數據
insert into sales_tab
select trunc(dbms_random.value(low=>2010,high=>2012)) as year_id,
trunc(dbms_random.value(low=>1,high=>13)) as month_id,
trunc(dbms_random.value(low=>1,high=>32)) as day_id,
round(dbms_random.value(low=>1,high=>100)) as sales_value
from dual
connect by level <=1000;

-- 查詢 group by 后的數據
select sum(t.sales_value) from SALES_TAB t -- 1行

select t.year_id,t.month_id,t.day_id,sum(t.sales_value) sales from SALES_TAB t group by t.year_id,t.month_id,t.day_id
order by t.year_id,t.month_id,t.day_id desc; -- 540行

select t.year_id,t.month_id,sum(t.sales_value) sales from SALES_TAB t group by t.year_id,t.month_id
order by t.year_id,t.month_id desc; -- 24 行

select t.year_id,sum(t.sales_value) sales from SALES_TAB t group by t.year_id
order by t.year_id desc; -- 2 行

-- 使用高級分組函數
-- group by rollup(a,b,c)
select t.year_id,t.month_id,t.day_id,sum(t.sales_value) sales from SALES_TAB t group by rollup(t.year_id,t.month_id,t.day_id)
order by t.year_id,t.month_id,t.day_id; -- 567 行 = 同上面 1+540+24+2

-- group by cube(a,b,c)
select t.year_id,t.month_id,t.day_id,sum(t.sales_value) sales from SALES_TAB t group by cube(t.year_id,t.month_id,t.day_id)
order by t.year_id,t.month_id,t.day_id;

--group by grouping sets(a,b,c)
select t.year_id,t.month_id,sum(t.sales_value) sales from SALES_TAB t group by cube(t.year_id,t.month_id)
order by 1,2; -- 39 行

select t.year_id,t.month_id,sum(t.sales_value) sales from SALES_TAB t group by grouping sets(t.year_id,t.month_id)
order by 1,2; -- 14 行



Oracle高級查詢之OVER (PARTITION BY ..)

注:本內容來源於《Oracle高級查詢之OVER (PARTITION BY ..)

為了方便大家學習和測試,所有的例子都是在Oracle自帶用戶Scott下建立的。

注:標題中的紅色order by是說明在使用該方法的時候必須要帶上order by。

一、rank()/dense_rank() over(partition by ...order by ...)

現在客戶有這樣一個需求,查詢每個部門工資最高的雇員的信息,相信有一定oracle應用知識的同學都能寫出下面的SQL語句:

[sql]  view plain copy
  1. select e.ename, e.job, e.sal, e.deptno 
  2.   from scott.emp e, 
  3.        (select e.deptno, max(e.sal) sal from scott.emp e group by e.deptno) me 
  4.  where e.deptno = me.deptno 
  5.    and e.sal = me.sal; 

在滿足客戶需求的同時,大家應該習慣性的思考一下是否還有別的方法。這個是肯定的,就是使用本小節標題中rank() over(partition by...)或dense_rank() over(partition by...)語法,SQL分別如下:

[sql]  view plain copy
  1. select e.ename, e.job, e.sal, e.deptno 
  2.   from (select e.ename, 
  3.                e.job, 
  4.                e.sal, 
  5.                e.deptno, 
  6.                rank() over(partition by e.deptno order by e.sal desc) rank 
  7.           from scott.emp e) e 
  8.  where e.rank = 1; 
[sql]  view plain copy
  1. select e.ename, e.job, e.sal, e.deptno 
  2.   from (select e.ename, 
  3.                e.job, 
  4.                e.sal, 
  5.                e.deptno, 
  6.                dense_rank() over(partition by e.deptno order by e.sal desc) rank 
  7.           from scott.emp e) e 
  8.  where e.rank = 1; 

為什么會得出跟上面的語句一樣的結果呢?這里補充講解一下rank()/dense_rank() over(partition by e.deptno order by e.sal desc)語法。
over:  在什么條件之上。
partition by e.deptno:  按部門編號划分(分區)。
order by e.sal desc:  按工資從高到低排序(使用rank()/dense_rank() 時,必須要帶order by否則非法)
rank()/dense_rank():  分級
整個語句的意思就是:在按部門划分的基礎上,按工資從高到低對雇員進行分級,“級別”由從小到大的數字表示(最小值一定為1)。

那么rank()和dense_rank()有什么區別呢?
rank():  跳躍排序,如果有兩個第一級時,接下來就是第三級。
dense_rank():  連續排序,如果有兩個第一級時,接下來仍然是第二級。

小作業:查詢部門最低工資的雇員信息。

二、min()/max() over(partition by ...)

現在我們已經查詢得到了部門最高/最低工資,客戶需求又來了,查詢雇員信息的同時算出雇員工資與部門最高/最低工資的差額。這個還是比較簡單,在第一節的groupby語句的基礎上進行修改如下:

[sql]  view plain copy
  1. select e.ename, 
  2.          e.job, 
  3.          e.sal, 
  4.          e.deptno, 
  5.          e.sal - me.min_sal diff_min_sal, 
  6.          me.max_sal - e.sal diff_max_sal 
  7.     from scott.emp e, 
  8.          (select e.deptno, min(e.sal) min_sal, max(e.sal) max_sal 
  9.             from scott.emp e 
  10.            group by e.deptno) me 
  11.    where e.deptno = me.deptno 
  12.    order by e.deptno, e.sal; 

上面我們用到了min()和max(),前者求最小值,后者求最大值。如果這兩個方法配合over(partition by ...)使用會是什么效果呢?大家看看下面的SQL語句:

[sql]  view plain copy
  1. select e.ename, 
  2.        e.job, 
  3.        e.sal, 
  4.        e.deptno, 
  5.        nvl(e.sal - min(e.sal) over(partition by e.deptno), 0) diff_min_sal, 
  6.        nvl(max(e.sal) over(partition by e.deptno) - e.sal, 0) diff_max_sal 
  7.   from scott.emp e; 

這兩個語句的查詢結果是一樣的,大家可以看到min()和max()實際上求的還是最小值和最大值,只不過是在partition by分區基礎上的。

小作業:如果在本例中加上order by,會得到什么結果呢?

三、lead()/lag() over(partition by ... order by ...)

中國人愛攀比,好面子,聞名世界。客戶更是好這一口,在和最高/最低工資比較完之后還覺得不過癮,這次就提出了一個比較變態的需求,計算個人工資與比自己高一位/低一位工資的差額。這個需求確實讓我很是為難,在groupby語句中不知道應該怎么去實現。不過。。。。現在我們有了over(partition by ...),一切看起來是那么的簡單。如下:

[sql]  view plain copy
  1. select e.ename, 
  2.        e.job, 
  3.        e.sal, 
  4.        e.deptno, 
  5.        lead(e.sal, 1, 0) over(partition by e.deptno order by e.sal) lead_sal, 
  6.        lag(e.sal, 1, 0) over(partition by e.deptno order by e.sal) lag_sal, 
  7.        nvl(lead(e.sal) over(partition by e.deptno order by e.sal) - e.sal, 
  8.            0) diff_lead_sal, 
  9.        nvl(e.sal - lag(e.sal) over(partition by e.deptno order by e.sal), 0) diff_lag_sal 
  10.   from scott.emp e;  

看了上面的語句后,大家是否也會覺得虛驚一場呢(驚出一身冷汗后突然雞凍起來,這樣容易感冒)?我們還是來講解一下上面用到的兩個新方法吧。
lead(列名,n,m):  當前記錄后面第n行記錄的<列名>的值,沒有則默認值為m;如果不帶參數n,m,則查找當前記錄后面第一行的記錄<列名>的值,沒有則默認值為null。
lag(列名,n,m): 
當前記錄前面第n行記錄的<列名>的值,沒有則默認值為m;如果不帶參數n,m,則查找當前記錄前面第一行的記錄<列名>的值,沒有則默認值為null。

下面再列舉一些常用的方法在該語法中的應用(注:帶order by子句的方法說明在使用該方法的時候必須要帶order by):

[sql]  view plain copy
  1. select e.ename, 
  2.        e.job, 
  3.        e.sal, 
  4.        e.deptno, 
  5.        first_value(e.sal) over(partition by e.deptno) first_sal, 
  6.        last_value(e.sal) over(partition by e.deptno) last_sal, 
  7.        sum(e.sal) over(partition by e.deptno) sum_sal, 
  8.        avg(e.sal) over(partition by e.deptno) avg_sal, 
  9.        count(e.sal) over(partition by e.deptno) count_num, 
  10.        row_number() over(partition by e.deptno order by e.sal) row_num 
  11.   from scott.emp e; 


重要提示:大家在讀完本片文章之后可能會有點誤解,就是OVER (PARTITION BY ..)比GROUP BY更好,實際並非如此,前者不可能替代后者,而且在執行效率上前者也沒有后者高,只是前者提供了更多的功能而已,所以希望大家在使用中要根據需求情況進行選擇。

路在腳下






==================================================================================================================================================


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM