hive group by聚合函數增強


1.grouping sets

grouping sets子句都可以根據UNION連接的多個GROUP BY查詢進行邏輯表示

SELECT a,b,SUM(c)FROM tab1 GROUP BY a,b GROUPING SETS((a,b),a,b,())

等價於

SELECT a,b,SUM(c)FROM tab1 GROUP BY a,b
union
SELECT a,null,SUM(c)FROM tab1 GROUP BY a,null
union
SELECT null,b,SUM(c)FROM tab1 GROUP BY null,b
union
SELECT null,null,SUM(c)FROM tab1

2.GROUPING__ID

注意是兩個下划線相連,說明聚合結果是屬於(grouping sets)哪一個子集的的。

SELECT key, value, GROUPING__ID,count(*)
FROM T1
GROUP BY key, value
GROUPING SETS((key,value),key,value)
;

等價於

SELECT key, value,1,count(*) -- 屬於第1個GROUPING SETS子集,即(key,value)
FROM T1
GROUP BY key, value
union 
SELECT key, NULL,2,count(*) -- 屬於第2個GROUPING SETS子集,即key
FROM T1
GROUP BY key
union 
SELECT NULL, value,3,count(*) -- 屬於第3個GROUPING SETS子集,即value
FROM T1
GROUP BY value


3.WITH CUBE

CUBE是是group by字段的所有組合

GROUP BY a,b,c WITH CUBE

等同於

GROUP BY a,b,c GROUPING SETS((a,b,c),(a,b),(b,c), (a,c),(a),(b),(c),())

4.WITH ROLLUP

ROLLUP子句與GROUP BY一起用於計算維度的層次結構級別的聚合。

GROUP BY a,b,c,WITH ROLLUP

等同於

GROUP BY a,b,c GROUPING SETS((a,b,c),(a,b),(a),())

官網文檔


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM