表結構
-
CREATE TABLE test (f1 string,
-
f2 string,
-
f3 string,
-
cnt int) ROW FORMAT delimited FIELDS TERMINATED BY '\t' stored AS textfile;
-
LOAD DATA LOCAL inpath '/data/logs/suiyingli/tmp/test.data' overwrite INTO TABLE test;
原始數據
•
A A B 1
•
B B A 1
•
A A A 2
with cube查詢語句
-
SELECT f1,
-
f2,
-
f3,
-
sum(cnt),
-
GROUPING__ID,
-
rpad( reverse(bin(cast(GROUPING__ID AS bigint))),3,'0')
-
FROM test
-
GROUP BY f1,
-
f2,
-
f3 WITH CUBE;
with cube結果范例
rollup查詢語句
-
SELECT f1,
-
f2,
-
f3,
-
sum(cnt),
-
GROUPING__ID,
-
rpad( reverse(bin(cast(GROUPING__ID AS bigint))),3,'0')
-
FROM test
-
GROUP BY f1,
-
f2,
-
f3 WITH ROLLUP;
rollup結果范例

grouping sets查詢語句
-
SELECT f1,
-
f2,
-
f3,
-
sum(cnt),
-
GROUPING__ID,
-
rpad( reverse(bin(cast(GROUPING__ID AS bigint))),3,'0')
-
FROM test
-
GROUP BY f1,
-
f2,
-
f3
-
GROUPING sets((f1),(f1,f2))

總結
cube的分組組合最全,是各個維度值的笛卡爾(包含null)組合,
rollup的各維度組合應滿足,前一維度為null后一位維度必須為null,前一維度取非null時,下一維度隨意,
grouping sets則為自定義維度,根據需要分組即可。
ps:通過grouping sets的使用可以簡化SQL,比group by單維度進行union性能更好。