Hive學習筆記：列轉行之collect_list/collect_set/concat_ws

本文轉載自查看原文 2022-01-11 22:49 1531 Hive

一、介紹

在 Hive 中想實現按某字段分組，對另外字段進行合並，可通過 collect_list 或者 collect_set 實現。

它們都是將分組中的某列轉為一個數組返回，其中區別在於：

有點類似於 Python 中的列表、集合。

create table table_tmp(
    id string,
    classes string
) partitioned by (month string)
row format delimited fields terminated by ',';

1,a
1,b
2,a
2,b
2,a
2,c
3,a
3,c

load data local inpath '/root/data/id.data' into table table_tmp partition (month='202201');

select id,
       collect_list(classes) as col_001
from table_tmp
group by id;

select id,
       concat_ws('-', collect_list(cast(col_001 as string))) as col_concat
from table_tmp
group by id;

select id,
       concat_ws('-', collect_set(cast(col_001 as string))) as col_concat
from table_tmp
group by id;

可以利用 collect 突破 group by 的限制，分組查詢的時候要求出現在 select 后面的列都必須是分組的列。

但有時候我們想根據某列進行分組后，隨機抽取另一列中的一個值，即可通過以下實現：

select id
       collect_list(classes)[0] as col_001
from table_tmp
group by id;

有種類似於 Python 中索引切片的感覺。

concat_ws(separator, str1, str2, ...)
concat_ws(separator, [str1, str2, ...])

參考鏈接：hive中對多行進行合並—collect_set&collect_list函數

參考鏈接：Hive筆記之collect_list/collect_set（列轉行）

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Hive筆記之collect_list/collect_set（列轉行） hive中的concat，concat_ws，collect_set用法 Hive系統函數之collect_list和collect_set 對多行進行合並(collect_set,collect_list,sort_array函數) Hive 的collect_set使用詳解 CONCAT_WS函數 concat_ws 使用在hive spark-sql上的區別 MySQL中concat_ws函數 Mysql中contact、group_concat、concat_ws、repeat hive行轉列，列轉行