hive中常規處理json數據,array類型json用get_json_object(#,"$.#")這個方法足夠了,map類型復合型json就需要通過數據處理才能解析。
explode:字段行轉列
select explode(split(字段,',')) as abc from explode_lateral_view;
select explode(split(字段,',')) as abc from explode_lateral_view;
LATERAL VIEW:單行數據拆解成多行數據
側視圖的意義是配合explode(或者其他的UDTF),一個語句生成把單行數據拆解成多行后的數據結果集。
select get_json_object(concat('{',sale_info_r,'}'),'$.monthSales') as monthSales from explode_lateral_view LATERAL VIEW explode(split(regexp_replace(regexp_replace(sale_info,'\\[\\{',''),'}]',''),'},\\{'))sale_info as sale_info_r;
統一版
通過下面的句子,把這個json格式的一行數據,完全轉換成二維表的方式展現
select t1.id ,get_json_object(col,'$.key') as value ,get_json_object(col,'$.key') as value from (select id,s.col as col from table_a lateral view explode(split(regexp_replace(regexp_extract(json,'^\\[(.+)\\]$',1),'\\}\\,|[, ]{0,1}\\{', '\\}\\|\\|\\{'),'\\|\\|')) s as col ) t1
或者另一版本
select get_json_object(concat('{',sale_info_1,'}'),'$.source') as source,
get_json_object(concat('{',sale_info_1,'}'),'$.monthSales') as monthSales, get_json_object(concat('{',sale_info_1,'}'),'$.userCount') as monthSales, get_json_object(concat('{',sale_info_1,'}'),'$.score') as monthSales from explode_lateral_view LATERAL VIEW explode(split(regexp_replace(regexp_replace(sale_info,'\\[\\{',''),'}]',''),'},\\{'))sale_info as sale_info_1
hive 數據轉成json數據組
concat('{\"name\":\"',name,'\",\"cus_nam\":\"',NVL(t2.cus_nam, ''), '\",\"orderNo\":\"', NVL(orderNo, ''), '\",\"ord_no\":\"', NVL(t1.ord_no, ''), '\",\"trigger\":\"', NVL(trigger, ''), '\",\"assignmentOfClaims\":\"', NVL(assignmentOfClaims, ''), '\"}') as value
通過get_json_object函數解析,測試無誤
hive 正則匹配
regexp_extract(字段,正則表達式,序號)
匹配樣例
select regexp_extract('honey123moon', 'hon([0-9]+)(moon)', 0) select regexp_extract('x=a3&x=18abc&x=2&y=3&x=4','x=([0-9]+)([a-z]+)',1)
其他: