在hive中對於json的數據格式,可以使用get_json_object或json_tuple先解析然后查詢。
也可以直接在hive中創建json格式的表結構,這樣就可以直接查詢,實戰如下(hive-2.3.0版本):
1. 准備數據源
將以下內容保存為test.txt
{"student":{"name":"king","age":11,"sex":"M"},"class":{"book":"語文","level":2,"score":80},"teacher":{"name":"t1","class":"語文"}}
{"student":{"name":"wang","age":12,"sex":"M"},"class":{"book":"語文","level":2,"score":80},"teacher":{"name":"t1","class":"語文"}}
{"student":{"name":"test","age":13,"sex":"M"},"class":{"book":"語文","level":2,"score":80},"teacher":{"name":"t1","class":"語文"}}
{"student":{"name":"test2","age":14,"sex":"M"},"class":{"book":"語文","level":2,"score":80},"teacher":{"name":"t1","class":"語文"}}
{"student":{"name":"test3","age":15,"sex":"M"},"class":{"book":"語文","level":2,"score":80},"teacher":{"name":"t1","class":"語文"}}
{"student":{"name":"test4","age":16,"sex":"M"},"class":{"book":"語文","level":2,"score":80},"teacher":{"name":"t1","class":"語文"}}
2. 創建hive表
注意serde格式大小寫不能寫錯: org.apache.hive.hcatalog.data.JsonSerDe
create external table if not exists dw_stg.student( student map<string,string> comment "學生信息", class map<string,string> comment "課程信息", teacher map<string,string> comment "授課老師信息" ) comment "學生課程信息" row format serde 'org.apache.hive.hcatalog.data.JsonSerDe' stored as textfile;
3. 上傳數據
將test.txt上傳到剛才創建的student目錄
hdfs dfs -put test.txt /user/hive/warehouse/dw_stg.db/student/
4. 使用hql查詢
查詢所有信息記錄:
查詢字段student信息
查詢字段class信息
查詢學生姓名為test4的所有記錄
取json串中某個值可以使用 student['name'] ,如下:
select student['name'] as stuName, class['book'] as cls_book, class['score'] as cls_score, teacher['name'] as tech_name from student where student['name'] = 'test4';
總體看起來,比使用get_json_object或json_tuple解析方便多了。