1 創建表 hive命令行操作
CREATE TABLE IF NOT EXISTS emp( name STRING, salary FLOAT, subordinates ARRAY<STRING>, deductions MAP<STRING,FLOAT>, address STRUCT<street:STRING,city:STRING,province:STRING,zip:INT> ) PARTITIONED BY (province STRING,city STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',' MAP KEYS TERMINATED BY ':’;
2 造數據
shanxi.txt
zj1 10000 james,datacloase jim:1.2,james:2.1,lilly:3.8 huaxing,xian,shanxi,1 zj2 10000 james,datacloase jim:1.2,james:2.1,lilly:3.8 huaxing,xian,shanxi,2 zj3 10000 james,datacloase jim:1.2,james:2.1,lilly:3.8 huaxing,xian,shanxi,3 zj4 10000 james,datacloase jim:1.2,james:2.1,lilly:3.8 huaxing,xian,shanxi,4 zj5 10000 james,datacloase jim:1.2,james:2.1,lilly:3.8 huaxing,xian,shanxi,5 zj6 10000 james,datacloase jim:1.2,james:2.1,lilly:3.8 huaxing,xian,shanxi,6
hunan.txt
zbq1 10000 james,datacloase jim:1.2,james:2.1,lilly:3.8 huaxing,zhangjiajie,hunan,1 zbq2 10000 james,datacloase jim:1.2,james:2.1,lilly:3.8 huaxing,zhangjiajie,hunan,2 zbq3 10000 james,datacloase jim:1.2,james:2.1,lilly:3.8 huaxing,zhangjiajie,hunan,3 zbq4 10000 james,datacloase jim:1.2,james:2.1,lilly:3.8 huaxing,zhangjiajie,hunan,4 zbq5 10000 james,datacloase jim:1.2,james:2.1,lilly:3.8 huaxing,zhangjiajie,hunan,5
3 導入數據 hive命令行操作
LOAD DATA LOCAL INPATH '/tmp/logs/shanxi.txt' OVERWRITE INTO TABLE emp PARTITION(province='shanxi',city='xian’); LOAD DATA LOCAL INPATH '/tmp/logs/hunan.txt' OVERWRITE INTO TABLE emp PARTITION(province='hunan',city='zhangjiajie’);
4 查詢hive數據
hive表結構
hive> describe extended emp;
查詢hive分區數據
hive> select * from emp where province='shanxi' and city = 'xian';
5 查看HDFS的hive目錄
[root@hdp1 /tmp/logs]#hdfs dfs -ls /user/hive/warehouse/emp
6 刪除hive中hunan的分區
A 進入hive使用的MySQL
B 切換為hive數據庫
mysql> use hive;
C 查詢相關表
mysql> SELECT * FROM TBLS WHERE TBL_NAME='emp';
發現出現1條記錄,所以使用TBLS表的SD_ID字段去SDS表查詢LOCATION字段的值,通過LOCATION字段,就可以知道emp這張表的數據庫,TBLS中TBL_ID為6的這行記錄就是我要查詢的表的ID
mysql> select * from SDS where SD_ID=6;
接下來,要根據TBLS表的TBL_ID和hive表分區字段的值(模糊查詢)去PARTITIONS表查詢,需要得到PARTITIONS表的PART_ID的值(2)
mysql> select * from PARTITIONS t where t.tbl_id=6 and PART_NAME like '%hunan%';
D開始刪除
最后,通過TBLS表的TBL_ID(70)和PARTITIONS表的PART_ID(202354)就可以刪除hive的分區了
mysql> mysql> delete from PARTITION_KEY_VALS where part;
Query OK, 0 rows affected (0.00 sec)
mysql> delete from PARTITION_KEY_VALS where part_id=2;
Query OK, 2 rows affected (0.00 sec)
mysql> delete from PARTITION_PARAMS where part_id=2;
Query OK, 6 rows affected (0.01 sec)
mysql> delete from PARTITIONS where tbl_id=6 and part_id=2;
Query OK, 1 row affected (0.00 sec)
刪除hdfs相關分區目錄
[root@hdp1 /root]#hdfs dfs -rm -r "/user/hive/warehouse/emp/province=hunan"
Deleted /user/hive/warehouse/emp/province=hunan
7 查詢分區是否刪除
已經查詢不出來hunan的分區數據了
hive> select * from emp where province='hunan';
OK
Time taken: 0.073 seconds