Hive通過mysql元數據表刪除分區


1 創建表 hive命令行操作

  

CREATE TABLE IF NOT EXISTS emp(
name STRING,
salary FLOAT,
subordinates ARRAY<STRING>,
deductions MAP<STRING,FLOAT>,
address STRUCT<street:STRING,city:STRING,province:STRING,zip:INT>
)
PARTITIONED BY (province STRING,city STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
COLLECTION ITEMS TERMINATED BY ','
MAP KEYS TERMINATED BY ':’;

  

2 造數據

  shanxi.txt

  

zj1	10000	james,datacloase	jim:1.2,james:2.1,lilly:3.8	huaxing,xian,shanxi,1
zj2	10000	james,datacloase	jim:1.2,james:2.1,lilly:3.8	huaxing,xian,shanxi,2
zj3	10000	james,datacloase	jim:1.2,james:2.1,lilly:3.8	huaxing,xian,shanxi,3
zj4	10000	james,datacloase	jim:1.2,james:2.1,lilly:3.8	huaxing,xian,shanxi,4
zj5	10000	james,datacloase	jim:1.2,james:2.1,lilly:3.8	huaxing,xian,shanxi,5
zj6	10000	james,datacloase	jim:1.2,james:2.1,lilly:3.8	huaxing,xian,shanxi,6

  hunan.txt

zbq1	10000	james,datacloase	jim:1.2,james:2.1,lilly:3.8	huaxing,zhangjiajie,hunan,1
zbq2	10000	james,datacloase	jim:1.2,james:2.1,lilly:3.8	huaxing,zhangjiajie,hunan,2
zbq3	10000	james,datacloase	jim:1.2,james:2.1,lilly:3.8	huaxing,zhangjiajie,hunan,3
zbq4	10000	james,datacloase	jim:1.2,james:2.1,lilly:3.8	huaxing,zhangjiajie,hunan,4
zbq5	10000	james,datacloase	jim:1.2,james:2.1,lilly:3.8	huaxing,zhangjiajie,hunan,5

  

3 導入數據 hive命令行操作

  

LOAD DATA LOCAL INPATH '/tmp/logs/shanxi.txt' OVERWRITE INTO TABLE emp
PARTITION(province='shanxi',city='xian’);

LOAD DATA LOCAL INPATH '/tmp/logs/hunan.txt' OVERWRITE INTO TABLE emp
PARTITION(province='hunan',city='zhangjiajie’);

 

4 查詢hive數據

hive表結構  

 hive> describe extended emp; 

查詢hive分區數據

hive> select * from emp where province='shanxi' and city = 'xian';

 

 

5 查看HDFS的hive目錄

[root@hdp1 /tmp/logs]#hdfs dfs -ls /user/hive/warehouse/emp

 

6 刪除hive中hunan的分區

  

A 進入hive使用的MySQL 

B 切換為hive數據庫
mysql> use hive;
C 查詢相關表
mysql> SELECT * FROM TBLS WHERE TBL_NAME='emp';

 



發現出現1條記錄,所以使用TBLS表的SD_ID字段去SDS表查詢LOCATION字段的值,通過LOCATION字段,就可以知道emp這張表的數據庫,TBLS中TBL_ID為6的這行記錄就是我要查詢的表的ID


mysql> select * from SDS where SD_ID=6;

 


接下來,要根據TBLS表的TBL_ID和hive表分區字段的值(模糊查詢)去PARTITIONS表查詢,需要得到PARTITIONS表的PART_ID的值(2)

mysql> select * from PARTITIONS t where t.tbl_id=6 and PART_NAME like '%hunan%';

 

D開始刪除

  最后,通過TBLS表的TBL_ID(70)和PARTITIONS表的PART_ID(202354)就可以刪除hive的分區了

mysql> mysql> delete from PARTITION_KEY_VALS where part;
Query OK, 0 rows affected (0.00 sec)

mysql> delete from PARTITION_KEY_VALS where part_id=2;
Query OK, 2 rows affected (0.00 sec)

mysql> delete from PARTITION_PARAMS where part_id=2;
Query OK, 6 rows affected (0.01 sec)

mysql> delete from PARTITIONS where tbl_id=6 and part_id=2;
Query OK, 1 row affected (0.00 sec)

刪除hdfs相關分區目錄

[root@hdp1 /root]#hdfs dfs -rm -r "/user/hive/warehouse/emp/province=hunan"
Deleted /user/hive/warehouse/emp/province=hunan

7 查詢分區是否刪除
已經查詢不出來hunan的分區數據了

hive> select * from emp where province='hunan';
OK
Time taken: 0.073 seconds


 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM