Hive表中Partition的創建

本文轉載自查看原文 2018-11-16 17:05 1337 hive

作用：

在Hive Select查詢中一般會掃描整個表內容，會消耗很多時間做沒必要的工作。有時候只需要掃描表中關心的一部分數據，在對應的partition里面去查找就可以，減少查詢時間。

1. 創建表

]# cat create_rating_table_p.sql
create external table rating_table_p
(userId STRING,
movieId STRING,
rating STRING
)
partitioned by (dt STRING)
row format delimited fields terminated by '\t'
lines terminated by '\n';

2. 導入數據

LOAD DATA LOCAL INPATH '/usr/local/hive/test/hive_test_3/ml-latest-small/2009-12.data' OVERWRITE INTO TABLE rating_table_p partition(dt='2009-12');
LOAD DATA LOCAL INPATH '/usr/local/hive/test/hive_test_3/ml-latest-small/2003-09.data' OVERWRITE INTO TABLE rating_table_p partition(dt='2003-09');

3. HDFS上面查看，會在以表名為文件夾下面，有兩個以時間命名的文件夾，對應日期數據存在對應文件夾下面

]$ hdfs dfs -ls /user/hive/warehouse/rating_table_p
Found 2 items
drwxrwxrwx   - hadoop supergroup          0 2018-06-25 15:27 /user/hive/warehouse/rating_table_p/dt=2003-10
drwxrwxrwx   - hadoop supergroup          0 2018-06-25 15:26 /user/hive/warehouse/rating_table_p/dt=2009-12

4. Hive表中查詢

hive> select userid, dt from rating_table_p where dt='2009-12' limit 10;
OK
1    2009-12
1    2009-12
1    2009-12
1    2009-12
1    2009-12
1    2009-12
1    2009-12
1    2009-12
1    2009-12
1    2009-12

5. 刪除分區

alter table rating_table_p drop if exists partition(dt='2003-10');

6.添加分區

alter table rating_table_p add if not exists partition(dt='2003-10');

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 hive中partition如何使用 hive中簡單介紹分區表(partition table)——動態分區(dynamic partition)、靜態分區(static partition) HIVE中臨時表創建 hive中創建hive-json格式的表及查詢 hive學習----Hive表的創建在hive中查詢導入數據表時FAILED: SemanticException [Error 10096]: Dynamic partition strict mode requires at least one static partition column. To turn this off set hive.exec.dynamic.partition.mode=nonstrict Hive_創建表 hive第二篇----hive中partition如何使用 hive創建ES外部表過程中的問題 HIVE中Create Temporary Table臨時表的創建