Hive 利用 on tez 引擎合並小文件

標簽（空格分隔）： Hive



SET hive.exec.dynamic.partition=true;   
SET hive.exec.dynamic.partition.mode=nonstrict;  
set hive.exec.max.dynamic.partitions=3000;
set hive.exec.max.dynamic.partitions.pernode=500;
SET hive.tez.container.size=6656;
SET hive.tez.java.opts=-Xmx5120m;
set hive.merge.tezfiles=true;
set hive.merge.smallfiles.avgsize=1280000000;
set hive.merge.size.per.task=1280000000;
set hive.execution.engine=tez;


insert overwrite table zhaobo_test.lazy_st_rpt_priv_occupation_new partition (pt) select * from zhaobo_test.lazy_st_rpt_priv_occupation_new;


=============tez 合並========



Try using TEZ execution engine and then hive.merge.tezfiles. You might also want to specify the size as well.

set hive.execution.engine=tez; -- TEZ execution engine
set hive.merge.tezfiles=true; -- Notifying that merge step is required
set hive.merge.smallfiles.avgsize=128000000; --128MB
set hive.merge.size.per.task=128000000; -- 128MB













================合並============

If you want to go with MR engine then add following settings (I haven't tried it personally)
set hive.merge.mapredfiles=true; -- Notifying that merge step is required
set hive.merge.smallfiles.avgsize=128000000; --128MB
set hive.merge.size.per.task=128000000; -- 128MB
Above setting will spawn one more step to merge the files and approx size of each part file should be 128MB.

獲取 partition.

beeline -u jdbc:hive2://10.111.55.163:10000 -n   deploy --showHeader=false --outputformat=tsv2 --silent=true -e "show partitions ods.t_city" > found_partitions.txt

開始執行

#!/bin/bash

for line in `cat found_partitions.txt`; 
do
    echo "the next partition is $line"
    partition=`(echo $line | sed -e 's/\//,/g' -e "s/=/='/g" -e "s/,/',/g")`\'
    beeline -u jdbc:hive2://10.111.55.163:10000 -n  deploy -e "alter table database.table partition($partition) concatenate" 
done

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Hive小文件合並 hive小文件合並 hive優化之小文件合並 hive中合並小文件合並hive/hdfs小文件 hive小文件合並設置參數 Spark定期合並Hive表小文件 hive 更換 tez 引擎（二） hive引擎的選擇：tez和spark Hive 使用Tez引擎的配置

Hive 利用 on tez 引擎 合並小文件

Hive 利用 on tez 引擎 合並小文件

獲取 partition.

開始執行

免責聲明！

Hive 利用 on tez 引擎合並小文件

Hive 利用 on tez 引擎合並小文件