shell腳本實現hive增量加載

本文轉載自查看原文 2019-07-19 10:53 545

實現思路：

1、每天凌晨將前一天增量的數據從業務系統導出到文本，並FTP到Hadoop集群某個主節點上

　　上傳路徑默認為：/mnt/data/crawler/

2、主節點上通過shell腳本調用hive命令加載本地增量溫江到hive臨時表

3、shell腳本中，使用hive sql 實現臨時表中的增量數據更新或者新增增量數據到hive 主數據表中

實現步驟：
1.建表語句, 分別創建兩張表test_temp, test 表

create table crawler.test_temp(

a.id string,

a.name string,

a.email string,

create_time string

)

row format delimited

fields terminated by ','

stored as textfile

;

+++++++++++++++++++++++++++++++++

create table crawler.test(

a.id string,

a.name string,

a.email string,

create_time string

)

partitioned by (dt string)

row format delimited

fields terminated by '\t'

stored as orc

;

2.編寫處理加載本地增量數據到hive臨時表的shell腳本test_temp.sh

#! /bin/bash

##################################

# 調用格式: #

# 腳本名稱 [yyyymmdd] #

# 日期參數可選，默認是系統日期-1 #

##################################

dt=''

table=test_temp

#獲取當前系統日期

sysdate=`date +%Y%m%d`

#獲取昨日日期,格式: YYYYMMDD

yesterday=`date -d yesterday +%Y%m%d`

#數據文件地址

file_path=/mnt/data/crawler/

if [ $# -eq 1 ]; then

dt=$1

elif [ $# -eq 0 ]; then

dt=$yesterday

else

echo "非法參數!"

#0-成功，非0-失敗

exit 1

filename=$file_path$table'_'$dt'.txt'

if [ ! -e $filename ]; then

echo "$filename 數據文件不存在!"

exit 1

hive<<EOF

load data local inpath '$filename' overwrite into table crawler.$table;

EOF

if [ $? -eq 0 ]; then

echo ""

echo $dt "$table 加載成功!"

else

echo ""

echo $dt "$table 加載失敗!"

3.增量加載臨時數據到主數據表的shell腳本test.sh

#! /bin/bash

##################################

table=test

#獲取當前系統日期

sysdate=`date +%Y%m%d`

#實現增量覆蓋

hive<<EOF

set hive.exec.dynamic.partition=true;

set hive.exec.dynamic.partition.mode=nonstrict;

insert overwrite table crawler.test partition (dt)

select a.id, a.name, a.email, a.create_time, a.create_time as dt

from (

select id, name, email, create_time from crawler.test_temp

union all

select t.id, t.name, t.email, t.create_time

from crawler.test t

left outer join crawler.test_temp t1

on t.id = t1.id

where t1.id is null

) a;

quit;

EOF

if [ $? -eq 0 ]; then

echo $sysdate $0 " 增量抽取完成!"

else

echo $sysdate $0 " 增量抽取失敗!"

https://www.aboutyun.com/forum.php?mod=viewthread&tid=20025&ordertype=1

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 用 shell 腳本自動給 hive 表添加分區加載數據 shell 腳本運行 hive sql 如何每日增量加載數據到Hive分區表 shell命令執行hive腳本（轉） oozie4.3.0+sqoop1.4.6實現mysql到hive的增量抽取 sshpass 實現shell腳本直接加載密登錄服務器使用hive增量更新 shell腳本實現程序重啟【Win10 應用開發】實現數據的增量加載 shell腳本中向hive動態分區插入數據