Hive數據導入Hbase

本文轉載自查看原文 2019-05-17 17:05 3832 hbase

方案一：Hive關聯HBase表方式

適用場景：數據量不大4T以下（走hbase的api導入數據）

一、hbase表不存在的情況

創建hive表hive_hbase_table映射hbase表hbase_table，會自動創建hbase表hbase_table，且會隨着hive表刪除而刪除，這里需要指定hive的schema到hbase schema的映射關系：

1、建表

CREATE TABLE hive_hbase_table(key int, name String,age String) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:name,cf1:age") 
TBLPROPERTIES ("hbase.table.name" = "hbase_table", 
"hbase.mapred.output.outputtable" = "hbase_table");

2、創建一張原始的hive表，准備一些數據

create table hive_data (key int,name String,age string);
insert into hive_data values(1,"za","13");
insert into hive_data values(2,"ff","44");

3、把hive原表hive_data的數據，通過hive表hive_hbase_table導入到hbase的表hbase_table中

insert into table hive_hbase_table select * from hive_data;

4、查看hbase表hbase_table中是否有數據

二、hbase表存在的情況

創建hive的外表關聯hbase表,注意hive schema到hbase schema的映射關系。刪除外表不會刪除對應hbase表

CREATE EXTERNAL TABLE hive_hbase_external_table(key String, name string,sex String,age String,department String) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,info:name,info:sex,info:age,info:department") 
TBLPROPERTIES ("hbase.table.name" = "filtertest", 
"hbase.mapred.output.outputtable" = "filtertest");

其他步驟與上面相同

方案二：HIve表生成hfile，通過bulkload導入到hbase

1、適用場景：數據量大（4T以上）

2、把hive數據轉換為hfile

3、啟動hive並添加相關的hbase的jar包

add jar /mnt/hive/lib/hive-hbase-handler-2.1.1.jar;
add jar /mnt/hive/lib/hbase-common-1.1.1.jar;
add jar /mnt/hive/lib/hbase-client-1.1.1.jar;
add jar /mnt/hive/lib/hbase-protocol-1.1.1.jar;
add jar /mnt/hive/lib/hbase-server-1.1.1.jar;

4、創建一個outputformat為HiveHFileOutputFormat的hive表

其中/tmp/hbase_table_hfile/cf_0是hfile保存到hdfs的路徑，cf_0是hbase family的名字

create table hbase_hfile_table(key int, name string,age String) 
stored as
INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.hbase.HiveHFileOutputFormat'
TBLPROPERTIES ('hfile.family.path' = '/tmp/hbase_table_hfile/cf_0');

5、原始數據表的數據通過hbase_hfile_table表保存為hfile

insert into table hbase_hfile_table select * from hive_data;

6、查看對應hdfs路徑是否生成了hfile

7、通過bulkload將數據導入到hbase表中

建表：使用hbase客戶端創建具有上面對應family的hbase表

create 'hbase_hfile_load_table','cf_0'

下載hbase客戶端,配置hbase-site.xml，並將hdfs-site.xml、core-site.xml拷貝到hbase/conf目錄

導入：

 hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles \
 hdfs://master:9000/tmp/hbase_table_hfile/  hbase_hfile_load_table

8、查看

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 數據導入(一):Hive On HBase Hive數據導入Hbase Hive如何加載和導入HBase的數據將數據導入hive，再將hive表導入hbase 讀取hive文件並將數據導入hbase 優雅的將hbase的數據導入hive表 hive-hbase-handler方式導入hive表數據到hbase表中使用Sqoop從MySQL導入數據到Hive和HBase 及近期感悟 sqoop:mysql和Hbase/Hive/Hdfs之間相互導入數據 hive讀取hbase數據