hive建表與數據的導入導出


建表:

create EXTERNAL table tabtext(IMSI string,
MDN string,
MEID string,
NAI string,
DestinationIP string,
DestinationPort string,
SourceIP string,
SourcePort string,
PDSNIP string,
PCFIP string,
HAIP string,
UserZoneID string,
BSID string,
Subnet string,
ServiceOption string,
ProtocolID string,
ServiceType string,
StartTime string,
EndTime string,
Duration string,
InputOctets string,
OutputOctets string,
InputPacket string,
OutputPacket string,
SessionID string,
RecordCloseCause string,
UserAgent string,
DestinationURL string,
DomainName string,
Host string,
ContentLen string,
ContentType string,
IfLink string,
Refer string,
HttpAction string,
HttpStatus string,
RespDelay string,
BehaviorTarget string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|';

load data inpath '/user/vendorultrapower/ck/car.txt' into table tabtext;
load data inpath '/user/vendorultrapower/ck/car.txt' into table tabtext;


set mapreduce.job.queuename=root.vendor.ven3;

create EXTERNAL table unmatch(url string);

load data local inpath '/home/vendorultrapower/ck/notnatch.txt' into table unmatch;

 

 

數據導入導出:

 

1.
Hive的幾種常見的數據導入方式
這里介紹四種:
(1)、從本地文件系統中導入數據到Hive表;
(2)、從HDFS上導入數據到Hive表;
(3)、從別的表中查詢出相應的數據並導入到Hive表中;
(4)、在創建表的時候通過從別的表中查詢出相應的記錄並插入到所創建的表中

1.從本地文件系統中導入數據到Hive表
1.1
[hadoop@h91 hive-0.9.0-bin]$ bin/hive
創建ha表
hive> create table ha(id int,name string)
> row format delimited
> fields terminated by '\t'
> stored as textfile;

[ROW FORMAT DELIMITED]關鍵字,是用來設置創建的表在加載數據的時候,支持的列分隔符。
[STORED AS file_format]關鍵字是用來設置加載數據的數據類型,默認是TEXTFILE,如果文件數據是純文本,就是使用 [STORED AS TEXTFILE],然后從本地直接拷貝到HDFS上,hive直接可以識別數據。

1.2
操作系統中的文本
[hadoop@h91 ~]$ cat haha.txt
101 zs
102 ls
103 ww

1.3導入數據
hive> load data local inpath '/home/hadoop/haha.txt' into table ha;
hive> select * from ha;

*****
和我們熟悉的關系型數據庫不一樣,Hive現在還不支持在insert語句里面直接給出一組記錄的文字形式,也就是說,Hive並不支持INSERT INTO …. VALUES形式的語句。
*****

--------------------------------------------------
2.
從HDFS上導入數據到Hive表;

2.1
[hadoop@h91 hadoop-0.20.2-cdh3u5]$ bin/hadoop fs -mkdir abc

[hadoop@h91 ~]$ cat hehe.txt
1001 aa
1002 bb
1003 cc

[hadoop@h91 hadoop-0.20.2-cdh3u5]$ bin/hadoop fs -put /home/hadoop/hehe.txt abc/.
(上傳到 hdfs中)

2.2
hive> create table he(id int,name string)
> row format delimited
> fields terminated by '\t'
> stored as textfile;

導入
hive> load data inpath '/user/hadoop/abc/hehe.txt' into table he;

---------------------------------------------------------
3.從別的表中查詢出相應的數據並導入到Hive表中

3.1
hive> select * from he;
OK
1001 aa
1002 bb
1003 cc

hive> create table heihei(id int,name string)
> row format delimited
> fields terminated by '\t'
> stored as textfile;

3.2
hive> insert into table heihei select * from he;


hive> insert overwrite table heihei select * from ha;
(insert overwrite 會覆蓋數據)

--------------------------------------------------
4.在創建表的時候通過從別的表中查詢出相應的記錄並插入到所創建的表中
hive> create table gaga as select * from he;

 

================================================================
導出數據
(1)、導出到本地文件系統;
(2)、導出到HDFS中;
(3)、導出到Hive的另一個表中。

1.導出到本地文件系統;
hive> insert overwrite local directory '/home/hadoop/he1' select * from he;

[hadoop@h91 ~]$ cd he1(he1為目錄,目錄下有000000_0文件 )
[hadoop@h91 he1]$ cat 000000_0
(發現 列之間沒有分割 )

可以下面的方式增加分割
hive> insert overwrite local directory '/home/hadoop/he1' select id,concat('\t',name) from he;

******
和導入數據到Hive不一樣,不能用insert into來將數據導出
******

---------------------------------------------------------
2.導出到HDFS中。
hive> insert overwrite directory '/user/hadoop/abc' select * from he;
(/user/hadoop/abc 為hdfs下目錄)

[hadoop@h91 hadoop-0.20.2-cdh3u5]$ bin/hadoop fs -ls abc
[hadoop@h91 hadoop-0.20.2-cdh3u5]$ bin/hadoop fs -cat abc/000000_0

-------------------------------------------------------------
3.導出到Hive的另一個表中
hive> insert into table he12 select * from he;

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM