Hive 最適合應用在基於大量不可變數據的批處理作業。
1. 建表
CREATE ATBEL [IF NOT EXISTS] table_name
(col_name data_type)
CREATE TABLE 創建一個表,如果相同名字的表已經存在,則拋出異常。用IF NOT EXISTS來忽略這個異常。
如:CRETAE TABLE BCUSTOMER(
CST_ID Integer,
CST_NAME STRING
);
創建外部表:
CREATE EXTERNAL TABLE page_view(viewTime INT, userid BIGINT,
page_url STRING, referrer_url STRING,
ip STRING COMMENT 'IP Address of the User',
country STRING COMMENT 'country of origination')
COMMENT 'This is the staging page view table'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\054'
STORED AS TEXTFILE
LOCATION '<hdfs_location>';
建分區表
CREATE TABLE par_table(viewTime INT, userid BIGINT,
page_url STRING, referrer_url STRING,
ip STRING COMMENT 'IP Address of the User')
COMMENT 'This is the page view table'
PARTITIONED BY(date STRING, pos STRING)
ROW FORMAT DELIMITED ‘\t’
FIELDS TERMINATED BY '\n'
STORED AS SEQUENCEFILE;
建Bucket表
CREATE TABLE par_table(viewTime INT, userid BIGINT,
page_url STRING, referrer_url STRING,
ip STRING COMMENT 'IP Address of the User')
COMMENT 'This is the page view table'
PARTITIONED BY(date STRING, pos STRING)
CLUSTERED BY(userid) SORTED BY(viewTime) INTO 32 BUCKETS
ROW FORMAT DELIMITED ‘\t’
FIELDS TERMINATED BY '\n'
STORED AS SEQUENCEFILE;
創建表並創建索引字段ds
hive> CREATE TABLE invites (foo INT, bar STRING) PARTITIONED BY (ds STRING);
復制一個空表 (允許復制現有的表結構,但不復制數據)
CREATE TABLE empty_key_value_store
LIKE key_value_store;
顯示所有表: SHOW TABLES;
表添加一列/更新列:
ALTER TABLE BCUSTOMER ADD/REPLACE COLUMNS (new_col INT)
添加一列並增加列字段注釋:
ALTER TABLE BCUSTOMER ADD COLUMNS (new_col INT COMMENT 'a comment')
更改表名:
ALTER TABLE BCUSTOMER RENAME TO NEWNAME;
刪除表:
DROP TABLE IF EXISTS IKEA.TEMP_IF_REPORT_CHECK_CHERXU;
刪除列:
ALTER TABLE table_name DROP column_name;
Hive SQL中沒有delete 和update功能,可以用OVERWRITE代替
假如你要刪除CST_ID=1的人
INSERT INTO TABLE BCUSTOMER
SELECT * FROM BCUSTOMER WHERE CST_ID !=1
HIVE中不支持等值連接
select * from a,b where a.key=b.key (不可以)
應該寫為:
select * from a join b on a.key=b.key
