1.hive命令登錄HIVE數據庫后,執行show databases;命令可以看到hive數據庫中有一個默認的default數據庫。

[root@hadoop hive]# hive Logging initialized using configuration in file:/usr/local/hive/conf/hive-log4j2.properties Async: true Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. hive> show databases; OK default #可以看到HIVE默認自帶了一個數據庫default Time taken: 21.043 seconds, Fetched: 1 row(s) hive>
然后登錄mysql數據庫,show databases;顯示數據庫名,可以看到有一個hive數據庫;use hive; 進入hive數據庫;show tables;顯示表名;select * from DBS; #可以看到HIVE默認default數據庫的元數據信息。

[root@hadoop ~]# mysql -uroot -proot Warning: Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 24 Server version: 5.6.40-log MySQL Community Server (GPL) Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | hive | | mysql | | performance_schema | | test | +--------------------+ 5 rows in set (0.32 sec) mysql> use hive Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed mysql> show tables; +---------------------------+ | Tables_in_hive | +---------------------------+ | AUX_TABLE | | BUCKETING_COLS | | CDS | | COLUMNS_V2 | | COMPACTION_QUEUE | | COMPLETED_COMPACTIONS | | COMPLETED_TXN_COMPONENTS | | DATABASE_PARAMS | | DBS | | DB_PRIVS | | DELEGATION_TOKENS | | FUNCS | | FUNC_RU | | GLOBAL_PRIVS | | HIVE_LOCKS | | IDXS | | INDEX_PARAMS | | KEY_CONSTRAINTS | | MASTER_KEYS | | NEXT_COMPACTION_QUEUE_ID | | NEXT_LOCK_ID | | NEXT_TXN_ID | | NOTIFICATION_LOG | | NOTIFICATION_SEQUENCE | | NUCLEUS_TABLES | | PARTITIONS | | PARTITION_EVENTS | | PARTITION_KEYS | | PARTITION_KEY_VALS | | PARTITION_PARAMS | | PART_COL_PRIVS | | PART_COL_STATS | | PART_PRIVS | | ROLES | | ROLE_MAP | | SDS | | SD_PARAMS | | SEQUENCE_TABLE | | SERDES | | SERDE_PARAMS | | SKEWED_COL_NAMES | | SKEWED_COL_VALUE_LOC_MAP | | SKEWED_STRING_LIST | | SKEWED_STRING_LIST_VALUES | | SKEWED_VALUES | | SORT_COLS | | TABLE_PARAMS | | TAB_COL_STATS | | TBLS | | TBL_COL_PRIVS | | TBL_PRIVS | | TXNS | | TXN_COMPONENTS | | TYPES | | TYPE_FIELDS | | VERSION | | WRITE_SET | +---------------------------+ 57 rows in set (0.00 sec) mysql> select * from DBS; #可以看到HIVE默認數據庫default的元數據 +-------+-----------------------+----------------------------------------+---------+------------+------------+ | DB_ID | DESC | DB_LOCATION_URI | NAME | OWNER_NAME | OWNER_TYPE | +-------+-----------------------+----------------------------------------+---------+------------+------------+ | 1 | Default Hive database | hdfs://hadoop:9000/user/hive/warehouse | default | public | ROLE | +-------+-----------------------+----------------------------------------+---------+------------+------------+ 1 row in set (0.00 sec) mysql>
2.在hive創建一個測試庫
hive> create database testhive; #創建庫 OK Time taken: 3.45 seconds hive> show databases; #顯示庫 OK default testhive Time taken: 1.123 seconds, Fetched: 2 row(s)
在mysql查看,發現顯示了測試庫元數據信息(包括testhive的DB_ID,在HDFS上的存儲位置等 )
mysql> select * from DBS; +-------+-----------------------+----------------------------------------------------+----------+------------+------------+ | DB_ID | DESC | DB_LOCATION_URI | NAME | OWNER_NAME | OWNER_TYPE | +-------+-----------------------+----------------------------------------------------+----------+------------+------------+ | 1 | Default Hive database | hdfs://hadoop:9000/user/hive/warehouse | default | public | ROLE | | 6 | NULL | hdfs://hadoop:9000/user/hive/warehouse/testhive.db | testhive | root | USER | +-------+-----------------------+----------------------------------------------------+----------+------------+------------+ 2 rows in set (0.00 sec)
在HDFS查看,我們看一下testhive.db是什么。它其實就是一個目錄,所以說創建一個數據庫其實就是創建了一個目錄
我創建的hdfs目錄明明是/usr/hive/warehouse/,不知道為啥數據庫卻保存到了/user/hive/warehouse/??哪里出錯了??或者說是我的目錄創建錯了,應該創建的就是/user/hive/warehouse/?
[root@hadoop ~]# hdfs dfs -ls /user/hive/warehouse Found 1 items drwxr-xr-x - root supergroup 0 2018-07-27 15:17 /user/hive/warehouse/testhive.db
3.創建表
hive> use testhive; #使用庫 OK Time taken: 0.131 seconds hive> create table test(id int); 創建表 OK Time taken: 3.509 seconds
在mysql中查看表的信息,可以看到test表歸屬於DB_ID為6的數據庫,即testhive(可 select * from DBS; 查看)
mysql> select * from TBLS; +--------+-------------+-------+------------------+-------+-----------+-------+----------+---------------+--------------------+--------------------+--------------------+ | TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | RETENTION | SD_ID | TBL_NAME | TBL_TYPE | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT | IS_REWRITE_ENABLED | +--------+-------------+-------+------------------+-------+-----------+-------+----------+---------------+--------------------+--------------------+--------------------+ | 1 | 1532677542 | 6 | 0 | root | 0 | 1 | test | MANAGED_TABLE | NULL | NULL | | +--------+-------------+-------+------------------+-------+-----------+-------+----------+---------------+--------------------+--------------------+--------------------+ 1 row in set (0.01 sec)
在HDFS中查看,發現HDFS為新表創建了一個目錄
[root@hadoop ~]# hdfs dfs -ls /user/hive/warehouse/testhive.db Found 1 items drwxr-xr-x - root supergroup 0 2018-07-27 16:03 /user/hive/warehouse/testhive.db/test
4.插入數據。
4.1 在表中插入數據 insert into test values (1); 可以看到系統在對數據進行MapReduce。

hive> insert into test values (1); WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Query ID = root_20180727155527_5971c7d8-9b5c-4ef3-98f7-63febe38c79a Total jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1532671010251_0001, Tracking URL = http://hadoop:8088/proxy/application_1532671010251_0001/ Kill Command = /usr/local/hadoop/bin/hadoop job -kill job_1532671010251_0001 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0 2018-07-27 16:02:25,979 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 3.32 sec MapReduce Total cumulative CPU time: 3 seconds 320 msec Ended Job = job_1532671010251_0001 Stage-4 is selected by condition resolver. Stage-3 is filtered out by condition resolver. Stage-5 is filtered out by condition resolver. Moving data to directory hdfs://hadoop:9000/user/hive/warehouse/testhive.db/test/.hive-staging_hive_2018-07-27_15-55-27_353_3121708441542170724-1/-ext-10000 Loading data to table testhive.test MapReduce Jobs Launched: Stage-Stage-1: Map: 1 Cumulative CPU: 3.32 sec HDFS Read: 3951 HDFS Write: 71 SUCCESS Total MapReduce CPU Time Spent: 3 seconds 320 msec OK Time taken: 453.982 seconds
在HDFS查看,發現HDFS將插入的數據封裝成了一個文件000000_0
[root@hadoop ~]# hdfs dfs -ls /user/hive/warehouse/testhive.db/test -rwxr-xr-x 1 root supergroup 2 2018-07-27 16:01 /user/hive/warehouse/testhive.db/test/000000_0 [root@hadoop ~]# hdfs dfs -cat /user/hive/warehouse/testhive.db/test/000000_0 1
4.2 再插入一個數據 insert into test values (2); 可以看到系統還是在對數據進行MapReduce。
hive> insert into test values (2);
在HDFS中查看,發現HDFS將插入的數據封裝成了另外一個文件000000_0_copy_1
[root@hadoop ~]# hdfs dfs -ls /user/hive/warehouse/testhive.db/test Found 2 items -rwxr-xr-x 1 root supergroup 2 2018-07-27 16:01 /user/hive/warehouse/testhive.db/test/000000_0 -rwxr-xr-x 1 root supergroup 2 2018-07-27 16:22 /user/hive/warehouse/testhive.db/test/000000_0_copy_1 [root@hadoop ~]# hdfs dfs -cat /user/hive/warehouse/testhive.db/test/000000_0_copy_1 2
4.3 再插入一個數據 insert into test values (3); 可以看到系統還是在對數據進行MapReduce。
在HDFS中查看,發現HDFS將插入的數據封裝成了另外一個文件000000_0_copy_2
[root@hadoop ~]# hdfs dfs -ls /user/hive/warehouse/testhive.db/test Found 3 items -rwxr-xr-x 1 root supergroup 2 2018-07-27 16:01 /user/hive/warehouse/testhive.db/test/000000_0 -rwxr-xr-x 1 root supergroup 2 2018-07-27 16:22 /user/hive/warehouse/testhive.db/test/000000_0_copy_1 -rwxr-xr-x 1 root supergroup 2 2018-07-27 16:37 /user/hive/warehouse/testhive.db/test/000000_0_copy_2 [root@hadoop ~]# hdfs dfs -cat /user/hive/warehouse/testhive.db/test/000000_0_copy_2 3
4.4 在hive中查看表
hive> select * from test; OK 1 2
3
Time taken: 5.483 seconds, Fetched: 3 row(s)
5.從本地文件加載數據
先創建文件
[root@hadoop ~]# vi hive.txt #創建文件 4 5 6 7 8 9 0 #保存退出
然后加載數據
hive> load data local inpath '/root/hive.txt' into table testhive.test; #加載數據 Loading data to table testhive.test OK Time taken: 6.282 seconds
在hive中查看,發現文件內容被映射到了表中的對應的列里
hive> select * from test; OK 1 2 3 4 5 6 7 8 9 0 Time taken: 0.534 seconds, Fetched: 10 row(s)
在HDFS查看,發現hive.txt文件被保存到了test表目錄下
[root@hadoop ~]# hdfs dfs -ls /user/hive/warehouse/testhive.db/test Found 4 items -rwxr-xr-x 1 root supergroup 2 2018-07-27 16:01 /user/hive/warehouse/testhive.db/test/000000_0 -rwxr-xr-x 1 root supergroup 2 2018-07-27 16:22 /user/hive/warehouse/testhive.db/test/000000_0_copy_1 -rwxr-xr-x 1 root supergroup 2 2018-07-27 16:37 /user/hive/warehouse/testhive.db/test/000000_0_copy_2 -rwxr-xr-x 1 root supergroup 14 2018-07-27 16:48 /user/hive/warehouse/testhive.db/test/hive.txt
6.hive也支持排序 select * from test order by id desc; 可以看到hive此時也是有一個MapReduce過程

hive> select * from test order by id desc; WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Query ID = root_20180730093619_c798eb69-b94f-4678-94cc-5ec56865ed5c Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Starting Job = job_1532913019648_0001, Tracking URL = http://hadoop:8088/proxy/application_1532913019648_0001/ Kill Command = /usr/local/hadoop/bin/hadoop job -kill job_1532913019648_0001 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1 2018-07-30 09:38:13,904 Stage-1 map = 0%, reduce = 0% 2018-07-30 09:39:09,656 Stage-1 map = 13%, reduce = 0%, Cumulative CPU 1.66 sec 2018-07-30 09:39:14,311 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.72 sec 2018-07-30 09:39:49,708 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 5.41 sec MapReduce Total cumulative CPU time: 5 seconds 930 msec Ended Job = job_1532913019648_0001 MapReduce Jobs Launched: Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 5.93 sec HDFS Read: 6799 HDFS Write: 227 SUCCESS Total MapReduce CPU Time Spent: 5 seconds 930 msec OK 9 8 7 6 5 4 3 2 1 0 Time taken: 224.27 seconds, Fetched: 10 row(s)
7.hive也支持desc test;
hive> desc test; OK id int Time taken: 6.194 seconds, Fetched: 1 row(s)
hive數據庫的操作和mysql其實差不多,它的缺點是沒有修改和刪除命令,優點是不需要用戶親自寫MapReduce,只需要通過簡單的sql語句的形式就可以實現復雜關系。
hive的操作還有很多,以后用到再整理吧。