Hive之一：hive2.1.1安裝部署

本文轉載自查看原文 2018-06-16 22:52 1363 hive

一、Hive 運行模式

與 Hadoop 類似，Hive 也有 3 種運行模式：

1. 內嵌模式

將元數據保存在本地內嵌的 Derby 數據庫中，這是使用 Hive 最簡單的方式。但是這種方式缺點也比較明顯，因為一個內嵌的 Derby 數據庫每次只能訪問一個數據文件，這也就意味着它不支持多會話連接。

2. 本地模式

這種模式是將元數據保存在本地獨立的數據庫中（一般是 MySQL），這用就可以支持多會話和多用戶連接了。

3. 遠程模式

此模式應用於 Hive 客戶端較多的情況。把 MySQL 數據庫獨立出來，將元數據保存在遠端獨立的 MySQL 服務中，避免了在每個客戶端都安裝 MySQL 服務從而造成冗余浪費的情況。

二、下載安裝 Hive

http://hive.apache.org/downloads.html

tar -xzvf apache-hive-2.1.1-bin.tar.gz ##解壓

三、配置系統環境變量

修改 /etc/profile 文件 vi /etc/profile 來修改（root用戶操作）：

export JAVA_HOME="/usr/local/jdk1.8.0_172"
export HIVE_HOME=/home/duanxz/hive/apache-hive-2.1.1-bin
export PATH="$PATH:$JAVA_HOME/bin:$HIVE_HOME/bin:$HIVE_HOME/conf"

使環境變量生效:

source /etc/profile

四、內嵌模式

（1）修改 Hive 配置文件

$HIVE_HOME/conf 對應的是 Hive 的配置文件路徑，類似於之前學習的Hbase, 該路徑下的 hive-site.xml 是 Hive 工程的配置文件。默認情況下，該文件並不存在，我們需要拷貝它的模版來實現：

cp hive-default.xml.template hive-site.xml

hive-site.xml 的主要配置有：

  <property>
    <name>hive.metastore.warehouse.dir</name>
    <value>/hive/warehouse</value>
    <description>location of default database for the warehouse</description>
  </property>

  <property>
    <name>hive.exec.scratchdir</name>
    <value>/tmp/hive</value>
    <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/&lt;username&gt; is created, with ${hive.scratch.dir.permission}.</description>
  </property>

hive.metastore.warehouse.dir
該參數指定了 Hive 的數據存儲目錄，默認位置在 HDFS 上面的 /user/hive/warehouse 路徑下。

hive.exec.scratchdir
該參數指定了 Hive 的數據臨時文件目錄，默認位置為 HDFS 上面的 /tmp/hive 路徑下。

同時我們還要修改 Hive 目錄下 /conf/hive-env.sh 文件（請根據自己的實際路徑修改），該文件默認也不存在，同樣是拷貝它的模版來修改：

cp hive-env.sh.template hive-env.sh

# Set HADOOP_HOME to point to a specific hadoop install directory
HADOOP_HOME=/usr/local/hadoop-2.7.6

# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=/home/duanxz/hive/apache-hive-2.1.1-bin/conf

# Folder containing extra ibraries required for hive compilation/execution can be controlled by:
export HIVE_AUX_JARS_PATH=/home/duanxz/hive/apache-hive-2.1.1-bin/bin

（2）創建必要目錄

前面我們看到 hive-site.xml 文件中有兩個重要的路徑，切換到 hadoop 用戶下查看 HDFS 是否有這些路徑：

duanxz@three:~$ sudo chmod a+w hive

hadoop fs -ls /

沒有發現上面提到的路徑，因此我們需要自己新建這些目錄，並且給它們賦予用戶寫（W）權限。

$HADOOP_HOME/bin/hadoop fs -mkdir -p /hive/warehouse  
$HADOOP_HOME/bin/hadoop fs -mkdir -p /tmp/hive/  
hadoop fs -chmod 777 /hive/warehouse  
hadoop fs -chmod 777 /tmp/hive

檢查是否新建成功 hadoop fs -ls / 以及 hadoop fs -ls /tmp/hive/ ：

（3）修改 io.tmpdir 路徑

同時，要修改 hive-site.xml 中所有包含 ${system:java.io.tmpdir} 字段的 value 即路徑（vim下 / 表示搜索，后面跟你的關鍵詞，比如搜索 hello，則為 /hello , 再回車即可），你可以自己新建一個目錄來替換它，例如

mkdir /home/duanxz/hive/apache-hive-2.1.1-bin/iotmp  
chmod 777 /home/duanxz/hive/apache-hive-2.1.1-bin/iotmp  
把hive-site.xml 中所有包含 ${system:Java.io.tmpdir}替換成/home/duanxz/hive/apache-hive-2.1.1-bin/iotmp

全局替換命令先按Esc鍵再同時按shift+:把以下替換命令粘貼按回車即可全局替換

%s#${system:java.io.tmpdir}#/home/hadoop/cloud/apache-hive-2.1.1-bin/iotmp#g

（4）運行 Hive

./bin/hive

報錯

解決辦法：./schematool -initSchema -dbType derby

報錯

解決方法：刪除/home/hadoop/cloud/apache-hive-2.1.1-bin目錄下 rm -rf metastore_db/ ，再初始化：./bin/schematool -initSchema -dbType derby
重新運行

但在我的機器上沒有metastore_db/這個目錄，也修改下初始化命令，執行目錄向上退一級再執行：

./bin/schematool -initSchema -dbType derby

執行成功，如下圖：

再啟動hive：

./bin/hive

報錯

/tmp/hive 沒寫的權限

duanxz@three:~/hive/apache-hive-2.1.1-bin$ hadoop fs -chmod a+w /tmp/hive

啟動城后，如下：

關閉就殺死對應的進程即可

duanxz@three:~$ jps
6193 RunJar

Hive本身自帶一個數據庫，但是有弊端，hive本身數據庫，每次只允許一個用戶登錄

mysql安裝：http://blog.csdn.net/u014695188/article/details/51532410

設置mysql關聯hive

修改配置文件

### 創建hive-site.xml文件
在hive/conf/目錄下創建hive-site.xml文件

[html] view plain copy

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://192.168.169.134:3306/hive?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
<description>
Enforce metastore schema version consistency.
True: Verify that version information stored in metastore matches with one from Hive jars. Also disable automatic
schema migration attempt. Users are required to manully migrate schema after Hive upgrade which ensures
proper metastore schema migration. (Default)
False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
</description>
</property>
</configuration>

報錯：Caused by: MetaException(message:Version information not found in metastore. )

解決：hive-site.xml加入

[html] view plain copy

<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
<description>
Enforce metastore schema version consistency.
True: Verify that version information stored in metastore matches with one from Hive jars. Also disable automatic
schema migration attempt. Users are required to manully migrate schema after Hive upgrade which ensures
proper metastore schema migration. (Default)
False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
</description>
</property>

報錯：缺少mysql jar包

解決：將其（如mysql-connector-Java-5.1.15-bin.jar）拷貝到$HIVE_HOME/lib下即可。

報錯：

[html] view plain copy

Exception in thread "main" java.lang.RuntimeException: Hive metastore database is not initialized.
Please use schematool (e.g. ./schematool -initSchema -dbType ...) to create the schema. If needed,
don't forget to include the option to auto-create the underlying database in your JDBC connection string (e.g. ?createDatabaseIfNotExist=true for mysql)

解決：

[html] view plain copy