基於Docker搭建大數據集群(六)Hive搭建
前言
之前搭建的都是1.x版本,這次搭建的是hive3.1.2版本的。。還是有一點細節不一樣的
Hive現在解析引擎可以選擇spark,我是用spark做解析引擎的,存儲還是用的HDFS
我是在docker里面搭建的集群,所以都是基於docker操作的
一、安裝包准備
二、版本兼容
我使用的相關軟件版本
- Hadoop ~ 2.7.7
- Spark ~ 2.4.4
- JDK ~ 1.8.0_221
- Scala ~ 2.12.9
三、環境准備
(1)解壓hive壓縮包
tar xivf apache-hive-3.1.2-bin -C /opt/hive/
(2)新建一個日志目錄
mdkir /opt/hive/iotmp
原因
Hive啟動時獲取的 ${system:java.io.tmpdir} ${system:user.name}這兩個變量獲取絕對值有誤,需要手動指定真實路徑,替換其默認路徑
報錯截圖

解決措施
將hive-site.xml配置里面所有相關變量全部替換掉
VI編輯器替換命令
:%s/${system:java.io.tmpdir}/\/opt\/hive\/iotmp/g
:%s/${system:user.name}/huan/g
(3)在MySQL上新建一個數據庫用於存放元數據
create database hive;
(4)環境變量配置
- HIVE_OHME
- HADOOP_HOME
- SPARK_HOME
- JAVA_HOME
四、jar包
1. MySQL驅動
2. 將hive的jline包替換到hadoop的yarn下
mv /opt/hive/apache-hive-3.1.2-bin/lib/jline-2.12.jar /opt/hadoop/hadoop-2.7.7/share/hadoop/yarn/
3.將MySQL驅動放到hive的lib目錄下
4.同步jar包到client節點
五、配置
我是用的是遠程分布式架構,一個master提供服務,3個client遠程連接master
第一步:復制或新建一個hvie-site.xml配置文件
cp hive-default.xml.template hive-site.xml
第二步:修改master節點配置文件
1. 使用mysql替換默認的derby存放元數據
<!--元數據庫修改為MySQL-->
<property>
<name>hive.metastore.db.type</name>
<value>mysql</value>
<description>
Expects one of [derby, oracle, mysql, mssql, postgres].
Type of database used by the metastore. Information schema & JDBCStorageHandler depend on it.
</description>
</property>
<!--MySQL 驅動-->
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<!--MySQL URL-->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://192.168.11.46:13306/hive?createDatabaseIfNotExist=true</value>
<description>
JDBC connect string for a JDBC metastore.
To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
</description>
</property>
<!--MySQL 用戶名-->
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>Username to use against metastore database</description>
</property>
<!--MySQL 密碼-->
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>root</value>
<description>password to use against metastore database</description>
<property>
2.設置解析引擎為spark
<property>
<name>hive.execution.engine</name>
<value>spark</value>
<description>
Expects one of [mr, tez, spark].
Chooses execution engine. Options are: mr (Map reduce, default), tez, spark. While MR
remains the default engine for historical reasons, it is itself a historical engine
and is deprecated in Hive 2 line. It may be removed without further warning.
</description>
</property>
3. 自動初始化元數據
<property>
<name>datanucleus.schema.autoCreateAll</name>
<value>true</value>
<description>Auto creates necessary schema on a startup if one doesn't exist. Set this to false, after creating it once.To enable auto create also set hive.metastore.schema.verification=false. Auto creation is not recommended for production use cases, run schematool command instead.
</description>
</property>
4. 關閉校驗
<!--聽說是JDK版本使用1.8的問題。。-->
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
<description>
Enforce metastore schema version consistency.
True: Verify that version information stored in is compatible with one from Hive jars. Also disable automatic
schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures
proper metastore schema migration. (Default)
False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
</description>
</property>
<property>
<name>hive.conf.validation</name>
<value>false</value>
<description>Enables type checking for registered Hive configurations</description>
</property>
5. 刪除 description 中的 ,這個解析會報錯
<property>
<name>hive.txn.xlock.iow</name>
<value>true</value>
<description>
Ensures commands with OVERWRITE (such as INSERT OVERWRITE) acquire Exclusive locks fortransactional tables. This ensures that inserts (w/o overwrite) running concurrently
are not hidden by the INSERT OVERWRITE.
</description>
</property>
第三步:將hive-site.xml發送到client結點
scp hive-site.xml 目的結點IP或目的結點主機名:目的主機保存目錄
第四步:修改client節點的hive-site.xml
<property>
<name>hive.metastore.uris</name>
<value>thrift://cluster-master:9083</value>
<description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
</property>
6. 替換相對路徑
:%s/${system:java.io.tmpdir}/\/opt\/hive\/iotmp/g
:%s/${system:user.name}/huan/g
六、啟動
master節點
啟動時會自動初始化元數據,可以查看數據庫是否有表生成
./hive --service metastore &
client節點
hive
