基於Docker搭建大數據集群（六）Hive搭建

本文轉載自查看原文 2019-09-27 11:44 665 工具

基於Docker搭建大數據集群（六）Hive搭建

前言

之前搭建的都是1.x版本，這次搭建的是hive3.1.2版本的。。還是有一點細節不一樣的

Hive現在解析引擎可以選擇spark，我是用spark做解析引擎的，存儲還是用的HDFS

我是在docker里面搭建的集群，所以都是基於docker操作的

一、安裝包准備

Hive官網下載

微雲下載 | 在 tar 目錄下

二、版本兼容

我使用的相關軟件版本

Hadoop ~ 2.7.7
Spark ~ 2.4.4
JDK ~ 1.8.0_221
Scala ~ 2.12.9

三、環境准備

（1）解壓hive壓縮包

tar xivf apache-hive-3.1.2-bin -C /opt/hive/

（2）新建一個日志目錄

mdkir /opt/hive/iotmp

原因

Hive啟動時獲取的 ${system:java.io.tmpdir} ${system:user.name}這兩個變量獲取絕對值有誤，需要手動指定真實路徑，替換其默認路徑

報錯截圖

解決措施

將hive-site.xml配置里面所有相關變量全部替換掉

VI編輯器替換命令

:%s/${system:java.io.tmpdir}/\/opt\/hive\/iotmp/g
:%s/${system:user.name}/huan/g

（3）在MySQL上新建一個數據庫用於存放元數據

create database hive;

（4）環境變量配置

HIVE_OHME
HADOOP_HOME
SPARK_HOME
JAVA_HOME

四、jar包

1. MySQL驅動

微雲下載 | jar包目錄下

2. 將hive的jline包替換到hadoop的yarn下

mv /opt/hive/apache-hive-3.1.2-bin/lib/jline-2.12.jar /opt/hadoop/hadoop-2.7.7/share/hadoop/yarn/

3.將MySQL驅動放到hive的lib目錄下

4.同步jar包到client節點

五、配置

我是用的是遠程分布式架構，一個master提供服務，3個client遠程連接master

第一步：復制或新建一個`hvie-site.xml`配置文件

cp hive-default.xml.template hive-site.xml

第二步：修改master節點配置文件

1. 使用mysql替換默認的derby存放元數據

<!--元數據庫修改為MySQL-->
<property>
    <name>hive.metastore.db.type</name>
    <value>mysql</value>
    <description>
      Expects one of [derby, oracle, mysql, mssql, postgres].
      Type of database used by the metastore. Information schema &amp; JDBCStorageHandler depend on it.
    </description>
</property>
<!--MySQL 驅動-->
<property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
    <description>Driver class name for a JDBC metastore</description>
</property>
<!--MySQL URL-->
<property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://192.168.11.46:13306/hive?createDatabaseIfNotExist=true</value>
    <description>
      JDBC connect string for a JDBC metastore.
      To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
      For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
    </description>
</property>
<!--MySQL 用戶名-->
<property>
	<name>javax.jdo.option.ConnectionUserName</name>
	<value>root</value>
	<description>Username to use against metastore database</description>
</property>
<!--MySQL 密碼-->
<property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>root</value>
    <description>password to use against metastore database</description>
<property>

2.設置解析引擎為spark

<property>
    <name>hive.execution.engine</name>
    <value>spark</value>
    <description>
      Expects one of [mr, tez, spark].
      Chooses execution engine. Options are: mr (Map reduce, default), tez, spark. While MR
      remains the default engine for historical reasons, it is itself a historical engine
      and is deprecated in Hive 2 line. It may be removed without further warning.
    </description>
</property>

3. 自動初始化元數據

<property>
    <name>datanucleus.schema.autoCreateAll</name>
    <value>true</value>
    <description>Auto creates necessary schema on a startup if one doesn't exist. Set this to false, after creating it once.To enable auto create also set hive.metastore.schema.verification=false. Auto creation is not recommended for production use cases, run schematool command instead.
    </description>
</property>

4. 關閉校驗

<!--聽說是JDK版本使用1.8的問題。。-->
<property>
   <name>hive.metastore.schema.verification</name>
   <value>false</value>
   <description>
     Enforce metastore schema version consistency.
     True: Verify that version information stored in is compatible with one from Hive jars.  Also disable automatic
           schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures
           proper metastore schema migration. (Default)
     False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
   </description>
</property>
<property>
   <name>hive.conf.validation</name>
   <value>false</value>
   <description>Enables type checking for registered Hive configurations</description>
 </property>

5. 刪除 description 中的 `&#8`，這個解析會報錯

<property>
   <name>hive.txn.xlock.iow</name>
   <value>true</value>
   <description>
     Ensures commands with OVERWRITE (such as INSERT OVERWRITE) acquire Exclusive locks for&#8;transactional tables.  This ensures that inserts (w/o overwrite) running concurrently
     are not hidden by the INSERT OVERWRITE.
   </description>
</property>

第三步：將hive-site.xml發送到client結點

scp hive-site.xml 目的結點IP或目的結點主機名:目的主機保存目錄

第四步：修改client節點的hive-site.xml

<property>
   <name>hive.metastore.uris</name>
   <value>thrift://cluster-master:9083</value>
   <description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
</property>

6. 替換相對路徑

:%s/${system:java.io.tmpdir}/\/opt\/hive\/iotmp/g
:%s/${system:user.name}/huan/g

六、啟動

master節點

啟動時會自動初始化元數據，可以查看數據庫是否有表生成

./hive --service metastore &

client節點

hive

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Docker搭建大數據集群 Docker搭建大數據集群 Hadoop Spark HBase Hive Zookeeper Scala 基於Docker搭建大數據集群（一）Docker環境部署基於Docker搭建大數據集群（七）Hbase部署基於Docker搭建大數據集群（三）Hadoop部署基於Docker搭建大數據集群（四）Spark部署大數據集群搭建 3台雲服務器搭建大數據集群（hadoop + Zookeeper + HBase + Hive + jstorm）大數據集群搭建的網絡配置過程大數據集群環境搭建之一 Centos基本環境准備