超詳細!CDH版-Hadoop_Hive_Impala單機環境搭建純離線安裝


CDH-Hadoop_Hive_Impala單機環境搭建純離線安裝

環境介紹及安裝包下載地址

Centos6.5

版本選擇:cdh5.14.0-src.tar.gz  CDH5 – 5.14.0

Hadoop安裝包下載地址

http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.14.0.tar.gz

Hive安裝包下載地址

http://archive.cloudera.com/cdh5/cdh/5/hive-1.1.0-cdh5.14.0.tar.gz

impala安裝包下載地址

http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.14.0/RPMS/x86_64/impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.14.0/RPMS/x86_64/impala-catalog-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.14.0/RPMS/x86_64/impala-debuginfo-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.14.0/RPMS/x86_64/impala-server-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.14.0/RPMS/x86_64/impala-shell-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.14.0/RPMS/x86_64/impala-state-store-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm  

http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.14.0/RPMS/x86_64/impala-udf-devel-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

 

參考文檔

https://docs.cloudera.com/documentation/enterprise/5-12-x/topics/cdh_qs_cdh5_pseudo.html#topic_3

https://docs.cloudera.com/documentation/index.html

1.安裝JDK

上傳rpm包

rpm -ivh jdk-8u221-linux-x64.rpm

 

默認安裝在/usr/java中

路徑為/usr/java/jdk1.8.0_221-amd64

 

配置環境變量

vim /etc/profile

#添加如下內容

JAVA_HOME=/usr/java/jdk1.8.0_221-amd64

JRE_HOME=/usr/java/jdk1.8.0_221-amd64/jre

PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin

CLASSPATH=:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib

export JAVA_HOME JRE_HOME PATH CLASSPATH

 

[root@sunny conf]# java -version

java version "1.8.0_221"

Java(TM) SE Runtime Environment (build 1.8.0_221-b11)

Java HotSpot(TM) 64-Bit Server VM (build 25.221-b11, mixed mode)

2.安裝Hadoop

2.1指定Hadoop安裝目錄/data/CDH,進行解壓

cd /data/CDH

tar -zvxf /data/install_pkg_CDH/hadoop-2.6.0-cdh5.14.0.tar.gz

 

hadoop核心配置文件位置

/data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop/

 

2.2配置Hadoop環境變量

vi /etc/profile

 

#在末尾加入配置:

JAVA_HOME=/usr/java/jdk1.8.0_221-amd64

JRE_HOME=/usr/java/jdk1.8.0_221-amd64/jre

PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin

CLASSPATH=:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib

export JAVA_HOME JRE_HOME PATH CLASSPATH

export HADOOP_HOME=/data/CDH/hadoop-2.6.0-cdh5.14.0

export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_HOME}/lib/native

export HADOOP_OPTS="-Djava.library.path=HADOOP_HOME/lib"

export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

 

source /etc/profile #使得配置生效

 

2.3先在hadoop安裝目錄下創建數據目錄(用於配置: core-site.xml)

cd /data/CDH/hadoop-2.6.0-cdh5.14.0

mkdir data

 

2.4配置core-site.xml

vi   /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop/core-site.xml

 

#添加如下配置:

<property>

    <name>fs.defaultFS</name>

    <!-- 這里填的是你自己的ip,端口默認-->

    <value>hdfs://172.20.71.66:9000</value>

</property>

<property>

    <name>hadoop.tmp.dir</name>

    <!-- 這里填的是你自定義的hadoop工作的目錄,端口默認-->

    <value>/data/CDH/hadoop-2.6.0-cdh5.14.0/data</value>

</property>

 

<property>

    <name>hadoop.native.lib</name>

    <value>false</value>

    <description>Should native hadoop libraries, if present, be used.

    </description>

</property>

 

2.5配置hdfs-site.xml

vi      /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop/hdfs-site.xml

 

#添加如下配置:

<property>

         <name>dfs.replication</name>

         <value>1</value>

</property>

<property>

    <name>dfs.namenode.secondary.http-address</name>

    <value>172.20.71.66:50090</value>

</property>

<property>

    <name>dfs.permissions.enabled</name>

    <value>false</value>

</property>

 

2.6配置mapres-site.xml

cd     /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop

cp mapred-site.xml.template  ./mapred-site.xml  #復制默認文件配置命名為mapred-site.xml

vi      /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop/mapred-site.xml

 

#增加如下配置:

<property>

    <name>mapreduce.framework.name</name>

    <value>yarn</value>

</property>

 

2.7配置yarn-site.xml

vi      /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop/yarn-site.xml 

 

#添加如下配置:

<property>

    <name>yarn.resourcemanager.hostname</name>

    <value>172.20.71.66</value>

</property>

<property>

    <name>yarn.nodemanager.aux-services</name>

    <value>mapreduce_shuffle</value>

</property>

 

2.8配置hadoop-env.sh

vi      /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop/hadoop-env.sh

export JAVA_HOME=${JAVA_HOME}

更改為:

export JAVA_HOME=/usr/java/jdk1.8.0_221-amd64

 

2.9啟動hadoop

配置好之后切換到sbin目錄下

cd /data/CDH/hadoop-2.6.0-cdh5.14.0/sbin/ 

hadoop namenode –format #格式化hadoop文件格式

 

錯誤提示:hostname與/etc/hosts中的不對應。

執行hostname命令,查看主機名。比如為testuser

執行vi /etc/hosts

修改

127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4

::1      localhost localhost.localdomain localhost6 localhost6.localdomain6

127.0.0.1 localhost localhost.localdomain testuser  localhost4.localdomain4

::1      localhost localhost.localdomain localhost6 localhost6.localdomain6

 

再次格式化: hadoop namenode -format

成功后啟動所有的

./start-all.sh

中間需要輸入多次當前服務器密碼,按照提示輸入即可

然后再輸入命令 jps 查看服務啟動情況

出現下圖標記的五個進程,即表示啟動成功!

 

2.10Hadoop頁面訪問

http://172.20.71.66:50070   #把172.20.71.66 換為你的服務器IP即可

 

3.安裝Hive

3.1指定Hive安裝目錄/data/CDH,並且解壓

cd /data/CDH

tar -zxvf /data/install_pkg_CDH/hive-1.1.0-cdh5.14.0.tar.gz

 

3.2配置環境變量

vim /etc/profile

 

#添加如下內容

export HIVE_HOME=/data/CDH/hive-1.1.0-cdh5.14.0/

export PATH=$HIVE_HOME/bin:$PATH

export HIVE_CONF_DIR=$HIVE_HOME/conf

export HIVE_AUX_JARS_PATH=$HIVE_HOME/lib

 

~ source /etc/profile

 

3.3配置hive-env.sh

cd /data/CDH/hive-1.1.0-cdh5.14.0/conf

cp  hive-env.sh.template hive-env.sh

vi hive_env.sh

 

#在文件末尾添加如下內容

export JAVA_HOME=/usr/java/jdk1.8.0_221-amd64

export HADOOP_HOME=/data/CDH/hadoop-2.6.0-cdh5.14.0

export HIVE_HOME=/data/CDH/hive-1.1.0-cdh5.14.0

export HIVE_CONF_DIR=/data/CDH/hive-1.1.0-cdh5.14.0/conf

export HIVE_AUX_JARS_PATH=/data/CDH/hive-1.1.0-cdh5.14.0/lib

 

3.4配置文件hive-site.xml

cd /data/CDH/hive-1.1.0-cdh5.14.0/conf

vi hive-site.xml

#添加如下內容

<?xml version="1.0" encoding="UTF-8" standalone="no"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>

  <property>

    <name>javax.jdo.option.ConnectionURL</name>

    <value>jdbc:derby:;databaseName=metastore_db;create=true</value>

  </property>

    <property>

    <name>javax.jdo.option.ConnectionDriverName</name>

    <value>org.apache.derby.jdbc.EmbeddedDriver</value>

  </property>

    <property>

    <name>javax.jdo.option.ConnectionUserName</name>

    <value>APP</value>

  </property>

    <property>

    <name>javax.jdo.option.ConnectionPassword</name>

    <value>mine</value>

  </property>

</configuration>

3.5創建HDFS目錄(這里好像非必須的,也可不操作)

cd /data/CDH/hive-1.1.0-cdh5.14.0

mkdir hive_hdfs

cd hive_hdfs

hdfs dfs -mkdir -p warehouse #使用這條命令的前提是hadoop已經安裝好了

hdfs dfs -mkdir -p tmp

hdfs dfs -mkdir -p log

hdfs dfs -chmod -R 777 warehouse

hdfs dfs -chmod -R 777 tmp

hdfs dfs -chmod -R 777 log

 

3.6初始化hive (建議在安裝目錄的bin目錄里執行)

cd  /data/CDH/hive-1.1.0-cdh5.14.0/bin

schematool -dbType derby –initSchema

#如果需要反復執行schematool -dbType derby –initSchema ,可以手動刪除或重命名metastore_db目錄。

 

 

 

3.7啟動hive

hive --service metastore &    #可在任意目錄輸入

hive --service hiveserver2 &    #可在任意目錄輸入

3.8使用hive

hive  #可在任意目錄輸入

3.9安裝hive時的異常處理

3.9.1 java.lang.ClassNotFoundException: org.htrace.Trace

從maven倉庫中下載htrace-core.jar包拷貝到$HIVE_HOME/lib目錄下

<!-- https://mvnrepository.com/artifact/org.htrace/htrace-core -->

<dependency>

    <groupId>org.htrace</groupId>

    <artifactId>htrace-core</artifactId>

    <version>3.0</version>

</dependency>

 

cp /data/install_pkg_apache/htrace-core-3.0.jar /data/CDH/hive-1.1.0-cdh5.14.0/lib

cd /data/CDH/hive-1.1.0-cdh5.14.0/bin

 

3.9.2 [ERROR] Terminal initialization failed; falling back to unsupported java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but interface was expected

處理方案:從maven倉庫中下載 jline-2.12.jar(上傳到當前服務器)

cd /data/CDH/hadoop-2.6.0-cdh5.14.0/share/hadoop/yarn/lib

mv jline-2.11.jar jline-2.11.jar_bak

 

cd /data/CDH/hive-1.1.0-cdh5.14.0/lib

cp jline-2.12.jar /data/CDH/hadoop-2.6.0-cdh5.14.0/share/hadoop/yarn/lib

 

重新啟動 hadoop:

cd /data/CDH/hadoop-2.6.0-cdh5.14.0/sbin

sh stop-all.sh

sh start-all.sh

 

再次執行: schematool -dbType derby -initSchema

錯誤還在!!!

 

處理方案Plus:

vi /etc/profile

export HADOOP_USER_CLASSPATH_FIRST=true

再次執行: schematool -dbType derby -initSchema  (ok)

 

接着開始啟動hive:

./hive --service metastore &  

./hive --service hiveserver2 &

 

3.9.3 java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

參考博客地址: https://www.58jb.com/html/89.html

在Hive2.0.0以前的版本,需要在一個Hive的主節點上啟動兩個服務:metastore 、hiveserver2 才可以登陸到Hive的命令行下。(配置hive-site.xml再啟動即可)

 

hive --service metastore &

hive --service hiveserver2 &

 

3.10Hive基本操作測試

hive  #進入hive

#執行hql語句

show databases;

create database test;

use test;

create table user_test(name string) row format delimited fields terminated by '\001';

 

insert into table user_test values('zs');

select * from user_test;

 

4.安裝Impala

4.1磁盤擴容(服務器配置足夠的跳過即可)

需要來個小插曲,我這是自己搭建的centos環境,服務器配置不夠,需要加空間。

參考博客: http://blog.itpub.net/117319/viewspace-2125079/

windows操作進入cdm命令行:

C:\Users\kingdee\VirtualBox VMs\centos6\centos6.vdi

VBoxManage.exe modifymedium C:\Users\kingdee\VirtualBox VMs\centos6\centos6.vdi resize --20480  #這里運行有問題 因為文件名稱存在空格

 

C:

cd Users\kingdee\VirtualBox VMs\centos6

F:\VirtualBox\VBoxManage.exe modifymedium centos6.vdi --resize 20480  #這樣執行就可以啦

 

Centos操作:

fdisk -l

fdisk /dev/sda

n

p

3

回車 (使用默認)

+10G (分配的內存)

w (保存)

 

 

 

查看並立即刷新分區信息

cat /proc/partitions

reboot #重啟

mkfs -t ext3 /dev/sda3

格式分后再掛載到映射目錄

mount /dev/sda3 /home  #取消掛載命令umount /home

將 impala安裝在 /home目錄下

cd /home

mkdir data

mkdir CDH

mkdir impala_rpm

cd /home/data/CDH/impala_rpm

 

4.2安裝impala(ps:所有安裝包都放到/home/data/CDH/impala_rpm_install_pkg 目錄下 )

cd    /home/data/CDH/impala_rpm

rpm -ivh ../impala_rpm_install_pkg/impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

 

出現如下錯誤:

warning: ../impala_rpm_install_pkg/impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm: Header V4 DSA/SHA1 Signature, key ID e8f86acd: NOKEY

error: Failed dependencies:

        bigtop-utils >= 0.7 is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        hadoop is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        hadoop-hdfs is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        hadoop-yarn is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        hadoop-mapreduce is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        hbase is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        hive >= 0.12.0+cdh5.1.0 is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        zookeeper is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        hadoop-libhdfs is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        avro-libs is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        parquet is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        sentry >= 1.3.0+cdh5.1.0 is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        sentry is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        /lib/lsb/init-functions is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        libhdfs.so.0.0.0()(64bit) is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

 

emmmm… 得先安裝 bigtop-util 和 sentry

4.3安裝bigtop-util 和 sentry

bigtop-util 和 sentry安裝包下載地址:

http://archive.cloudera.com/cdh5/redhat/7/x86_64/cdh/5.14.0/RPMS/noarch/

也可以直接點擊下面這兩個鏈接下載:

http://archive.cloudera.com/cdh5/redhat/7/x86_64/cdh/5.14.0/RPMS/noarch/bigtop-utils-0.7.0+cdh5.14.0+0-1.cdh5.14.0.p0.47.el7.noarch.rpm

http://archive.cloudera.com/cdh5/redhat/7/x86_64/cdh/5.14.0/RPMS/noarch/sentry-1.5.1+cdh5.14.0+432-1.cdh5.14.0.p0.47.el7.noarch.rpm

 

執行如下命令:

sudo rpm -ivh  ../impala_rpm_install_pkg/bigtop-utils-0.7.0+cdh5.14.0+0-1.cdh5.14.0.p0.47.el7.noarch.rpm

sudo rpm -ivh  ../impala_rpm_install_pkg/sentry-1.5.1+cdh5.14.0+432-1.cdh5.14.0.p0.47.el7.noarch.rpm --nodeps  # --nodeps 不強制依賴

sudo rpm -ivh  ../impala_rpm_install_pkg/impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm --nodeps

sudo rpm -ivh  ../impala_rpm_install_pkg/impala-server-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

sudo rpm -ivh  ../impala_rpm_install_pkg/impala-state-store-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

sudo rpm -ivh  ../impala_rpm_install_pkg/impala-catalog-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

sudo rpm -ivh  ../impala_rpm_install_pkg/impala-shell-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

 

執行到impala-shell得時候,需要先安裝:python-setuptools

 

4.4安裝python-setuptools

python-setuptools的下載地址:

      http://mirror.centos.org/altarch/7/os/aarch64/Packages/python-setuptools-0.9.8-7.el7.noarch.rpm

執行如下命令:

sudo rpm -ivh  ../impala_rpm_install_pkg/python-setuptools-0.9.8-7.el7.noarch.rpm --nodeps

 

然后我們繼續上次的impala-shell安裝:

sudo rpm -ivh  ../impala_rpm_install_pkg/impala-shell-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

歐克了

4.5配置bigtop-utils

vi /etc/default/bigtop-utils

export JAVA_HOME=/usr/java/jdk1.8.0_221-amd64

4.6配置conf

cd  /etc/impala/conf  #復制hadoop與hive的配置文件到impala的目錄下

cp /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop/core-site.xml ./

cp /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop/hdfs-site.xml ./

cp /data/CDH/hive-1.1.0-cdh5.14.0/conf/hive-site.xml ./

 

4.7配置hdfs-site.xml

vi hdfs-site.xml

#添加如下內容:

 <property>

         <name>dfs.datanode.hdfs-blocks-metadata.enabled</name>

         <value>true</value>

</property>

<property>

         <name>dfs.client.file-block-storage-locations.timeout.millis</name>

         <value>10000</value>

</property>

 

4.8處理jar包

4.8.1我們先來學習一下linux的軟鏈接

創建軟鏈接

ln  -s  [源文件或目錄]  [目標文件或目錄]

 

例如:

當前路徑創建test 引向/var/www/test 文件夾

ln –s  /var/www/test  test

 

刪除軟鏈接

和刪除普通的文件是一樣的,刪除都是使用rm來進行操作

 rm –rf 軟鏈接名稱(請注意不要在后面加”/”,rm –rf 后面加不加”/” 的區別,可自行去百度下啊)

例如:

刪除test

rm –rf test

 

修改軟鏈接

ln –snf  [新的源文件或目錄]  [目標文件或目錄]

這將會修改原有的鏈接地址為新的地址

例如:

創建一個軟鏈接

ln –s  /var/www/test   /var/test

修改指向的新路徑

ln –snf  /var/www/test1   /var/test

4.8.2處理impala jar包的軟連接

(以下內容可以復制,記得替換版本號哦,例如我的包版本是 2.6.0-cdh5.14.0, 需要按照你具體的安裝版本替換哦)

Hadoop相關jar包

sudo rm -rf /usr/lib/impala/lib/hadoop-*.jar

sudo ln -s $HADOOP_HOME/share/hadoop/common/lib/hadoop-annotations-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-annotations.jar

sudo ln -s $HADOOP_HOME/share/hadoop/common/lib/hadoop-auth-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-auth.jar

sudo ln -s $HADOOP_HOME/share/hadoop/tools/lib/hadoop-aws-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-aws.jar

sudo ln -s $HADOOP_HOME/share/hadoop/common/hadoop-common-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-common.jar

sudo ln -s $HADOOP_HOME/share/hadoop/hdfs/hadoop-hdfs-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-hdfs.jar

sudo ln -s $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-mapreduce-client-common.jar

sudo ln -s $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-mapreduce-client-core.jar

sudo ln -s $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-mapreduce-client-jobclient.jar

sudo ln -s $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-mapreduce-client-shuffle.jar

sudo ln -s $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-api-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-api.jar

sudo ln -s $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-client-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-client.jar

sudo ln -s $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-common.jar

sudo ln -s $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-server-applicationhistoryservice.jar

sudo ln -s $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-server-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-server-common.jar

sudo ln -s $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-server-nodemanager.jar

sudo ln -s $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-server-resourcemanager.jar

sudo ln -s $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-server-web-proxy.jar

sudo ln -s $HADOOP_HOME/share/hadoop/mapreduce1/hadoop-core-2.6.0-mr1-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-core.jar

sudo ln -s $HADOOP_HOME/share/hadoop/tools/lib/hadoop-azure-datalake-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-azure-datalake.jar

 

#重新建立軟連接:

sudo ln -snf $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-common-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-common.jar

sudo ln -snf $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-server-common-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-server-common.jar

sudo ln -snf $HADOOP_HOME/share/hadoop/common/lib/avro-1.7.6-cdh5.14.0.jar /usr/lib/impala/lib/avro.jar

sudo ln -snf $HADOOP_HOME/share/hadoop/common/lib/zookeeper-3.4.5-cdh5.14.0.jar /usr/lib/impala/lib/zookeeper.jar

hive相關jar包

sudo rm -rf /usr/lib/impala/lib/hive-*.jar

sudo ln -s $HIVE_HOME/lib/hive-ant-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-ant.jar

sudo ln -s $HIVE_HOME/lib/hive-beeline-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-beeline.jar

sudo ln -s $HIVE_HOME/lib/hive-common-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-common.jar

sudo ln -s $HIVE_HOME/lib/hive-exec-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-exec.jar

sudo ln -s $HIVE_HOME/lib/hive-hbase-handler-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-hbase-handler.jar

sudo ln -s $HIVE_HOME/lib/hive-metastore-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-metastore.jar

sudo ln -s $HIVE_HOME/lib/hive-serde-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-serde.jar

sudo ln -s $HIVE_HOME/lib/hive-service-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-service.jar

sudo ln -s $HIVE_HOME/lib/hive-shims-common-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-shims-common.jar

sudo ln -s $HIVE_HOME/lib/hive-shims-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-shims.jar

sudo ln -s $HIVE_HOME/lib/hive-shims-scheduler-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-shims-scheduler.jar

sudo ln -snf $HIVE_HOME/lib/parquet-hadoop-bundle-1.5.0-cdh5.14.0.jar /usr/lib/impala/lib/parquet-hadoop-bundle.jar

其它jar包

sudo ln -snf $HIVE_HOME/lib/hbase-annotations-1.2.0-cdh5.14.0.jar /usr/lib/impala/lib/hbase-annotations.jar

sudo ln -snf $HIVE_HOME/lib/hbase-common-1.2.0-cdh5.14.0.jar /usr/lib/impala/lib/hbase-common.jar

sudo ln -snf $HIVE_HOME/lib/hbase-protocol-1.2.0-cdh5.14.0.jar /usr/lib/impala/lib/hbase-protocol.jar

sudo ln -snf /data/CDH/impala-2.11.0-cdh5.14.0/thirdparty/hbase-1.2.0-cdh5.14.0/lib/hbase-client-1.2.0-cdh5.14.0.jar /usr/lib/impala/lib/hbase-client.jar

sudo ln -snf /data/CDH/impala-2.11.0-cdh5.14.0/thirdparty/hadoop-2.6.0-cdh5.14.0/lib/native/libhadoop.so /usr/lib/impala/lib/libhadoop.so

sudo ln -snf /data/CDH/impala-2.11.0-cdh5.14.0/thirdparty/hadoop-2.6.0-cdh5.14.0/lib/native/libhadoop.so.1.0.0 /usr/lib/impala/lib/libhadoop.so.1.0.0

sudo ln -snf /data/CDH/impala-2.11.0-cdh5.14.0/thirdparty/hadoop-2.6.0-cdh5.14.0/lib/native/libhdfs.so /usr/lib/impala/lib/libhdfs.so

sudo ln -snf /data/CDH/impala-2.11.0-cdh5.14.0/thirdparty/hadoop-2.6.0-cdh5.14.0/lib/native/libhdfs.so.0.0.0 /usr/lib/impala/lib/libhdfs.so.0.0.0

 

額外補充

如有遺漏,還有一些jar包軟鏈接未生效,可以全局搜索當前服務器上是否存在此jar,如果不存在就去maven上下載一個,然后放到某個目錄,軟連接過去即可。

eg:發現還有個 hadoop-core.jar軟連接無效。

find  /  -name       “hadoop-core*”;  #根據此命令找到相關包的位置,然后ln –s xx xx 即可

4.9啟動impala

service impala-state-store start

service impala-catalog start

service impala-server start

 

在執行service impala-state-store start 報錯

[root@sunny conf]# service impala-state-store start

/etc/init.d/impala-state-store: line 34: /lib/lsb/init-functions: No such file or directory

/etc/init.d/impala-state-store: line 104: log_success_msg: command not found

 

需要安裝 redhat-lsb

下載地址:http://rpmfind.net/linux/rpm2html/search.php?query=redhat-lsb-core(x86-64)

選擇:redhat-lsb-core-4.1-27.el7.centos.1.x86_64.rpm 下載

 

 

 

下載好后執行命令:

rpm -ivh redhat-lsb-core-4.1-27.el7.centos.1.x86_64.rpm

 

繼續啟動impala

service impala-state-store start

service impala-catalog start

service impala-server start #所有datanode節點啟動impalad

 

所有服務啟動成功后可以開始訪問impala

4.10使用impala

進入impala:

impala-shell #任意路徑輸入即可,基本操作同hive

訪問impalad的管理界面

http://172.20.71.66:25000/

 

訪問statestored的管理界面

http://172.20.71.66:25010/

 

訪問catalog的管理界面

http://172.20.71.66:25020

impala日志文件:  /var/log/impala 

impala配置文件:/etc/default/impala

/etc/impala/conf  (core-site.xml, hdfs-site.xml, hive-site.xml)

4.11 安裝impala異常處理

4.11.1 FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

處理方案:

參考:http://blog.csdn.net/l1028386804/article/details/51566262

進入所安裝的Hive的conf目錄,找到hive-site.xml,(若沒修改,則是hive-default.xml.template)。

<property>

  <name>hive.metastore.schema.verification</name>

  <value>true</value>

   <description>

   Enforce metastore schema version consistency.

   True: Verify that version information stored in metastore matches with one from Hive jars.  Also disable automatic

         schema migration attempt. Users are required to manully migrate schema after Hive upgrade which ensures

         proper metastore schema migration. (Default)

   False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.

   </description>

</property>

改為

<property>

  <name>hive.metastore.schema.verification</name>

  <value>false</value>

   <description>

   Enforce metastore schema version consistency.

   True: Verify that version information stored in metastore matches with one from Hive jars.  Also disable automatic

         schema migration attempt. Users are required to manully migrate schema after Hive upgrade which ensures

         proper metastore schema migration. (Default)

   False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.

   </description>

</property>

 

然后重啟hive

hive --service metastore &  

hive --service hiveserver2 &

 

4.11.2 Invalid short-circuit reads configuration:- Impala cannot read or execute the parent directory of dfs.domain.socket.path

處理方案:創建配置文件中 /etc/impala/conf/hdfs-site.xml,對應目錄即可

 

 

 

cd /var/run

mkdir hdfs-sockets

cd hdfs-sockets

mkdir dn

4.11.3impala insert數據報錯:NoClassDefFoundError: org/apache/hadoop/fs/adl/AdlFileSystem

[localhost.localdomain:21000] > insert into user_test values('zs');

Query: insert into user_test values('zs')

Query submitted at: 2020-12-17 02:39:24 (Coordinator: http://sunny:25000)

ERROR: AnalysisException: Failed to load metadata for table: test.user_test. Running 'invalidate metadata test.user_test' may resolve this problem.

CAUSED BY: NoClassDefFoundError: org/apache/hadoop/fs/adl/AdlFileSystem

CAUSED BY: TableLoadingException: Failed to load metadata for table: test.user_test. Running 'invalidate metadata test.user_test' may resolve this problem.

CAUSED BY: NoClassDefFoundError: org/apache/hadoop/fs/adl/AdlFileSystem

 

處理方案:

impala沒有引用hadoop-azure-datalake.jar ,補充引用

sudo ln -s /data/CDH/hadoop-2.6.0-cdh5.14.0/share/hadoop/tools/lib/hadoop-azure-datalake-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-azure-datalake.jar

 

hadoop core-site.xml 配置解析:http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml

 

到此 hadoop hive impala 安裝結束啦,完結 撒花~

 

--=====================補充更新 2020-12-28================

ERROR: AnalysisException: Unable to INSERT into target table (Test.user_test) because Impala does not have WRITE access to at least one HDFS path: hdfs://172.20.71.66:9000/user/hive/warehouse/test.db/user_test
處理方案:
vi /etc/selinux/config
SELINUX=disabled

vi /data/CDH/hive-1.1.0-cdh5.14.0/conf/hive-site.xml
添加內容
<property>
<name>hive.metastore.warehouse.dir</name>
<!--此處修改-->
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>

hadoop fs -chmod 777 /user/hive/warehouse/test.db/user_test

hive --service metastore &
hive --service hiveserver2 &

service impala-state-store restart
service impala-catalog restart
service impala-server restart

impala-shell
invalidate metadata


impala-shell --auth_creds_ok_in_clear -l -i 172.20.71.66 -u APP

-- ============================================================================


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM