本教程僅僅是使用spark,能在hive用就行。
1.下載Spark;
2.WinSCP上傳spark壓縮包到虛擬機;
3.tar -zxvf spark-2.3.3-bin-without-hadoop -C /opt/programs/
4.
1)配置Spark環境變量;
2)配置spark-env.sh 命令cp spark-env.sh.template spark-env.sh
后vi spark-env.sh
末尾加上:
export JAVA_HOME=/usr/java/jdk1.8.0_25
export SPARK_DIST_CLASSPATH=$(/opt/programs/hadoop-2.6.0/bin/hadoop classpath)
// /opt/programs/hadoop-2.6.0/bin/hadoopg改成自己的路徑
3)配置slaves
hadoop1
hadoop2
hadoop3
4)將Spark相關文件,連接到$HIVE_HOME/lib中
3個包:
scala-library-2.11.8.jar
spark-core_2.11-2.3.3.jar
spark-network-common_2.11-2.3.3.jar
ln -snf /opt/programs/spark-2.3.3/jars/spark-core_2.11-2.3.3.jar /opt/programs/hive-2.3.5/lib/spark-core_2.11-2.3.3.jar
配置成功的結果:
[root@hadoop1 conf]# ll /opt/programs/hive-2.3.5/lib/ | grep spark
lrwxrwxrwx. 1 root root 55 Sep 12 22:26 scala-library-2.11.8.jar -> /opt/programs/spark-2.3.3/jars/scala-library-2.11.8.jar
lrwxrwxrwx. 1 root root 56 Sep 12 22:27 spark-core_2.11-2.3.3.jar -> /opt/programs/spark-2.3.3/jars/spark-core_2.11-2.3.3.jar
lrwxrwxrwx. 1 root root 66 Sep 12 22:27 spark-network-common_2.11-2.3.3.jar -> /opt/programs/spark-2.3.3/jars/spark-network-common_2.11-2.3.3.jar
5)配置hive執行引擎
- 在配置文件里面配置;
<property>
<name>hive.execution.engine</name>
<value>spark</value>
</property>
- 在beeline配置,只在當前session有效;
//在beeline里
set hive.execution.engine=spark;