spark編譯安裝 spark 2.1.0 hadoop2.6.0-cdh5.7.0


1、准備:

centos 6.5

jdk 1.7

Java SE安裝包下載地址:http://www.oracle.com/technetwork/java/javase/downloads/java-archive-downloads-javase7-521261.html

maven3.3.9  

Maven3.3.9安裝包下載地址:https://mirrors.tuna.tsinghua.edu.cn/apache//maven/maven-3/3.3.9/binaries/

spark 2.1.0 下載
http://spark.apache.org/downloads.html

 

下載后文件名:

 

 

***************************************************分界線  編譯開始*********************************************************************

 

上傳到linux

安裝maven,解壓,配置環境變量

在此略掉...

 mvn-v

 

說明mvn就已經沒問題

*************************************************************分界線***********************************************************************************

我的hadoop版本是hadoop2.6.0-cdh5.7.0

解壓spark源碼包

得到源碼包

忽略我這邊已經編譯好的spark安裝包

先設置maven的內存,不然會有問題,直接設置臨時的

export MAVEN_OPTS="-Xmx2g -XX:ReservedCodeCacheSize=512m"

[root@master109 opt]# echo $MAVEN_OPTS
-Xmx2g -XX:ReservedCodeCacheSize=512m

 

 進入spark源碼主目錄

./dev/make-distribution.sh --name 2.6.0-cdh5.7.0   --tgz   -Phadoop-2.6 -Dhadoop.version=2.6.0-cdh5.7.0 -Phive -Phive-thriftserver  -Pyarn

  

結果:

[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------ [INFO] Total time: 9.810 s (Wall Clock) [INFO] Finished at: 2017-10-13T15:52:09+08:00 [INFO] Final Memory: 67M/707M [INFO] ------------------------------------------------------------------------ [ERROR] Failed to execute goal on project spark-launcher_2.11: Could not resolve dependencies for project org.apache.spark:spark-launcher_2.11:jar:2.1.0: Failure to find org.apache.hadoop:hadoop-client:jar:2.6.0-cdh5.7.0 in https://repo1.maven.org/maven2 was cached in the local repository, resolution will not be reattempted until the update interval of central has elapsed or updates are forced -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn <goals> -rf :spark-launcher_2.11

 

編譯失敗,顯示沒有找到一些包,這里是數據源不對,默認的是Apache的源,這里要改成cdh的源

編輯 pom.xml

[root@master109 spark-2.1.0]# ls
appveyor.yml  bin    common  CONTRIBUTING.md  data  docs      external  launcher  licenses  mllib        NOTICE   project  R          repl  scalastyle-config.xml streaming tools assembly build conf core dev examples graphx LICENSE mesos mllib-local pom.xml python README.md sbin sql target yarn [root@master109 spark-2.1.0]# vim pom.xml

 

 在如下位置插入

#---------------------------------------------
中間的內容,改變數據源。記住,刪掉上下的分隔符。
#---------------------------------------------
 <repositories>
    <repository>
      <id>central</id>
      <!-- This should be at top, it makes maven try the central repo first and then others and hence faster dep resolution -->
      <name>Maven Repository</name>
      <url>https://repo1.maven.org/maven2</url>
      <releases>
        <enabled>true</enabled>
      </releases>
      <snapshots>
        <enabled>false</enabled>
      </snapshots>
    </repository>

#---------------------------------------------
   <repository>
      <id>cloudera</id>
      <name>cloudera Repository</name>
      <url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
   </repository> #--------------------------------------------- </repositories>

 

重新編譯開始:

[root@master109 spark-2.1.0]# ./dev/make-distribution.sh --name 2.6.0-cdh5.7.0   --tgz   -Phadoop-2.6 -Dhadoop.version=2.6.0-cdh5.7.0 -Phive -Phive-thriftserver  -Pyarn

等待幾分鍾:

[INFO] Reactor Summary:
[INFO] 
[INFO] Spark Project Parent POM ........................... SUCCESS [  3.997 s] [INFO] Spark Project Tags ................................. SUCCESS [ 3.394 s] [INFO] Spark Project Sketch ............................... SUCCESS [ 14.061 s] [INFO] Spark Project Networking ........................... SUCCESS [ 37.680 s] [INFO] Spark Project Shuffle Streaming Service ............ SUCCESS [ 12.750 s] [INFO] Spark Project Unsafe ............................... SUCCESS [ 33.158 s] [INFO] Spark Project Launcher ............................. SUCCESS [ 50.148 s] [INFO] Spark Project Core ................................. SUCCESS [04:16 min] [INFO] Spark Project ML Local Library ..................... SUCCESS [ 45.832 s] [INFO] Spark Project GraphX ............................... SUCCESS [ 26.712 s] [INFO] Spark Project Streaming ............................ SUCCESS [ 58.080 s] [INFO] Spark Project Catalyst ............................. SUCCESS [02:22 min] [INFO] Spark Project SQL .................................. SUCCESS [03:02 min] [INFO] Spark Project ML Library ........................... SUCCESS [02:16 min] [INFO] Spark Project Tools ................................ SUCCESS [ 2.588 s] [INFO] Spark Project Hive ................................. SUCCESS [01:19 min] [INFO] Spark Project REPL ................................. SUCCESS [ 6.337 s] [INFO] Spark Project YARN Shuffle Service ................. SUCCESS [ 13.252 s] [INFO] Spark Project YARN ................................. SUCCESS [ 57.556 s] [INFO] Spark Project Hive Thrift Server ................... SUCCESS [ 45.074 s] [INFO] Spark Project Assembly ............................. SUCCESS [ 7.410 s] [INFO] Spark Project External Flume Sink .................. SUCCESS [ 30.214 s] [INFO] Spark Project External Flume ....................... SUCCESS [ 19.359 s] [INFO] Spark Project External Flume Assembly .............. SUCCESS [ 6.082 s] [INFO] Spark Integration for Kafka 0.8 .................... SUCCESS [ 30.266 s] [INFO] Spark Project Examples ............................. SUCCESS [ 28.668 s] [INFO] Spark Project External Kafka Assembly .............. SUCCESS [ 6.919 s] [INFO] Spark Integration for Kafka 0.10 ................... SUCCESS [ 30.811 s] [INFO] Spark Integration for Kafka 0.10 Assembly .......... SUCCESS [ 6.551 s] [INFO] Kafka 0.10 Source for Structured Streaming ......... SUCCESS [ 17.707 s] [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 13:25 min (Wall Clock) [INFO] Finished at: 2017-10-13T16:35:47+08:00 [INFO] Final Memory: 90M/979M [INFO] ------------------------------------------------------------------------

 

完事!

2017-10-13   16:55:55

作者by :山高似水深(http://www.cnblogs.com/tnsay/)轉載注明出處。


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM