fesh個人實踐,歡迎經驗交流!本文Blog地址:http://www.cnblogs.com/fesh/p/3775343.html
本文編譯方法所支持的hadoop環境是Hadoop-2.2.0,YARN是2.2.0,JAVA版本為1.8.0_11,操作系統Ubuntu14.04
cd spark-1.0.1 ./make-distribution.sh --hadoop 2.2.0 --with-yarn --tgz
--tgz: Additionally creates spark-$VERSION-bin.tar.gz
--hadoop VERSION: Builds against specified version of Hadoop.
--with-yarn: Enables support for Hadoop YARN.
--with-hive: Enable support for reading Hive tables.
--name: A moniker for the release target. Defaults to the Hadoop verison.
如果一切順利,會在$SPARK_HOME/assembly/target/scala-2.10目錄下生成目標文件
(好像Java版本1.8在這里有版本問題?默認在1.6環境下,但我居然編譯成功了,呵呵)
(注:之前加了--with-tachyon 我總是編譯成功,但生成tgz部署包失敗,不知道為什么。今天我在JDK1.7.0_51環境(應該與JDK版本無關)下,去掉了--with-tachyon ,編譯成功,並且生成了spark-1.0.1-
bin-2.2.0.tgz部署包)
編譯結果:
tar -zxvf spark-1.0.0.tar.gz
cd spark-1.0.1
SPARK_HADOOP_VERSION=2.2.0 SPARK_YARN=true ./sbt/sbt assembly
export MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m"
[INFO] Compiling 203 Scala sources and 9 Java sources to /Users/me/Development/spark/core/target/scala-2.10/classes... [ERROR] PermGen space -> [Help 1] [INFO] Compiling 203 Scala sources and 9 Java sources to /Users/me/Development/spark/core/target/scala-2.10/classes... [ERROR] Java heap space -> [Help 1]
2)指定Hadoop版本並編譯

# Apache Hadoop 2.2.X
mvn -Pyarn -Phadoop-2.2 -Dhadoop.version=2.2.0 -DskipTests clean package
如果是其他版本的YARN和HDFS,則按下面編譯:
# Different versions of HDFS and YARN.
mvn -Pyarn-alpha -Phadoop-2.3 -Dhadoop.version=2.3.0 -Dyarn.version=0.23.7 -DskipTests clean package
)
編譯結果為:

另外,這篇文章的編譯講得也很詳細,也可以參考:http://mmicky.blog.163.com/blog/static/1502901542014312101657612/
以及文章 http://www.cnblogs.com/hseagle/p/3732492.html
Spark源碼和編譯后的源碼、部署包我分享在: http://pan.baidu.com/s/1c0y7JKs 提取密碼: ccvy