1. 環境准備
JDK1.8
Scala2.11.8
Maven 3.3+
IDEA with scala plugin
2. 下載spark源碼
下載地址 https://archive.apache.org/dist/spark/spark-2.0.0/spark-2.0.0.tgz
將下載好的spark源碼解壓到c:\workspace
3. Idea 導入spark-2.0.0源碼工程






一路next下去,最后點擊finish。

最后,進行項目的編譯

4. 可能遇到的問題
4.1 not found: type SparkFlumeProtocol
spark\external\flume-sink\src\main\scala\org\apache\spark\streaming\flume\sink\SparkAvroCallbackHandler.scala Error:(45, 66) not found: type SparkFlumeProtocol
解決方案:

選中Spark Project External Flume Sink,並右鍵點擊Generate Sources and Update Folders. 然后重新編譯應該就會消失。
4.2 Error:(34, 45) object SqlBaseParser is not a member of package org.apache.spark.sql.catalyst.parser
\spark\sql\catalyst\src\main\scala\org\apache\spark\sql\catalyst\parser\AstBuilder.scala Error:(34, 45) object SqlBaseParser is not a member of package org.apache.spark.sql.catalyst.parser import org.apache.spark.sql.catalyst.parser.SqlBaseParser._
解決方案:

選中Spark Project External Catalyst,並右鍵點擊Generate Sources and Update Folders. 然后重新編譯應該就會消失.
4.3 Error:(52, 75) not found: value TCLIService
spark\sql\hive-thriftserver\src\main\java\org\apache\hive\service\cli\thrift\ThriftCLIService.java Error:(52, 75) not found: value TCLIService public abstract class ThriftCLIService extends AbstractService implements TCLIService.Iface, Runnable {………..

一般來講,這幾個問題解決之后,編譯就會成功。

5. gitBash中進行編譯
為什么使用gitbash,因為在idea中編譯時會出現各種各種的報錯,gitbash中擁有一些idea中沒有的環境。可能出現如下錯誤,使用gitbash即可解決
使用gitbash進入項目的根目錄下,執行下面3條命令
cd /c/Workspace/spark-2.0.0
export MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m" ./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean package

最后編譯出來的結果如下:

參考:
https://blog.csdn.net/make__It/article/details/84258916
http://dengfengli.com/blog/how-to-run-and-debug-spark-source-code-locally/
