、創建好Maven項目之后(記得添加Scala框架到該項目),修改pom.xml文件,添加如下內容:
<properties>
<spark.version>2.1.1</spark.version>
<scala.version>2.11</scala.version>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_${scala.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_${scala.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_${scala.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_${scala.version}</artifactId>
<version>${spark.version}</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.scala-tools</groupId>
<artifactId>maven-scala-plugin</artifactId>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
<configuration>
<scalaVersion>${scala.version}</scalaVersion>
<args>
<arg>-target:jvm-1.5</arg>
</args>
</configuration>
</plugin>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.6.0</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.19</version>
<configuration>
<skip>true</skip>
</configuration>
</plugin>
</plugins>
</build>
其中保存之后,需要點擊下面的import change,這樣相當於是下載jar包
二、編寫一個Scala程序,統計單詞的個數
import org.apache.spark.SparkConf import org.apache.spark.SparkContext object WordCount { def main(args: Array[String]) { if (args.length == 0) { System.err.println("Usage: spark.example.WordCount <input> <output>") System.exit(1) } val input_path = args(0).toString val output_path = args(1).toString val conf = new SparkConf().setAppName("WordCount") conf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer") val sc = new SparkContext(conf) val inputFile = sc.textFile(input_path) val countResult = inputFile.flatMap(line => line.split(" ")) .map(word => (word, 1)) .reduceByKey(_ + _) .map(x => x._1 + "\t" + x._2) .saveAsTextFile(output_path) } }
三、打包
file->Porject Structure->Artifacts->綠色的加號->JAR->from modules...
然后填寫定義的類名,選擇copy to..選項(打包這一個類)

點擊ok之后,然后build->build Artifacts->build,等待build完成。然后可以在項目的這個目錄中找到剛剛打包的這個jar包

四、運行在spark集群上面
1. 把jar包放到能訪問spark集群的機器上面
2. 運行
/usr/local/spark/bin/spark-submit --class WordCount --master spark://master:7077 /data/wangzai/package/WordCount.jar \ hdfs://master:9000/spark/test.data hdfs://master:9000/spark_output/spark_wordcount \ --executor-memory 1G \ --executor-cores 1 \ --num-executors 10
3. 結果


