hadoop示例中的WordCount程序,很多教程上都是推薦以下二種運行方式:
1.將生成的jar包,復制到hadoop集群中的節點,然后運行
$HADOOP_HOME/bin/hadoop xxx.jar xxx.WordCount /input/xxx.txt /output
2.或者直接在IDE環境中調試(參見eclipse/intellij idea 遠程調試hadoop 2.6.0)
但是生產環境中,更多的情況是:沒有ide環境,且各應用最終生成的jar包部署在應用服務器上(應用服務器並非hadoop集群中的服務器節點),所以需要jar能獨立運行並能連接到hadoop環境,以下是關鍵點:
1. pom.xml中將WordCount所依賴的jar包依賴項,全添加進來(這樣最終運行時,這些jar包就不用依賴ide或hadoop運行環境)
2. 參考maven: 打包可運行的jar包(java application)及依賴項處理 一文將依賴的jar包導出,且通過maven插件自動修改MANIFEST.MF中的Main-Class信息
3. core-site.xml要復制到maven項目的resources目錄下(這樣打包后,xml會復制到classpath下,運行時,根據這個配置文件,WordCount就能知道去連哪里的hadoop)
4. 部署時,將最終生成的WordCount jar包及依賴的lib包,全上傳到應用服務器
然后就能直接以類似
java -jar hadoop-helloworld.jar /jimmy/input/README.txt /jimmy/output 運行
最后附幾個關鍵文件內容:
a. pom.xml

1 <?xml version="1.0" encoding="UTF-8"?> 2 <project xmlns="http://maven.apache.org/POM/4.0.0" 3 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 4 xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> 5 <modelVersion>4.0.0</modelVersion> 6 7 <groupId>cn.cnblogs.yjmyzz</groupId> 8 <artifactId>hadoop-helloworld</artifactId> 9 <version>1.0</version> 10 11 <dependencies> 12 <dependency> 13 <groupId>org.apache.hadoop</groupId> 14 <artifactId>hadoop-common</artifactId> 15 <version>2.6.0</version> 16 </dependency> 17 <dependency> 18 <groupId>org.apache.hadoop</groupId> 19 <artifactId>hadoop-hdfs</artifactId> 20 <version>2.6.0</version> 21 </dependency> 22 <dependency> 23 <groupId>org.apache.hadoop</groupId> 24 <artifactId>hadoop-mapreduce-client-jobclient</artifactId> 25 <version>2.6.0</version> 26 </dependency> 27 <dependency> 28 <groupId>commons-cli</groupId> 29 <artifactId>commons-cli</artifactId> 30 <version>1.2</version> 31 </dependency> 32 </dependencies> 33 34 <build> 35 <finalName>${project.artifactId}</finalName> 36 37 <plugins> 38 <plugin> 39 <groupId>org.apache.maven.plugins</groupId> 40 <artifactId>maven-jar-plugin</artifactId> 41 <configuration> 42 <archive> 43 <manifest> 44 <mainClass>cn.cnblogs.yjmyzz.WordCount</mainClass> 45 <addClasspath>true</addClasspath> 46 <classpathPrefix>lib/</classpathPrefix> 47 </manifest> 48 </archive> 49 <classesDirectory> 50 </classesDirectory> 51 </configuration> 52 </plugin> 53 </plugins> 54 </build> 55 56 <!--mvn dependency:copy-dependencies -DoutputDirectory=target/lib--> 57 58 </project>
b.\META-INF\MANIFEST.MF內容

Manifest-Version: 1.0 Built-By: jimmy Build-Jdk: 1.7.0_09 Class-Path: lib/hadoop-common-2.6.0.jar lib/hadoop-annotations-2.6.0.j ar lib/guava-11.0.2.jar lib/commons-math3-3.1.1.jar lib/xmlenc-0.52.j ar lib/commons-httpclient-3.1.jar lib/commons-codec-1.4.jar lib/commo ns-io-2.4.jar lib/commons-net-3.1.jar lib/commons-collections-3.2.1.j ar lib/servlet-api-2.5.jar lib/jetty-6.1.26.jar lib/jetty-util-6.1.26 .jar lib/jersey-core-1.9.jar lib/jersey-json-1.9.jar lib/jettison-1.1 .jar lib/jaxb-impl-2.2.3-1.jar lib/jaxb-api-2.2.2.jar lib/stax-api-1. 0-2.jar lib/activation-1.1.jar lib/jackson-jaxrs-1.8.3.jar lib/jackso n-xc-1.8.3.jar lib/jersey-server-1.9.jar lib/asm-3.1.jar lib/jasper-c ompiler-5.5.23.jar lib/jasper-runtime-5.5.23.jar lib/jsp-api-2.1.jar lib/commons-el-1.0.jar lib/commons-logging-1.1.3.jar lib/log4j-1.2.17 .jar lib/jets3t-0.9.0.jar lib/httpclient-4.1.2.jar lib/httpcore-4.1.2 .jar lib/java-xmlbuilder-0.4.jar lib/commons-lang-2.6.jar lib/commons -configuration-1.6.jar lib/commons-digester-1.8.jar lib/commons-beanu tils-1.7.0.jar lib/commons-beanutils-core-1.8.0.jar lib/slf4j-api-1.7 .5.jar lib/slf4j-log4j12-1.7.5.jar lib/jackson-core-asl-1.9.13.jar li b/jackson-mapper-asl-1.9.13.jar lib/avro-1.7.4.jar lib/paranamer-2.3. jar lib/snappy-java-1.0.4.1.jar lib/protobuf-java-2.5.0.jar lib/gson- 2.2.4.jar lib/hadoop-auth-2.6.0.jar lib/apacheds-kerberos-codec-2.0.0 -M15.jar lib/apacheds-i18n-2.0.0-M15.jar lib/api-asn1-api-1.0.0-M20.j ar lib/api-util-1.0.0-M20.jar lib/curator-framework-2.6.0.jar lib/jsc h-0.1.42.jar lib/curator-client-2.6.0.jar lib/curator-recipes-2.6.0.j ar lib/jsr305-1.3.9.jar lib/htrace-core-3.0.4.jar lib/zookeeper-3.4.6 .jar lib/commons-compress-1.4.1.jar lib/xz-1.0.jar lib/hadoop-hdfs-2. 6.0.jar lib/commons-daemon-1.0.13.jar lib/netty-3.6.2.Final.jar lib/x ercesImpl-2.9.1.jar lib/xml-apis-1.3.04.jar lib/hadoop-mapreduce-clie nt-jobclient-2.6.0.jar lib/hadoop-mapreduce-client-common-2.6.0.jar l ib/hadoop-yarn-common-2.6.0.jar lib/hadoop-yarn-api-2.6.0.jar lib/jer sey-client-1.9.jar lib/jersey-guice-1.9.jar lib/hadoop-yarn-client-2. 6.0.jar lib/hadoop-mapreduce-client-core-2.6.0.jar lib/hadoop-yarn-se rver-common-2.6.0.jar lib/hadoop-mapreduce-client-shuffle-2.6.0.jar l ib/hadoop-yarn-server-nodemanager-2.6.0.jar lib/leveldbjni-all-1.8.ja r lib/guice-servlet-3.0.jar lib/guice-3.0.jar lib/javax.inject-1.jar lib/aopalliance-1.0.jar lib/commons-cli-1.2.jar Created-By: Apache Maven 3.2.3 Main-Class: cn.cnblogs.yjmyzz.WordCount Archiver-Version: Plexus Archiver
運行截圖: