一.簡介
要在Windows下的 Eclipse上調試Hadoop2代碼,所以我們在windows下的Eclipse配置hadoop-eclipse-plugin- 2.6.0.jar插件,並在運行Hadoop的WordCount代碼時出現了一系列的問題,搞了好幾天終於能運行起代碼。接下來我們來看看問題並怎么解決,提供給跟我同樣遇到的問題作為參考。
Hadoop2的WordCount.java統計代碼如下:
import java.io.IOException; import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class WordCount { public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable>{ private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(Object key, Text value, Context context ) throws IOException, InterruptedException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } } } public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWritable> { private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable<IntWritable> values, Context context ) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } } public static void main(String[] args) throws Exception { Configuration conf = new Configuration();
conf.set("mapred.job.tracker", "hadoopmaster:9001");//在windows下面必須設置
conf.set("fs.default.name", "hdfs://hadoopmaster:9000"); Job job = Job.getInstance(conf, "word count"); job.setJarByClass(WordCount.class); job.setMapperClass(TokenizerMapper.class); job.setCombinerClass(IntSumReducer.class); job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); } }
問題一.An internal error occurred during: "Map/Reducelocation status updater".java.lang.NullPointerException
把hadoop-eclipse-plugin-2.6.0.jar放到Eclipse的plugins目錄下,我們的Eclipse目錄是 F:\tool\eclipse-jee-juno-SR2\eclipse-jee-juno-SR2\plugins,重啟一下Eclipse,然 后,打開Window-->Preferens,可以看到Hadoop Map/Reduc選項,然后點擊出現了An internal error occurredduring: "Map/Reduce location status updater".java.lang.NullPointerException,如圖所示:
解決:
我們發現剛配置部署的Hadoop2還沒創建輸入和輸出目錄,先在hdfs上建個文件夾 。
#bin/hdfs dfs -mkdir –p /user/root/input
#bin/hdfs dfs -mkdir -p /user/root/output
我們在Eclipse的DFS Locations目錄下看到我們這兩個目錄,如圖所示:
問題二.Exception in thread "main" java.lang.NullPointerException atjava.lang.ProcessBuilder.start(Unknown Source)
運行Hadoop2的WordCount.java代碼時出現了這樣錯誤:
log4j:WARNPlease initialize the log4j system properly. log4j:WARN Seehttp://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Exception in thread "main" java.lang.NullPointerException atjava.lang.ProcessBuilder.start(Unknown Source) atorg.apache.hadoop.util.Shell.runCommand(Shell.java:482) atorg.apache.hadoop.util.Shell.run(Shell.java:455) atorg.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715) atorg.apache.hadoop.util.Shell.execCommand(Shell.java:808) atorg.apache.hadoop.util.Shell.execCommand(Shell.java:791) at
分析:下載Hadoop2以上版本時,在Hadoop2的bin目錄下沒有winutils.exe
解決:
1.下載https://codeload.github.com/srccodes/hadoop-common-2.2.0-bin/zip /master下載hadoop-common-2.2.0-bin-master.zip,然后解壓后,把hadoop-common-2.2.0- bin-master下的bin全部復制放到我們下載的Hadoop2的binHadoop2/bin目錄下。如圖所示:
2.Eclipse-》window-》Preferences 下的Hadoop Map/Peduce 把下載放在我們的磁盤的Hadoop目錄引進來,如圖所示:
3.Hadoop2配置變量環境HADOOP_HOME 和path,如圖所示:
問題三.Exception in thread "main"java.lang.UnsatisfiedLinkError:org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
當我們解決了問題三時,在運行WordCount.java代碼時,出現這樣的問題:
log4j:WARN No appenders could be found forlogger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4jsystem properly. log4j:WARN Seehttp://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Exception in thread "main"java.lang.UnsatisfiedLinkError:org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z atorg.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method) atorg.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:557) atorg.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:977) atorg.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:187) atorg.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:174) atorg.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:108) atorg.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:285) atorg.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:344) atorg.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150) atorg.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131) atorg.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:115) atorg.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.
分析:C:\Windows\System32下缺少hadoop.dll,把這個文件拷貝到C:\Windows\System32下面即可。
解決:hadoop-common-2.2.0-bin-master下的bin的hadoop.dll放到C:\Windows\System32下,然后重啟電腦,也許還沒那么簡單,還是出現這樣的問題。
我們在繼續分析:我們在出現錯誤的的atorg.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:557)我們來看這個類NativeIO的557行,如圖所示:
Windows的唯一方法用於檢查當前進程的請求,在給定的路徑的訪問權限,所以我們先給以能進行訪問,我們自己先修改源代碼,return true 時允許訪問。我們下載對應hadoop源代碼,hadoop-2.6.0-src.tar.gz解壓,hadoop-2.6.0-src\hadoop- common-project\hadoop-common\src\main\java\org\apache\hadoop\io\nativeio 下NativeIO.java 復制到對應的Eclipse的project,然后修改557行為return true如圖所示:
問題四:org.apache.hadoop.security.AccessControlException: Permissiondenied: user=zhengcy, access=WRITE,inode="/user/root/output":root:supergroup:drwxr-xr-x
我們在執行運行WordCount.java代碼時,出現這樣的問題:
2014-12-18 16:03:24,092 WARN (org.apache.hadoop.mapred.LocalJobRunner:560) - job_local374172562_0001 org.apache.hadoop.security.AccessControlException: Permission denied: user=zhengcy, access=WRITE, inode="/user/root/output":root:supergroup:drwxr-xr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:271) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:257) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:238) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:179) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6512) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6494) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6446) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4248) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4218) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4191) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:8
分析:我們沒權限訪問output目錄。
解決:我們 在設置hdfs配置的目錄是在hdfs-site.xml配置hdfs文件存放的地方,我們在這個etc/hadoop下的hdfs-site.xml添加
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
設置沒有權限,不過我們在正式的 服務器上不能這樣設置。
問題五:File/usr/root/input/file01._COPYING_ could only be replicated to 0 nodes instead ofminRepLication (=1) There are 0 datanode(s) running and no node(s) are excludedin this operation
如圖所示:
分析: 我們在第一次執行#hadoop namenode –format 完然后在執行#sbin/start-all.sh 再執行#jps,能看到Datanode,在執行#hadoop namenode –format然后執行#jps這時看不到Datanode ,如圖所示:
然后我們想把文本放到輸入目錄執行bin/hdfs dfs -put/usr/local/hadoop/hadoop-2.6.0/test/* /user/root/input 把/test/*文件上傳到hdfs的/user/root/input中,出現這樣的問題。
解決:是我們執行太多次了hadoopnamenode –format,創建了多個,去我們對應的hdfs目錄刪除hdfs-site.xml配置的保存datanode和namenode目錄即可解決。