一:簡介
最近學習hadoop本地運行模式,在運行期間遇到一些問題,記錄下來備用;以運行hadoop下wordcount為例子。
hadoop程序是在集群運行還是在本地運行取決於下面兩個參數的設置,第一個參數用來設置mr程序要在yarn集群中執行,第二個參數設置yarn集群的主節點地址。
hadoop默認情況下是在window本地運行。
conf.set("mapreduce.framework.name","yarn");
conf.set("yarn.resourcemanager.hostname","hadoop-server-03");
問題一:Exception in thread "main" java.lang.NullPointerException atjava.lang.ProcessBuilder.start(Unknown Source)
本地運行hadoop報錯
log4j:WARNPlease initialize the log4j system properly.
log4j:WARN Seehttp://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Exception in thread "main" java.lang.NullPointerException
atjava.lang.ProcessBuilder.start(Unknown Source)
atorg.apache.hadoop.util.Shell.runCommand(Shell.java:482)
atorg.apache.hadoop.util.Shell.run(Shell.java:455)
atorg.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
atorg.apache.hadoop.util.Shell.execCommand(Shell.java:808)
atorg.apache.hadoop.util.Shell.execCommand(Shell.java:791)
at
分析:
下載Hadoop2以上版本時,在Hadoop2的bin目錄下沒有winutils.exe
解決:
1.下載https://codeload.github.com/srccodes/hadoop-common-2.2.0-bin/zip/master下載hadoop-common-2.2.0-bin-master.zip,然后解壓后,把hadoop-common-2.2.0-bin-master下的bin全部復制放到我們下載的Hadoop2的binHadoop2/bin目錄下。如圖所示:
2.Eclipse-》window-》Preferences 下的Hadoop Map/Peduce 把下載放在我們的磁盤的Hadoop目錄引進來,如圖所示:
3.Hadoop2配置變量環境HADOOP_HOME 和path,如圖所示:
問題二.Exception in thread "main"java.lang.UnsatisfiedLinkError:org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
當我們解決了問題三時,在運行WordCount.java代碼時,出現這樣的問題
- log4j:WARN No appenders could be found forlogger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
- log4j:WARN Please initialize the log4jsystem properly.
- log4j:WARN Seehttp://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
- Exception in thread "main"java.lang.UnsatisfiedLinkError:org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
- atorg.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
- atorg.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:557)
- atorg.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:977)
- atorg.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:187)
- atorg.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:174)
- atorg.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:108)
- atorg.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:285)
- atorg.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:344)
- atorg.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
- atorg.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131)
- atorg.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:115)
- atorg.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:131)
分析:
C:\Windows\System32下缺少hadoop.dll,把這個文件拷貝到C:\Windows\System32下面即可。
如果是windows是64位操作系統,將hadoop.dll文件拷貝到C:\Windows\SysWOW64下面即可。
解決:
hadoop-common-2.2.0-bin-master下的bin的hadoop.dll放到C:\Windows\System32下,然后重啟電腦,也許還沒那么簡單,還是出現這樣的問題。
我們在繼續分析(這個問題我並沒有遇到):
我們在出現錯誤的的atorg.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:557)我們來看這個類NativeIO的557行,如圖所示:
Windows的唯一方法用於檢查當前進程的請求,在給定的路徑的訪問權限,所以我們先給以能進行訪問,我們自己先修改源代碼,return true 時允許訪問。我們下載對應hadoop源代碼,hadoop-2.6.0-src.tar.gz解壓,hadoop-2.6.0-src\hadoop-common-project\hadoop-common\src\main\java\org\apache\hadoop\io\nativeio下NativeIO.java 復制到對應的Eclipse的project,然后修改557行為return true如圖所示:
問題三:org.apache.hadoop.security.AccessControlException: Permissiondenied: user=zhengcy, access=WRITE,inode="/user/root/output":root:supergroup:drwxr-xr-x
我們在執行運行WordCount.java代碼時,出現這樣的問題
- 2014-12-18 16:03:24,092 WARN (org.apache.hadoop.mapred.LocalJobRunner:560) - job_local374172562_0001
- org.apache.hadoop.security.AccessControlException: Permission denied: user=zhengcy, access=WRITE, inode="/user/root/output":root:supergroup:drwxr-xr-x
- at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:271)
- at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:257)
- at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:238)
- at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:179)
- at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6512)
- at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6494)
- at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6446)
- at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4248)
- at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4218)
- at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4191)
- at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813)
分析:
我們沒權限訪問output目錄。
解決方法:
執行賦權限語句: hadoop fs -chmod -R 777 /wordcount/output
重新執行,hadoop程序在本地終於運行成功了。