復制本地文件到HDFS本地測試異常


項目中需要將本地文件拷貝到hdfs上,由於本人比較懶,於是使用擅長的Java程序通過Hadoop.FileSystem.CopyFromLocalFile方法來實現。 在本地(Window 7 環境)本地模式下運行卻遇到了下述異常:

An exception or error caused a run to abort: org.apache.hadoop.io.nativeio.NativeIO$Windows.createFileWithMode0(Ljava/lang/String;JJJI)Ljava/io/FileDescriptor; 
java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.createFileWithMode0(Ljava/lang/String;JJJI)Ljava/io/FileDescriptor;
    at org.apache.hadoop.io.nativeio.NativeIO$Windows.createFileWithMode0(Native Method)
    at org.apache.hadoop.io.nativeio.NativeIO$Windows.createFileOutputStreamWithMode(NativeIO.java:559)
    at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:219)
    at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:209)
    at org.apache.hadoop.fs.RawLocalFileSystem.createOutputStreamWithMode(RawLocalFileSystem.java:307)
    at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:295)
    at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:328)
    at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:388)
    at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:451)
    at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:430)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:920)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:901)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:798)
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:368)
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:341)
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:292)
    at org.apache.hadoop.fs.LocalFileSystem.copyFromLocalFile(LocalFileSystem.java:82)
    at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1882)

通過分析異常堆棧可知,

org.apache.hadoop.io.nativeio.NativeIO$Windows.createFileWithMode0方法發生了異常,createFileWithMode0方法的實現如下:
    /** Wrapper around CreateFile() with security descriptor on Windows */
    private static native FileDescriptor createFileWithMode0(String path,
        long desiredAccess, long shareMode, long creationDisposition, int mode)
        throws NativeIOException;

通過代碼可知,這個方法是hadoop不支持的方法。那么為什么會調用這個方法,通過異常堆棧繼續向上追蹤,

是在 org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init> 過程調用了nativeio.NativeIO$Windows類,對應的方法如下:

 1  private LocalFSFileOutputStream(Path f, boolean append,
 2         FsPermission permission) throws IOException {
 3       File file = pathToFile(f);
 4       if (permission == null) {
 5         this.fos = new FileOutputStream(file, append);
 6       } else {
 7         if (Shell.WINDOWS && NativeIO.isAvailable()) {
 8           this.fos = NativeIO.Windows.createFileOutputStreamWithMode(file,
 9               append, permission.toShort());
10         } else {
11           this.fos = new FileOutputStream(file, append);
12           boolean success = false;
13           try {
14             setPermission(f, permission);
15             success = true;
16           } finally {
17             if (!success) {
18               IOUtils.cleanup(LOG, this.fos);
19             }
20           }
21         }
22       }
23     }

通過調用椎棧可知是上述代碼第8行調用了NativeIO.Windows類。那么if判斷應該是成立的,分析NativeIO.isAvailable方法代碼如下:

1   /**
2    * Return true if the JNI-based native IO extensions are available.
3    */
4   public static boolean isAvailable() {
5     return NativeCodeLoader.isNativeCodeLoaded() && nativeLoaded;
6   }

isAvailable方法主要是調用NativeCodeLoader.isNativeCodeLoaded方法

 1  static {
 2     // Try to load native hadoop library and set fallback flag appropriately
 3     if(LOG.isDebugEnabled()) {
 4       LOG.debug("Trying to load the custom-built native-hadoop library...");
 5     }
 6     try {
 7       System.loadLibrary("hadoop");
 8       LOG.debug("Loaded the native-hadoop library");
 9       nativeCodeLoaded = true;
10     } catch (Throwable t) {
11       // Ignore failure to load
12       if(LOG.isDebugEnabled()) {
13         LOG.debug("Failed to load native-hadoop with error: " + t);
14         LOG.debug("java.library.path=" +
15             System.getProperty("java.library.path"));
16       }
17     }
18     
19     if (!nativeCodeLoaded) {
20       LOG.warn("Unable to load native-hadoop library for your platform... " +
21                "using builtin-java classes where applicable");
22     }
23   }
24 
25   /**
26    * Check if native-hadoop code is loaded for this platform.
27    * 
28    * @return <code>true</code> if native-hadoop is loaded, 
29    *         else <code>false</code>
30    */
31   public static boolean isNativeCodeLoaded() {
32     return nativeCodeLoaded;
33   }

通過可以看到,isNativeCodeLoaded方法就是返回一個屬性值,那么問題出現在什么地方呢?

經過分析NativeCodeLoaded類的靜態構造函數,有一個“System.loadLibrary("hadoop")”方法。 是不是這個方法導致的呢?通過在其他同事環境上調試,System.loadLibrary("hadoop") 會異常,從而運行catch部分,但是本人電腦卻不會異常,直接繼續運行。那么System.loadLibrary方法是什么用途呢,通過分析源碼知道,這個方法是加載本地系統和用戶的環境變量的。進而分析是因為本人在C:\\Windows\System32目錄下有hadoop.dll文件或環境變量Path中配置了%Hadoop_Home%/bin目錄而導致的。

簡而言之,是因為配置的系統環境變量Path的任意目錄下存在hadoop.dll文件,從而被認為這是一個hadoop集群環境,但是hadoop集群又不支持window環境而產生的異常。處理方法也很簡單,檢查系統環境變量Path下的每一個目錄,確保沒有hadoop.dll文件即可。

如果是刪除系統環境變量Path的某一個目錄,需要重啟Intellij Idea后ClassLoader類中的usr_paths或sys_paths才會生效。

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM