Hadoop Eclipse開發環境搭建


    This document is from my evernote, when I was still at baidu, I have a complete hadoop development/Debug environment. But at that time, I was tired of writing blogs. It costs me two day’s spare time to recovery from where I was stoped. Hope the blogs will keep on. Still cherish the time speed there, cause when doing the same thing at both different time and different place(company), the things are still there, but mens are no more than the same one. Talk too much, Let’s go on.

  在Hadoop集群搭建,已經搭建好了一個用於開發/測試的haoop集群,在這篇文章中,將介紹如何使用eclipse作為開發環境來進行程序的開發和測試。

  1.) 在這個地址http://download.csdn.net/detail/uestczhangchao/8409179 下載, hadoop-eclipse-plugin-1.0.3.jar的eclipse插件,本文使用 Eclipse Java EE IDE for Web Developers.  Version: Luna Release (4.4.0) 作為IDE工具,將下載好的hadoop-eclipse-plugin-1.0.3.jar文件放到eclipse的plugin目錄中(如果是MyEclispe則放到:D:\program_files\MyEclipse\MyEclipse 10\dropins\svn\plugins 目錄中)

image

2.) 在Eclipse的Windows->Preferences中,選擇Hadoop Map/Reduce,設置好Hadoop的安裝目錄,這里,我直接從linux的/home/hadoop/hadoop-1.0.3拷貝過來的,點擊OK按鈕:

image         image

3.) 新建一個Map/Reduce Project

image

4.) 新建Map/Reduce Project后,會生成如下的兩個目錄, DFS Locations和suse的Java工程,在java工程中,自動加入對hadoop包的依賴:

image  image

5.) 是用該插件建立的工程,有專門的視圖想對應:

image     image

6.)在Map/Reduce Locations中,選擇Edit Hadoop Location…選項,Map/Recuce Master和 DFS Master的設置:

image

image

7.)在Advanced parameters中,設置Hadoop的配置選項,將dfs.data.dir設置成和linx環境中的一樣,在Advanced parameters中,將所有與路徑相關的都設置成對應的Linux路徑即可:

image

8.)將Hadoop集群相關的配置設置好后,可以在DFS location中看到Hadoop集群上的文件,可以進行添加和刪除操作:

image

9.)在生成的Java工程中,添加Map/Reduce程序,這里我添加了一個WordCount程序作為測試:

image

10.) 在Java工程的Run Configurations中設置WordCount的Arguments,第一個參數為輸入文件在hdfs的路徑,第二個參數為hdfs的輸出路徑:

image

11.)設置好Word Count的RunConfiguration后,選擇Run As-> Run on Hadoop:

image

12.) 在Console中可以看到Word Count運行的輸出日志信息:

image

13.)在DFS Location中可以看到,Word Count在result目錄下生成的結果:

image

14.)進行Word Count程序的調試,在WordCount.java中設置好斷點,點擊debug按鈕,就可以進行程序的調試了:

image

至此, Hadoop+Eclipse的開發環境搭建完成。

15.) 搭建環境的異常情況處理,在搭建環境的過程中,遇到的比較棘手的問題如下,提示widows上的用戶沒有權限,這個異常的處理在修改hadoop FileUtil.java,解決權限檢查的問題文章中進行介紹,需要通過修改hadoop的源代碼,重新編譯進行修復:

15/01/30 10:08:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/01/30 10:08:17 ERROR security.UserGroupInformation: PriviledgedActionException as:zhangchao3 cause:java.io.IOException: Failed to set permissions of path: \tmp\hadoop-zhangchao3\mapred\staging\zhangchao3502228304\.staging to 0700
Exception in thread "main" java.io.IOException: Failed to set permissions of path: \tmp\hadoop-zhangchao3\mapred\staging\zhangchao3502228304\.staging to 0700
    at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
    at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
    at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509)
    at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:344)
    at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:189)
    at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:116)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
    at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:500)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530)
    at org.apache.hadoop.examples.WordCount.main(WordCount.java:68)

 

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM