操作系統:ubuntu 12.04
1.$ sudo apt-get install install ssh (備注:需要輸入yes的,要安裝openssh server 和另一個文件,忘了)
2.官方文檔要求:$ sudo apt-get install rsync 這個系統已經裝好了的
3 安裝java(我的安裝方法)
$chmod +x jdk-6u30-linux-i586.bin
$./jdk-6u30-linux-i586.bin
找到安裝好的目錄jdk1.6.0_30
$sudo mv jdk1.6.0_30 /usr/java (沒有這個目錄,可以提前建一個)
(這里java -version是不會出版本信息的,對hadoop是不影響的,如果需要設定,可以參考我的另一個文章http://www.cnblogs.com/xioyaozi/archive/2012/05/21/2511562.html)
4 解壓hadoop安裝包
$ tar -zxvf hadoop-1.0.3-bin.tar.gz
解壓后,找到hadoop-1.0.3目錄,修改conf/hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.6.0_30 (shell腳本中#號是注釋)
6 在hadoop-1.0.3目錄下 $ bin/hadoop 如果執行成功,就可以了。
Now you are ready to start your Hadoop cluster in one of the three supported modes:
- Local (Standalone) Mode(單機模式)
- Pseudo-Distributed Mode(偽分布式模式)
- Fully-Distributed Mode(完全分布式模式)
7 單機模式下(以下英文是官方文檔,可以簡單調試下)
By default, Hadoop is configured to run in a non-distributed mode, as a single Java process. This is useful for debugging.
The following example copies the unpacked conf directory to use as input and then finds and displays every match of the given regular expression. Output is written to the given output directory.
$ mkdir input
$ cp conf/*.xml input
$ bin/hadoop jar hadoop-examples-*.jar grep input output 'dfs[a-z.]+'
$ cat output/*
8 Pseudo-Distributed Mode
Hadoop can also be run on a single-node in a pseudo-distributed mode where each Hadoop daemon runs in a separate Java process.Configuration
Use the following:
conf/core-site.xml:
<configuration> <property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> </property> </configuration>
conf/hdfs-site.xml:
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>
conf/mapred-site.xml:
<configuration> <property> <name>mapred.job.tracker</name> <value>localhost:9001</value> </property> </configuration>
9 設置ssh免密碼登陸
$ssh localhost
會讓你輸入密碼,所以需要下進行配置
$ssh-keygen -t dsa
然后按回車就行。文件會自動產生.ssh目錄,但是我們看不到,無所謂
$ cd .ssh
****/.ssh$ cp id_dsa.pub authorized_keys
然后執行$ ssh localhost就可以不需要密碼登陸了
10 完全分布的還沒有配置,OK,over了,我也是一個新手
偽分布式安裝好后,可以進行wordcount的實驗,見下一篇博文http://www.cnblogs.com/xioyaozi/archive/2012/05/28/2521161.html