搭建hadoop1.2集群

本文轉載自查看原文 2013-06-21 14:36 2100 雲/ 雲計算

環境准備

我使用的是vmware workstation，首先安裝ubuntu 12.04，安裝完成后通過vmware的clone，clone出兩個虛機，設置的IP分別是：

192.168.74.130 master
192.168.74.132 node1
192.168.74.133 node2

然后修改各個主機的/etc/hosts中的內容。

使用vi或者gedit，將上邊的內容編緝進去。

創建用戶

先創建hadoop用戶組：

sudo addgroup hadoop

然后創建用戶hadoop：

sudo adduser -ingroup hadoop hadoop

注：在centos 和 redhat下直接創建用戶就行，會自動生成相關的用戶組和相關文件，而ubuntu下直接創建用戶，創建的用戶沒有根目錄。

給hadoop用戶添加權限，打開/etc/sudoers文件；

sudo gedit /etc/sudoers

按回車鍵后就會打開/etc/sudoers文件了，給hadoop用戶賦予root用戶同樣的權限。

在root ALL=(ALL:ALL) ALL下添加hadoop ALL=(ALL:ALL) ALL，

hadoop  ALL=(ALL:ALL) ALL

為本機(master)和子節點(node..)安裝JDK環境。

其實網上挺多的，參考http://blog.csdn.net/klov001/article/details/8075237，這里不詳細描述了。

修改本機(master)和子節點(node..)機器名

打開/etc/hostname文件；

sudo gedit /etc/hostname

分別改為master、node1和node2。

本機(master)和子節點(son..)安裝ssh服務

主要為ubuntu安裝，cents和redhat系統自帶。

ubuntu下：

sudo apt-get install ssh openssh-server

建立ssh無密碼登錄環境

做這一步之前首先建議所有的機子全部轉換為hadoop用戶，以防出現權限問題的干擾。

ssh生成密鑰有rsa和dsa兩種生成方式，默認情況下采用rsa方式。

創建ssh-key，這里我們采用rsa方式；

ssh-keygen -t rsa -P ""

（注：回車后會在~/.ssh/下生成兩個文件：id_rsa和id_rsa.pub這兩個文件是成對出現的）

進入~/.ssh/目錄下，將id_rsa.pub追加到authorized_keys授權文件中，開始是沒有authorized_keys文件的；

cd ~/.ssh
cat id_rsa.pub >> authorized_keys

可以使用ssh 主機名測試一下是否成功。

為mater安裝hadoop

在hadoop用戶下建立hadoop文件夾，然后將hadoop-1.2.0.tar.gz上傳到這個目錄下。

tar -zxvf hadoop-1.2.0.tar.gz

解壓縮。然后到hadoop目錄下conf下找到hadoop-env.sh

配置JAVA_HOME為你上面配置的JAVA_HOME。

找到core-site.xml，配置信息如下：

<configuration>
   <property>
     <name>hadoop.tmp.dir</name>
     <value>/home/hadoop/tmp/hadoop-${user.name}</value>
     <description>A base for other temporarydirectories.</description>
   </property>

   <property>
     <name>fs.default.name</name>
     <value>hdfs://master:9000</value>
     <description>The name of the default file system.  A URI whose
     scheme and authority determine the FileSystem implementation.  The
     uri's scheme determines the config property (fs.SCHEME.impl) naming
     the FileSystem implementation class.  The uri's authority is used to
     determine the host, port, etc. for a filesystem.
     </description>
   </property>
</configuration>

修改hdfs-site.xml:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
   <property>
     <name>dfs.replication</name>
     <value>2</value>
     <description>Default block replication.
     The actual number of replications can be specified when the file iscreated.
     The default is used if replication is not specified in create time.
     </description>
   </property>
</configuration>

修改mapred-site.xml:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
     <name>mapred.job.tracker</name>
     <value>master:9001</value>
     <description>The host and port that the MapReduce job trackerruns
     at.  If "local", then jobs are run in-process as a singlemap
     and reduce task.
     </description>
   </property> 
</configuration>

修改masters:

master

修改slaves:

node1
node2

啟動hadoop

在master主機上的hadoop安裝目錄下的bin目錄下，執行格式化

./hadoop namenode -format

正常情況下會出現如下提示：

說明格式化成功。

啟動所有結點：

./start-all.sh

會按先后順序啟動，啟動完成后，分別到主機和兩個node上使用jps查看。

master上顯示如下：

node1和node2上顯示：

在操作的過程中遇到了DataNode不能啟動的問題，經過查看node1的hadoop的日志，發現提示錯誤信息：

org.apache.hadoop.hdfs.server.datanode.DataNode: All directories in dfs.data.dir are invalid.

經過查找是因為權限的問題，於是

sudo chmod 755 “你配置的data目錄”

問題解決。

運行示例

在根目錄下新建文件a，並且向a中隨意添加字符串信息。

然后在hdfs上創建目錄：

./hadoop dfs -mkdir test1

把剛才創建的文件a上傳到test1下：

./hadoop dfs -put ~/a test1

然后查看文件中的內容：

./hadoop dfs -cat test1/a

顯示結果如下：

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 hadoop2集群環境搭建 hadoop3.1.0集群搭建 Hadoop2.6.5集群搭建 Hadoop3集群搭建之——安裝hadoop，配置環境 Hadoop-2.7.2集群的搭建——集群學習日記 Hadoop-1.0.4集群搭建筆記 CentOS 7 的hadoop-3.0.3集群環境搭建 Hadoop3集群搭建之——虛擬機安裝 Hadoop3集群搭建之——hive安裝 hadoop3.3.0集群搭建（詳細教程）