在Vmware中Centos下的Hadoop環境搭建


Hadoop環境搭建

hadoo和jdk的下載問題:如果是下載到windows上,需要移動到虛擬機上。只需直接拖拽就可以完成文件的移動。如果沒能拖拽成功,則需要使用遠程連接的軟件來完成文件的上傳,這里推薦使用MobaXterm的安裝和使用https://www.cnblogs.com/cainiao-chuanqi/p/11366726.html

完成上傳之后,其中的simple/soft等文件夾需要自己去創建(創建目錄的命令為mkdir xxx,這里的創建沒有硬性規定,隨自己喜好,但是前提是你可以找到自己創建的目錄文件夾在那個位置。

環境變量的設置需要准確定位到JDK和Hadoop存放目錄的絕對路徑。

文章涉及到的IP地址為本虛擬機的IP地址。大家做實驗的時候需要使用自己虛擬機的IP地址(查看:ifconfig

JDK的安裝與配置

首先是JDK的選擇,建議選擇JDK1.8版本。防止兼容性問題。因為一個Hadoop的安裝過程中會調用許多jar包。(Hadoop本身是由Java來寫的)。

下載JDK

下載到一個文件夾中,這里選擇的是 /home/cai/simple/soft文件夾。把下載過后的JDK復制到此文件夾中。(到網上下載Linux版本的jdk)。ps:用cp或者mv來移動文件。

 cd /home/cai/simple/soft

解壓JDK

如果你下載的是一個壓縮包,那么就需要解壓這個文件。tar -zxvf /home/cai/simple/soft/jdk-8u181-linux-x64.tar.gz 

注意:在環境變量中 我把解壓后的文件移動到home/cai/simple/目錄下了,在后面的環境變量配置需要注意;

tar -zxvf /home/cai/simple/soft/jdk-8u181-linux-x64.tar.gz /home/cai/simple/

進入JDK目錄

 cd /home/cai/simple/soft/ jdk1.8.0_181

 這里我的JDK目錄在中

 cd /home/cai/simple/soft/ 

這里根據自己文件位置選擇。。確定文件解壓無誤。。

配置JDK環境

vim /etc/profile

#java environment
  export JAVA_HOME=/home/cai/simple/jdk1.8.0_181
  export CLASSPATH=.:${JAVA_HOME}/jre/lib/rt.jar:${JAVA_HOME}/lib/dt.jar:${JAVA_HOME}/lib/tools.jar
  export PATH=$PATH:${JAVA_HOME}/bin

更新配置文件

完成編輯后,執行source /etc/profile,刷新配置,配置中的文件才會生效。

source /etc/profile

測試配置是否成功

在任何目錄下執行javac命令,如果提示‘找不到命令’,則表示配置未成功,否則,表示配置成功。

javac

Hadoop的安裝與配置

下載Hadoop

下載文件到 /home/cai/simple/soft/ 中,(下載位置自己選擇)。ps:用cp或者mv來移動文件。

cd /home/cai/simple/soft/ 

解壓Hadoop

tar -zxvf /home/cai/simple/soft/hadoop-2.7.1.tar.gz

查看Hadoop的etc文件

首先查看是否解壓成功,如果解壓成功,則進入hadoop-2.7.1文件夾中。

查看/home/cai/simple/soft/hadoop-2.7.1/etc/hadoop目錄下的文件

cd /home/cai/simple/soft/hadoop-2.7.1/etc/hadoop

查看配置文件

配置$HADOOP_HOME/etc/hadoop下的hadoop-env.sh文件 

vim hadoop-env.sh 

 

# The java implementation to use.
#export JAVA_HOME=${JAVA_HOME}
export JAVA_HOME=/home/cai/simple/jdk1.8.0_181

配置$HADOOP_HOME/etc/hadoop下的core-site.xml文件

vim core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<!-- HDFS file path -->
<property>
  <name>fs.default.name</name>
  <value>hdfs://172.16.12.37:9000</value>
 </property>

<property>
  <name>fs.defaultFS</name>
  <value>hdfs://172.16.12.37:9000</value>
 </property>

 <property>
  <name>io.file.buffer.size</name>
  <value>131072</value>
 </property>

 <property>
  <name>hadoop.tmp.dir</name>
  <value>/home/cai/simple/soft/hadoop-2.7.1/tmp</value>
  <description>Abasefor other temporary directories.</description>
 </property>


</configuration>

             

配置$HADOOP_HOME/etc/hadoop下的hdfs-site.xml文件

vim hdfs-site.xml

這里需要注意的是:如果找不到 /hdfs 文件,可以把文件路徑改為 /tmp/dfs 下查找name與data文件
<value>/home/cai/simple/soft/hadoop-2.7.1/hdfs/name</value>
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributeid on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  ldoop時,需要對conf目錄下的三個文件進行配置imitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->
<configuration>
  <property>
   <name>dfs.namenode.name.dir</name>
   <value>/home/cai/simple/soft/hadoop-2.7.1/hdfs/name</value>
 </property>

 <property>
  <name>dfs.datanode.data.dir</name>
    <value>/home/cai/simple/soft/hadoop-2.7.1/hdfs/data</value>
  </property>

<!--
  <property>
   <name>dfs.namenode.name.dir</name>
   <value>/home/cai/simple/soft/hadoop-2.7.1/etc/hadoop
/hdfs/name</value>
 </property>

 <property>
  <name>dfs.datanode.data.dir</name>
    <value>/home/cai/simple/soft/hadoop-2.7.1/etc/hadoop
/hdfs/data</value>
  </property>
-->

 <property>
  <name>dfs.replication</name>
  <value>1</value>
 </property>

 <property>
  <name>dfs.webhdfs.enabled</name>
  <value>true</value>
 </property>


</configuration>

配置$HADOOP_HOME/etc/hadoop下的mapred-site.xml文件

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->
<configuration>

<configuration>
  <property>
   <name>mapreduce.framework.name</name>
   <value>yarn</value>
 </property>

 <property>
  <name>mapreduce.jobhistory.address</name>
  <value>172.16.12.37:10020</value>
 </property>

 <property>
  <name>mapreduce.jobhistory.webapp.address</name>
  <value>172.16.12.37:19888</value>
 </property>

</configuration>

</configuration>

配置$HADOOP_HOME/etc/hadoop下的yarn-site.xml文件

<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>

<!-- Site specific YARN configuration properties -->
<property>
   <name>yarn.nodemanager.aux-services</name>
   <value>mapreduce_shuffle</value>
  </property>

  <property>
   <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
   <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>

 <property>
   <name>yarn.resourcemanager.address</name>
   <value>172.16.12.37:8032</value>
  </property>

  <property>
   <name>yarn.resourcemanager.scheduler.address</name>
   <value>172.16.12.37:8030</value>
  </property>

  <property>
   <name>yarn.resourcemanager.resource-tracker.address</name>
   <value>172.16.12.37:8035</value>
  </property>

 <property>
   <name>yarn.resourcemanager.admin.address</name>
   <value>172.16.12.37:8033</value>
  </property>

  <property>
   <name>yarn.resourcemanager.webapp.address</name>
   <value>172.16.12.37:8088</value>
  </property>

</configuration>

配置/etc/profile文件

 

 

vim /etc/profile

# /etc/profile

# System wide environment and startup programs, for login setup
# Functions and aliases go in /etc/bashrc

# It's NOT a good idea to change this file unless you know what you
# are doing. It's much better to create a custom.sh shell script in
# /etc/profile.d/ to make custom changes to your environment, as this
# will prevent the need for merging in future updates.

pathmunge () {
    case ":${PATH}:" in
        *:"$1":*)
            ;;
        *)
            if [ "$2" = "after" ] ; then
                PATH=$PATH:$1
            else
                PATH=$1:$PATH
            fi
    esac
}

if [ -x /usr/bin/id ]; then
    if [ -z "$EUID" ]; then
        # ksh workaround
        EUID=`id -u`
        UID=`id -ru`
    fi
    USER="`id -un`"
    LOGNAME=$USER
    MAIL="/var/spool/mail/$USER"
fi

# Path manipulation
if [ "$EUID" = "0" ]; then
    pathmunge /usr/sbin
    pathmunge /usr/local/sbin
else
    pathmunge /usr/local/sbin after
    pathmunge /usr/sbin after
fi
HOSTNAME=`/usr/bin/hostname 2>/dev/null`
HISTSIZE=1000
if [ "$HISTCONTROL" = "ignorespace" ] ; then
    export HISTCONTROL=ignoreboth
else
    export HISTCONTROL=ignoredups
fi

export PATH USER LOGNAME MAIL HOSTNAME HISTSIZE HISTCONTROL

# By default, we want umask to get set. This sets it for login shell
# Current threshold for system reserved uid/gids is 200
# You could check uidgid reservation validity in
# /usr/share/doc/setup-*/uidgid file
if [ $UID -gt 199 ] && [ "`id -gn`" = "`id -un`" ]; then
    umask 002
else
    umask 022
fi

for i in /etc/profile.d/*.sh ; do
    if [ -r "$i" ]; then
        if [ "${-#*i}" != "$-" ]; then
            . "$i"
        else
            . "$i" >/dev/null
        fi
    fi
done

unset i
unset -f pathmunge

#java environment
  export JAVA_HOME=/home/cai/simple/jdk1.8.0_181
  export CLASSPATH=.:${JAVA_HOME}/jre/lib/rt.jar:${JAVA_HOME}/lib/dt.jar:${JAVA_HOME}/lib/tools.jar
  export PATH=$PATH:${JAVA_HOME}/bin


  export HADOOP_HOME=/home/cai/simple/soft/hadoop-2.7.1
  export PATH=$PATH:${JAVA_HOME}/bin:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin

更新配置文件

讓配置文件生效,需要執行命令source /etc/profile

source /etc/profile

格式化NameNode

格式化NameNode,在任意目錄下執行 hdfs namenode -format 或者 hadoop namenode -format ,實現格式化。

hdfs namenode -format 
或者
hadoop namenode -format 

啟動Hadoop集群

啟動Hadoop進程,首先執行命令start-dfs.sh,啟動HDFS系統。

start-dfs.sh

啟動yarn集群

start-yarn.sh

jps查看配置信息

jps

UI測試

測試HDFS和yarn(推薦使用火狐瀏覽器)有兩種方法,一個是在命令行中打開,另一個是直接雙擊打開

firefox

端口:8088與50070端口

首先在瀏覽器中輸入http://172.16.12.37:50070/  (HDFS管理界面)(此IP是自己虛擬機的IP地址,端口為固定端口)每個人的IP不一樣,根據自己的IP地址來定。。。

在瀏覽器中輸入http://172.16.12.37:8088/(MR管理界面)(此IP是自己虛擬機的IP地址,端口為固定端口)每個人的IP不一樣,根據自己的IP地址來定。。。


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM