Hadoop3.2.2集群初級安裝配置


大數據 Hadoop安裝

_(centos7下配置Hadoop3.2.2)_

一、安裝centos7

「跳轉到進階配置,為方便遠程發送的JAR運行!」

二、下載和解壓jak hadoop

(克隆之前進行)

  1. 建立安裝目錄

    cd / mkdir software ​

  2. 下載

    wget https://downloads.apache.org/hadoop/common/hadoop-3.2.2/hadoop-3.2.2.tar.gz
    
    wget https://repo.huaweicloud.com/java/jdk/8u202-b08/jdk-8u202-linux-x64.tar.gz
    
  3. 解壓

    tar -zxvf hadoop-3.2.2.tar.gz -C /software/
    tar -zxvf jdk-8u202-linux-x64.tar.gz -C /software/
    

三、配置環境變量

  1. 環境變量

    vi /etc/profile
    
    export JAVA_HOME=/software/jdk1.8.0_202
    export PATH=$JAVA_HOME/bin:$PATH
    export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:	$JAVA_HOME/lib/tools.jar
    export JAVA_HOME PATH CLASSPATH
    
    export HADOOP_HOME=/software/hadoop-3.2.2
    export PATH=$PATH:$HADOOP_HOME/bin
    export PATH=$PATH:$HADOOP_HOME/sbin
    
    export JAVA_HOME=/software/jdk1.8.0_202
    export PATH=$JAVA_HOME/bin:$PATH
    export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
    export JAVA_HOME PATH CLASSPATH
    
    export HADOOP_HOME=/software/hadoop-3.2.2
    export PATH=$PATH:$HADOOP_HOME/bin
    export PATH=$PATH:$HADOOP_HOME/sbin
    
  2. 保存配置

    source /etc/profile
    
  3. 或者使用用戶環境變量

    vi ~/.bash_profile
    
    export JAVA_HOME=/software/jdk1.8.0_202
    export JAVA_BIN=$JAVA_HOME/bin
    export JAVA_LIB=$JAVA_HOME/lib
    export CLASSPATH=.:$JAVA_LIB/tools.jar:$JAVA_LIB/dt.jar
        
    export HADOOP_HOME=/software/hadoop-3.2.2
    
    PATH=$PATH:$JAVA_BIN:$HADOOP_HOME/bin:	$HADOOP_HOME/sbin
    
    export PATH
    
  4. 保存用戶的環境變量

    source ~/.bash_profile
    

五、修改固定IP

(所有機器都建議修改,修改方式相同)

  1. ifconfig查看網卡

    ifconfig
    
  2. 進入網卡配置文件

    vi /etc/sysconfig/network-scripts/ifcfg-ens33
    

    修改

    BOOTPROTO="static"(將原來的修改為static)
    

    添加

    IPADDR=192.168.158.137(自定義)
    (和虛擬網卡設置有關 -->網關IP的值)
    GATEWAY=192.168.158.2
    DNS1=192.168.158.2
    
  3. 重啟網絡

    service network restart
    
  4. 關閉防火牆

     systemctl disable firewalld
    
  5. 重啟,后查看防火牆狀態

    systemctl status firewalld
    

六、修改hostname和hosts

(多台機器同時進行)

  1. 修改主機名

    vi /etc/hostname
    

刪除所有然后寫上自己的命名,我的命名node0(不同主機不同名)

  1. 添加其他節點

    vi /etc/hosts
    
     192.168.158.137 node0
     192.168.158.138 node1
     192.168.158.139 node2
     192.168.158.140 node3
    

七、免密登錄

(所有主機同時進行)

  1. 使用腳本快速創建

    在家目錄里建立一個1.sh,復制下面的東西進入1.sh

    cd ~
    
    vi 1.sh
    
  2. 復制進去

    ssh-keygen -t rsa
    ssh-copy-id -i ~/.ssh/id_rsa.pub node0
    ssh-copy-id -i ~/.ssh/id_rsa.pub node1
    ssh-copy-id -i ~/.ssh/id_rsa.pub node2 
    
  3. 執行1.sh

    cd ~
    
    bash 1.sh
    

八、建立自己需要的文件夾

  1. 建立logs文件夾

    (也可以不建立 后面運行時會自己creating,並且所有機器都要建 立)

    自己決定logs文件夾配置在哪里

  2. 建立配置文件需要的文件夾

    (同樣所有主機都要建立相同的)

    cd /
    mkdir data
    cd data
    mkdir hadoop
    cd hadoop
    mkdir hdfs tmp
    cd hdfs
    mkdir name data
    

九、修改hadoop配置文件

/software/hadoop-3.2.2/etc/hadoop目錄下

  1. hadoop-env.sh

    export JAVA_HOME=/software/jdk1.8.0_202
    export HADOOP_HOME=/software/hadoop-3.2.2
    export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
    
    export HDFS_NAMENODE_USER=root
    export HDFS_DATANODE_USER=root
    export HDFS_SECONDARYNAMENODE_USER=root
    export YARN_RESOURCEMANAGER_USER=root
    export YARN_NODEMANAGER_USER=root
    
  2. core-site.xml

    <configuration>
            <property>
                    <!-- 必須設置:默認文件系統(存儲層和運算層解耦 -->
                    <!-- 此處值為uri結構: 使用內置的hdfs系統 端口號一般都是9000 -->
                    <name>fs.defaultFS</name>
                    <value>hdfs://node0:9000</value>
            </property>
            <property>
                    <!-- 必須設置:hadoop在本地的工作目錄,用於放hadoop進程的臨時數據,可以自己指定 -->
                    <name>hadoop.tmp.dir</name>
                    <value>/data/hadoop/tmp</value>
            </property>
    </configuration>
    
  3. hdfs-site.xml

    
    (需要自己建立文件夾,前面已經建立了)
    <configuration>
            <!-- hdfs存儲數據的副本數量(避免一台宕機),可以不設置,默認值是3-->
            <property>
                    <name>dfs.replication</name>
                    <value>2</value>
           </property>
    
            <!--hdfs 監聽namenode的web的地址,默認就是9870端口,如果不改端口也可以不設置 -->
            <property>
                    <name>dfs.namenode.http-address</name>
                    <value>node0:9870</value>
            </property>
    
            <!-- hdfs保存datanode當前數據的路徑,默認值需要配環境變量,建議使用自己創建的路徑,方便管理-->
            <property>
                    <name>dfs.datanode.data.dir</name>
                    <value>/data/hadoop/hdfs/data</value>
            </property>
    
            <!-- hdfs保存namenode當前數據的路徑,默認值需要配環境變量,建議使用自己創建的路徑,方便管理-->
            <property>
                    <name>dfs.namenode.name.dir</name>
                    <value>/data/hadoop/hdfs/name</value>
            </property>
    </configuration>
    
  4. mapred-site.xml

    <configuration>
            <!-- 必須設置,mapreduce程序使用的資源調度平台,默認值是local,若不改就只能單機運行,不會到集群上了 -->
           <property>
                    <name>mapreduce.framework.name</name>
                    <value>yarn</value>
           </property>
            <!-- 這是3.2以上版本需要增加配置的,不配置運行mapreduce任務可能會有問題,記得使用自己的路徑 -->
            <property>
                    <name>mapreduce.application.classpath</name>
                    <value>
                            /software/hadoop-3.2.2/etc/hadoop,
                            /software/hadoop-3.2.2/share/hadoop/common/*,
                            /software/hadoop-3.2.2/share/hadoop/common/lib/*,
                            /software/hadoop-3.2.2/hadoop/hdfs/*,
                            /software/hadoop-3.2.2/share/hadoop/hdfs/lib/*,
                            /software/hadoop-3.2.2/share/hadoop/mapreduce/*,
                            /software/hadoop-3.2.2/share/hadoop/mapreduce/lib/*,
                            /software/hadoop-3.2.2/share/hadoop/yarn/*,
                            /software/hadoop-3.2.2/share/hadoop/yarn/lib/*
                    </value>
            </property>
    </configuration>
    
  5. yarn-site.xml

    <configuration>
            <!-- Site specific YARN configuration properties -->
            <!-- 必須配置 指定YARN的老大(ResourceManager)在哪一台主機 -->
            <property>
                    <name>yarn.resourcemanager.hostname</name>
                    <value>node0</value>
            </property>
    
            <!-- 必須配置 提供mapreduce程序獲取數據的方式 默認為空 -->
            <property>
                    <name>yarn.nodemanager.aux-services</name>
                    <value>mapreduce_shuffle</value>
            </property>
    </configuration>
    
  6. workers

    node0
    node1
    node2
    node3
    

十、發送配置文件

  1. 進入配置文件夾

    cd /software/hadoop-3.2.2/etc
    
    scp -r hadoop root@node1:/software/hadoop-3.2.2/etc/
    scp -r hadoop root@node2:/software/hadoop-3.2.2/etc/
    scp -r hadoop root@node3:/software/hadoop-3.2.2/etc/
    

十一、格式化namenode

  1. 初始化hadoop

    hadoop namenode -format
    

十二、運行查看

  1. 運行

    start-all.sh
    
  2. 查看

    jps 
    

十三、windos通過web訪問

  1. 作業端口(主節點IP + 8088端口)

    http://192.168.158.137:8088

  2. 資源管理端口

    http://192.168.158.137:9870

  3. http://192.168.158.137:9868

無法訪問,二進制編譯問題 如果自己編譯就不會出現問題

進階配置

classPath獲取方式

hadoop classpath

yarn.application.classpath需要

  1. yarn-site.xml新增加

     <property>
         <name>yarn.application.classpath</name>
         <value>
             /software/hadoop-3.2.2/etc/hadoop,
             /software/hadoop-3.2.2/share/hadoop/common/*,
             /software/hadoop-3.2.2/share/hadoop/common/lib/*,
             /software/hadoop-3.2.2/hadoop/hdfs/*,
             /software/hadoop-3.2.2/share/hadoop/hdfs/lib/*,
             /software/hadoop-3.2.2/share/hadoop/mapreduce/*,
             /software/hadoop-3.2.2/share/hadoop/mapreduce/lib/*,
             /software/hadoop-3.2.2/share/hadoop/yarn/*,
             /software/hadoop-3.2.2/share/hadoop/yarn/lib/*
         </value>
     </property>
    
     <property>
         <name>yarn.resourcemanager.webapp.address.rm1</name>
         <value>node0</value>
     </property>
     <property>
         <name>yarn.resourcemanager.scheduler.address.rm2</name>
         <value>node0</value>
     </property>
     <property>
         <name>yarn.resourcemanager.webapp.address.rm2</name>
         <value>node0</value>
     </property>
    
  2. mapred-site.xml新增加

     <property>
         <name>yarn.application.classpath</name>
         <value>
             /software/hadoop-3.2.2/etc/hadoop,
             /software/hadoop-3.2.2/share/hadoop/common/*,
             /software/hadoop-3.2.2/share/hadoop/common/lib/*,
             /software/hadoop-3.2.2/hadoop/hdfs/*,
             /software/hadoop-3.2.2/share/hadoop/hdfs/lib/*,
             /software/hadoop-3.2.2/share/hadoop/mapreduce/*,
             /software/hadoop-3.2.2/share/hadoop/mapreduce/lib/*,
             /software/hadoop-3.2.2/share/hadoop/yarn/*,
             /software/hadoop-3.2.2/share/hadoop/yarn/lib/*
         </value>
     </property>
     <property>
         <name>yarn.app.mapreduce.am.env</name>
         <value>HADOOP_MAPRED_HOME=/software/hadoop-3.2.2</value>
     </property>
     <property>
         <name>mapreduce.map.env</name>
         <value>HADOOP_MAPRED_HOME=/software/hadoop-3.2.2</value>
     </property>
     <property>
         <name>mapreduce.reduce.env</name>
         <value>HADOOP_MAPRED_HOME=/software/hadoop-3.2.2</value>
     </property>
    


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM