集群配置虛擬主機及部署Hadoop集群碰到的問題


配置集群方案

Ubuntu下的配置apache虛擬主機方案:

對其中的Master節點配置虛擬主機,可以通過Chrome瀏覽器訪問目錄。
安裝虛擬主機之前,先安裝Apache2

sudo apt-get install apache2

再安裝php5

sudo apt-get install php5

然后,進入 /etc/apache2/sites-available文件夾,添加”*.conf”文件
往該文件里寫入

<VirtualHost *:80>
    ServerName author.xxx.com
    ServerAdmin author.xxx.com
    DocumentRoot "/home/author"
    <Directory "/home/author">
        Options Indexes
        AllowOverride all
        Order allow,deny
        IndexOptions Charset=UTF-8
        Allow from all
        Require all granted
    </Directory>
    <ifModule dir_module>
        DirectoryIndex index.html
    </ifModule>
        ErrorLog ${APACHE_LOG_DIR}/authors_errors.log
        CustomLog ${APACHE_LOG_DIR}/authors_access.log combined
</VirtualHost>                     

這樣的結果是,當Url中訪問author.xxx.com時,是有文件夾的樹狀列表顯示的。如果想關掉樹狀列表顯示(為了安全),可以將

Options Indexes
IndexOptions Charset=UTF-8

改成

Options FollowSymLinks

這邊

paul_errors.log

paul_access.log

都位於 /usr/log/apache2中,可以查看apache的日志,用root權限。

配置文件完成之后,則設置的配置文件運行以下命令:

sudo a2ensite xxx.conf
sudo /etc/init.d/apache2 restart

mac下的配置apache虛擬主機方案:

前面基本一致,除了重新啟動配置文件不同:

sudo apachectl -v //查看apache版本
sudo apachectl -t //查看虛擬文件配置是否語法正確
sudo apachectl -k restart //重新啟動Apache

hadoop部署集群碰到問題(版本為2.7及以上)

該搭建集群具體參數參考本主上一篇文章“機房4台服務器集群網絡配置"

在Master上執行下列查看語句之后,出現如下錯誤

hdfs dfsadmin -report
Configured Capacity: 0 (0 B)
Present Capacity: 0 (0 B)
DFS Remaining: 0 (0 B)
DFS Used: 0 (0 B)
DFS Used%: NaN%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1):

所有值都為0,且得不到其他slave1,slave2,slave3的反饋消息。

解決方法:

mkdir /home/hadoop/usr/hadoop/conf

新建配置文件夾

文件夾下放入以下配置文件

core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <configuration>
        <property>
            <name>hadoop.tmp.dir</name>
            <value>/usr/hadoop/tmp</value>
            <description>A base for other temporary directories.</description>
        </property>
        <!--file system properties-->
     <property>
          <name>fs.default.name</name>
       <value>hdfs://192.168.223.1:9000</value>
    </property>
  </configuration>    

hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
         <name>dfs.replication</name>
      <value>1</value>
    </property>
</configuration>

mapred-site.xml(老版本下job,task配置)

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
        <property>
           <name>mapred.job.tracker</name>
            <value>http://192.168.223.1:9001</value>
        </property>
</configuration>    

mapred-site.xml(使用hadoop2.2之后的配置)

<configuration>
       <property>
            <name>mapreduce.framework.name</name>
            <value>yarn</value>
        </property>
</configuration>

yarn-site.xml(Master下的配置文件)

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>resourcemanager.company.com</value>
     </property>
     <property>
       <description>Classpath for typical applications.</description>
       <name>yarn.application.classpath</name>
       <value>
           $HADOOP_CONF_DIR,
            $HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib*/,
            $HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,
            $HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,
            $HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*
        </value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>file:///data/1/yarn/local,file:///data/2/yarn/local,file:///data/3/yarn/local</value>
</property>
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>file:///data/1/yarn/logs,file:///data/2/yarn/logs,file:///data/3/yarn/logs</value>
</property>
<property>
<name>yarn.log.aggregation-enable</name>
<value>true</value>
</property>
<property>
<description>Where to aggregate logs</description>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>hdfs://<namenode-host.company.com>:8020/var/log/hadoop-yarn/apps</value>
</property><!-- Site specific YARN configuration properties --></configuration>

為了配合yarn-site.xml中的配置,需要配置

  1. 創建 yarn.nodemanager.local-dirs 本地目錄:
    $ sudo mkdir -p /data/1/yarn/local /data/2/yarn/local /data/3/yarn/local /data/4/yarn/local
  2. 創建 yarn.nodemanager.log-dirs 本地目錄:
    $ sudo mkdir -p /data/1/yarn/logs /data/2/yarn/logs /data/3/yarn/logs /data/4/yarn/logs
  3. 將 yarn.nodemanager.local-dirs 目錄的所有者配置為 hadoop 用戶:

    $ sudo chown -R hadoop:hadoop /data/1/yarn/local /data/2/yarn/local /data/3/yarn/local /data/4/yarn/local
  4. 將 yarn.nodemanager.log-dirs 目錄的所有者配置為 hadoop 用戶:

    $ sudo chown -R hadoop:hadoop /data/1/yarn/logs /data/2/yarn/logs /data/3/yarn/logs /data/4/yarn/logs

yarn-site.xml在slave中的配置,用於與master節點通信,所以IP與端口號都是master節點的:

<?xml version="1.0"?>
<configuration>
    <property>
        <name>
           yarn.nodemanager.aux-services
        </name>
        <value>
           mapreduce_shuffle
        </value>
    </property>
    <property>
          <name>
             yarn.nodemanager.auxservices.mapreduce.shuffle.class
          </name>
          <value>
             org.apache.hadoop.mapred.ShuffleHandler
          </value>
    </property>
    <property>
        <name>
             yarn.resourcemanager.address
        </name>
        <value>
             192.168.223.1:8032
        </value>
    </property>
    <property>
        <name>
             yarn.resourcemanager.scheduler.address
        </name>
        <value>
             192.168.223.1:8030
        </value>
    </property>
    <property>
        <name>
              yarn.resourcemanager.resource-tracker.address
        </name>
        <value>
              192.168.223.1:8031
        </value>
    </property>
    <property>       
        <name>
              yarn.resourcemanager.hostname
        </name>
        <value>
              192.168.223.1
        </value>
    </property>
<!-- Site specific YARN configuration properties -->
</configuration>                                                        

master

192.168.223.1

slaves(在master節點上的配置文件相應ip地方換上以下相應的ip)

192.168.223.2
192.168.223.3
192.168.223.4

slaves(在slave節點上的配置文件)

localhost

啟動方法如下:

hadoop@master:/usr/hadoop$hadoop namenode -format
hadoop@master:/usr/hadoop$sbin/start-all.sh(如果已經啟動,則先運行sbin/stop-all.sh)

查看方法(執行以下命令)

hadoop@master:/usr/hadoop$hdfs dfsadmin -report

得到如下結果,則表示安裝正確

Configured Capacity: 4958160830464 (4.51 TB)
Present Capacity: 4699621490688 (4.27 TB)
DFS Remaining: 4699621404672 (4.27 TB)
DFS Used: 86016 (84 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

-------------------------------------------------
Live datanodes (3):

Name: 192.168.223.3:50010 (slave3)
Hostname: slave3
Decommission Status : Normal
Configured Capacity: 1697554399232 (1.54 TB)
DFS Used: 28672 (28 KB)
Non DFS Used: 88462258176 (82.39 GB)
DFS Remaining: 1609092112384 (1.46 TB)
DFS Used%: 0.00%
DFS Remaining%: 94.79%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sat Nov 14 21:40:02 CST 2015

Name: 192.168.223.2:50010 (slave2)
Hostname: slave2
Decommission Status : Normal
Configured Capacity: 1697938153472 (1.54 TB)
DFS Used: 28672 (28 KB)
Non DFS Used: 88474435584 (82.40 GB)
DFS Remaining: 1609463689216 (1.46 TB)
DFS Used%: 0.00%
DFS Remaining%: 94.79%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sat Nov 14 21:40:02 CST 2015

Name: 192.168.223.4:50010 (slave4)
Hostname: slave4
Decommission Status : Normal
Configured Capacity: 1562668277760 (1.42 TB)
DFS Used: 28672 (28 KB)
Non DFS Used: 81602646016 (76.00 GB)
DFS Remaining: 1481065603072 (1.35 TB)
DFS Used%: 0.00%
DFS Remaining%: 94.78%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sat Nov 14 21:40:02 CST 2015

創建HDFS文件系統的命令

hadoop fs -mkdir -p /user/[current login user]

創建完HDFS文件系統用戶之后,你就可以訪問HDFS文件系統,具體對HDFS分布式文件系統的命令請參考以下網址 

HDFS文件系統創建、刪除文件命令

網頁訪問hadoop當前性能 

http://10.1.8.200:50070/(這邊的ip為外網訪問master節點的ip,讀者自己設置自己的ip)

如下圖所示:

具體安裝則參考網址

Hadoop集群(第5期)_Hadoop安裝配置

從 MapReduce 1 (MRv1) 遷移到 MapReduce 2 (MRv2, YARN)

在群集中部署 MapReduce v2 (YARN)

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM