HUE配置文件hue.ini 的yarn_clusters模塊詳解（圖文詳解）（分HA集群和非HA集群）

本文轉載自查看原文 2017-05-06 16:38 1313 Cloudera Hue（可視化分析利器）的概念學習系列和部署搭建

不多說，直接上干貨！

　　我的集群機器情況是 bigdatamaster（192.168.80.10）、bigdataslave1（192.168.80.11）和bigdataslave2（192.168.80.12）

　　然后，安裝目錄是在/home/hadoop/app下。

　　官方建議在master機器上安裝Hue，我這里也不例外。安裝在bigdatamaster機器上。

　Hue版本：hue-3.9.0-cdh5.5.4

 需要編譯才能使用（聯網）


　說給大家的話：大家電腦的配置好的話，一定要安裝cloudera manager。畢竟是一家人的。
同時，我也親身經歷過，會有部分組件版本出現問題安裝起來要個大半天時間去排除，做好心里准備。廢話不多說，因為我目前讀研，自己筆記本電腦最大8G，只能玩手動來練手。
純粹是為了給身邊沒高配且條件有限的學生黨看的！ 但我已經在實驗室機器群里搭建好cloudera manager 以及 ambari都有。

大數據領域兩大最主流集群管理工具Ambari和Cloudera Manger

Cloudera安裝搭建部署大數據集群（圖文分五大步詳解）（博主強烈推薦）

Ambari安裝搭建部署大數據集群（圖文分五大步詳解）（博主強烈推薦）

　　首先，這是官網提供的參考步驟

http://archive.cloudera.com/cdh5/cdh/5/hue-3.9.0-cdh5.5.0/manual.html

　　一、以下是默認的配置文件

 # Configuration for YARN (MR2)
  # ------------------------------------------------------------------------
  [[yarn_clusters]]

    [[[default]]]
      # Enter the host on which you are running the ResourceManager
      ## resourcemanager_host=localhost

      # The port where the ResourceManager IPC listens on
      ## resourcemanager_port=8032

      # Whether to submit jobs to this cluster
      submit_to=True

      # Resource Manager logical name (required for HA)
      ## logical_name=

      # Change this if your YARN cluster is Kerberos-secured
      ## security_enabled=false

      # URL of the ResourceManager API
      ## resourcemanager_api_url=http://localhost:8088

      # URL of the ProxyServer API
      ## proxy_api_url=http://localhost:8088

      # URL of the HistoryServer API
      ## history_server_api_url=http://localhost:19888

      # In secure mode (HTTPS), if SSL certificates from YARN Rest APIs
      # have to be verified against certificate authority
      ## ssl_cert_ca_verify=True

    # HA support by specifying multiple clusters
    # e.g.

    # [[[ha]]]
      # Resource Manager logical name (required for HA)
      ## logical_name=my-rm-name

　　二、以下是跟我機器集群匹配的配置文件（非HA集群下怎么配置Hue的yarn_clusters模塊）

　　最終我的非HA配置信息如下

  # Configuration for YARN (MR2)
  # ------------------------------------------------------------------------
  [[yarn_clusters]]

    [[[default]]]
      # Enter the host on which you are running the ResourceManager
      resourcemanager_host=bigdatamaster

      # The port where the ResourceManager IPC listens on
      resourcemanager_port=8032

      # Whether to submit jobs to this cluster
      submit_to=True

      # Resource Manager logical name (required for HA)
      ## logical_name=

      # Change this if your YARN cluster is Kerberos-secured
      ## security_enabled=false

      # URL of the ResourceManager API
      resourcemanager_api_url=http://bigdatamaster:8088

      # URL of the ProxyServer API
      proxy_api_url=http://bigdatamaster:8088

      # URL of the HistoryServer API
      history_server_api_url=http://bigdatamaster:19888

      # In secure mode (HTTPS), if SSL certificates from YARN Rest APIs
      # have to be verified against certificate authority
      ## ssl_cert_ca_verify=True

    # HA support by specifying multiple clusters
    # e.g.

    # [[[ha]]]
      # Resource Manager logical name (required for HA)
      ## logical_name=my-rm-name

　　三、以下是跟我機器集群匹配的配置文件（HA集群下怎么配置Hue的yarn_clusters模塊）

hadoop-2.6.0.tar.gz的集群搭建（5節點）

　　這里需要說明一下，[[[default]]] 和 [[ha]]中各配置一個RM。

logical_name名字就是你集群中yarn-site.xml中配置的

<property> 
        <name>yarn.resourcemanager.ha.rm-ids</name>  
        <value>rm1,rm2</value> 
</property>

URL of the ResourceManager API 這里配置資源管理的地址和端口，對應yarn-site.xml中的

<property>
         <name>yarn.resourcemanager.webapp.address.rm1</name>
         <value>djt11:8088</value>
</property>


<property>
         <name>yarn.resourcemanager.webapp.address.rm2</name>
         <value>djt12:8088</value>
</property>

　　那么就要如下來配置

# URL of the ResourceManager API
resourcemanager_api_url=djt11:8088,djt12:8088

URL of the HistoryServer API 這里配置歷史記錄資源管理的地址和端口，對應mapred-site.xml中的

<property>
              <name>mapreduce.jobhistory.webapp.address</name>
              <value>djt13:19888</value>
</property>

　　所以，我的HA最終如下配置

# Configuration for YARN (MR2)
# ------------------------------------------------------------------------
[[yarn_clusters]]

[[[default]]]
# Enter the host on which you are running the ResourceManager
resourcemanager_host=cluster1

# The port where the ResourceManager IPC listens on
resourcemanager_port=8032

# Whether to submit jobs to this cluster
submit_to=True

# Resource Manager logical name (required for HA)
logical_name=rm1

# Change this if your YARN cluster is Kerberos-secured
## security_enabled=false

# URL of the ResourceManager API
resourcemanager_api_url=http://djt11:8088

# URL of the ProxyServer API
proxy_api_url=http://djt13:8088

# URL of the HistoryServer API
history_server_api_url=http://bigdatamaster:19888

# In secure mode (HTTPS), if SSL certificates from YARN Rest APIs
# have to be verified against certificate authority
## ssl_cert_ca_verify=True

# HA support by specifying multiple clusters
# e.g.

# [[[ha]]]
# Resource Manager logical name (required for HA)
logical_name=rm2
resourcemanager_api_url=http://djt12:23188
history_server_api_url=http://djt13:19888
submit_to=True

　　成功！

　　　　或者，HA集群也可以如下來做

　　我們首先查看yarn-site.xml

<configuration>

<!-- Site specific YARN configuration properties -->
    <property>
        <name>yarn.resourcemanager.cluster-id</name>
        <value>rs</value>
    </property>
    <property>
        <name>yarn.resourcemanager.ha.rm-ids</name>
        <value>rm1,rm2</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname.rm1</name>
        <value>bigdata-pro01.kfk.com</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname.rm2</name>
        <value>bigdata-pro02.kfk.com</value>
    </property>
    <property>
        <name>yarn.resourcemanager.zk.state-store.address</name>
        <value>bigdata-pro01.kfk.com:2181,bigdata-pro02.kfk.com:2181,bigdata-pro03.kfk.com:2181</value>
    </property>
    <property>
        <name>yarn.resourcemanager.zk-address</name>
        <value>bigdata-pro01.kfk.com:2181,bigdata-pro02.kfk.com:2181,bigdata-pro03.kfk.com:2181</value>
    </property>
    <property>
        <name>yarn.resourcemanager.recovery.enabled</name>
        <value>true</value>
    </property>
    <property>
        <name>yarn.resourcemanager.store.class</name>
        <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
    </property>
    
    <property>
        <name>yarn.resourcemanager.ha.enabled</name>
        <value>true</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
</configuration>

　　以上，我的yarn-site.xml配置文件。

　　然后，修改配置hue.ini

# Configuration for YARN (MR2)
  # ------------------------------------------------------------------------
  [[yarn_clusters]]

    [[[default]]]
      # Enter the host on which you are running the ResourceManager
      resourcemanager_host=rs

      # The port where the ResourceManager IPC listens on
      resourcemanager_port=8032

      # Whether to submit jobs to this cluster
      submit_to=True

      # Resource Manager logical name (required for HA)
      ## logical_name=

      # Change this if your YARN cluster is Kerberos-secured
      ## security_enabled=false

      # URL of the ResourceManager API
      resourcemanager_api_url=http://bigdata-pro01.kfk.com:8088

      # URL of the ProxyServer API
      proxy_api_url=http://bigdata-pro01.kfk.com:8088

      # URL of the HistoryServer API
      history_server_api_url=http://bigdata-pro01.kfk.com:19888

      # In secure mode (HTTPS), if SSL certificates from YARN Rest APIs
      # have to be verified against certificate authority
      ## ssl_cert_ca_verify=True

    # HA support by specifying multiple clusters
    # e.g.

    # [[[ha]]]
      # Resource Manager logical name (required for HA)
      ## logical_name=my-rm-name

　　　　配置完成后，先停止yarn，再重新啟動yarn，再重新啟動hue。

參考

http://gethue.com/hadoop-tutorial-yarn-resource-manager-high-availability-ha-in-mr2/

http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_ig_hue_config.html

http://cloudera.github.io/hue/docs-3.8.0/manual.html#_hadoop_configuration

歡迎大家，加入我的微信公眾號：大數據躺過的坑人工智能躺過的坑

同時，大家可以關注我的個人博客：

http://www.cnblogs.com/zlslch/ 和 http://www.cnblogs.com/lchzls/ http://www.cnblogs.com/sunnyDream/

詳情請見：http://www.cnblogs.com/zlslch/p/7473861.html

　　人生苦短，我願分享。本公眾號將秉持活到老學到老學習無休止的交流分享開源精神，匯聚於互聯網和個人學習工作的精華干貨知識，一切來於互聯網，反饋回互聯網。
　　目前研究領域：大數據、機器學習、深度學習、人工智能、數據挖掘、數據分析。語言涉及：Java、Scala、Python、Shell、Linux等。同時還涉及平常所使用的手機、電腦和互聯網上的使用技巧、問題和實用軟件。只要你一直關注和呆在群里，每天必須有收獲

對應本平台的討論和答疑QQ群：大數據和人工智能躺過的坑（總群）（161156071）

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。