不多說,直接上干貨!
我的集群機器情況是 bigdatamaster(192.168.80.10)、bigdataslave1(192.168.80.11)和bigdataslave2(192.168.80.12)
然后,安裝目錄是在/home/hadoop/app下。
官方建議在master機器上安裝Hue,我這里也不例外。安裝在bigdatamaster機器上。
Hue版本:hue-3.9.0-cdh5.5.4
需要編譯才能使用(聯網)
說給大家的話:大家電腦的配置好的話,一定要安裝cloudera manager。畢竟是一家人的。
同時,我也親身經歷過,會有部分組件版本出現問題安裝起來要個大半天時間去排除,做好心里准備。廢話不多說,因為我目前讀研,自己筆記本電腦最大8G,只能玩手動來練手。
純粹是為了給身邊沒高配且條件有限的學生黨看的! 但我已經在實驗室機器群里搭建好cloudera manager 以及 ambari都有。
大數據領域兩大最主流集群管理工具Ambari和Cloudera Manger
Cloudera安裝搭建部署大數據集群(圖文分五大步詳解)(博主強烈推薦)
Ambari安裝搭建部署大數據集群(圖文分五大步詳解)(博主強烈推薦)
首先,這是官網提供的參考步驟
http://archive.cloudera.com/cdh5/cdh/5/hue-3.9.0-cdh5.5.0/manual.html
一、以下是默認的配置文件
# Configuration for YARN (MR2) # ------------------------------------------------------------------------ [[yarn_clusters]] [[[default]]] # Enter the host on which you are running the ResourceManager ## resourcemanager_host=localhost # The port where the ResourceManager IPC listens on ## resourcemanager_port=8032 # Whether to submit jobs to this cluster submit_to=True # Resource Manager logical name (required for HA) ## logical_name= # Change this if your YARN cluster is Kerberos-secured ## security_enabled=false # URL of the ResourceManager API ## resourcemanager_api_url=http://localhost:8088 # URL of the ProxyServer API ## proxy_api_url=http://localhost:8088 # URL of the HistoryServer API ## history_server_api_url=http://localhost:19888 # In secure mode (HTTPS), if SSL certificates from YARN Rest APIs # have to be verified against certificate authority ## ssl_cert_ca_verify=True # HA support by specifying multiple clusters # e.g. # [[[ha]]] # Resource Manager logical name (required for HA) ## logical_name=my-rm-name
二、以下是跟我機器集群匹配的配置文件(非HA集群下怎么配置Hue的yarn_clusters模塊)
最終我的非HA配置信息如下
# Configuration for YARN (MR2) # ------------------------------------------------------------------------ [[yarn_clusters]] [[[default]]] # Enter the host on which you are running the ResourceManager resourcemanager_host=bigdatamaster # The port where the ResourceManager IPC listens on resourcemanager_port=8032 # Whether to submit jobs to this cluster submit_to=True # Resource Manager logical name (required for HA) ## logical_name= # Change this if your YARN cluster is Kerberos-secured ## security_enabled=false # URL of the ResourceManager API resourcemanager_api_url=http://bigdatamaster:8088 # URL of the ProxyServer API proxy_api_url=http://bigdatamaster:8088 # URL of the HistoryServer API history_server_api_url=http://bigdatamaster:19888 # In secure mode (HTTPS), if SSL certificates from YARN Rest APIs # have to be verified against certificate authority ## ssl_cert_ca_verify=True # HA support by specifying multiple clusters # e.g. # [[[ha]]] # Resource Manager logical name (required for HA) ## logical_name=my-rm-name
三、以下是跟我機器集群匹配的配置文件(HA集群下怎么配置Hue的yarn_clusters模塊)
hadoop-2.6.0.tar.gz的集群搭建(5節點)
這里需要說明一下,[[[default]]] 和 [[ha]]中各配置一個RM。
logical_name名字就是你集群中yarn-site.xml中配置的
<property> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2</value> </property>
URL of the ResourceManager API 這里配置資源管理的地址和端口,對應yarn-site.xml中的
<property> <name>yarn.resourcemanager.webapp.address.rm1</name> <value>djt11:8088</value> </property> <property> <name>yarn.resourcemanager.webapp.address.rm2</name> <value>djt12:8088</value> </property>
那么就要如下來配置
# URL of the ResourceManager API resourcemanager_api_url=djt11:8088,djt12:8088
URL of the HistoryServer API 這里配置歷史記錄資源管理的地址和端口,對應mapred-site.xml中的
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>djt13:19888</value>
</property>
所以,我的HA最終如下配置
# Configuration for YARN (MR2) # ------------------------------------------------------------------------ [[yarn_clusters]] [[[default]]] # Enter the host on which you are running the ResourceManager resourcemanager_host=cluster1 # The port where the ResourceManager IPC listens on resourcemanager_port=8032 # Whether to submit jobs to this cluster submit_to=True # Resource Manager logical name (required for HA) logical_name=rm1 # Change this if your YARN cluster is Kerberos-secured ## security_enabled=false # URL of the ResourceManager API resourcemanager_api_url=http://djt11:8088 # URL of the ProxyServer API proxy_api_url=http://djt13:8088 # URL of the HistoryServer API history_server_api_url=http://bigdatamaster:19888 # In secure mode (HTTPS), if SSL certificates from YARN Rest APIs # have to be verified against certificate authority ## ssl_cert_ca_verify=True # HA support by specifying multiple clusters # e.g. # [[[ha]]] # Resource Manager logical name (required for HA) logical_name=rm2 resourcemanager_api_url=http://djt12:23188 history_server_api_url=http://djt13:19888 submit_to=True
成功!
或者,HA集群也可以如下來做
我們首先查看yarn-site.xml
<configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.resourcemanager.cluster-id</name> <value>rs</value> </property> <property> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2</value> </property> <property> <name>yarn.resourcemanager.hostname.rm1</name> <value>bigdata-pro01.kfk.com</value> </property> <property> <name>yarn.resourcemanager.hostname.rm2</name> <value>bigdata-pro02.kfk.com</value> </property> <property> <name>yarn.resourcemanager.zk.state-store.address</name> <value>bigdata-pro01.kfk.com:2181,bigdata-pro02.kfk.com:2181,bigdata-pro03.kfk.com:2181</value> </property> <property> <name>yarn.resourcemanager.zk-address</name> <value>bigdata-pro01.kfk.com:2181,bigdata-pro02.kfk.com:2181,bigdata-pro03.kfk.com:2181</value> </property> <property> <name>yarn.resourcemanager.recovery.enabled</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.store.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value> </property> <property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> </configuration>
以上,我的yarn-site.xml配置文件。
然后,修改配置hue.ini
# Configuration for YARN (MR2) # ------------------------------------------------------------------------ [[yarn_clusters]] [[[default]]] # Enter the host on which you are running the ResourceManager resourcemanager_host=rs # The port where the ResourceManager IPC listens on resourcemanager_port=8032 # Whether to submit jobs to this cluster submit_to=True # Resource Manager logical name (required for HA) ## logical_name= # Change this if your YARN cluster is Kerberos-secured ## security_enabled=false # URL of the ResourceManager API resourcemanager_api_url=http://bigdata-pro01.kfk.com:8088 # URL of the ProxyServer API proxy_api_url=http://bigdata-pro01.kfk.com:8088 # URL of the HistoryServer API history_server_api_url=http://bigdata-pro01.kfk.com:19888 # In secure mode (HTTPS), if SSL certificates from YARN Rest APIs # have to be verified against certificate authority ## ssl_cert_ca_verify=True # HA support by specifying multiple clusters # e.g. # [[[ha]]] # Resource Manager logical name (required for HA) ## logical_name=my-rm-name
配置完成后,先停止yarn,再重新啟動yarn,再重新啟動hue。
參考
http://gethue.com/hadoop-tutorial-yarn-resource-manager-high-availability-ha-in-mr2/
http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_ig_hue_config.html
http://cloudera.github.io/hue/docs-3.8.0/manual.html#_hadoop_configuration
同時,大家可以關注我的個人博客:
http://www.cnblogs.com/zlslch/ 和 http://www.cnblogs.com/lchzls/ http://www.cnblogs.com/sunnyDream/
詳情請見:http://www.cnblogs.com/zlslch/p/7473861.html
人生苦短,我願分享。本公眾號將秉持活到老學到老學習無休止的交流分享開源精神,匯聚於互聯網和個人學習工作的精華干貨知識,一切來於互聯網,反饋回互聯網。
目前研究領域:大數據、機器學習、深度學習、人工智能、數據挖掘、數據分析。 語言涉及:Java、Scala、Python、Shell、Linux等 。同時還涉及平常所使用的手機、電腦和互聯網上的使用技巧、問題和實用軟件。 只要你一直關注和呆在群里,每天必須有收獲
對應本平台的討論和答疑QQ群:大數據和人工智能躺過的坑(總群)(161156071)