HUE配置文件hue.ini 的hive和beeswax模塊詳解(圖文詳解)(分HA集群和非HA集群)


 

 

不多說,直接上干貨!

 

 

  我的集群機器情況是 bigdatamaster(192.168.80.10)、bigdataslave1(192.168.80.11)和bigdataslave2(192.168.80.12)

  然后,安裝目錄是在/home/hadoop/app下。

 

  官方建議在master機器上安裝Hue,我這里也不例外。安裝在bigdatamaster機器上。

 

 Hue版本:hue-3.9.0-cdh5.5.4
 需要編譯才能使用(聯網)


 說給大家的話:大家電腦的配置好的話,一定要安裝cloudera manager。畢竟是一家人的。
同時,我也親身經歷過,會有部分組件版本出現問題安裝起來要個大半天時間去排除,做好心里准備。廢話不多說,因為我目前讀研,自己筆記本電腦最大8G,只能玩手動來練手。
純粹是為了給身邊沒高配且條件有限的學生黨看的! 但我已經在實驗室機器群里搭建好cloudera manager 以及 ambari都有。

大數據領域兩大最主流集群管理工具Ambari和Cloudera Manger

Cloudera安裝搭建部署大數據集群(圖文分五大步詳解)(博主強烈推薦)

Ambari安裝搭建部署大數據集群(圖文分五大步詳解)(博主強烈推薦)

 

 

 

 

 

 

 

  首先,在這里,先給大家普及知識。

對於hive的安裝是有3種方式的:

1.本地derby

2.本地mysql (比如master、slave1、slave2集群。hive一般我是安裝在master上)(也叫作hive單用戶模式)

  當然,你也來個master、slave1、slave2集群,外加client專門來安裝hive、sqoop、azkaban這樣的。

  或者,你也來個master、slave1、slave2、slave3、slave4集群,hive一般我也是安裝在master上。

3..遠端mysql (在主從上配)(也叫作hive多用戶模式)

  (比如master、slave1、slave2集群。hive一般我是安裝在master和slave1上)

  或者,你也來個master、slave1、slave2、slave3、slave4集群,hive一般我也是安裝在master和slave1上。

Hadoop Hive概念學習系列之hive三種方式區別和搭建、HiveServer2環境搭建、HWI環境搭建和beeline環境搭建(五)

 

 

 

 

https://www.cloudera.com/documentation/enterprise/latest/topics/cdh_ig_hue_config.html#concept_ezg_b2s_hl

 

 

 

 

 

    首先,來看看官網提供的參考步驟

 

http://archive.cloudera.com/cdh5/cdh/5/hue-3.9.0-cdh5.5.0/manual.html

 

 

 

 

 

 

 

 

 

 

一、以下是默認的配置文件

 

###########################################################################
# Settings to configure Beeswax with Hive
###########################################################################

[beeswax]

  # Host where HiveServer2 is running.
  # If Kerberos security is enabled, use fully-qualified domain name (FQDN).
  ## hive_server_host=localhost

  # Port where HiveServer2 Thrift server runs on.
  ## hive_server_port=10000

  # Hive configuration directory, where hive-site.xml is located
  ## hive_conf_dir=/etc/hive/conf

  # Timeout in seconds for thrift calls to Hive service
  ## server_conn_timeout=120

  # Choose whether to use the old GetLog() thrift call from before Hive 0.14 to retrieve the logs.
  # If false, use the FetchResults() thrift call from Hive 1.0 or more instead.
  ## use_get_log_api=false

  # Set a LIMIT clause when browsing a partitioned table.
  # A positive value will be set as the LIMIT. If 0 or negative, do not set any limit.
  ## browse_partitioned_table_limit=250

  # A limit to the number of rows that can be downloaded from a query.
  # A value of -1 means there will be no limit.
  # A maximum of 65,000 is applied to XLS downloads.
  ## download_row_limit=1000000

  # Hue will try to close the Hive query when the user leaves the editor page.
  # This will free all the query resources in HiveServer2, but also make its results inaccessible.
  ## close_queries=false

  # Thrift version to use when communicating with HiveServer2.
  # New column format is from version 7.
  ## thrift_version=7

 

 

 

 

 

 

二、以下是跟我機器集群匹配的配置文件(非HA集群下怎么配置Hue的hive和beeswax模塊)(本地mysql模式

三、以下是跟我機器集群匹配的配置文件(非HA集群下怎么配置Hue的hive和beeswax模塊)(本地mysql模式

  都是如下哈。因為hive說白了,是可以安裝在集群之外,它就是一個客戶端。

 

 

  其實啊,目前Hue里的beeswax 和 hive模塊是一起的。為什么叫[beeswax]而不是[hive]這是歷史原因!!!

 

 

 

 

   

  同時,是還要將hive-default.xml.template里的hive.server2.thrift.port默認屬性 和 hive.server2.thrift.bind.host默認屬性,

拷貝到hive-site.xml里進行修改。

        <property>
                <name>hive.server2.thrift.port</name>
                <value>10000</value>
        </property>
        <property>
                <name>hive.server2.thrift.bind.host</name>
                <value>bigdatamaster</value>
        </property>

 

 

 

 

 

###########################################################################
# Settings to configure Beeswax with Hive
###########################################################################

[beeswax]

  # Host where HiveServer2 is running.
  # If Kerberos security is enabled, use fully-qualified domain name (FQDN).
  hive_server_host=bigdatamaster

  # Port where HiveServer2 Thrift server runs on.
  hive_server_port=10000

  # Hive configuration directory, where hive-site.xml is located
  hive_conf_dir=/home/hadoop/app/hive/conf

  # Timeout in seconds for thrift calls to Hive service
  ## server_conn_timeout=120

  # Choose whether to use the old GetLog() thrift call from before Hive 0.14 to retrieve the logs.
  # If false, use the FetchResults() thrift call from Hive 1.0 or more instead.
  ## use_get_log_api=false

  # Set a LIMIT clause when browsing a partitioned table.
  # A positive value will be set as the LIMIT. If 0 or negative, do not set any limit.
  ## browse_partitioned_table_limit=250

  # The maximum number of partitions that will be included in the SELECT * LIMIT sample query for partitioned tables.
  ## sample_table_max_partitions=10

  # A limit to the number of rows that can be downloaded from a query.
  # A value of -1 means there will be no limit.
  # A maximum of 65,000 is applied to XLS downloads.
  ## download_row_limit=1000000

  # Hue will try to close the Hive query when the user leaves the editor page.
  # This will free all the query resources in HiveServer2, but also make its results inaccessible.
  ## close_queries=false

  # Thrift version to use when communicating with HiveServer2.
  # New column format is from version 7.
  ## thrift_version=7

 

 

 

 

   因為,Hue底層通過HiveServer2中JDBC/ODBC方式連接HIve,進行數據分析查詢,需要先啟動Hive中的HiveServer2服務。

 

   所以,啟動hive(在bigdatamaster節點)

$HIVE_HOME/bin/hive --service hiveserver2


或者


$HIVE_HOME/bin/hiveserver2

 

 

 

   得到

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

   其他,不多贅述,大家自己去看自己機器!

 

 

 

 

 

 

 

 

 

 

 

 

 

 

四、以下是跟我機器集群匹配的配置文件(非HA集群下怎么配置Hue的hive和beeswax模塊)(遠端mysql模式

五、以下是跟我機器集群匹配的配置文件(非HA集群下怎么配置Hue的hive和beeswax模塊)(遠端mysql模式

  都是如下哈。因為hive說白了,是可以安裝在集群之外,它就是一個客戶端。

 

 

 

 

  其實啊,目前Hue里的beeswax 和 hive模塊是一起的。

 

 

 

 

 

 

   比如,我這里是master、slave1和slave2組成的集群,在master和slave1上搭建的是hive的Remote模式。

 

 

 

 

看hive的官方文檔

http://hive.apache.org/

 

 

 

 

 

 

 

 

   master機器上

將hive-site.xml配置文件拆為如下兩部分
1)、服務端配置文件(比如在master)
<?xml version="1.0"?> 
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?> 

<configuration> 

<property> 
<name>hive.metastore.warehouse.dir</name> 
<value>/user/hive/warehouse</value> 
</property> 

<property> 
<name>javax.jdo.option.ConnectionURL</name> 
<value>jdbc:mysql://192.168.80.10:3306/hive?createDatabaseIfNotExist=true</value> 
</property> 

<property> 
<name>javax.jdo.option.ConnectionDriverName</name> 
<value>com.mysql.jdbc.Driver</value> 
</property> 

<property> 
<name>javax.jdo.option.ConnectionUserName</name> 
<value>root</value> 
</property> 

<property> 
<name>javax.jdo.option.ConnectionPassword</name> 
<value>123456</value> 
</property> 
</configuration>

 

 

 

 

 

  slave1機器上

<?xml version="1.0"?> 
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?> 

<configuration> 

<property> 
<name>hive.metastore.warehouse.dir</name> 
<value>/user/hive/warehouse</value> 
</property> 

<property> 
<name>hive.metastore.local</name> 
<value>false</value> 
</property> 

<property> 
<name>hive.metastore.uris</name> 
<value>thrift://192.168.80.11:9083</value> 
</property> 

</configuration>

  注意,在客戶端slave1,有個屬性,hive.metastore.local為false。

 

 

 

  在master節點上啟動hive服務端程序
hive --service metastore

或者

hive  --servie metastore -9083

 

 

 

 

  注意啦,是還要將hive-default.xml.template里的hive.metastore.uris默認屬性,

拷貝到hive-site.xml里進行修改。

  hive.metastore.uris  (在slave1機器上)

<property> 
<name>hive.metastore.uris</name> 
<value>thrift://192.168.80.11:9083</value> 
</property>

 

 

 

 

 

###########################################################################
# Settings to configure Beeswax with Hive
###########################################################################

[beeswax]

  # Host where HiveServer2 is running.
  # If Kerberos security is enabled, use fully-qualified domain name (FQDN).
  hive_server_host=bigdatamaster

  # Port where HiveServer2 Thrift server runs on.
  hive_server_port=10000

  # Hive configuration directory, where hive-site.xml is located
  hive_conf_dir=/home/hadoop/app/hive/conf

  # Timeout in seconds for thrift calls to Hive service
  ## server_conn_timeout=120

  # Choose whether to use the old GetLog() thrift call from before Hive 0.14 to retrieve the logs.
  # If false, use the FetchResults() thrift call from Hive 1.0 or more instead.
  ## use_get_log_api=false

  # Set a LIMIT clause when browsing a partitioned table.
  # A positive value will be set as the LIMIT. If 0 or negative, do not set any limit.
  ## browse_partitioned_table_limit=250

  # The maximum number of partitions that will be included in the SELECT * LIMIT sample query for partitioned tables.
  ## sample_table_max_partitions=10

  # A limit to the number of rows that can be downloaded from a query.
  # A value of -1 means there will be no limit.
  # A maximum of 65,000 is applied to XLS downloads.
  ## download_row_limit=1000000

  # Hue will try to close the Hive query when the user leaves the editor page.
  # This will free all the query resources in HiveServer2, but also make its results inaccessible.
  ## close_queries=false

  # Thrift version to use when communicating with HiveServer2.
  # New column format is from version 7.
  ## thrift_version=7

 

 

   最后的界面

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  以下是跟我機器集群匹配的配置文件(HA集群下怎么配置Hue的hive和beeswax模塊)(本地和遠端mysql模式

 

    如下:

 

[beeswax]

  # Host where HiveServer2 is running.
  # If Kerberos security is enabled, use fully-qualified domain name (FQDN).
  hive_server_host=bigdata-pro01.kfk.com

  # Port where HiveServer2 Thrift server runs on.
  hive_server_port=10000

  # Hive configuration directory, where hive-site.xml is located
  hive_conf_dir=/opt/modules/hive-0.13.1-cdh5.3.0/conf

  # Timeout in seconds for thrift calls to Hive service
  ## server_conn_timeout=120

  # Choose whether to use the old GetLog() thrift call from before Hive 0.14 to retrieve the logs.
  # If false, use the FetchResults() thrift call from Hive 1.0 or more instead.
  ## use_get_log_api=false

  # Set a LIMIT clause when browsing a partitioned table.
  # A positive value will be set as the LIMIT. If 0 or negative, do not set any limit.
  ## browse_partitioned_table_limit=250

  # A limit to the number of rows that can be downloaded from a query.
  # A value of -1 means there will be no limit.
  # A maximum of 65,000 is applied to XLS downloads.
  ## download_row_limit=1000000

  # Hue will try to close the Hive query when the user leaves the editor page.
  # This will free all the query resources in HiveServer2, but also make its results inaccessible.
  ## close_queries=false

  # Thrift version to use when communicating with HiveServer2.
  # New column format is from version 7.
  ## thrift_version=7

 

 

 

 

  先啟動hivesever2

[kfk@bigdata-pro01 hive-0.13.1-cdh5.3.0]$ pwd
/opt/modules/hive-0.13.1-cdh5.3.0
[kfk@bigdata-pro01 hive-0.13.1-cdh5.3.0]$ bin/hiveserver2 
Starting HiveServer2
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/modules/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/modules/hbase-0.98.6-cdh5.3.0/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

 

 

 

 

     停掉hue,再啟動hue

^C[kfk@bigdata-pro01 hue-3.9.0-cdh5.5.0]$ ./build/env/bin/supervisor 
[INFO] Not running as root, skipping privilege drop
starting server with options:
{'daemonize': False,
 'host': 'bigdata-pro01.kfk.com',
 'pidfile': None,
 'port': 8888,
 'server_group': 'hue',
 'server_name': 'localhost',
 'server_user': 'hue',
 'ssl_certificate': None,
 'ssl_cipher_list': 'ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-DSS-AES128-GCM-SHA256:kEDH+AESGCM:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA256:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:DHE-RSA-AES256-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:AES:CAMELLIA:DES-CBC3-SHA:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!MD5:!PSK:!aECDH:!EDH-DSS-DES-CBC3-SHA:!EDH-RSA-DES-CBC3-SHA:!KRB5-DES-CBC3-SHA',
 'ssl_private_key': None,
 'threads': 40,
 'workdir': None}

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  成功!

 

 

 

 

 

歡迎大家,加入我的微信公眾號:大數據躺過的坑        人工智能躺過的坑
 
 
 

同時,大家可以關注我的個人博客

   http://www.cnblogs.com/zlslch/   和     http://www.cnblogs.com/lchzls/      http://www.cnblogs.com/sunnyDream/   

   詳情請見:http://www.cnblogs.com/zlslch/p/7473861.html

 

  人生苦短,我願分享。本公眾號將秉持活到老學到老學習無休止的交流分享開源精神,匯聚於互聯網和個人學習工作的精華干貨知識,一切來於互聯網,反饋回互聯網。
  目前研究領域:大數據、機器學習、深度學習、人工智能、數據挖掘、數據分析。 語言涉及:Java、Scala、Python、Shell、Linux等 。同時還涉及平常所使用的手機、電腦和互聯網上的使用技巧、問題和實用軟件。 只要你一直關注和呆在群里,每天必須有收獲

 

      對應本平台的討論和答疑QQ群:大數據和人工智能躺過的坑(總群)(161156071) 

 

 

 

 

 

 

 

 

 

 

 

 

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM