CentOS7 安裝kylin2.6.0集群


1. 環境准備

zookeeper3.4.12

mysql5.7

hive2.3.4

hadoop2.7.3

JDK1.8

hbase1.3.3

2. 集群規划

ip地址 機器名 角色
192.168.1.101 palo101 hadoop namenode, hadoop datanode, yarn nodeManager, zookeeper, hive, hbase master,hbase region server,
192.168.1.102 palo102 hadoop secondary namenode, hadoop datanode, yarn nodeManager,  yarn resource manager, zookeeper, hive, hbase master,hbase region server
192.168.1.103 palo103 hadoop datanode, yarn nodeManager, zookeeper, hive,hbase region server,mysql

3. 下載kylin2.6

wget http://mirrors.tuna.tsinghua.edu.cn/apache/kylin/apache-kylin-2.6.0/apache-kylin-2.6.0-bin-hbase1x.tar.gz   #下載kylin2.6.0二進制文件
tar -xzvf apache-kylin-2.6.0-bin-hbase1x.tar.gz          #解壓kylin2.6.0二進制壓縮包
mv apache-kylin-2.6.0-bin apache-kylin-2.6.0             #將kylin解壓過的文件重命名(去掉最后的bin)
mkdir /usr/local/kylin/                                  #創建目標存放路徑
mv apache-kylin-2.6.0    /usr/local/kylin/               #將kylin2.6.0文件夾移動到/usr/local/kylin目錄下

 

4. 添加系統環境變量

vim /etc/profile

在文件末尾添加

#kylin
export KYLIN_HOME=/usr/local/kylin/apache-kylin-2.6.0
export KYLIN_CONF_HOME=$KYLIN_HOME/conf
export PATH=:$PATH:$KYLIN_HOME/bin:$CATALINE_HOME/bin
export tomcat_root=$KYLIN_HOME/tomcat   #變量名小寫
export hive_dependency=$HIVE_HOME/conf:$HIVE_HOME/lib/*:$HCAT_HOME/share/hcatalog/hive-hcatalog-core-2.3.4.jar   #變量名小寫

:wq保存退出,並輸入source /etc/profile使環境變量生效

 

5. 配置kylin

5.1 配置$KYLIN_HOME/bin/kylin.sh

vim $KYLIN_HOME/bin/kylin.sh

在文件開頭添加

export HBASE_CLASSPATH_PREFIX=${tomcat_root}/bin/bootstrap.jar:${tomcat_root}/bin/tomcat-juli.jar:${tomcat_root}/lib/*:$hive_dependency:$HBASE_CLASSPATH_PREFIX

這么做的目的是為了加入$hive_dependency環境,解決后續的兩個問題,都是沒有hive依賴的原因:
a) kylinweb界面load hive表會失敗
b) cube build的第二步會報org/apache/Hadoop/hive/conf/hiveConf的錯誤。

5.2 hadoop壓縮配置

關於snappy壓縮支持問題,如果支持需要事先重新編譯Hadoop源碼,使得native庫支持snappy.使用snappy能夠實現一個適合的壓縮比,使得這個運算的中間結果和最終結果都能占用較小的存儲空間
本例的hadoop不支持snappy壓縮,這個會導致后續cube build報錯。

vim  $KYLIN_HOME/conf/Kylin_job_conf.xml

修改配置文件,將配置項mapreduce.map.output.compress,mapreduce.output.fileoutputformat.compress修改為false

    <property>
        <name>mapreduce.map.output.compress</name>
        <value>false</value>
        <description>Compress map outputs</description>
    </property>
    <property>
        <name>mapreduce.output.fileoutputformat.compress</name>
        <value>false</value>
        <description>Compress the output of a MapReduce job</description>
    </property>

 

還有一個關於壓縮的地方需要修改

vim   $KYLIN_HOME/conf/kylin.properties

將kylin.hbase.default.compression.codec設置為none或者注釋掉

#kylin.storage.hbase.compression-codec=none

5.3 主配置$KYLIN_HOME/conf/kylin.properties

vim   $KYLIN_HOME/conf/kylin.properties

修改為:

## The metadata store in hbase
##hbase上存儲kylin元數據
kylin.metadata.url=kylin_metadata@hbase

## metadata cache sync retry times
##元數據同步重試次數
kylin.metadata.sync-retries=3

## Working folder in HDFS, better be qualified absolute path, make sure user has the right permission to this directory
##hdfs上kylin工作目錄
kylin.env.hdfs-working-dir=/kylin

## kylin zk base path
kylin.env.zookeeper-base-path=/kylin

## DEV|QA|PROD. DEV will turn on some dev features, QA and PROD has no difference in terms of functions.
#kylin.env=DEV

## Kylin server mode, valid value [all, query, job]
##kylin主節點模式,從節點的模式為query,只有這一點不一樣
kylin.server.mode=all

## List of web servers in use, this enables one web server instance to sync up with other servers.
##集群的信息同步
kylin.server.cluster-servers=192.168.1.131:7070,192.168.1.193:7070,192.168.1.194:7070


## Display timezone on UI,format like[GMT+N or GMT-N]
##改為中國時間
kylin.web.timezone=GMT+8    

## Timeout value for the queries submitted through the Web UI, in milliseconds
##web查詢超時時間(毫秒)
kylin.web.query-timeout=300000

## Max count of concurrent jobs running
##可並發執行的job數量
kylin.job.max-concurrent-jobs=10


#### ENGINE ###
## Time interval to check hadoop job status
##檢查hdfs job的時間間隔(秒)
kylin.engine.mr.yarn-check-interval-seconds=10


## Hive database name for putting the intermediate flat tables
##build cube 產生的Hive中間表存放的數據庫
kylin.source.hive.database-for-flat-table=kylin_flat_db


## The percentage of the sampling, default 100%
kylin.job.cubing.inmem.sampling.percent=100

## Max job retry on error, default 0: no retry
kylin.job.retry=0

## Compression codec for htable, valid value [none, snappy, lzo, gzip, lz4]
##不采用壓縮
kylin.storage.hbase.compression-codec=none  
 
## The cut size for hbase region, in GB.
kylin.storage.hbase.region-cut-gb=5

## The hfile size of GB, smaller hfile leading to the converting hfile MR has more reducers and be faster.
## Set 0 to disable this optimization.
kylin.storage.hbase.hfile-size-gb=2

## The storage for final cube file in hbase
kylin.storage.url=hbase

## The prefix of hbase table
kylin.storage.hbase.table-name-prefix=KYLIN_

## The namespace for hbase storage
kylin.storage.hbase.namespace=default

###定義kylin用於MR jobs的job.jar包和hbase的協處理jar包,用於提升性能(添加項)
kylin.job.jar=/usr/local/kylin/apache-kylin-2.6.0/lib/kylin-job-2.6.0.jar
kylin.coprocessor.local.jar=/usr/local/kylin/apache-kylin-2.6.0/lib/kylin-coprocessor-2.6.0.jar

 

5.4 將配置好的kylin復制到其他兩台機器上去

scp -r /usr/local/kylin/  192.168.1.102:/usr/local
scp -r /usr/local/kylin/  192.168.1.103:/usr/local

 

5.5 將192.168.1.102,192.168.1.103上的kylin.server.mode改為query

vim   $KYLIN_HOME/conf/kylin.properties

修改項為

kylin.server.mode=query      ###kylin主節點模式,從節點的模式為query,只有這一點不一樣

 

6. 啟動kylin

6.1 前提條件:依賴服務先啟動

a) 啟動zookeeper,所有節點運行

$ZOO_KEEPER_HOME/bin/zkServer.sh start

b) 啟動hadoop,主節點運行

$HADOOP_HOME/bin/start-all.sh

c) 啟動JobHistoryserver服務,master主節點啟動.

$HADOOP_HOME/sbin/mr-jobhistory-daemon.sh start historyserver

d) 啟動hivemetastore服務

nohup $HIVE_HOME/bin/hive --service metastore /dev/null 2>&1 &

e) 啟動hbase集群,主節點啟動

$HBASE_HOME/bin/start-hbase.sh

啟動后的進程為:

192.168.1.101

[root@palo101 apache-kylin-2.6.0]# jps
62403 NameNode          #hdfs NameNode
31013 NodeManager       #yarn NodeManager
22325 Kafka      
54217 QuorumPeerMain    #zookeeper
7274 Jps
62589 DataNode          #hadoop datanode
28895 HRegionServer     #hbase region server
8440 HMaster            #hbase master

192.168.1.102

[root@palo102 ~]# jps
47474 QuorumPeerMain    #zookeeper
15203 NodeManager       #yarn NodeManager
15061 ResourceManager   #yarn ResourceManager
49877 Jps
6694 HRegionServer      #hbase region server
7673 Kafka
37517 SecondaryNameNode #hdfs SecondaryNameNode
37359 DataNode          #hadoop datanode

192.168.1.103

[root@palo103 ~]# jps
1185 RunJar             #hive metastore
62404 NodeManager       #yarn NodeManager
47365 HRegionServer     #hbase region server
62342 QuorumPeerMain    #zookeeper
20952 ManagerBootStrap  
52440 Kafka
31801 RunJar            #hive thrift server
47901 DataNode          #hadoop datanode
36494 Jps

 

6.2 檢查配置是否正確

 $KYLIN_HOME/bin/check-env.sh
[root@palo101 bin]#  $KYLIN_HOME/bin/check-env.sh
Retrieving hadoop conf dir...
KYLIN_HOME is set to /usr/local/kylin/apache-kylin-2.6.0
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

hive依賴檢查find-hive-dependency.sh
hbase依賴檢查find-hbase-dependency.sh
所有的依賴檢查可吃用check-env.sh

 

6.3 所有節點運行下面命令來啟動kylin

 $KYLIN_HOME/bin/kylin.sh start

 啟動時如果出現下面的錯誤

Failed to find metadata store by url: kylin_metadata@hbase

解決辦法 為:

1)將$HBASE_HOME/conf/hbase-site.html的屬性hbase.rootdir改成與$HADOOP_HOME/etc/hadoop/core-site.xml中的屬性fs.defaultFS一致

2)進入zk的bin的zkCli,將/hbase刪除,然后重啟hbase可以解決

 

6.4 登錄kylin

http://192.168.1.101:7070/kylin, 其他幾台也可以登錄,只要切換相應的ip即可

默認登錄名密碼為:admin/KYLIN

 

 登錄后的主頁面為:

 

 

7 FAQ

7.1 如果遇到類似下面的錯誤

WARNING: Failed to process JAR
[jar:file:/home/hadoop-2.7.3/contrib/capacity-scheduler/.jar!/] for

這個問題只是一些小bug問題把這個腳本的內容改動一下就好了${HADOOP_HOME}/etc/hadoop/hadoop-env.sh,把下面的這一段循環語句給注釋掉

 vim ${HADOOP_HOME}/etc/hadoop/hadoop-env.sh
#for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do
#  if [ "$HADOOP_CLASSPATH" ]; then
#    export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f
#  else
#    export HADOOP_CLASSPATH=$f
#  fi
#done

 7.2 如果遇到Caused by: java.lang.ClassCastException: com.fasterxml.jackson.datatype.joda.JodaModule cannot be cast to com.fasterxml.jackson.databind.Module的錯誤

 產生這個問題的原因是hive使用的jackson-datatype-joda-2.4.6.jar,而kylin使用的是jackson-databind-2.9.5.jar,jar包版本不一致造成的。

hive:

kylin:

解決辦法為:

mv $HIVE_HOME/lib/jackson-datatype-joda-2.4.6.jar $HIVE_HOME/lib/jackson-datatype-joda-2.4.6.jarback

即不使用hive的這個jar包,詳情請參見https://issues.apache.org/jira/browse/KYLIN-3129

 

7.3 如果遇到Failed to load keystore type JKS with path conf/.keystore due to (No such file or directory)

解決辦法為:

打開apache-kylin-2.6.0/tomcat/conf/server.xml文件,把其中的https的配置刪除掉(或者注釋掉)

 

        <!--
        <Connector port="7443" protocol="org.apache.coyote.http11.Http11Protocol"
                   maxThreads="150" SSLEnabled="true" scheme="https" secure="true"
                   keystoreFile="conf/.keystore" keystorePass="changeit"
                   clientAuth="false" sslProtocol="TLS" />
         -->

 

 

8. 簡單使用入門

8.1 執行官方發布的樣例數據

 $KYLIN_HOME/bin/sample.sh

如果出現Restart Kylin Server or click Web UI => System Tab => Reload Metadata to take effect,就說明示例cube創建成功了,如圖:

8.2 重啟kylin或者重新加載元數據讓數據生效

本例中選擇重新加載元數據,操作如圖所示

8.3 進入hive,查看kylin cube表結構

$HIVE_HOME/bin/hive        #進入hive shell客戶端
hive>show databases;       #查詢hive中數據庫列表
hive>use kylin_flat_db;    #切換到kylin的hive數據庫
hive>show tables;          #查詢kylin hive數據庫中的所有表

輸入如下:

[druid@palo101 kafka_2.12-2.1.0]$ $HIVE_HOME/bin/hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/workspace/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/workspace/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in file:/home/workspace/apache-hive-2.3.4-bin/conf/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive> show databases;
OK
default
dw_sales
kylin_flat_db
ods_sales
Time taken: 1.609 seconds, Fetched: 4 row(s)
hive> use kylin_flat_db;
OK
Time taken: 0.036 seconds
hive> show tables;
OK
kylin_account
kylin_cal_dt
kylin_category_groupings
kylin_country
kylin_sales
Time taken: 0.321 seconds, Fetched: 5 row(s)
hive> 

再來看hbase

[druid@palo101 kafka_2.12-2.1.0]$ hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.3.3, rfd0d55b1e5ef54eb9bf60cce1f0a8e4c1da073ef, Sat Nov 17 21:43:34 CST 2018

hbase(main):001:0> list
TABLE                                                                                                                                                                                                       
dev                                                                                                                                                                                                         
kylin_metadata                                                                                                                                                                                              
test                                                                                                                                                                                                        
3 row(s) in 0.3180 seconds

=> ["dev", "kylin_metadata", "test"]

hbase中多了個叫kylin_metadata的表,說明使用官方示例數據的cube已經創建成功了!

8.4 構建cube

刷新http://192.168.1.101:7070/kylin,我們發現多了個項目learn_kylin

 

 選擇kylin_sales_model,進行構建

可以在monitor里查看構建的進度

 

 Build成功之后model里面會出現storage信息,之前是沒有的,可以到hbase里面去找對應的表,同時cube狀態變為ready,表示可查詢。

 

8.5 kylin中進行查詢

 

 

至此,kylin集群部署結束。


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM