官方源碼: https://gitee.com/apache/griffin/tree/master 下載到本地
一、啟動前需要先安裝以下環境
- Jdk(1.8 or later versions)
- Postgresql or Mysql(用於存儲Measure、job等元數據信息)
- npm(version 6.0.0+,用於編譯ui模塊)
- Hadoop(2.6.0 or later,需要HDFS存儲)
- Spark(version 2.2.1,用於數據質量的各種計算)
- Hive(version 2.2.0,實際上低版本也支持,需要自行測試,如果需要用到其中的數據源則需要,否則也可以不用)
- Livy(用來通過Http的方式提交Spark作業)
- ElasticSearch (5.0 or later,用於存儲質量檢測后生成的時序數據)
二、安裝細節
以下我是統一安裝到 /usr/local/src/ 目錄下
Hadoop 安裝
1. wget http://archive.apache.org/dist/hadoop/core/hadoop-2.7.1/hadoop-2.7.1.tar.gz
2. tar -xvf hadoop-2.7.1.tar.gz
3. 需要修改配置文件 進入 :cd /usr/local/src/hadoop-2.7.1/etc/hadoop/
1. vim core-site.xml
<configuration> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/src/hadoop-2.7.1/tmp</value> </property> <property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> </property> </configuration>
HDFS的元數據存儲在這個/usr/local/src/hadoop-2.7.1/tmp目錄下,如果操作系統重啟了,系統會清空/tmp目錄下的東西,導致NameNode元數據丟失,應該修改這個路徑,所以在要創建tmp目錄
2. vim hadoop-env.sh 和 vim yarn-env.sh
配置JDK路徑: export JAVA_HOME=/usr/local/src/jdk1.8.0_131
3. 把mapred-site.xml.templte 修改成 mapred-site.xml
4. vim mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
5.vim hdfs-site.xml
<configuration> <property> <name>dfs.replication</name> <value>3</value> </property> </configuration>
6.vim yarn-site.xml
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.resourcemanager.hostname</name> <value>masteractive</value> </property> </configuration>
7. vim /etc/profile 配置環境變量
export HADOOP_HOME=/usr/local/src/hadoop-2.7.1
export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/bin
----------------------------以上就完成了hadoop的配置---------------------------------------
啟動Hadoop
1.格式化hdfs文件系統,啟動hadoop。
[root@localhost ~]# cd /usr/local/src/hadoop-2.6.0/bin
[root@localhost bin]# ./hadoop namenode -format
[root@localhost bin]# ./sbin/start-all.sh
啟動過程需要輸入好幾次的linux當前用戶的密碼,啟動完成驗證是否成功,輸入 jps
[root@localhost bin]# jps
輸入 hadoop -version 出現版本號,那么恭喜你,啟動成功
Spark 安裝
1.wget https://www.apache.org/dyn/closer.lua/spark/spark-3.1.1/spark-3.1.1-bin-hadoop2.7.tgz
2. tar -xvf spark-3.1.1-bin-hadoop2.7.tgz
3.vim /etc/profile 配置環境變量
#spark
export SPARK_HOME=/usr/local/src/spark-3.1.1-bin-hadoop2.7
export PATH=$PATH:$SPARK_HOME/bin
4.直接啟動 輸入 spark-shell 命令即可
[root@localhost /]# spark-shell
Hive 安裝
1.wget https://downloads.apache.org/incubator/livy/0.7.1-incubating/apache-livy-0.7.1-incubating-bin.zip
2. unzip apache-livy-0.7.1-incubating-bin.zip
3. 進入hive下的conf修改配置文件
[root@localhost /]# cd /usr/local/src/apache-hive-2.3.8-bin/conf/
1.vim hive-site.xml
把下面這一段復制進去,修改對應的數據用戶名密碼即可
<?xml version="1.0"?> <configuration> <property> <name>hive.metastore.uris</name> <value>thrift://localhost:9083</value> </property> <property> <name>hive.server2.authentication</name> <value>CUSTOM</value> </property> <!-- Hive jdbc usernmae and pwd--> <property> <name>hive.jdbc_passwd.auth.zhangsan</name> <value>123456789</value> <description/> </property> <property> <name>hive.server2.custom.authentication.class</name> <value>org.apache.hadoop.hive.contrib.auth.CustomPasswdAuthenticator</value> </property> <!-- Hive產生的元數據存放位置--> <property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive/warehouse</value> </property> <!--- 使用本地服務連接Hive,默認為true--> <property> <name>hive.metastore.local</name> <value>true</value> </property> <!-- 數據庫連接JDBC的URL地址--> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://localhost:3306/hive2?createDatabaseIfNotExist=true</value> </property> <!-- 數據庫連接driver,即MySQL驅動--> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <!-- MySQL數據庫用戶名--> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> </property> <!-- MySQL數據庫密碼--> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>12345678</value> </property> <property> <name>hive.metastore.schema.verification</name> <value>false</value> </property> <property> <name>datanucleus.schema.autoCreateAll</name> <value>true</value> </property> <property> <name>hive.exec.local.scratchdir</name> <value>/tmp</value> <description>Local scratch space for Hive jobs</description> </property> <property> <name>hive.downloaded.resources.dir</name> <value>/tmp</value> <description>Temporary local directory for added resources in the remote file system.</description> </property> <property> <name>hive.server2.logging.operation.log.location</name> <value>/tmp/operation_logs</value> <description>Top level directory where operation logs are stored if logging functionality is enabled</description> </property> <property> <name>hive.metastore.port</name> <value>9083</value> <description>Hive metastore listener port</description> </property> <property> <name>hive.server2.thrift.port</name> <value>10000</value> <description>Port number of HiveServer2 Thrift interface when hive.server2.transport.mode is 'binary'.</description> </property> </configuration>
2.vim hive-env.sh
直接文件底部插入以下配置
export JAVA_HOME=/usr/local/src/jdk1.8.0_131
export HADOOP_HOME=/usr/local/src/hadoop-2.7.1
export HIVE_HOME=/usr/local/src/apache-hive-2.3.8-bin
export HIVE_CONF_DIR=$HIVE_HOME/conf
export HIVE_AUX_JARS_PATH=$HIVE_HOME/lib
3. 下載jdbc的驅動包 mysql-connector-java-5.1.47.jar (自行下載)放入 /usr/local/src/apache-hive-2.3.8-bin/lib 文件夾中
4.mysql並配置hive數據庫及權限
1. create table hive; 2.update user set host='%' where user='root';

3. flush privileges;
----------------------------以上就完成了hive的配置---------------------------------------
啟動測試hive
1.執行以下命令,創建hive數據表,可以登陸mysql 查看hive表
[root@localhost bin]# cd /usr/local/src/apache-hive-2.3.8-bin/bin
[root@localhost bin]# schematool -initSchema -dbType mysql
2.完成數據庫初始化后,執行以下兩條命令啟動,
[root@localhost bin]# ./hive --service metastore &
[root@localhost bin]# ./hive
3.使用hive 創建表
CREATE EXTERNAL TABLE test(
id int,
name string,
age int)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION 'hdfs:///griffin/persist/test';
Livy 安裝
1.wget https://downloads.apache.org/incubator/livy/0.7.1-incubating/apache-livy-0.7.1-incubating-bin.zip
2. unzip apache-livy-0.7.1-incubating-bin.zip
3.vim /etc/profile 配置環境變量
#livy
export LIVY_HOME=/usr/local/src/apache-livy-0.7.1
export HADOOP_CONF_DIR=/usr/local/src/hadoop-2.6.0/etc/hadoop
4. 配置很少,直接啟動
[root@localhost /]# cd /usr/local/src/apache-livy-0.7.1/bin/
[root@localhost bin]# ./livy-server
Elasticsearch 安裝
1. wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.0.0.tar.gz
2. tar -vxf elasticsearch-6.0.0.tar.gz
3. 配置 elasticsearch.yml
[root@localhost config]# cd /usr/local/src/elasticsearch-6.0.0/config
[root@localhost config]# vim elasticsearch.yml
再底部增加
network.host: 0.0.0.0
http.port: 9200
path.data: /home/elasticserach/elasticsearch-1/data
path.logs: /home/elasticserach/elasticsearch-1/logs
4.啟動服務:需要注意的是ES在linux上的話默認是不支持root用戶進行啟動,所以我們需要切換其它用戶來進行啟動,並且需要設置訪問權限
需要設置以下兩個文件權限
[root@localhost /]#chown -R admin /usr/local/src/elasticsearch-6.0.0
[root@localhost /]#chown -R admin /home/elasticserach/
5. vim /etc/sysctl.conf
添加下面配置:
vm.max_map_count=655360
最后記得執行:
sysctl -p
6.切換用戶登陸,進行啟動
[root@localhost bin]# cd /usr/local/src/elasticsearch-6.0.0/bin
[root@localhost bin]# ./elasticsearch
****************************************************以上完成了所有的griffin 需要的依賴配置****************************************************
接下來配置griffin
griffin 安裝配置請查看 https://www.cnblogs.com/qiu-hua/p/13947941.html
griffin ui操作請查看 https://cloud.tencent.com/developer/article/1780013

啟動 griffin 命令:nohup java -jar griffin-service.jar --httpPort=8085 >a.log 2>&1 &
