linux 之 DolphinScheduler 安裝步驟


下載安裝包

直接進官網下載 https://dolphinscheduler.apache.org/zh-cn/download/download.html

參考官方文檔 https://dolphinscheduler.apache.org/zh-cn/docs/1.3.2/user_doc/cluster-deployment.html

我下載的是1.3.2版本
apache-dolphinscheduler-incubating-1.3.2-dolphinscheduler-bin.tar.gz

基礎環境

系統版本:   centos6.5
普通用戶:   hadoop
家目錄:      /hadoop
JDK1.8:     /hadoop/app/jdk1.8.0_281
mysql5.7.27:         /hadoop/app/mysql
zookeeper-3.5.6:  /hadoop/app/zookeeper-3.5.6
hadoop-2.7.7:       /hadoop/app/hadoop-2.7.7

安裝機器IP及hostname
192.168.100.10 bigdata01
192.168.100.11 bigdata02
192.168.100.12 bigdata03

配置sudo免密

使用root用戶給每台機器配置sudo免密

vi /etc/sudoers

添加

hadoop ALL=(ALL) NOPASSWD: NOPASSWD: ALL

注釋
# Defaults requirett

或者

echo 'hadoop  ALL=(ALL)  NOPASSWD: NOPASSWD: ALL' >> /etc/sudoers
sed -i 's/Defaults    requirett/#Defaults    requirett/g' /etc/sudoers

注意:
因為是以 sudo -u {linux-user} 切換不同linux用戶的方式來實現多用戶運行作業,所以部署用戶需要有 sudo 權限,而且是免密的。
如果/etc/sudoers文件中有"Default requiretty"這行,須要注釋掉
如果用到資源上傳的話,還需要在`HDFS或者MinIO`上給該部署用戶分配讀寫的權限

 配置hostname

在所有機器上使用root用戶配置hostname

vi /etc/hosts

192.168.100.10 bigdata01

192.168.100.11 bigdata02
192.168.100.12 bigdata03

配置ssh免密

在三台機器上都使用hadoop用戶配置ssh免密

ssh-keygen -t rsa -m PEM

一直按回車,都設置為默認值,然后再當前用戶的Home目錄下的.ssh目錄中會生成公鑰文件(id_rsa.pub)和私鑰文件(id_rsa)

分發公鑰

ssh-copy-id 192.168.100.10
ssh-copy-id 192.168.100.11
ssh-copy-id 192.168.100.12

注意:正常設置后,ssh bigdata01 是不需要再輸入密碼的

配置JAVA環境

hadoop用戶已經安裝/hadoop/app/jdk1.8.0_281
將jdk軟鏈/bin/java下

因為已經存在open-jdk軟鏈接,需root用戶修改

sudo ln -snf /hadoop/app/jdk1.8.0_281/bin/java /bin/java

數據庫初始化

CREATE DATABASE dolphinscheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL PRIVILEGES ON dolphinscheduler.* TO 'root'@'%' IDENTIFIED BY '123';
GRANT ALL PRIVILEGES ON dolphinscheduler.* TO 'root'@'localhost' IDENTIFIED BY '123';
flush privileges;

添加mysql-connector-java 驅動jar包

手動添加 [ mysql-connector-java 驅動 jar ] 包mysql-connector-java-5.1.49.jar到lib目錄

下載mysql-connector-java-5.1.49.jar包

 

修改配置文件conf/datasource.properties 

vi conf/datasource.properties

spring.datasource.driver-class-name=com.mysql.jdbc.Driver
spring.datasource.url=jdbc:mysql://ywjcapp4:3306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8&allowMultiQueries=true
spring.datasource.username=root
spring.datasource.password=123


執行建表及導入基礎數據腳本

sh script/create-dolphinscheduler.sh

配置運行參數

vi conf/env/dolphinscheduler_env.sh

export HADOOP_HOME=/hadoop/app/hadoop-2.7.7
export HADOOP_CONF_DIR=/hadoop/app/hadoop-2.7.7/etc/hadoop
#export SPARK_HOME1=/opt/soft/spark1
#export SPARK_HOME2=/opt/soft/spark2
#export PYTHON_HOME=/opt/soft/python

export JAVA_HOME=/opt/soft/java

#export HIVE_HOME=/opt/soft/hive
#export FLINK_HOME=/opt/soft/flink
#export DATAX_HOME=/opt/soft/datax/bin/datax.py

export PATH=$JAVA_HOME/bin:$PATH

修改一鍵部署配置文件 conf/config/install_config.conf中的各參數

vi conf/config/install_config.conf
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#


# NOTICE :  If the following config has special characters in the variable `.*[]^${}\+?|()@#&`, Please escape, for example, `[` escape to `\[`
# 這里填 mysql or postgresql
dbtype="mysql"

# db config
# 數據庫連接地址
dbhost="bigdata01:3306"

# 數據庫用戶名,此處需要修改為上面設置的{user}具體值
username="root"

# 數據庫名
dbname="dolphinscheduler"

# 數據庫密碼, 如果有特殊字符,請使用\轉義,需要修改為上面設置的{password}具體值
password="123"

# Zookeeper地址
zkQuorum="bigdata01:2181,bigdata02:2181,bigdata03:2181"

# #將DS安裝到哪個目錄,如: /opt/soft/dolphinscheduler,不同於現在的目錄
installPath="/hadoop/app/ds"

#使用哪個用戶部署,使用第3節創建的用戶
# Note: the deployment user needs to have sudo privileges and permissions to operate hdfs. If hdfs is enabled, the root directory needs to be created by itself
deployUser="hadoop"

# 郵件配置,以qq郵箱為例
# 郵件協議
mailProtocol="SMTP"

# 郵件服務地址
mailServerHost="smtp.qq.com"

# 郵件服務端口
# note: Different protocols and encryption methods correspond to different ports, when SSL/TLS is enabled, make sure the port is correct.
mailServerPort="25"

# mailSender和mailUser配置成一樣即可
# 發送者
mailSender="xxx@qq.com"

# 發送用戶
mailUser="xxx@qq.com"

# 郵箱密碼
# note: The mail.passwd is email service authorization code, not the email login password.
mailPassword="xxx"

# TLS協議的郵箱設置為true,否則設置為false
starttlsEnable="true"

# 開啟SSL協議的郵箱配置為true,否則為false。注意: starttlsEnable和sslEnable不能同時為true
# only one of TLS and SSL can be in the true state.
sslEnable="false"

#note: 郵件服務地址值,參考上面 mailServerHost
sslTrust="smtp.qq.com"

# 業務用到的比如sql等資源文件上傳到哪里,可以設置:HDFS,S3,NONE,
# 單機如果想使用本地文件系統,請配置為HDFS,因為HDFS支持本地文件系統;
# 如果不需要資源上傳功能請選擇NONE。強調一點:使用本地文件系統不需要部署hadoop
# resource storage type:HDFS,S3,NONE
resourceStorageType="HDFS"

# 如果上傳資源保存想保存在hadoop上,hadoop集群的NameNode啟用了HA的話,
# 需要將hadoop的配置文件core-site.xml和hdfs-site.xml放到安裝路徑(/hadoop/app/ds/conf)的conf目錄下,並配置namenode cluster名稱;
# 如果NameNode不是HA,則只需要將mycluster修改為具體的ip或者主機名即可
# if resourceStorageType is HDFS,defaultFS write namenode address,HA you need to put core-site.xml and hdfs-site.xml in the conf directory.
# if S3,write S3 address,HA,for example :s3a://dolphinscheduler,
# Note,s3 be sure to create the root directory /dolphinscheduler
defaultFS="hdfs://nn1:8020"


# if resourceStorageType is S3, the following three configuration is required, otherwise please ignore
s3Endpoint="http://192.168.xx.xx:9010"
s3AccessKey="xxxxxxxxxx"
s3SecretKey="xxxxxxxxxx"

# 如果沒有使用到Yarn,保持以下默認值即可;
# 如果ResourceManager是HA,則配置為ResourceManager節點的主備ip或者hostname,比如"192.168.xx.xx,192.168.xx.xx";
# 如果是單ResourceManager請配置yarnHaIps=""即可
# if resourcemanager HA enable, please type the HA ips ; if resourcemanager is single, make this value empty
yarnHaIps="bigdata01,bigdata02"

# 如果ResourceManager是HA或者沒有使用到Yarn保持默認值即可;
# 如果是單ResourceManager,請配置真實的ResourceManager主機名或者ip
# if resourcemanager HA enable or not use resourcemanager, please skip this value setting; If resourcemanager is single, you only need to replace yarnIp1 to actual resourcemanager hostname.
# singleYarnIp="yarnIp1"

# 資源上傳根路徑,主持HDFS和S3,由於hdfs支持本地文件系統,需要確保本地文件夾存在且有讀寫權限
# resource store on HDFS/S3 path, resource file will store to this hadoop hdfs path, self configuration, please make sure the directory exists on hdfs and have read write permissions。/dolphinscheduler is recommended
resourceUploadPath="/hadoop/data/dolphinscheduler"

# 具備權限創建resourceUploadPath的用戶
# who have permissions to create directory under HDFS/S3 root path
# Note: if kerberos is enabled, please config hdfsRootUser=
hdfsRootUser="hadoop"

# kerberos config
# whether kerberos starts, if kerberos starts, following four items need to config, otherwise please ignore
kerberosStartUp="false"
# kdc krb5 config file path
krb5ConfPath="$installPath/conf/krb5.conf"
# keytab username
keytabUserName="hdfs-mycluster@ESZ.COM"
# username keytab path
keytabPath="$installPath/conf/hdfs.headless.keytab"


# api server port
apiServerPort="12345"

# 在哪些機器上部署DS服務,本機選localhost
# install hosts
# Note: install the scheduled hostname list. If it is pseudo-distributed, just write a pseudo-distributed hostname
ips="bigdata01,bigdata02,bigdata03"

#ssh端口,默認22
# ssh port, default 22
# Note: if ssh port is not default, modify here
sshPort="22"

#master服務部署在哪台機器上
# run master machine
# Note: list of hosts hostname for deploying master
masters="bigdata01,bigdata02"

# worker服務部署在哪台機器上,並指定此worker屬於哪一個worker組,下面示例的default即為組名
# run worker machine
# note: need to write the worker group name of each worker, the default value is "default"
workers="bigdata01:default,bigdata02:default,bigdata03:default"

# 報警服務部署在哪台機器上
# run alert machine
# note: list of machine hostnames for deploying alert server
alertServer="bigdata01"

# 后端api服務部署在在哪台機器上
# run api machine
# note: list of machine hostnames for deploying api server
apiServers="bigdata01"

執行一鍵安裝

sh install.sh

 

注意:
將hadoop的配置文件 core-site.xml 和 hdfs-site.xml 放到/hadoop/app/ds/conf下面:

cp /hadoop/app/hadoop-2.7.7/etc/hadoop/core-site.xml /hadoop/app/ds/conf
cp /hadoop/app/hadoop-2.7.7/etc/hadoop/hdfs-site.xml /hadoop/app/ds/conf

重啟服務

/hadoop/app/ds/bin/stop_all.sh
/hadoop/app/ds/bin/start_all.sh

  進程說明

    MasterServer 主要負責 DAG 的切分和任務狀態的監控
    WorkerServer/LoggerServer 主要負責任務的提交、執行和任務狀態的更新。LoggerServer 用於 Rest Api 通過 RPC 查看日志
    ApiServer 提供 Rest Api 服務,供 UI 進行調用
    AlertServer 提供告警服務

 

查看前台web頁面

初始賬號、密碼: admin/dolphinscheduler123

http://192.168.100.10:12345/dolphinscheduler

 

OK,安裝完成!

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM