DolphinScheduler 1.3.2集群版(基於CDH5.13.1)安裝手冊


0.介紹說明

Apache DolphinScheduler是一個分布式去中心化,易擴展的可視化DAG工作流任務調度系統。致力於解決數據處理流程中錯綜復雜的依賴關系,使調度系統在數據處理流程中開箱即用。
我們的目標是希望通過使用 DS 完成帶調度的數據加工,為后續的模型或算法提供數據支持。

 


 

1.環境依賴

通過官網確定當前服務器的基礎軟件安裝情況,以下是安裝依賴

PostgreSQL (8.2.15+) or MySQL (5.6或者5.7系列) : 兩者任選其一即可
JDK (1.8+) : 必裝,請安裝好后在/etc/profile下配置 JAVA_HOME 及 PATH 變量
ZooKeeper (3.4.6+) :必裝
Hadoop (2.6+) or MinIO :選裝,如果需要用到資源上傳功能,可以選擇上傳到Hadoop or MinIO上

不用考慮DS越早的版本越能兼容越早的依賴環境的版本,他們的依賴都是一樣的,越新的穩定版本安裝步驟越就簡單,而且舊版本的BUG在新版本中得到了解決。
以上是安裝DS官方安裝包最低的要求,我們使用的環境是CDH5.13.1,實際環境的相關組件版本早於最低要求,所以需要對源碼進行編譯

 

  

用戶說明
可以使用服務器中已存在的非root用戶(如appuser),也可以專門新建一個用戶dolphinscheduler,DS的租戶只能從用戶中選擇。
本文以 appuser 作為Dolphin Scheduler用戶

 

集群角色規划
因為CDH機器的規模較小,我們這邊采用了如下配置,可以根據實際情況擴展server004中的服務角色到別的服務器

 

以下是Dolphin Scheduler相關服務的內存占用情況,在規划服務器角色的時候也需要考慮內存占用因素

 

 

 


2.前置准備


2.1 安裝MySQL

說明
參考 CentOS 7離線安裝MySQL 5.7

https://www.jellythink.com/archives/14

在Client 服務器中檢查是否安裝了MySQL,前提是確認MySQL服務器在你的集群中是否有別的用途,再決定要不要先卸載 !!!

1.檢查是否安裝了 MySQL
rpm -qa | grep mysql

2.檢查是否安裝了mariadb
rpm -qa | grep mariadb

3.如果已經存在則卸載
rpm -e --nodeps xxx

 

下載mysql-5.7.30
下載地址 https://downloads.mysql.com/archives/community/

 

 

 

 

 

將下載的離線安裝包mysql-5.7.30-linux-glibc2.12-x86_64.tar上傳到/home/appuser/mysql中

解壓離線安裝包

# 解壓縮
tar -xvf /home/appuser/mysql/mysql-5.7.30-linux-glibc2.12-x86_64.tar

# 會得到一個mysql-5.7.30-linux-glibc2.12-x86_64.tar.gz文件,再解壓縮
tar -zxvf /home/appuser/mysql/mysql-5.7.30-linux-glibc2.12-x86_64.tar.gz

# 建立軟鏈接,便於以后版本升級
ln -s mysql-5.7.30-linux-glibc2.12-x86_64 mysql

# 修改mysql文件夾下所有文件的用戶和用戶組
chown -R appuser:appuser mysql/

 

 

創建臨時目錄、數據目錄和日志目錄

mkdir -p /home/appuser/mysql/3306/data
mkdir -p /home/appuser/mysql/3306/log
mkdir -p /home/appuser/mysql/3306/tmp

 

 

創建MySQL配置文件

# 創建配置文件
cd /etc

# 在my.cnf文件中添加對應的配置項
cp my.cnf my.cnf.bak20201023
vi my.cnf

 

my.cnf 內容如下

 1 [client]                                        # 客戶端設置,即客戶端默認的連接參數
 2 port = 3306                                    # 默認連接端口
 3 socket = /home/appuser/mysql/3306/tmp/mysql.sock                        # 用於本地連接的socket套接字,mysqld守護進程生成了這個文件
 4 
 5 
 6 
 7 
 8 [mysqld]                                        # 服務端基本設置
 9 # 基礎設置
10 server-id = 1                                  # Mysql服務的唯一編號 每個mysql服務Id需唯一
11 port = 3306                                    # MySQL監聽端口
12 basedir = /home/appuser/mysql/mysql                      # MySQL安裝根目錄
13 datadir = /home/appuser/mysql/3306/data                      # MySQL數據文件所在位置
14 tmpdir  = /home/appuser/mysql/3306/tmp                                  # 臨時目錄,比如load data infile會用到
15 socket = /home/appuser/mysql/3306/tmp/mysql.sock        # 為MySQL客戶端程序和服務器之間的本地通訊指定一個套接字文件
16 pid-file = /home/appuser/mysql/3306/log/mysql.pid      # pid文件所在目錄
17 skip_name_resolve = 1                          # 只能用IP地址檢查客戶端的登錄,不用主機名
18 character-set-server = utf8mb4                  # 數據庫默認字符集,主流字符集支持一些特殊表情符號(特殊表情符占用4個字節)
19 transaction_isolation = READ-COMMITTED          # 事務隔離級別,默認為可重復讀,MySQL默認可重復讀級別
20 collation-server = utf8mb4_general_ci          # 數據庫字符集對應一些排序等規則,注意要和character-set-server對應
21 init_connect='SET NAMES utf8mb4'                # 設置client連接mysql時的字符集,防止亂碼
22 lower_case_table_names = 1                      # 是否對sql語句大小寫敏感,1表示不敏感
23 max_connections = 400                          # 最大連接數
24 max_connect_errors = 1000                      # 最大錯誤連接數
25 explicit_defaults_for_timestamp = true          # TIMESTAMP如果沒有顯示聲明NOT NULL,允許NULL值
26 max_allowed_packet = 128M                      # SQL數據包發送的大小,如果有BLOB對象建議修改成1G
27 interactive_timeout = 1800                      # MySQL連接閑置超過一定時間后(單位:秒)將會被強行關閉
28 wait_timeout = 1800                            # MySQL默認的wait_timeout值為8個小時, interactive_timeout參數需要同時配置才能生效
29 tmp_table_size = 16M                            # 內部內存臨時表的最大值 ,設置成128M;比如大數據量的group by ,order by時可能用到臨時表;超過了這個值將寫入磁盤,系統IO壓力增大 
View Code

 

 安裝MySQL數據庫

# 進入MySQL的bin目錄
cd /home/appuser/mysql/mysql/bin

# 初始化數據庫,並指定啟動mysql的用戶
./mysqld --initialize --user=appuser

[appuser@server004 bin]$ ./mysqld --initialize --user=appuser
2020-10-23T05:28:42.613320Z 0 [Warning] InnoDB: New log files created, LSN=45790
2020-10-23T05:28:42.646524Z 0 [Warning] InnoDB: Creating foreign key constraint system tables.
2020-10-23T05:28:42.701420Z 0 [Warning] No existing UUID has been found, so we assume that this is the first time that this server has been started. Generating a new UUID: 9d332734-14f0-11eb-a21e-ac61750347bf.
2020-10-23T05:28:42.701917Z 0 [Warning] Gtid table is not ready to be used. Table 'mysql.gtid_executed' cannot be opened.
2020-10-23T05:28:43.099968Z 0 [Warning] CA certificate ca.pem is self signed.
2020-10-23T05:28:43.383014Z 1 [Note] A temporary password is generated for root@localhost: GWg!<:gQL3/k

 

這里最好指定啟動mysql的用戶名,否則就會在啟動MySQL時出現權限不足的問題
初始化完成之后,生成了root用戶的隨機密碼。

 

設置開機自啟動服務
# 復制啟動腳本到資源目錄
cp /home/appuser/mysql/mysql/support-files/mysql.server /etc/rc.d/init.d/mysqld
 
# 增加mysqld服務控制腳本執行權限
chmod +x /etc/rc.d/init.d/mysqld
 
# 將mysqld服務加入到系統服務
chkconfig --add mysqld
 
# 檢查mysqld服務是否已經生效
chkconfig --list mysqld
 
# 切換至mysql用戶,啟動mysql
su appuser
service mysqld start

 

 

配置環境變量
為了更好的操作mysql,配置環境變量。
# 切換至mysql用戶
su - appuser
 
# 修改配置文件,增加export PATH=$PATH:/home/appuser/mysql/mysql/bin
cp .bash_profile .bash_profile_bak2020102
vi .bash_profile
 
# 立即生效
source .bash_profile

 

 

登陸,修改密碼
# 登陸mysql
mysql -uroot -p
 
# 修改root用戶密碼
set password for root@localhost=password("123456");

 

 

2.2 編譯DS

下載DS 1.3.2

 

 

修改pom.xml的properties

<hadoop.version>2.7.3</hadoop.version>
<hive.jdbc.version>2.1.0</hive.jdbc.version>
<zookeeper.version>3.4.14</zookeeper.version>
<java.version>1.8</java.version>

 

<hadoop.version>2.6.0</hadoop.version>
<hive.jdbc.version>1.1.0</hive.jdbc.version>
<zookeeper.version>3.4.14</zookeeper.version>
<java.version>1.8</java.version>

 

CDH 中 ZK 的版本是3.4.5 ,通過修改依賴文件pom.xml的方法,重新通過Maven下載依賴包,結果發現項目中需要的一些方法 ZK 3.4.5中沒有

后來把這個依賴更改回去,經實際測試可用

 注釋掉MySQL的依賴

<!--            <dependency>
                <groupId>mysql</groupId>
                <artifactId>mysql-connector-java</artifactId>
                <version>${mysql.connector.version}</version>
                <scope>test</scope>
            </dependency>-->

 

在IDEA中通過 Maven reimport 修改好的 pom.xml

 

打包
mvn -U clean package -Prelease -Dmaven.test.skip=true

在 dolphinscheduler-dist 這個 module 下的 target 目錄中能找到打包好的 tar.gz 包 apache-dolphinscheduler-incubating-1.3.2-SNAPSHOT-dolphinscheduler-bin.tar.gz

 


 

3.安裝步驟

 3.1 解壓

編譯后的包名 apache-dolphinscheduler-incubating-1.3.2-SNAPSHOT-dolphinscheduler-bin.tar.gz

# 創建部署目錄
mkdir -p /home/appuser/dolphin/package;
cd /home/appuser/dolphin/package;

# 解壓縮
tar -zxvf apache-dolphinscheduler-incubating-1.3.2-SNAPSHOT-dolphinscheduler-bin.tar.gz
mv apache-dolphinscheduler-incubating-1.3.2-SNAPSHOT-dolphinscheduler-bin  dolphinscheduler-bin

 

3.2 創建部署用戶並賦予目錄操作權限

在這邊我們結合實際情況進行如下操作
省略掉創建新用戶和設置值密碼,已有用戶 appuser
只需要對server001、server002、server003、server004 服務器的appuser用戶配置免密登錄且修改目錄權限即可

# 備份所要修改的配置文件
cp /etc/sudoers /etc/sudoers_bak20201023

# 修改權限
chmod 640 /etc/sudoers

# 編輯/etc/sudoers 在最后添加一行 appuser ALL=(ALL) NOPASSWD: ALL
vi /etc/sudoers

# 將權限改回來
chmod 440 /etc/sudoers

 

 

配置hosts映射和ssh打通及修改目錄權限
以第一台機器(hostname為server004)作為部署機,在server004上配置所有待部署機器的hosts, 在server004以root登錄
vi /etc/hosts

#add ip hostname
192.168.xxx.xxx server001
192.168.xxx.xxx server002
192.168.xxx.xxx server003
192.168.xxx.xxx server004

注意:請刪掉或者注釋掉127.0.0.1這行

 

同步server004上的/etc/hosts到所有部署機器

for ip in ds2 ds3;     #請將此處ds2 ds3替換為自己要部署的機器的hostname
do
    sudo scp -r /etc/hosts  $ip:/etc/          #在運行中需要輸入root密碼
done

 

 

3.3 ssh免密配置

在server004上,切換到部署用戶並配置ssh本機免密登錄

su appuser;

ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys

 

 注意:正常設置后,server004用戶在執行命令ssh localhost 是不需要再輸入密碼的

 

 在server004上,配置部署用戶appuser ssh打通到其他待部署的機器

su appuser;
for ip in ds2 ds3;     #請將此處ds2 ds3替換為自己要部署的機器的hostname
do
    ssh-copy-id  $ip   #該操作執行過程中需要手動輸入dolphinscheduler用戶的密碼
done
# 當然 通過 sshpass -p xxx ssh-copy-id $ip 就可以省去輸入密碼了

# 執行的命令
ssh-copy-id  server001
ssh-copy-id  server002
ssh-copy-id  server003

 

 修改目錄dolphinscheduler-bin的權限

chown -R appuser:appuser dolphinscheduler-bin

 

 

 

3.4 數據庫初始化

進入數據庫,默認數據庫是PostgreSQL,如選擇MySQL的話,使用默認的即可。

 

進入數據庫命令行窗口后,執行數據庫初始化命令,設置訪問賬號和密碼。注: {user} 和 {password} 需要替換為具體的數據庫用戶名和密碼

mysql> CREATE DATABASE dolphinscheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
mysql> GRANT ALL PRIVILEGES ON dolphinscheduler.* TO '{user}'@'%' IDENTIFIED BY '{password}';
mysql> GRANT ALL PRIVILEGES ON dolphinscheduler.* TO '{user}'@'localhost' IDENTIFIED BY '{password}';
mysql> flush privileges;

 

執行以下命令

CREATE DATABASE dolphinscheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL PRIVILEGES ON dolphinscheduler.* TO 'appuser'@'%' IDENTIFIED BY 'appuser';
GRANT ALL PRIVILEGES ON dolphinscheduler.* TO 'appuser'@'localhost' IDENTIFIED BY 'appuser';
flush privileges;

 

創建表和導入基本數據

修改 conf 目錄下 datasource.properties 中的下列配置

[appuser@server004 conf]$ pwd
/home/appuser/dolphin/package/dolphinscheduler-bin/conf
[appuser@server004 conf]$ cp datasource.properties datasource.properties_bak20201026
[appuser@server004 conf]$ vi datasource.properties

 

 如果選擇 MySQL,請注釋掉 PostgreSQL 相關配置(反之同理),然后正確配置數據庫連接相關信息

# postgre
#spring.datasource.driver-class-name=org.postgresql.Driver
#spring.datasource.url=jdbc:postgresql://localhost:5432/dolphinscheduler
# mysql
spring.datasource.driver-class-name=com.mysql.jdbc.Driver
spring.datasource.url=jdbc:mysql://xxx:3306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8     需要修改ip,本機localhost即可
spring.datasource.username=xxx                        需要修改為上面的{user}值
spring.datasource.password=xxx                        需要修改為上面的{password}值

 

修改配置文件如下

# mysql
spring.datasource.driver-class-name=com.mysql.jdbc.Driver
spring.datasource.url=jdbc:mysql://192.168.1.14:3306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8
spring.datasource.username=appuser
spring.datasource.password=appuser

 

before

 

after

 

 修改並保存完后,執行 script 目錄下的創建表及導入基礎數據腳本

[appuser@server004 dolphinscheduler-bin]$ pwd
/home/appuser/dolphin/package/dolphinscheduler-bin
[appuser@server004 dolphinscheduler-bin]$ sh script/create-dolphinscheduler.sh
注意: 如果執行上述腳本報 "/bin/java: No such file or directory" 錯誤,請在/etc/profile下配置 JAVA_HOME 及 PATH 變量

 

3.5 修改運行參數

修改 conf/env 目錄下的 .dolphinscheduler_env.sh 環境變量(以相關用到的軟件都安裝在/opt/soft下為例)

[appuser@server004 env]$ pwd
/home/appuser/dolphin/package/dolphinscheduler-bin/conf/env
[appuser@server004 env]$ cp dolphinscheduler_env.sh dolphinscheduler_env.sh_bak20201016
[appuser@server004 env]$ ll
總用量 8
-rw-r--r-- 1 appuser appuser 1301 10月 26 10:13 dolphinscheduler_env.sh
-rw-r--r-- 1 appuser appuser 1301 10月 26 16:47 dolphinscheduler_env.sh_bak20201016

 

export HADOOP_HOME=/opt/cloudera/parcels/CDH/lib/hadoop
export HADOOP_CONF_DIR=/opt/cloudera/parcels/CDH/lib/hadoop/etc/hadoop
export SPARK_HOME1=/opt/cloudera/parcels/CDH/lib/spark
export SPARK_HOME2=/opt/cloudera/parcels/SPARK2/lib/spark2
export PYTHON_HOME=/opt/cloudera/parcels/Anaconda/bin/python
export JAVA_HOME=/usr/java/jdk1.8.0_111-cloudera
export HIVE_HOME=/opt/cloudera/parcels/CDH/lib/hive
export FLINK_HOME=/opt/soft/flink
export DATAX_HOME=/opt/soft/datax/bin/datax.py

export PATH=$HADOOP_HOME/bin:$SPARK_HOME1/bin:$SPARK_HOME2/bin:$PYTHON_HOME:$JAVA_HOME/bin:$HIVE_HOME/bin:$PATH:$FLINK_HOME/bin:$DATAX_HOME:$PATH

注: 這一步非常重要,例如 JAVA_HOME 和 PATH 是必須要配置的,沒有用到的可以忽略或者注釋掉

 

將jdk軟鏈到/usr/bin/java下(仍以 JAVA_HOME=/app/soft/jdk 為例)

sudo ln -s /app/soft/jdk/bin/java /usr/bin/java

 

3.6 修改一鍵部署腳本參數

修改一鍵部署腳本 install.sh中的各參數,特別注意以下參數的配置

[appuser@server004 config]$ pwd
/home/appuser/dolphin/package/dolphinscheduler-bin/conf/config
[appuser@server004 config]$ cp install_config.conf install_config.conf_bak20201026
[appuser@server004 config]$ vi install_config.conf

 

以下配置僅供參考

 1 # 這里填 mysql or postgresql
 2 dbtype="mysql"
 3 # 數據庫連接地址
 4 dbhost="localhost:3306"
 5 # 數據庫名
 6 dbname="dolphinscheduler"
 7 # 數據庫用戶名,此處需要修改為上面設置的{user}具體值
 8 username="xxx"    
 9 # 數據庫密碼, 如果有特殊字符,請使用\轉義,需要修改為上面設置的{passowrd}具體值
10 passowrd="xxx"
11 #將DS安裝到哪個目錄,如: /opt/soft/dolphinscheduler,不同於現在的目錄
12 installPath="/opt/soft/dolphinscheduler"
13 #使用哪個用戶部署,使用1.3小節創建的用戶
14 deployUser="dolphinscheduler"
15 #Zookeeper地址,單機本機是localhost:2181,記得把2181端口帶上
16 zkQuorum="localhost:2181"
17 #在哪些機器上部署DS服務,本機選localhost
18 ips="localhost"
19 #master服務部署在哪台機器上
20 masters="localhost"
21 #worker服務部署在哪台機器上
22 workers="localhost"
23 #報警服務部署在哪台機器上
24 alertServer="localhost"
25 #后端api服務部署在在哪台機器上
26 apiServers="localhost"
27 
28 
29 # 郵件配置,以qq郵箱為例
30 # 郵件協議
31 mailProtocol="SMTP"
32 # 郵件服務地址
33 mailServerHost="smtp.exmail.qq.com"
34 # 郵件服務端口
35 mailServerPort="25"
36 # mailSender和mailUser配置成一樣即可
37 # 發送者
38 mailSender="xxx@qq.com"
39 # 發送用戶
40 mailUser="xxx@qq.com"
41 # 郵箱密碼
42 mailPassword="xxx"
43 # TLS協議的郵箱設置為true,否則設置為false
44 starttlsEnable="true"
45 # 郵件服務地址值,參考上面 mailServerHost
46 sslTrust="smtp.exmail.qq.com"
47 # 開啟SSL協議的郵箱配置為true,否則為false。注意: starttlsEnable和sslEnable不能同時為true
48 sslEnable="false"
49 # excel下載路徑
50 xlsFilePath="/tmp/xls"
51 # 業務用到的    sql等資源文件上傳到哪里,可以設置:HDFS,S3,NONE,單機如果想使用本地文件系統,請配置為HDFS,因為HDFS支持本地文件系統;如果不需要資源上傳功能請選擇NONE。強調一點:使用本地文件系統不需要部署hadoop
52 resUploadStartupType="HDFS"
53 # 這里以保存到本地文件系統為例
54 #注:如果上傳資源想保存在hadoop上,hadoop集群的NameNode啟用了HA的話,需要將hadoop的配置文件core-site.xml和hdfs-site.xml放到安裝路徑(上面的installPath)的conf目錄下,本例即是放到/opt/soft/dolphinscheduler/conf下面,並配置namenode cluster名稱;如果NameNode不是HA,則只需要將mycluster修改為具體的ip或者主機名即可
55 defaultFS="file:///data/dolphinscheduler"    #hdfs://{具體的ip/主機名}:8020
56 
57 
58 # 如果ResourceManager是HA,則配置為ResourceManager節點的主備ip或者hostname,比如"192.168.xx.xx,192.168.xx.xx",否則如果是單ResourceManager或者根本沒用到yarn,請配置yarnHaIps=""即可,我這里沒用到yarn,配置為""
59 yarnHaIps=""
60 # 如果是單ResourceManager,則配置為ResourceManager節點ip或主機名,否則保持默認值即可。我這里沒用到yarn,保持默認
61 singleYarnIp="ark1"
62 # 由於hdfs支持本地文件系統,需要確保本地文件夾存在且有讀寫權限
63 hdfsPath="/data/dolphinscheduler"
64 注:如果打算用到資源中心功能,請執行以下命令:
65 sudo mkdir /data/dolphinscheduler
66 sudo chown -R dolphinscheduler:dolphinscheduler /data/dolphinscheduler

 

原始配置文件

  1 #
  2 # Licensed to the Apache Software Foundation (ASF) under one or more
  3 # contributor license agreements.  See the NOTICE file distributed with
  4 # this work for additional information regarding copyright ownership.
  5 # The ASF licenses this file to You under the Apache License, Version 2.0
  6 # (the "License"); you may not use this file except in compliance with
  7 # the License.  You may obtain a copy of the License at
  8 #
  9 #     http://www.apache.org/licenses/LICENSE-2.0
 10 #
 11 # Unless required by applicable law or agreed to in writing, software
 12 # distributed under the License is distributed on an "AS IS" BASIS,
 13 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 14 # See the License for the specific language governing permissions and
 15 # limitations under the License.
 16 #
 17 
 18 
 19 # NOTICE: If the following config has special characters in the variable `.*[]^${}\+?|()@#&`, Please escape, for example, `[` escape to `\[`
 20 # postgresql or mysql
 21 dbtype="mysql"
 22 
 23 # db config
 24 # db address and port
 25 dbhost="192.168.xx.xx:3306"
 26 
 27 # db username
 28 username="xx"
 29 
 30 # db password
 31 # NOTICE: if there are special characters, please use the \ to escape, for example, `[` escape to `\[`
 32 password="xx"
 33 
 34 # database name
 35 dbname="dolphinscheduler"
 36 
 37 
 38 # zk cluster
 39 zkQuorum="192.168.xx.xx:2181,192.168.xx.xx:2181,192.168.xx.xx:2181"
 40 
 41 # zk root directory
 42 zkRoot="/dolphinscheduler"
 43 
 44 # Note: the target installation path for dolphinscheduler, please not config as the same as the current path (pwd)
 45 installPath="/data1_1T/dolphinscheduler"
 46 
 47 # deployment user
 48 # Note: the deployment user needs to have sudo privileges and permissions to operate hdfs. If hdfs is enabled, the root directory needs to be created by itself
 49 deployUser="dolphinscheduler"
 50 
 51 
 52 # alert config
 53 # mail server host
 54 mailServerHost="smtp.exmail.qq.com"
 55 
 56 # mail server port
 57 # note: Different protocols and encryption methods correspond to different ports, when SSL/TLS is enabled, make sure the port is correct.
 58 mailServerPort="25"
 59 
 60 # sender
 61 mailSender="xxxxxxxxxx"
 62 
 63 # user
 64 mailUser="xxxxxxxxxx"
 65 
 66 # sender password
 67 # note: The mail.passwd is email service authorization code, not the email login password.
 68 mailPassword="xxxxxxxxxx"
 69 
 70 # TLS mail protocol support
 71 starttlsEnable="true"
 72 
 73 # SSL mail protocol support
 74 # only one of TLS and SSL can be in the true state.
 75 sslEnable="false"
 76 
 77 #note: sslTrust is the same as mailServerHost
 78 sslTrust="smtp.exmail.qq.com"
 79 
 80 
 81 # resource storage type:HDFS,S3,NONE
 82 resourceStorageType="NONE"
 83 
 84 # if resourceStorageType is HDFS,defaultFS write namenode address,HA you need to put core-site.xml and hdfs-site.xml in the conf directory.
 85 # if S3,write S3 address,HA,for example :s3a://dolphinscheduler,
 86 # Note,s3 be sure to create the root directory /dolphinscheduler
 87 defaultFS="hdfs://mycluster:8020"
 88 
 89 # if resourceStorageType is S3, the following three configuration is required, otherwise please ignore
 90 s3Endpoint="http://192.168.xx.xx:9010"
 91 s3AccessKey="xxxxxxxxxx"
 92 s3SecretKey="xxxxxxxxxx"
 93 
 94 # if resourcemanager HA enable, please type the HA ips ; if resourcemanager is single, make this value empty
 95 yarnHaIps="192.168.xx.xx,192.168.xx.xx"
 96 
 97 # if resourcemanager HA enable or not use resourcemanager, please skip this value setting; If resourcemanager is single, you only need to replace yarnIp1 to actual resourcemanager hostname.
 98 singleYarnIp="yarnIp1"
 99 
100 # resource store on HDFS/S3 path, resource file will store to this hadoop hdfs path, self configuration, please make sure the directory exists on hdfs and have read write permissions。/dolphinscheduler is recommended
101 resourceUploadPath="/dolphinscheduler"
102 
103 # who have permissions to create directory under HDFS/S3 root path
104 # Note: if kerberos is enabled, please config hdfsRootUser=
105 hdfsRootUser="hdfs"
106 
107 # kerberos config
108 # whether kerberos starts, if kerberos starts, following four items need to config, otherwise please ignore
109 kerberosStartUp="false"
110 # kdc krb5 config file path
111 krb5ConfPath="$installPath/conf/krb5.conf"
112 # keytab username
113 keytabUserName="hdfs-mycluster@ESZ.COM"
114 # username keytab path
115 keytabPath="$installPath/conf/hdfs.headless.keytab"
116 
117 
118 # api server port
119 apiServerPort="12345"
120 
121 
122 # install hosts
123 # Note: install the scheduled hostname list. If it is pseudo-distributed, just write a pseudo-distributed hostname
124 ips="ds1,ds2,ds3,ds4,ds5"
125 
126 # ssh port, default 22
127 # Note: if ssh port is not default, modify here
128 sshPort="22"
129 
130 # run master machine
131 # Note: list of hosts hostname for deploying master
132 masters="ds1,ds2"
133 
134 # run worker machine
135 # note: need to write the worker group name of each worker, the default value is "default"
136 workers="ds1:default,ds2:default,ds3:default,ds4:default,ds5:default"
137 
138 # run alert machine
139 # note: list of machine hostnames for deploying alert server
140 alertServer="ds3"
141 
142 # run api machine
143 # note: list of machine hostnames for deploying api server
144 apiServers="ds1"
View Code

 

實際上的配置文件

  1 #
  2 # Licensed to the Apache Software Foundation (ASF) under one or more
  3 # contributor license agreements.  See the NOTICE file distributed with
  4 # this work for additional information regarding copyright ownership.
  5 # The ASF licenses this file to You under the Apache License, Version 2.0
  6 # (the "License"); you may not use this file except in compliance with
  7 # the License.  You may obtain a copy of the License at
  8 #
  9 #     http://www.apache.org/licenses/LICENSE-2.0
 10 #
 11 # Unless required by applicable law or agreed to in writing, software
 12 # distributed under the License is distributed on an "AS IS" BASIS,
 13 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 14 # See the License for the specific language governing permissions and
 15 # limitations under the License.
 16 #
 17 
 18 
 19 # NOTICE: If the following config has special characters in the variable `.*[]^${}\+?|()@#&`, Please escape, for example, `[` escape to `\[`
 20 # postgresql or mysql
 21 dbtype="mysql"
 22 
 23 # db config
 24 # db address and port
 25 dbhost="192.168.1.14:3306"
 26 
 27 # db username
 28 username="appuser"
 29 
 30 # db password
 31 # NOTICE: if there are special characters, please use the \ to escape, for example, `[` escape to `\[`
 32 password="appuser"
 33 
 34 # database name
 35 dbname="dolphinscheduler"
 36 
 37 
 38 # zk cluster
 39 zkQuorum="192.168.1.11:2181,192.168.1.12:2181,192.168.1.13:2181"
 40 
 41 # zk root directory
 42 zkRoot="/dolphinscheduler"
 43 
 44 # Note: the target installation path for dolphinscheduler, please not config as the same as the current path (pwd)
 45 installPath="/home/appuser/dolphin/soft"
 46 
 47 # deployment user
 48 # Note: the deployment user needs to have sudo privileges and permissions to operate hdfs. If hdfs is enabled, the root directory needs to be created by itself
 49 deployUser="appuser"
 50 
 51 
 52 # alert config
 53 # mail server host
 54 mailServerHost="smtp.exmail.qq.com"
 55 
 56 # mail server port
 57 # note: Different protocols and encryption methods correspond to different ports, when SSL/TLS is enabled, make sure the port is correct.
 58 mailServerPort="25"
 59 
 60 # sender
 61 mailSender="xxxxxxxxxx"
 62 
 63 # user
 64 mailUser="xxxxxxxxxx"
 65 
 66 # sender password
 67 # note: The mail.passwd is email service authorization code, not the email login password.
 68 mailPassword="xxxxxxxxxx"
 69 
 70 # TLS mail protocol support
 71 starttlsEnable="true"
 72 
 73 # SSL mail protocol support
 74 # only one of TLS and SSL can be in the true state.
 75 sslEnable="false"
 76 
 77 #note: sslTrust is the same as mailServerHost
 78 sslTrust="smtp.exmail.qq.com"
 79 
 80 
 81 # resource storage type:HDFS,S3,NONE
 82 resourceStorageType="HDFS"
 83 
 84 # if resourceStorageType is HDFS,defaultFS write namenode address,HA you need to put core-site.xml and hdfs-site.xml in the conf directory.
 85 # if S3,write S3 address,HA,for example :s3a://dolphinscheduler,
 86 # Note,s3 be sure to create the root directory /dolphinscheduler
 87 defaultFS="hdfs://server001:8020"
 88 
 89 # if resourceStorageType is S3, the following three configuration is required, otherwise please ignore
 90 s3Endpoint="http://192.168.xx.xx:9010"
 91 s3AccessKey="xxxxxxxxxx"
 92 s3SecretKey="xxxxxxxxxx"
 93 
 94 # if resourcemanager HA enable, please type the HA ips ; if resourcemanager is single, make this value empty
 95 yarnHaIps=""
 96 
 97 # if resourcemanager HA enable or not use resourcemanager, please skip this value setting; If resourcemanager is single, you only need to replace yarnIp1 to actual resourcemanager hostname.
 98 singleYarnIp="192.168.1.12"
 99 
100 # resource store on HDFS/S3 path, resource file will store to this hadoop hdfs path, self configuration, please make sure the directory exists on hdfs and have read write permissions。/dolphinscheduler is recommended
101 resourceUploadPath="/dolphinscheduler"
102 
103 # who have permissions to create directory under HDFS/S3 root path
104 # Note: if kerberos is enabled, please config hdfsRootUser=
105 hdfsRootUser="hdfs"
106 
107 # kerberos config
108 # whether kerberos starts, if kerberos starts, following four items need to config, otherwise please ignore
109 kerberosStartUp="false"
110 # kdc krb5 config file path
111 krb5ConfPath="$installPath/conf/krb5.conf"
112 # keytab username
113 keytabUserName="hdfs-mycluster@ESZ.COM"
114 # username keytab path
115 keytabPath="$installPath/conf/hdfs.headless.keytab"
116 
117 
118 # api server port
119 apiServerPort="12345"
120 
121 
122 # install hosts
123 # Note: install the scheduled hostname list. If it is pseudo-distributed, just write a pseudo-distributed hostname
124 ips="server004"
125 
126 # ssh port, default 22
127 # Note: if ssh port is not default, modify here
128 sshPort="22"
129 
130 # run master machine
131 # Note: list of hosts hostname for deploying master
132 masters="server004"
133 
134 # run worker machine
135 # note: need to write the worker group name of each worker, the default value is "default"
136 workers="server004:default"
137 
138 # run alert machine
139 # note: list of machine hostnames for deploying alert server
140 alertServer="server004"
141 
142 # run api machine
143 # note: list of machine hostnames for deploying api server
144 apiServers="server004"
View Code

 

3.7 開始部署

執行install.sh部署腳本
切換到部署用戶dolphinscheduler,然后執行一鍵部署腳本

cd /home/appuser/dolphin/package/dolphinscheduler-bin
sh install.sh

 

注意:
第一次部署的話,在運行中第3步`3,stop server`出現5次以下信息,此信息可以忽略
sh: bin/dolphinscheduler-daemon.sh: No such file or directory
腳本完成后,會啟動以下5個服務,使用jps命令查看服務是否啟動(jps為java JDK自帶)

MasterServer         ----- master服務
WorkerServer         ----- worker服務
LoggerServer         ----- logger服務
ApiApplicationServer ----- api服務
AlertServer          ----- alert服務

如果以上服務都正常啟動,說明自動部署成功

 

部署成功后,可以進行日志查看,日志統一存放於logs文件夾內

logs/
    ├── dolphinscheduler-alert-server.log
    ├── dolphinscheduler-master-server.log
    |—— dolphinscheduler-worker-server.log
    |—— dolphinscheduler-api-server.log
    |—— dolphinscheduler-logger-server.log

 

3.8 登錄系統

訪問前端頁面地址: http://192.168.xx.xx:12345/dolphinscheduler ,出現前端登錄頁面,接口地址(自行修改)

默認用戶名密碼:admin/dolphinscheduler123

 

 

 

 

相關啟動與停止服務的命令

# 一鍵停止集群所有服務
sh ./bin/stop-all.sh

# 一鍵開啟集群所有服務
sh ./bin/start-all.sh

# 啟停Master
sh ./bin/dolphinscheduler-daemon.sh start master-server
sh ./bin/dolphinscheduler-daemon.sh stop master-server

# 啟停Worker
sh ./bin/dolphinscheduler-daemon.sh start worker-server
sh ./bin/dolphinscheduler-daemon.sh stop worker-server

# 啟停Api
sh ./bin/dolphinscheduler-daemon.sh start api-server
sh ./bin/dolphinscheduler-daemon.sh stop api-server

# 啟停Logger
sh ./bin/dolphinscheduler-daemon.sh start logger-server
sh ./bin/dolphinscheduler-daemon.sh stop logger-server

# 啟停Alert
sh ./bin/dolphinscheduler-daemon.sh start alert-server
sh ./bin/dolphinscheduler-daemon.sh stop alert-server

 

參考鏈接
Apache DolphinScheduler v1.3.1 使用手冊
https://www.bookstack.cn/read/dolphinscheduler-1.3.0-zh/%E9%83%A8%E7%BD%B2%E6%96%87%E6%A1%A3.md
官網DS集群模式部署文檔
https://dolphinscheduler.apache.org/zh-cn/docs/1.2.1/user_doc/cluster-deployment.html
如何在CDH5上部署Dolphin Scheduler 1.3.1
https://my.oschina.net/u/3701426/blog/4419921
GitHub DolphinScheduler1.3.2文檔
https://github.com/apache/incubator-dolphinscheduler-website/tree/master/docs/zh-cn/1.3.2/user_doc
CentOS 7離線安裝MySQL 5.7
https://www.jellythink.com/archives/14

 


 

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM