airflow(一)centos7安裝airflow


 環境准備


 

1.conda創建虛擬環境

conda create -n 虛擬環境名字 python=版本

conda ccreate -n python3.6 python=3.6

2.查看虛擬環境

conda info -e

3.切換環境

Linux: source activate your_env_name(虛擬環境名稱)

Windows: activate your_env_name(虛擬環境名稱)

source activate python3.6 

4.關閉環境

Linux: source deactivate ​

Windows: deactivate

安裝airflow


 

1.升級pip

pip install --upgrade pip

2.安裝gcc(有不用安裝)

yum -y install gcc gcc-c++ kernel-devel

查看之前筆記 

3.安裝依賴

pip3 install paramiko
yum -y install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline-devel tk-devel gdbm-devel db4-devel libpcap-devel xz-devel

4.安裝airflow

pip3 install apache-airflow  

5.安裝pymysql

pip3 install pymysql

6.配置環境變量

# vi /etc/profile

  #airflow
  export AIRFLOW_HOME=/opt/airflow

# source /etc/profile

7.初始化數據庫表(默認使用本地sqlite數據庫)

airflow initdb

會在配置的airflow環境下生成如下文件

ls /opt/airflow

airflow.cfg airflow.db logs unittests.cfg

8.配置MySQL數據庫

創建airflow數據庫,並創建用戶和授權,給airflow訪問數據庫使用

如果沒有mysql查看之前的筆記 linux安裝mysql

mysql> CREATE DATABASE airflow;
Query OK, 1 row affected (0.00 sec)
 
mysql> GRANT all privileges on root.* TO 'root'@'localhost'  IDENTIFIED BY 'root';
ERROR 1819 (HY000): Your password does not satisfy the current policy requirements
#這個錯誤與validate_password_policy的值有關。默認值是1,即MEDIUM,所以剛開始設置的密碼必須符合長度,且必須含有數字,小寫或大寫字母,特殊字符。
 有時候,只是為了自己測試,不想密碼設置得那么復雜,譬如說,我只想設置root的密碼為root。
 
 必須修改兩個全局參數:
 
  1)首先,修改validate_password_policy參數的值:
 
mysql> set global validate_password_policy=0;
Query OK, 0 rows affected (0.00 sec)
#這樣,判斷密碼的標准就基於密碼的長度了。這個由validate_password_length參數來決定。
 
mysql> select @@validate_password_length;
+----------------------------+
| @@validate_password_length |
+----------------------------+
|                          8 |
+----------------------------+
1 row in set (0.00 sec)
 
2)修改validate_password_length參數,設置密碼僅由密碼長度決定。
mysql> set global validate_password_length=1;
Query OK, 0 rows affected (0.00 sec)
 
mysql> select @@validate_password_length;
+----------------------------+
| @@validate_password_length |
+----------------------------+
|                          4 |
+----------------------------+
1 row in set (0.00 sec)
 
mysql> GRANT all privileges on root.* TO 'root'@'localhost'  IDENTIFIED BY 'root';
Query OK, 0 rows affected, 1 warning (0.35 sec)
 
  mysql> FLUSH PRIVILEGES;
  Query OK, 0 rows affected (0.01 sec)

9.更改數據庫配置

mysql> set @@global.explicit_defaults_for_timestamp=on;

10.配置airflow

vim airflow/airflow.cfg
# The executor class that airflow should use. Choices include
# SequentialExecutor, LocalExecutor, CeleryExecutor, DaskExecutor, KubernetesExecutor
#executor = SequentialExecutor
executor = LocalExecutor
 
# The SqlAlchemy connection string to the metadata database.
# SqlAlchemy supports many different database engine, more information
# their website
#sql_alchemy_conn = sqlite:////data/airflow/airflow.db
sql_alchemy_conn = mysql+pymysql://root:root@localhost:3306/airflow

執行器executor 有如下選擇

SequentialExecutor:單進程順序執行任務,默認執行器,通常只用於測試

LocalExecutor:多進程本地執行任務

CeleryExecutor:分布式調度,生產常用

DaskExecutor :動態任務調度,主要用於數據分析

11.再次初始化數據庫表

airflow initdb

12.查看創建的airflow數據表

mysql> use airflow;
mysql> show tables;

13.啟動服務

airflow webserver
airflow scheduler

后台運行
# 打開airflow的webserver UI,為了使其后台運行,這里用了nohup nohup airflow webserver -p 8080 > /opt/airflow/webLog.log 2>&1 & # 打開airflow的調度器,以開始定時執行任務 nohup airflow scheduler > /opt/airflow/schedulerLog.log 2>&1 &
kill 進程
ps -ef|grep "
airflow "|grep -v grep|cut -c 9-15|xargs kill -9

注釋

"grep -v grep"是在列出的進程中去除含有關鍵字"grep"的進程。
"cut -c 9-15"是截取輸入行的第9個字符到第15個字符,而這正好是進程號PID。
"xargs kill -9"中的xargs命令是用來把前面命令的輸出結果(PID)作為"kill -9"命令的參數,並執行該令。

啟動scheduler時如果出錯

failed to log action with (sqlite3.operationalerror) no such table log

export AIRFLOW_HOME=/opt/airflow

airflow initdb

參考 點這里

14.瀏覽器查看

參考 點這里


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM