背景:使用腳本管理Spark任務,正處於RUNNING狀態的任務跳過,不提交
一、涉及到的知識點:
腳本不重要,重要的是知識點
1.查詢yarn處於RUNNING狀態的任務列表
yarn application -list -appStates RUNNING
2.在Python中使用Linux命令
import os
os.system('ll /')
os.popen("ll /")
二、完整腳本
import os
file = open(r'./bash.txt', 'r')
name_bash_dict = {}
for line in file.readlines():
words = line.split(':')
name_bash_dict[words[0]] = words[1]
file.close()
running_job_lines = os.popen("yarn application -list -appStates RUNNING")
line_num = 0
for line in running_job_lines.readlines():
line_num += 1
if line_num == 3:
column = line.split('\t')
if len(column) == 9 and column[5].strip() == 'RUNNING':
jobName = column[1].strip()
del name_bash_dict[jobName]
for v in name_bash_dict.values():
os.system(v)