Features
自动化工作流
为什么需要自动化工作流
自动化工作流适用于基于大规模数据集的计算场景,当数据集规模较大时,大量计算任务需要重复计算,采用自动化工作流可以大大提高计算效率。
采用工作为项目组自研的nightly
工具,为独立的可执行程序,无需环境依赖,下载后即可使用。用户可以通过将nightly
所在文件夹路径添加到环境变量中,来使用nightly
工具。
nightly-ops 0.1.0
USAGE:
nightly.exe [OPTIONS]
OPTIONS:
-d, --debug
-f, --flowfile <FLOWFILE>
-h, --help Print help information
-i, --input <INPUT> Input JSON file
-l, --list List available classes
-V, --version Print version information
自动化工作流示例
执行方式为
nightly -f workflow.yaml
简单工作流示例
# workflow.yaml
args:
- dir: {cwd}
tasks:
- classname: Echo
mag: "Hello, World!"
复杂工作流示例
args:
work_dir: "{cwd}"
tasks:
- classname: Echo
msg: "Start to compile run hitdic tasks"
- classname: MakeDir
dest: "{{work_dir}}/tasks"
- classname: SystemCall
cmd: "hitdic_opt"
env_path: ["{{work_dir}}/../../../build/bin/"]
args: ["-h"]
work_dir: "{{work_dir}}/tasks"
- classname: SystemCall
enable: false
cmd: "hitdic_opt"
env_path: ["{{work_dir}}/../../../build/bin/"]
args: ["--mock", "--input=../workspace.spire", "--tdb=../start.tdb", "--output=initial.tdb", "--threads=64", "--phase=FCC", "--elements=CO,CR", "--binary=A*1000+B*T,2"]
work_dir: "{{work_dir}}/tasks"
- classname: SystemCall
enable: false
cmd: "hitdic_opt"
env_path: ["{{work_dir}}/../../../build/bin/"]
args: ["--start", "--strategy=ga_batch", "--input=../workspace.spire", "--tdb=initial.tdb", "--threads=64"]
work_dir: "{{work_dir}}/tasks"
- classname: SystemCall
enable: false
cmd: "false"
env_path: ["{{work_dir}}/../../../build/bin/"]
args: ["--report", "--report_dir=report-cocr_couple", "--use_result=ga", "--used_sources=cocr_couple", "--input=../workspace.spire", "--tdb=initial.tdb", "--threads=64"]
work_dir: "{{work_dir}}/tasks"
- classname: SystemCall
enable: false
cmd: "hitdic_opt"
env_path: ["{{work_dir}}/../../../build/bin/"]
args: ["--report", "--report_dir=report-cocr_interd", "--use_result=ga", "--used_sources=cocr_interd", "--input=../workspace.spire", "--tdb=initial.tdb", "--threads=64"]
work_dir: "{{work_dir}}/tasks"
- classname: SystemCall
enable: false
cmd: "hitdic_opt"
env_path: ["{{work_dir}}/../../../build/bin/"]
args: ["--start", "--strategy=nsga2", "--input=../workspace.spire", "--use_result=ga", "--tdb=initial.tdb", "--threads=64", "--algo_config=nsga2:maxiter=1000;nsga2:popsize=100;"]
work_dir: "{{work_dir}}/tasks"
- classname: SystemCall
enable: false
cmd: "hitdic_opt"
env_path: ["{{work_dir}}/../../../build/bin/"]
args: ["--report", "--report_dir=report-nsga2", "--use_result=nsga2", "--input=../workspace.spire", "--tdb=initial.tdb", "--threads=64"]
work_dir: "{{work_dir}}/tasks"
- classname: SystemCall
enable: true
cmd: "hitdic_opt"
env_path: ["{{work_dir}}/../../../build/bin/"]
args: ["--start", "--strategy=nsga3", "--input=../workspace.spire", "--use_result=ga", "--tdb=initial.tdb", "--threads=64", "--algo_config=nsga3:maxiter=1000;nsga3:popsize=100;"]
work_dir: "{{work_dir}}/tasks"
- classname: SystemCall
enable: true
cmd: "hitdic_opt"
env_path: ["{{work_dir}}/../../../build/bin/"]
args: ["--report", "--report_dir=report-nsga3", "--use_result=nsga3", "--input=../workspace.spire", "--tdb=initial.tdb", "--threads=64"]
work_dir: "{{work_dir}}/tasks"
自动化工作流说明
SystemCall
classname: SystemCall
cmd: ls
args: ["-l", "-a"]
env_path: ["abs_path/to/folder1", "abs_path/to/folder2"]
work_dir: ./test_dir
CopyFile
classname: CopyFile
src: path/to/src
dest: path/to/dest
CopyFolder
classname: CopyFolder
src: path/to/src
dest: path/to/dest
Echo
classname: Echo
msg: "Hello, World!"
GzFolder
classname: GzFolder
src: path/to/src
dest: path/to/dest/file.tar.gz
HttpGetFile
classname: HttpGetFile
url: http://example.com/file
dest: path/to/dest/file
MakeDir
classname: MakeDir
path: path/to/dir
RenderTemplate
classname: RenderTemplate
template_path: path/to/template_file
output_path: path/to/output_file
args:
key1: value1
files: []
folders: []
operators:
- classname: ListDir
dir: ./test_dir
to: files
is_directory: false
is_file: true
- classname: ListDir
dir: ./test_dir
to: folders
is_directory: true
is_file: false
WriteJson
classname: WriteJson
dest: path/to/file.json
json: {"key": "value", "key2": [1, 2, 3]}
operators:
- type: replace
path: $.key
value: new_value
- type: append
path: $.key2
value: 4
- type: prepend
path: $.key2
value: 0
ZipFolder
classname: ZipFolder
src: path/to/src/folder
dest: path/to/dest/file.zip
UnZipFolder
classname: UnZipFolder
src: path/to/src/file.zip
dest: path/to/dest/folder
Aliyun OSS
.env
文件是必须的,内容如下
OSS_KEY_ID=xxx
OSS_KEY_SECRET=xxx
OSS_ENDPOINT=oss-cn-hongkong.aliyuncs.com
OSS_BUCKET=xxx
OSSFileDownload
classname: OSSFileDownload
filename: path/to/file/in/oss
dest: path/to/dest/file
OSSFileUpload
classname: OSSFileUpload
src: path/to/src/file
filename: path/to/file/in/oss