Features

自动化工作流

为什么需要自动化工作流

自动化工作流适用于基于大规模数据集的计算场景,当数据集规模较大时,大量计算任务需要重复计算,采用自动化工作流可以大大提高计算效率。 采用工作为项目组自研的nightly工具,为独立的可执行程序,无需环境依赖,下载后即可使用。用户可以通过将nightly所在文件夹路径添加到环境变量中,来使用nightly工具。

nightly-ops 0.1.0

USAGE:
    nightly.exe [OPTIONS]

OPTIONS:
    -d, --debug
    -f, --flowfile <FLOWFILE>
    -h, --help                   Print help information
    -i, --input <INPUT>          Input JSON file
    -l, --list                   List available classes
    -V, --version                Print version information

自动化工作流示例

执行方式为

nightly -f workflow.yaml

简单工作流示例

# workflow.yaml
args:
  - dir: {cwd}

tasks:
  - classname: Echo
    mag: "Hello, World!"

复杂工作流示例

args:
  work_dir: "{cwd}"

tasks:
  - classname: Echo
    msg: "Start to compile run hitdic tasks"

  - classname: MakeDir
    dest: "{{work_dir}}/tasks"

  - classname: SystemCall
    cmd: "hitdic_opt"
    env_path: ["{{work_dir}}/../../../build/bin/"]
    args: ["-h"]
    work_dir: "{{work_dir}}/tasks"

  - classname: SystemCall
    enable: false
    cmd: "hitdic_opt"
    env_path: ["{{work_dir}}/../../../build/bin/"]
    args: ["--mock", "--input=../workspace.spire", "--tdb=../start.tdb", "--output=initial.tdb", "--threads=64", "--phase=FCC", "--elements=CO,CR", "--binary=A*1000+B*T,2"]
    work_dir: "{{work_dir}}/tasks"

  - classname: SystemCall
    enable: false
    cmd: "hitdic_opt"
    env_path: ["{{work_dir}}/../../../build/bin/"]
    args: ["--start", "--strategy=ga_batch", "--input=../workspace.spire", "--tdb=initial.tdb", "--threads=64"]
    work_dir: "{{work_dir}}/tasks"

  - classname: SystemCall
    enable: false
    cmd: "false"
    env_path: ["{{work_dir}}/../../../build/bin/"]
    args: ["--report", "--report_dir=report-cocr_couple", "--use_result=ga", "--used_sources=cocr_couple", "--input=../workspace.spire", "--tdb=initial.tdb", "--threads=64"]
    work_dir: "{{work_dir}}/tasks"

  - classname: SystemCall
    enable: false
    cmd: "hitdic_opt"
    env_path: ["{{work_dir}}/../../../build/bin/"]
    args: ["--report", "--report_dir=report-cocr_interd", "--use_result=ga", "--used_sources=cocr_interd", "--input=../workspace.spire", "--tdb=initial.tdb", "--threads=64"]
    work_dir: "{{work_dir}}/tasks"

  - classname: SystemCall
    enable: false
    cmd: "hitdic_opt"
    env_path: ["{{work_dir}}/../../../build/bin/"]
    args: ["--start", "--strategy=nsga2", "--input=../workspace.spire", "--use_result=ga", "--tdb=initial.tdb", "--threads=64", "--algo_config=nsga2:maxiter=1000;nsga2:popsize=100;"]
    work_dir: "{{work_dir}}/tasks"

  - classname: SystemCall
    enable: false
    cmd: "hitdic_opt"
    env_path: ["{{work_dir}}/../../../build/bin/"]
    args: ["--report", "--report_dir=report-nsga2", "--use_result=nsga2", "--input=../workspace.spire", "--tdb=initial.tdb", "--threads=64"]
    work_dir: "{{work_dir}}/tasks"

  - classname: SystemCall
    enable: true
    cmd: "hitdic_opt"
    env_path: ["{{work_dir}}/../../../build/bin/"]
    args: ["--start", "--strategy=nsga3", "--input=../workspace.spire", "--use_result=ga", "--tdb=initial.tdb", "--threads=64", "--algo_config=nsga3:maxiter=1000;nsga3:popsize=100;"]
    work_dir: "{{work_dir}}/tasks"

  - classname: SystemCall
    enable: true
    cmd: "hitdic_opt"
    env_path: ["{{work_dir}}/../../../build/bin/"]
    args: ["--report", "--report_dir=report-nsga3", "--use_result=nsga3", "--input=../workspace.spire", "--tdb=initial.tdb", "--threads=64"]
    work_dir: "{{work_dir}}/tasks"

自动化工作流说明

SystemCall

classname: SystemCall
cmd: ls
args: ["-l", "-a"]
env_path: ["abs_path/to/folder1", "abs_path/to/folder2"]
work_dir: ./test_dir

CopyFile

classname: CopyFile
src: path/to/src
dest: path/to/dest

CopyFolder

classname: CopyFolder
src: path/to/src
dest: path/to/dest

Echo

classname: Echo
msg: "Hello, World!"

GzFolder

classname: GzFolder
src: path/to/src
dest: path/to/dest/file.tar.gz

HttpGetFile

classname: HttpGetFile
url: http://example.com/file
dest: path/to/dest/file

MakeDir

classname: MakeDir
path: path/to/dir

RenderTemplate

classname: RenderTemplate
template_path: path/to/template_file
output_path: path/to/output_file
args:
    key1: value1
    files: []
    folders: []
operators:
    - classname: ListDir
      dir: ./test_dir
      to: files
      is_directory: false
      is_file: true
    - classname: ListDir
      dir: ./test_dir
      to: folders
      is_directory: true
      is_file: false

WriteJson

classname: WriteJson
dest: path/to/file.json
json: {"key": "value", "key2": [1, 2, 3]}
operators:
    - type: replace
      path: $.key
      value: new_value
    - type: append
      path: $.key2
      value: 4
    - type: prepend
      path: $.key2
      value: 0

ZipFolder

classname: ZipFolder
src: path/to/src/folder
dest: path/to/dest/file.zip

UnZipFolder

classname: UnZipFolder
src: path/to/src/file.zip
dest: path/to/dest/folder

Aliyun OSS

.env 文件是必须的,内容如下

OSS_KEY_ID=xxx
OSS_KEY_SECRET=xxx
OSS_ENDPOINT=oss-cn-hongkong.aliyuncs.com
OSS_BUCKET=xxx

OSSFileDownload

classname: OSSFileDownload
filename: path/to/file/in/oss
dest: path/to/dest/file

OSSFileUpload

classname: OSSFileUpload
src: path/to/src/file
filename: path/to/file/in/oss