Skip to content

Implement a custom stage in and stage out

bbrauzzi edited this page Dec 20, 2022 · 5 revisions

To implement a custom stage-in and stage-out and wrap them around an application package, we highly recommend to use the CWL Wrapper tool.

CWL-Wrapper

CWL Wrapper is a library used by the Ades to add the stage in and stage out steps to an existing application package. It can also be used as a standalone CLI tool for development, testing and validation purposes.

The CWL wrapper tool takes as input 4 elements:

  • --maincwl: defines the template of the overall worklflow
  • --rulez: defines the rules that establish connections and conventions with the user cwl
  • --stagein: data stage-in phase
  • --stageout: data stage-out phase
  • app.cwl: user application package

Prerequisites

In order to use the cwl-wrapper tool, first install these cli applications:

  • python
  • conda
  • cwl-tool
  • wget

Get Started

  • Install the cwl-wrapper tool using conda
conda install -c eoepca cwl-wrapper
  • For this tutorial we will use the s-expression app.
    To download the application package run the following wget command:
wget https://raw.githubusercontent.com/EOEPCA/app-s-expression/main/app-s-expression.dev.0.0.2.cwl
  • To generate a new application package from the s-expression app which includes the default stage-in and stage-out phases, run the following command:
cwl-wrapper app-s-expression.dev.0.0.2.cwl

The output of this command should be the following:

$graph:
- class: Workflow
  doc: Main stage manager
  id: main
  inputs:
    cbn:
      doc: Common band name
      id: cbn
      label: Common band name
      type: string
    input_reference:
      doc: Input product reference
      id: input_reference
      label: Input product reference
      type: string
    job:
      doc: the job doc
      label: the job doc
      type: string
    outputfile:
      doc: the outputfile doc
      label: the outputfile doc
      type: string
    s_expression:
      doc: s expression
      id: s_expression
      label: s expression
      type: string
    store_apikey:
      doc: the store_apikey doc
      label: the store_apikey doc
      type: string
    store_host:
      doc: the store_host doc
      label: the store_host doc
      type: string
    store_username:
      doc: the store_username doc
      label: the store_username doc
      type: string
  label: macro-cwl
  outputs:
    wf_outputs:
      outputSource:
      - node_stage_out/wf_outputs_out
      type: Directory
  requirements:
    ScatterFeatureRequirement: {}
    SubworkflowFeatureRequirement: {}
  steps:
    node_stage_in:
      in:
        input: input_reference
      out:
      - input_reference_out
      run:
        arguments:
        - position: 1
          prefix: -t
          valueFrom: ./
        baseCommand: stage-in
        class: CommandLineTool
        hints:
          DockerRequirement:
            dockerPull: eoepca/stage-in:0.2
        id: stagein
        inputs:
          input:
            inputBinding:
              position: 2
            type: string
        outputs:
          input_reference_out:
            outputBinding:
              glob: .
            type: Directory
        requirements:
          EnvVarRequirement:
            envDef:
              PATH: /opt/anaconda/envs/env_stagein/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
          ResourceRequirement: {}
    node_stage_out:
      in:
        job: job
        outputfile: outputfile
        store_apikey: store_apikey
        store_host: store_host
        store_username: store_username
        wf_outputs: on_stage/wf_outputs
      out:
      - wf_outputs_out
      run:
        baseCommand: stage-out
        class: CommandLineTool
        hints:
          DockerRequirement:
            dockerPull: eoepca/stage-out:0.2
        inputs:
          job:
            doc: the job doc
            inputBinding:
              position: 1
              prefix: --job
            label: the job doc
            type: string
          outputfile:
            doc: the outputfile doc
            inputBinding:
              position: 5
              prefix: --outputfile
            label: the outputfile doc
            type: string
          store_apikey:
            doc: the store_apikey doc
            inputBinding:
              position: 4
              prefix: --store-apikey
            label: the store_apikey doc
            type: string
          store_host:
            doc: the store_host doc
            inputBinding:
              position: 2
              prefix: --store-host
            label: the store_host doc
            type: string
          store_username:
            doc: the store_username doc
            inputBinding:
              position: 3
              prefix: --store-username
            label: the store_username doc
            type: string
          wf_outputs:
            type: Directory
        outputs:
          wf_outputs_out:
            outputBinding:
              glob: .
            type: Directory
        requirements:
          EnvVarRequirement:
            envDef:
              PATH: /opt/anaconda/envs/env_stageout/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
          ResourceRequirement: {}
    on_stage:
      in:
        cbn: cbn
        input_reference: node_stage_in/input_reference_out
        s_expression: s_expression
      out:
      - wf_outputs
      run: '#s-expression'
- baseCommand: s-expression
  class: CommandLineTool
  hints:
    DockerRequirement:
      dockerPull: eoepca/s-expression:dev0.0.2
  id: clt
  inputs:
    cbn:
      inputBinding:
        position: 3
        prefix: --cbn
      type: string
    input_reference:
      inputBinding:
        position: 1
        prefix: --input_reference
      type: Directory
    s_expression:
      inputBinding:
        position: 2
        prefix: --s-expression
      type: string
  outputs:
    results:
      outputBinding:
        glob: .
      type: Directory
  requirements:
    EnvVarRequirement:
      envDef:
        PATH: /srv/conda/envs/env_app_snuggs/bin:/srv/conda/envs/env_app_snuggs/bin:/srv/conda/bin:/srv/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
    ResourceRequirement: {}
- class: Workflow
  doc: Applies s expressions to EO acquisitions
  id: s-expression
  inputs:
    cbn:
      doc: Common band name
      label: Common band name
      type: string
    input_reference:
      doc: Input product reference
      label: Input product reference
      type: Directory
    s_expression:
      doc: s expression
      label: s expression
      type: string
  label: s expressions
  outputs:
  - id: wf_outputs
    outputSource:
    - step_1/results
    type: Directory
  steps:
    step_1:
      in:
        cbn: cbn
        input_reference: input_reference
        s_expression: s_expression
      out:
      - results
      run: '#clt'
$namespaces:
  s: https://schema.org/
cwlVersion: v1.0
s:softwareVersion: 0.0.2
schemas:
- http://schema.org/version/9.0/schemaorg-current-http.rdf
  • The above CWL is the application package that the ADES would produce and execute with the app s-expression and the default stage-in and stage-out cwls. To validate and test the generated application package on your local machine, run the following commands:

    • save the generated application package on a file called `wrapped_s-expression.cwl'
    cwl-wrapper app-s-expression.dev.0.0.2.cwl > wrapped_s-expression.cwl
    
    • validate the cwl using the cwltool
      cwltool --validate wrapped_s-expression.cwl
    

    The output should look like this:

      INFO /home/test/.local/bin/cwltool 3.1.20211104071347
      INFO Resolved 'wrapped_s-expression.cwl' to 'file:///home/test/wrapped_s-expression.cwl'
      wrapped_s-expression.cwl is valid CWL.
    

Custom Stage in and Stage out

You are now ready to implement your own stage in and stage out CWL workflows and use them in the cwl wrapper command as follows:

cwl-wrapper app-s-expression.dev.0.0.2.cwl --stagein myStageIn.cwl --stageout myStageOut.cwl > wrapped_s-expression.cwl

Here are some considerations for writing the stagein and stageout CWL workflows:

  • the output of the stage in CWL workflow template should be of the same type as the input of the application package. (ie. Directory or Directory[])
  • the input of the stage out CWL workflow template should be of the same type as the output of the application package. (ie. Directory or Directory[])

⏭️ You are now ready to proceed to the Get Started section.