Analysis using CLI

The platform APIs for creating and running pipelines supports CWL definitions submitted in the appropriate format. CWL workflows can be written in JSON or YAML and might contain multiple files. To create a workflow version with a CWL definition using the WES APIs, the workflow must be packed into a single JSON document and passed to the definition field of the create workflow version API.

Multi-file CWL Definitions

The platform APIs for creating a workflow currently require the definition be specified in a single document. Many CWL workflows are defined using multiple files, each file containing a single step of the workflow. In order to compile a multi-file CWL workflow into a single document, the open source cwltool utility provides a mechanism for "packing" the workflow into a single JSON object. Follow the cwltool installation instructions to install the utility.

With cwltool installed, the --pack option can be used to combine a multi-file CWL workflow into a single JSON document (also supports single file cwl workflows). Details on usage of the --pack option are found here. Once the workflow has been packed into a single JSON document, the JSON contents may be used to create a workflow version. CWL workflows commonly contain input arguments to be set when launching the workflow. The WES API to launch the workflow version requires the inputs to be defined in JSON format. In order to obtain a template to use for inputs to the packed workflow, cwltool contains a --make-template option which may be used on the packed CWL JSON file. The output of --make-template may still be formatted in YAML. In order to convert the YAML input file to JSON, the yq python package may be used. Once converted to JSON, the inputs can be easily set and passed to the launch workflow request.

Quick Start

In this example, the 1st-tool.cwl example CWL workflow from the CWL GitHub repository examples is used.

The 1st-tool.cwl file is written in YAML, which is not compatible with the WES APIs:

cwlVersion: v1.0
class: CommandLineTool
baseCommand: echo
inputs:
  message:
    type: string
    inputBinding:
      position: 1
outputs: []

In order to convert to JSON, the file is passed to cwltool --pack:

cwltool --pack 1st-tool.cwl > 1st-tool-pack.cwl

The 1st-tool-pack.cwl file will contain the packed workflow in JSON format:

{
    "class": "CommandLineTool",
    "baseCommand": "echo",
    "inputs": [
        {
            "type": "string",
            "inputBinding": {
                "position": 1
            },
            "id": "#main/message"
        }
    ],
    "outputs": [],
    "id": "#main",
    "cwlVersion": "v1.0"
}

This JSON can now be used to create a CWL workflow version. Save the JSON output to file to be used with the command-line interface for creating the workflow.

First create a workflow resource.

$ ica workflows create --name cwl-example
id                wfl.3edad558930d48a9ba141894bEXAMPLE
name              cwl-example
...

Note the workflow resource ID. Next create a workflow version, passing in the packed CWL definition file. In the example below, the file is saved at /example.cwl. Ensure the --language-name option is passed with cwl as the input value.

$ ica workflows versions create wfl.3edad558930d48a9ba141894bEXAMPLE --version 1.0.0 --definition /example.cwl --language-name cwl
id                wfv.21ca1821e09a47d1adb993b975EXAMPLE
language.name     cwl
version           1.0.0
...

For this CWL workflow, an input argument is defined and will need to be supplied when launching the workflow. The input must be specified in JSON format to be used with the command-line interface. A simple way to obtain a JSON-formatted input template for specifying the inputs for the workflow is with the cwltool --make-template option on the packed CWL workflow. The output can then be piped through yq for conversion to JSON.

cwltool --make-template 1st-tool-pack.cwl | yq r -j - > 1st-tool-packed.input.json

The 1st-tool-packed.input.json file will contain the input template to use for launching the CWL workflow

{
  "message": "a_string"
}

Save the input JSON file to be passed into the command to launch the workflow.

In this example, the input contains a single field message. The value may be changed for each workflow launch. The command to launch the workflow provides an --input field to pass the file containing the input JSON file. In the example below, the input JSON file is saved at /input.json.

$ ica workflows versions launch wfl.3edad558930d48a9ba141894bEXAMPLE 1.0.0 --name test --input /input.yaml
id                                wfr.72ea78951dd0440a9a66b1948EXAMPLE
name                              test
status                            Running
workflowVersion.id                wfv.21ca1821e09a47d1adb993b975EXAMPLE
workflowVersion.language.name     cwl
workflowVersion.version           1.0.0
...

After executing the launch command, a workflow run resource will be created. The run ID can be used to monitor the workflow run. The run details can be retrieved to view the current status of the run as well as view the input, output, and any errors that may have occurred.

$ ica workflows runs get wfr.72ea78951dd0440a9a66b1948EXAMPLE
id                                wfr.72ea78951dd0440a9a66b1948EXAMPLE
input                             {"message":"a_string"}
output                            null
status                            Running
...

During the lifetime of the run, history events are produced to provide a form of logging as each step in the workflow is executed. These history events include details like status, inputs, outputs, and duration for each step.

$ ica workflows runs history wfr.72ea78951dd0440a9a66b1948EXAMPLE
NAME            EVENTTYPE       EVENTID PREVIOUSEVENTID TIMESTAMP
test            RunStarted      0       0               2020-09-28 06:22:16.575 -0700 PDT
main_launch     Started         244884  244883          2020-09-28 06:22:58.034 -0700 PDT
main_launch     Succeeded       244885  244884          2020-09-28 06:23:00.14 -0700 PDT

Use the JSON output format to see the full contents of each of the history events.

ica workflows runs history wfr.72ea78951dd0440a9a66b1948EXAMPLE -o json

Output to Project Data

Pipelines executed using the CLI will not generate outputs visible in the UI by default. In order to direct outputs to show in the UI, the workDirectory should be modified to point to a folder path within a project's home volume. See the Project section for details about a project's home volume. See the Output Directory section in the engine parameters documentation for details on how to set the workDirectory when launching a pipeline using the CLI.

{
  "engineParameters": {
     "workDirectory": "gds://projectName/runName"
  }
}

Last updated