Analysis using CLI
The platform APIs for creating and running pipelines supports CWL definitions submitted in the appropriate format. CWL workflows can be written in JSON or YAML and might contain multiple files. To create a workflow version with a CWL definition using the WES APIs, the workflow must be packed into a single JSON document and passed to the definition
field of the create workflow version API.
Multi-file CWL Definitions
The platform APIs for creating a workflow currently require the definition be specified in a single document. Many CWL workflows are defined using multiple files, each file containing a single step of the workflow. In order to compile a multi-file CWL workflow into a single document, the open source cwltool
utility provides a mechanism for "packing" the workflow into a single JSON object. Follow the cwltool
installation instructions to install the utility.
With cwltool
installed, the --pack
option can be used to combine a multi-file CWL workflow into a single JSON document (also supports single file cwl workflows). Details on usage of the --pack
option are found here. Once the workflow has been packed into a single JSON document, the JSON contents may be used to create a workflow version. CWL workflows commonly contain input arguments to be set when launching the workflow. The WES API to launch the workflow version requires the inputs to be defined in JSON format. In order to obtain a template to use for inputs to the packed workflow, cwltool
contains a --make-template
option which may be used on the packed CWL JSON file. The output of --make-template
may still be formatted in YAML. In order to convert the YAML input file to JSON, the yq
python package may be used. Once converted to JSON, the inputs can be easily set and passed to the launch workflow request.
Quick Start
In this example, the 1st-tool.cwl
example CWL workflow from the CWL GitHub repository examples is used.
The 1st-tool.cwl
file is written in YAML, which is not compatible with the WES APIs:
In order to convert to JSON, the file is passed to cwltool --pack
:
The 1st-tool-pack.cwl
file will contain the packed workflow in JSON format:
This JSON can now be used to create a CWL workflow version. Save the JSON output to file to be used with the command-line interface for creating the workflow.
First create a workflow resource.
Note the workflow resource ID. Next create a workflow version, passing in the packed CWL definition file. In the example below, the file is saved at /example.cwl
. Ensure the --language-name
option is passed with cwl
as the input value.
For this CWL workflow, an input argument is defined and will need to be supplied when launching the workflow. The input must be specified in JSON format to be used with the command-line interface. A simple way to obtain a JSON-formatted input template for specifying the inputs for the workflow is with the cwltool --make-template
option on the packed CWL workflow. The output can then be piped through yq
for conversion to JSON.
The 1st-tool-packed.input.json
file will contain the input template to use for launching the CWL workflow
Save the input JSON file to be passed into the command to launch the workflow.
In this example, the input contains a single field message
. The value may be changed for each workflow launch. The command to launch the workflow provides an --input
field to pass the file containing the input JSON file. In the example below, the input JSON file is saved at /input.json
.
After executing the launch command, a workflow run resource will be created. The run ID can be used to monitor the workflow run. The run details can be retrieved to view the current status of the run as well as view the input, output, and any errors that may have occurred.
During the lifetime of the run, history events are produced to provide a form of logging as each step in the workflow is executed. These history events include details like status, inputs, outputs, and duration for each step.
Use the JSON output format to see the full contents of each of the history events.
Output to Project Data
Pipelines executed using the CLI will not generate outputs visible in the UI by default. In order to direct outputs to show in the UI, the workDirectory
should be modified to point to a folder path within a project's home volume. See the Project section for details about a project's home volume. See the Output Directory section in the engine parameters documentation for details on how to set the workDirectory
when launching a pipeline using the CLI.
Last updated