Flow

Flow provides tooling for building and running secondary analysis pipelines. The platform supports analysis workflows constructed using Common Workflow Language (CWL). Each step of an analysis pipeline executes a containerized application using inputs passed into the pipeline or output from previous steps.

You can configure the following components in Illumina Connected Analytics Flow:

  • Tools — Pipeline components that are configured to process data input files. See Create a Tool.

  • Pipelines — One or more tools configured to process input data and generate output files. See Create a Pipeline.

  • Runs — Analysis of selected data input into a pipeline workflow. See Start a New Run.

Tools

A Tool is the definition of a containerized application with defined inputs, outputs, and execution environment details including compute resources required, environment variables, command line arguments, and more.

Import Tool

In addition to the interactive Tool builder, the platform GUI also supports working directly with the raw definition when developing a new Tool. This provides the ability to write the Tool definition manually or bring an existing Tool's definition to the platform.

A simple example CWL Tool definition is provided below.

#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool
label: echo
inputs:
  message:
    type: string
    default: testMessage
    inputBinding:
      position: 1
outputs:
  echoout:
    type: stdout
baseCommand:
- echo

When creating a new Tool, navigate to the Tool CWL tab to show the raw CWL definition. Here a CWL CommandLineTool definition may be pasted into the editor. After pasting into the editor, the definition is parsed and the other tabs for visually editing the Tool will populate according to the definition contents.

Create a Tool

Tools define the inputs, parameters, and outputs for the analysis. Tools are available for use by any project in the account.

  1. From the Tool Repository page, select New Tool.

  2. Configure tool settings in the tool properties tabs. See Tool Properties.

  3. Select Save.

Tool Properties

The following sections describe the tool properties that can be configured in each tab.

Refer to the CWL CommandLineTool Specification for further explanation about many of the properties described below. Not all features described in the specification are supported.

Information Tab

FieldEntry

Name

The name of the tool.

Status

The release status of the tool.

Category

One or more tags to categorize the tool. Select from existing tags or type a new tag name in the field.

Icon

The icon for the tool.

Docker image

The registered Docker image for the tool.

Version comment

A description of changes in the updated version.

Regions

The regions supported by linked Docker image.

Tool version

The version of the command line tool in the Docker image.

Release version

The version number of the tool.

Family

A group of tools or tool versions.

Links

External reference links.

Tool Status

The release status of the tool. can be one of "Draft", "Release Candidate", "Released" or "Deprecated".

StatusDescription

Draft

Fully editable draft.

Release Candidate

The tool is ready for release. Editing is locked but the tool can be cloned to create a new version.

Released

The tool is released. Tools in this state cannot be edited. Editing is locked but the tool can be cloned to create a new version.

Deprecated

The tool is no longer intended for use in pipelines. but there are no restrictions placed on the tool. That is, it can still be added to new pipelines and will continue to work in existing pipelines. It is merely an indication to the user that the tool should no longer be used.

Documentation Tab

The Documentation tab provides options for configuring the HTML description for the tool. The description appears in the Tool Repository but is excluded from exported CWL definitions.

General Tool Tab

The General Tool tab provides options to configure the basic command line.

FieldEntry

ID

CWL identifier field

CWL version

The CWL version in use. This field cannot be changed.

Base command

Components of the command. Each argument must be added in a separate line.

Standard out stream

The name of the file that captures Standard Out (STDOUT) stream information.

Standard error stream

The name of the file that captures Standard Error (STDERR) stream information.

Requirements

The requirements for triggering an error message.

Hints

The requirements for triggering a warning message.

The Hints/Requirements include CWL features to indicate capabilities expected in the Tool's execution environment.

  • Inline Javascript

    • The Tool contains a property with a JavaScript expression to resolve it's value.

  • Initial workdir

    • The workdir can be any of the following types:

      • String or Expression — A string or JavaScript expression, eg, $(inputs.InputFASTA)

      • File or Dir — A map of one or more files or directories, in the following format: {type: array, items: [File, Directory]}

      • Dirent — A script in the working directory. The Entry name field specifies the file name.

  • Scatter feature — Indicates that the workflow platform must support the scatter and scatterMethod fields.

Tool Arguments Tab

The Tool Arguments tab provides options to configure base command parameters that do not require user input.

Tool arguments may be one of two types:

  • String or Expression — A literal string or JavaScript expression, eg --format=bam.

  • Binding — An argument constructed from the binding of an input parameter.

The following table describes the argument input fields.

FieldEntryType

Value

The literal string to be added to the base command.

String or expression

Position

The position of the argument in the final command line. If the position is not specified, the default value is set to 0 and the arguments appear in the order they were added.

Binding

Prefix

The string prefix.

Binding

Item separator

The separator that is used between array values.

Binding

Value from

The source string or JavaScript expression.

Binding

Separate

The setting to require the Prefix and Value from fields to be added as separate or combined arguments. Tru indicates the fields must be added as separate arguments. False indicates the fields must be added as a single concatenated argument.

Binding

Shell quote

The setting to quote the Value from field on the command line. True indicates the value field appears in the command line. False indicates the value field is entered manually.

Binding

Example

FieldValue

Prefix

--output-filename

Value from

$(inputs.inputSAM.nameroot).bam

Input file

/tmp/storage/SRR45678_sorted.sam

Output file

SRR45678_sorted.bam

Tool Input Tab

The Tool Inputs tab provides options to define the input files and directories for the tool. The following table describes the input and binding fields. Selecting multi value enables type binding options for adding prefixes to the input.

FieldEntry

ID

The file ID.

Label

A short description of the input.

Description

A long description of the input.

Type

The input type, which can be either a file or a directory.

Input options

Checkboxes to add the following options. Optional indicates the input is optional. Multi value indicates there is more than one input file or directory. Streamable indicates the file is read or written sequentially without seeking.

Secondary files

The required secondary files or directories.

Format

The input file format.

Position

The position of the argument in the final command line. If the position is not specified, the default value is set to 0 and the arguments appear in the order they were added.

Prefix

The string prefix.

Item separator

The separator that is used between array values.

Value from

The source string or JavaScript expression.

Load contents

The setting to require the Prefix and Value from fields to be added as separate or combined arguments. True indicates the fields must be added as separate arguments. False indicates the fields must be added as a single concatenated argument.

Separate

The setting to require the Prefix and Value from fields to be added as separate or combined arguments. True indicates the fields must be added as separate arguments. False indicates the fields must be added as a single concatenated argument.

Shell quote

The setting to quote the Value from field on the command line. True indicates the value field appears in the command line. False indicates the value field is entered manually.

Tool Settings Tab

The Tool Settings tab provides options to define parameters that can be set at the time of execution. The following table describes the input and binding fields. Selecting multi value enables type binding options for adding prefixes to the input.

FieldEntry

ID

The file ID.

Label

A short description of the input.

Description

A long description of the input.

Default Value

The default value to use if the tool setting is not available.

Type

The input type, which can be either a file or a directory.

Input options

Checkboxes to add the following options. Optional indicates the input is optional. Multi value indicates there is more than one input file or directory. Streamable indicates the file is read or written sequentially without seeking.

Secondary files

The required secondary files or directories.

Format

The input file format.

Position

The position of the argument in the final command line. If the position is not specified, the default value is set to 0 and the arguments appear in the order they were added.

Prefix

The string prefix.

Item separator

The separator that is used between array values.

Value from

The source string or JavaScript expression.

Load contents

The setting to require the Prefix and Value from fields to be added as separate or combined arguments. True indicates the fields must be added as separate arguments. False indicates he fields must be added as a single concatenated argument.

Separate

The setting to require the Prefix and Value from fields to be added as separate or combined arguments. True indicates the fields must be added as separate arguments. False indicates the fields must be added as a single concatenated argument.

Shell quote

The setting to quote the Value from field on the command line. True indicates the value field appears in the command line. False indicates the value field is entered manually.

Tool Outputs Tab

The Tool Outputs tab provides options to define the parameters of output files.

The following table describes the input and binding fields. Selecting multi value enables type binding options for adding prefixes to the input.

FieldEntry

ID

The file ID.

Label

A short description of the input.

Description

A long description of the input.

Type

The input type, which can be either a file or a directory.

Output options

Checkboxes to add the following options. Optional indicates the input is optional. Multi value indicates here is more than one input file or directory. Streamable indicates the file is read or written sequentially without seeking.

Secondary files

The required secondary files or directories.

Format

The input file format.

Position

The position of the argument in the final command line. If the position is not specified, the default value is set to 0 and the arguments appear in the order they were added.

Globs

The pattern for searching file names.

Load contents

Automatically loads some contents. The system extracts up to the first 64 KiB of text from the file. Populates the contents field with the first 64 KiB of text from the file.

Output eval

Evaluate an expression to generate the output value.

Tool CWL Tab

The Tool CWL tab displays the complete CWL code constructed from the values entered in the other tabs. the CWL code automatically updates when changes are made in the tool definition tabs, and any changes to the CWL code are reflected in the tool definition tabs.

❗️ Modifying data within the CWL editor can result in invalid code.

Edit a Tool

  1. From the Tool Repository page, select a tool.

  2. Select Edit.

Update Tool Status

  1. From the Tool Repository page, select a tool.

  2. Select the Information tab.

  3. From the Status drop-down menu, select a status.

  4. Select Save.

Pipelines

A Pipeline is a series of Tools with connected inputs and outputs configured to execute in a specific order.

Create a Pipeline

Pipelines are created and stored within projects.

  1. Select a project.

  2. From the project menu, select Pipelines.

  3. Select Create Pipeline.

  4. Configure pipeline settings in the pipeline properties tabs.

  5. In the canvas, drag connectors to link tools to input and output files. Required tool inputs are indicated by a yellow connector.

  6. Select Save.

Pipeline Status

Pipelines can only be edited when they are in "Draft" or "Release Candidate" status. Pipeline can only be moved to "Released" Status, when all the Tools in the pipeline are ALSO in "Released" status.

StatusDescription

Draft

Fully editable draft.

Release Candidate

The pipeline is ready for release. Editing is locked but the pipeline can be cloned to create a new version.

Released

The pipeline is released. A pipeline cannot be released if it contains unreleased tools. Editing is locked but the pipeline can be cloned to create a new version.

Pipeline Properties

The following sections describe the tool properties that can be configured in each tab of the pipeline editor.

Information

The Information tab provides options for configuring basic information about the pipeline.

FieldEntry

Code

The name of the pipeline.

Status

The release status of the pipeline.

Categories

One or more tags to categorize the pipeline. Select from existing tags or type a new tag name in the field.

Description

A short description of the pipeline.

Family

A group of pipeline versions. To specify a family, select Change, and then select a pipeline or pipeline family. To change the order of the pipeline, select Up or Down. The first pipeline listed is the default and the remainder of the pipelines are listed as Other versions. The current pipeline appears in the list as this pipeline.

Version comment

A description of changes in the updated version.

Links

External reference links.

Documentation

The Documentation tab provides options for configuring the HTML description for the tool. The description appears in the tool repository but is excluded from exported CWL definitions.

Definition

The Definition tab provides options for configuring the pipeline. The tab consists of a visualization panel and a list of component menus.

MenuDescription

Machine profiles

Compute types available to use with Tools in the pipeline.

Shared settings

Settings for pipelines used in more than one tool.

Reference files

Descriptions of reference files used in the pipeline.

Input files

Descriptions of input files used in the pipeline.

Output files

Descriptions of output files used in the pipeline.

Tool

Details about the tool selected in the visualization panel.

Tool repository

A list of tools available to be used in the pipeline.

Run Report

The Run Report tab provides options for configuring pipeline execution reports. The report is composed of widgets added to the tab.

Configure Pipeline Run Report

The pipeline run report appears in the pipeline execution results. The report is configured form widgets added to the Run Report tab in the pipeline editor.

  1. [Optional] Import widgets from another pipeline.

    1. Select Import from other pipeline.

    2. Select the pipeline that contains the report you want to copy.

    3. Select an import option: Replace current report or Append to current report.

    4. Select Import.

  2. From the Run Report tab, select Add widget, and then select a widget type.

  3. Configure widget details.

    WidgetSettings

    Title

    Add and format title text.

    Run details

    Add heading text and select the run metadata details to display.

    Free text

    Add formatted free text. The widget includes options for placeholder variables that display the corresponding project values.

    Inline viewer

    Add options to view the content of a run output file.

    Run comments

    Add comments that can be edited after a run has been performed.

    Input details

    Add heading text and select the input details to display. The widget includes an option to group details by input name.

    Project details

    Add heading text and select the project details to display.

    Page break

    Add a page break widget where page breaks should appear between report sections.

  4. Select Save.

Free Text Placeholders

PlaceholderDescription

[[BB_PROJECT_NAME]]

The project name.

[[BB_PROJECT_OWNER]]

The project owner.

[[BB_PROJECT_DESCRIPTION]]

The project short description.

[[BB_PROJECT_INFORMATION]]

The project information.

[[BB_PROJECT_LOCATION]]

The project location.

[[BB_PROJECT_BILLING_MODE]]

The project billing mode.

[[BB_PROJECT_DATA_SHARING]]

The project data sharing settings.

[[BB_REFERENCE]]

The run reference.

[[BB_USERREFERENCE]]

The user run reference.

[[BB_PIPELINE]]

The name of the pipeline.

[[BB_USER_OPTIONS]]

The pipeline run user options.

[[BB_TECH_OPTIONS]]

The pipeline run technical options. Technical options include the TECH suffix and are not visible to end users.

[[BB_ALL_OPTIONS]]

All pipeline run options. Technical options include the TECH suffix and are not visible to end users.

[[BB_SAMPLE]]

The sample.

[[BB_REQUEST_DATE]]

The run request date.

[[BB_START_DATE]]

The run start date.

[[BB_DURATION]]

The run duration.

[[BB_REQUESTOR]]

The user requesting run execution.

[[BB_RUNSTATUS]]

The status of the run.

[[BB_ENTITLEMENTDETAIL]]

The used entitlement detail.

[[BB_METADATA:path]]

The value or list of values of a metadata field or multi-value fields.

Start a New Run

You can start a new analysis run for an individual pipeline or start a new analysis run for multiple pipelines.

Use the following instructions to start a new run for a single pipeline.

  1. Select a project.

  2. From the project menu, select Pipelines.

  3. Select the pipeline to run.

  4. Select Start a New Run.

  5. Configure run settings. See Run Properties.

  6. Select Start Run.

  7. View the run status on the Runs page.

    • Requested—The run is scheduled to begin.

    • Awaiting Input—The input file download is in progress.

    • In Progress—The run is in progress.

    • Succeeded—The run is complete.

    • Failed and Failed Final—The run has failed or was aborted.

  8. To end a run, select Abort.

  9. To perform a completed analysis run again, select Re-run.

Start a Run for Multiple Pipelines

To start a run for multiple pipelines, do as follows.

  1. Select a project.

  2. From the project menu, select Data.

  3. Select multiple data folders.

  4. Select Start a New Run.

Run Properties

The following sections describe the run properties that can be configured in each tab.

Run

The Run tab provides options for configuring basic information about the run.

FieldEntry

User Reference

The unique run name.

User tags

One or more tags used to filter the run list. Select from existing tags or type a new tag name in the field.

Entitlement Bundle

Select a subscription to charge the run to.

Input Files

Select the input files to use in the run.

Settings

Provide input settings.

Details

The Details tab provides information on the pipeline configuration.

MenuDescription

Machine profiles

Compute types used with Tools in the pipeline.

Shared settings

Settings for pipelines used in more than one tool.

Reference files

Descriptions of reference files used in the pipeline.

Input files

Descriptions of input files used in the pipeline.

Output files

Descriptions of output files used in the pipeline.

Tool

Details about the tool selected in the visualization panel.

Tool repository

A list of tools available to be used in the pipeline.

View Run Results

You can view run results on the Runs page or in the output_folder on the Data page.

  1. Select a project, and then select the Runs page.

  2. Select a run.

  3. On the Result tab, select an output file.

  4. To preview the file, select the View tab.

  5. Add or remove any user or technical tags, and then select Save.

  6. To download, select Schedule for Download.

  7. View additional run result information on the following tabs:

    • Details—View information on the pipeline configuration.

    • Logs—Download information on the pipeline process.

    • Resources—Measure the CPU and memory usage during each step of the run.

Last updated