Flow
Flow provides tooling for building and running secondary analysis pipelines. The platform supports analysis workflows constructed using Common Workflow Language (CWL). Each step of an analysis pipeline executes a containerized application using inputs passed into the pipeline or output from previous steps.
You can configure the following components in Illumina Connected Analytics Flow:
Tools — Pipeline components that are configured to process data input files. See Create a Tool.
Pipelines — One or more tools configured to process input data and generate output files. See Create a Pipeline.
Runs — Analysis of selected data input into a pipeline workflow. See Start a New Run.
Tools
A Tool is the definition of a containerized application with defined inputs, outputs, and execution environment details including compute resources required, environment variables, command line arguments, and more.
Import Tool
In addition to the interactive Tool builder, the platform GUI also supports working directly with the raw definition when developing a new Tool. This provides the ability to write the Tool definition manually or bring an existing Tool's definition to the platform.
A simple example CWL Tool definition is provided below.
When creating a new Tool, navigate to the Tool CWL tab to show the raw CWL definition. Here a CWL CommandLineTool definition may be pasted into the editor. After pasting into the editor, the definition is parsed and the other tabs for visually editing the Tool will populate according to the definition contents.
Create a Tool
Tools define the inputs, parameters, and outputs for the analysis. Tools are available for use by any project in the account.
From the Tool Repository page, select New Tool.
Configure tool settings in the tool properties tabs. See Tool Properties.
Select Save.
Tool Properties
The following sections describe the tool properties that can be configured in each tab.
Refer to the CWL CommandLineTool Specification for further explanation about many of the properties described below. Not all features described in the specification are supported.
Information Tab
Field | Entry |
---|---|
Name | The name of the tool. |
Status | The release status of the tool. |
Category | One or more tags to categorize the tool. Select from existing tags or type a new tag name in the field. |
Icon | The icon for the tool. |
Docker image | The registered Docker image for the tool. |
Version comment | A description of changes in the updated version. |
Regions | The regions supported by linked Docker image. |
Tool version | The version of the command line tool in the Docker image. |
Release version | The version number of the tool. |
Family | A group of tools or tool versions. |
Links | External reference links. |
Tool Status
The release status of the tool. can be one of "Draft", "Release Candidate", "Released" or "Deprecated".
Status | Description |
---|---|
Draft | Fully editable draft. |
Release Candidate | The tool is ready for release. Editing is locked but the tool can be cloned to create a new version. |
Released | The tool is released. Tools in this state cannot be edited. Editing is locked but the tool can be cloned to create a new version. |
Deprecated | The tool is no longer intended for use in pipelines. but there are no restrictions placed on the tool. That is, it can still be added to new pipelines and will continue to work in existing pipelines. It is merely an indication to the user that the tool should no longer be used. |
Documentation Tab
The Documentation tab provides options for configuring the HTML description for the tool. The description appears in the Tool Repository but is excluded from exported CWL definitions.
General Tool Tab
The General Tool tab provides options to configure the basic command line.
Field | Entry |
---|---|
ID | CWL identifier field |
CWL version | The CWL version in use. This field cannot be changed. |
Base command | Components of the command. Each argument must be added in a separate line. |
Standard out stream | The name of the file that captures Standard Out (STDOUT) stream information. |
Standard error stream | The name of the file that captures Standard Error (STDERR) stream information. |
Requirements | The requirements for triggering an error message. |
Hints | The requirements for triggering a warning message. |
The Hints/Requirements include CWL features to indicate capabilities expected in the Tool's execution environment.
Inline Javascript
The Tool contains a property with a JavaScript expression to resolve it's value.
Initial workdir
The workdir can be any of the following types:
String or Expression — A string or JavaScript expression, eg,
$(inputs.InputFASTA)
File or Dir — A map of one or more files or directories, in the following format:
{type: array, items: [File, Directory]}
Dirent — A script in the working directory. The Entry name field specifies the file name.
Scatter feature — Indicates that the workflow platform must support the
scatter
andscatterMethod
fields.
Tool Arguments Tab
The Tool Arguments tab provides options to configure base command parameters that do not require user input.
Tool arguments may be one of two types:
String or Expression — A literal string or JavaScript expression, eg --format=bam.
Binding — An argument constructed from the binding of an input parameter.
The following table describes the argument input fields.
Field | Entry | Type |
---|---|---|
Value | The literal string to be added to the base command. | String or expression |
Position | The position of the argument in the final command line. If the position is not specified, the default value is set to 0 and the arguments appear in the order they were added. | Binding |
Prefix | The string prefix. | Binding |
Item separator | The separator that is used between array values. | Binding |
Value from | The source string or JavaScript expression. | Binding |
Separate | The setting to require the Prefix and Value from fields to be added as separate or combined arguments. Tru indicates the fields must be added as separate arguments. False indicates the fields must be added as a single concatenated argument. | Binding |
Shell quote | The setting to quote the Value from field on the command line. True indicates the value field appears in the command line. False indicates the value field is entered manually. | Binding |
Example
Field | Value |
---|---|
Prefix |
|
Value from |
|
Input file |
|
Output file |
|
Tool Input Tab
The Tool Inputs tab provides options to define the input files and directories for the tool. The following table describes the input and binding fields. Selecting multi value enables type binding options for adding prefixes to the input.
Field | Entry |
---|---|
ID | The file ID. |
Label | A short description of the input. |
Description | A long description of the input. |
Type | The input type, which can be either a file or a directory. |
Input options | Checkboxes to add the following options. Optional indicates the input is optional. Multi value indicates there is more than one input file or directory. Streamable indicates the file is read or written sequentially without seeking. |
Secondary files | The required secondary files or directories. |
Format | The input file format. |
Position | The position of the argument in the final command line. If the position is not specified, the default value is set to 0 and the arguments appear in the order they were added. |
Prefix | The string prefix. |
Item separator | The separator that is used between array values. |
Value from | The source string or JavaScript expression. |
Load contents | The setting to require the Prefix and Value from fields to be added as separate or combined arguments. True indicates the fields must be added as separate arguments. False indicates the fields must be added as a single concatenated argument. |
Separate | The setting to require the Prefix and Value from fields to be added as separate or combined arguments. True indicates the fields must be added as separate arguments. False indicates the fields must be added as a single concatenated argument. |
Shell quote | The setting to quote the Value from field on the command line. True indicates the value field appears in the command line. False indicates the value field is entered manually. |
Tool Settings Tab
The Tool Settings tab provides options to define parameters that can be set at the time of execution. The following table describes the input and binding fields. Selecting multi value enables type binding options for adding prefixes to the input.
Field | Entry |
---|---|
ID | The file ID. |
Label | A short description of the input. |
Description | A long description of the input. |
Default Value | The default value to use if the tool setting is not available. |
Type | The input type, which can be either a file or a directory. |
Input options | Checkboxes to add the following options. Optional indicates the input is optional. Multi value indicates there is more than one input file or directory. Streamable indicates the file is read or written sequentially without seeking. |
Secondary files | The required secondary files or directories. |
Format | The input file format. |
Position | The position of the argument in the final command line. If the position is not specified, the default value is set to 0 and the arguments appear in the order they were added. |
Prefix | The string prefix. |
Item separator | The separator that is used between array values. |
Value from | The source string or JavaScript expression. |
Load contents | The setting to require the Prefix and Value from fields to be added as separate or combined arguments. True indicates the fields must be added as separate arguments. False indicates he fields must be added as a single concatenated argument. |
Separate | The setting to require the Prefix and Value from fields to be added as separate or combined arguments. True indicates the fields must be added as separate arguments. False indicates the fields must be added as a single concatenated argument. |
Shell quote | The setting to quote the Value from field on the command line. True indicates the value field appears in the command line. False indicates the value field is entered manually. |
Tool Outputs Tab
The Tool Outputs tab provides options to define the parameters of output files.
The following table describes the input and binding fields. Selecting multi value enables type binding options for adding prefixes to the input.
Field | Entry |
---|---|
ID | The file ID. |
Label | A short description of the input. |
Description | A long description of the input. |
Type | The input type, which can be either a file or a directory. |
Output options | Checkboxes to add the following options. Optional indicates the input is optional. Multi value indicates here is more than one input file or directory. Streamable indicates the file is read or written sequentially without seeking. |
Secondary files | The required secondary files or directories. |
Format | The input file format. |
Position | The position of the argument in the final command line. If the position is not specified, the default value is set to 0 and the arguments appear in the order they were added. |
Globs | The pattern for searching file names. |
Load contents | Automatically loads some contents. The system extracts up to the first 64 KiB of text from the file. Populates the contents field with the first 64 KiB of text from the file. |
Output eval | Evaluate an expression to generate the output value. |
Tool CWL Tab
The Tool CWL tab displays the complete CWL code constructed from the values entered in the other tabs. the CWL code automatically updates when changes are made in the tool definition tabs, and any changes to the CWL code are reflected in the tool definition tabs.
❗️ Modifying data within the CWL editor can result in invalid code.
Edit a Tool
From the Tool Repository page, select a tool.
Select Edit.
Update Tool Status
From the Tool Repository page, select a tool.
Select the Information tab.
From the Status drop-down menu, select a status.
Select Save.
Pipelines
A Pipeline is a series of Tools with connected inputs and outputs configured to execute in a specific order.
Create a Pipeline
Pipelines are created and stored within projects.
Select a project.
From the project menu, select Pipelines.
Select Create Pipeline.
Configure pipeline settings in the pipeline properties tabs.
In the canvas, drag connectors to link tools to input and output files. Required tool inputs are indicated by a yellow connector.
Select Save.
Pipeline Status
Pipelines can only be edited when they are in "Draft" or "Release Candidate" status. Pipeline can only be moved to "Released" Status, when all the Tools in the pipeline are ALSO in "Released" status.
Status | Description |
---|---|
Draft | Fully editable draft. |
Release Candidate | The pipeline is ready for release. Editing is locked but the pipeline can be cloned to create a new version. |
Released | The pipeline is released. A pipeline cannot be released if it contains unreleased tools. Editing is locked but the pipeline can be cloned to create a new version. |
Pipeline Properties
The following sections describe the tool properties that can be configured in each tab of the pipeline editor.
Information
The Information tab provides options for configuring basic information about the pipeline.
Field | Entry |
---|---|
Code | The name of the pipeline. |
Status | The release status of the pipeline. |
Categories | One or more tags to categorize the pipeline. Select from existing tags or type a new tag name in the field. |
Description | A short description of the pipeline. |
Family | A group of pipeline versions. To specify a family, select Change, and then select a pipeline or pipeline family. To change the order of the pipeline, select Up or Down. The first pipeline listed is the default and the remainder of the pipelines are listed as Other versions. The current pipeline appears in the list as this pipeline. |
Version comment | A description of changes in the updated version. |
Links | External reference links. |
Documentation
The Documentation tab provides options for configuring the HTML description for the tool. The description appears in the tool repository but is excluded from exported CWL definitions.
Definition
The Definition tab provides options for configuring the pipeline. The tab consists of a visualization panel and a list of component menus.
Menu | Description |
---|---|
Machine profiles | Compute types available to use with Tools in the pipeline. |
Shared settings | Settings for pipelines used in more than one tool. |
Reference files | Descriptions of reference files used in the pipeline. |
Input files | Descriptions of input files used in the pipeline. |
Output files | Descriptions of output files used in the pipeline. |
Tool | Details about the tool selected in the visualization panel. |
Tool repository | A list of tools available to be used in the pipeline. |
Run Report
The Run Report tab provides options for configuring pipeline execution reports. The report is composed of widgets added to the tab.
Configure Pipeline Run Report
The pipeline run report appears in the pipeline execution results. The report is configured form widgets added to the Run Report tab in the pipeline editor.
[Optional] Import widgets from another pipeline.
Select Import from other pipeline.
Select the pipeline that contains the report you want to copy.
Select an import option: Replace current report or Append to current report.
Select Import.
From the Run Report tab, select Add widget, and then select a widget type.
Configure widget details.
Widget Settings Title
Add and format title text.
Run details
Add heading text and select the run metadata details to display.
Free text
Add formatted free text. The widget includes options for placeholder variables that display the corresponding project values.
Inline viewer
Add options to view the content of a run output file.
Run comments
Add comments that can be edited after a run has been performed.
Input details
Add heading text and select the input details to display. The widget includes an option to group details by input name.
Project details
Add heading text and select the project details to display.
Page break
Add a page break widget where page breaks should appear between report sections.
Select Save.
Free Text Placeholders
Placeholder | Description |
---|---|
[[BB_PROJECT_NAME]] | The project name. |
[[BB_PROJECT_OWNER]] | The project owner. |
[[BB_PROJECT_DESCRIPTION]] | The project short description. |
[[BB_PROJECT_INFORMATION]] | The project information. |
[[BB_PROJECT_LOCATION]] | The project location. |
[[BB_PROJECT_BILLING_MODE]] | The project billing mode. |
[[BB_PROJECT_DATA_SHARING]] | The project data sharing settings. |
[[BB_REFERENCE]] | The run reference. |
[[BB_USERREFERENCE]] | The user run reference. |
[[BB_PIPELINE]] | The name of the pipeline. |
[[BB_USER_OPTIONS]] | The pipeline run user options. |
[[BB_TECH_OPTIONS]] | The pipeline run technical options. Technical options include the TECH suffix and are not visible to end users. |
[[BB_ALL_OPTIONS]] | All pipeline run options. Technical options include the TECH suffix and are not visible to end users. |
[[BB_SAMPLE]] | The sample. |
[[BB_REQUEST_DATE]] | The run request date. |
[[BB_START_DATE]] | The run start date. |
[[BB_DURATION]] | The run duration. |
[[BB_REQUESTOR]] | The user requesting run execution. |
[[BB_RUNSTATUS]] | The status of the run. |
[[BB_ENTITLEMENTDETAIL]] | The used entitlement detail. |
[[BB_METADATA:path]] | The value or list of values of a metadata field or multi-value fields. |
Start a New Run
You can start a new analysis run for an individual pipeline or start a new analysis run for multiple pipelines.
Use the following instructions to start a new run for a single pipeline.
Select a project.
From the project menu, select Pipelines.
Select the pipeline to run.
Select Start a New Run.
Configure run settings. See Run Properties.
Select Start Run.
View the run status on the Runs page.
Requested—The run is scheduled to begin.
Awaiting Input—The input file download is in progress.
In Progress—The run is in progress.
Succeeded—The run is complete.
Failed and Failed Final—The run has failed or was aborted.
To end a run, select Abort.
To perform a completed analysis run again, select Re-run.
Start a Run for Multiple Pipelines
To start a run for multiple pipelines, do as follows.
Select a project.
From the project menu, select Data.
Select multiple data folders.
Select Start a New Run.
Run Properties
The following sections describe the run properties that can be configured in each tab.
Run
The Run tab provides options for configuring basic information about the run.
Field | Entry |
---|---|
User Reference | The unique run name. |
User tags | One or more tags used to filter the run list. Select from existing tags or type a new tag name in the field. |
Entitlement Bundle | Select a subscription to charge the run to. |
Input Files | Select the input files to use in the run. |
Settings | Provide input settings. |
Details
The Details tab provides information on the pipeline configuration.
Menu | Description |
---|---|
Machine profiles | Compute types used with Tools in the pipeline. |
Shared settings | Settings for pipelines used in more than one tool. |
Reference files | Descriptions of reference files used in the pipeline. |
Input files | Descriptions of input files used in the pipeline. |
Output files | Descriptions of output files used in the pipeline. |
Tool | Details about the tool selected in the visualization panel. |
Tool repository | A list of tools available to be used in the pipeline. |
View Run Results
You can view run results on the Runs page or in the output_folder on the Data page.
Select a project, and then select the Runs page.
Select a run.
On the Result tab, select an output file.
To preview the file, select the View tab.
Add or remove any user or technical tags, and then select Save.
To download, select Schedule for Download.
View additional run result information on the following tabs:
Details—View information on the pipeline configuration.
Logs—Download information on the pipeline process.
Resources—Measure the CPU and memory usage during each step of the run.
Last updated