Task Execution
The basic building blocks of pipelines are Tools - containerized applications executed on a distributed cloud infrastructure with defined compute resources and execution environment conditions. The platform runs Tools through the Task Execution Service (TES), which hosts a suite of APIs for launching and monitoring the execution of the containerized applications. The TES APIs operate on the Task resource model. During pipeline execution, the Tool definition of each step is translated to a task. Tasks contain an execution specification as part of the resource model, containing all the information needed for TES to provision and launch the containerized application.
Requesting Resources
The Task Execution Service supports different compute types depending on the values provided in the execution.environment.resources
section of the task version or task run body.
ℹ️ Queued task runs are fulfilled as resources become available. Ordering is not gauranteed.
Type and Size
For the type and size fields, you can select from the following combinations:
standard
small
.8 CPU
3 GB
medium
1.3 CPU
4.5 GB
large
2 CPU
7 GB
xlarge
4 CPU
14 GB
xxlarge
8 CPU
28 GB
standardHiCpu
small
15.5 CPU
28 GB
medium
35.5 CPU
68 GB
large
71.5 CPU
140 GB
standardHiMem
small
7.5 CPU
60 GB
medium
15.5 CPU
124 GB
large
47.5 CPU
380 GB
xlarge
95.5 CPU
764 GB
fpga
small
7.5 CPU
118 GB
medium
15.5 CPU
240 GB
large
63.5 CPU
972 GB
The
FPGA, large
compute type is unavailable in the cac1 region TheFPGA, large
compute type is unavailable in the aps2 region
The exact memory and CPU resources provisioned for a task run may vary for a given compute type in the table above. This is done to optimize for availability to ensure a job is scheduled in a timely manner while satisfying the minimum resources requested. This may result in slight variations in a task run's performance and duration when executed on the same inputs multiple times.
If you do not specify a resource size and type, then the task is executed on the smallest instance available when the request is made.
Tier
Compute resource tiers provide pricing options to save cost at the risk of having runs more susceptible to capacity limitations. Choosing a low cost tier schedules the task run on a compute node that may be interrupted and re-provisioned when the system is under load. For short running jobs on smaller compute types and sizes that are tolerant to interruption, this works well. For long running jobs running on more powerful compute types and sizes, the change of interruption increases and may severely impact total run duration.
economy
Lowest cost option. The run may be interrupted and will be continue to be rescheduled on interruptible nodes when restarted upon interruption or failure.
standard
Highest cost option. The run will be scheduled on a non-interruptible node and will be rescheduled to non-interruptible nodes when restarted upon interruption or failure.
FPGA compute types are limited to the standard tier in the regions below.
London (euw2)
Canada (cac1)
Singapore (aps1)
Frankfurt (euc1)
Environment Variables
Environment variables may be set in the container executing the task run.
Secure Environment Variables
Environment variables may be secured to hide them from log outputs and API responses. Use the SECURE_
prefix to indicate a environment variable as secure.
Substitution
Within the execution body, either a static value or a substitution can be provided for a field's value. A substitution is made using a string wrapped with {{<string>}}
and allows the actual values to be specified at launch time using arguments. Substitutions can be reused for multiple fields and can be embedded within a string value.
Certain fields, like passwords or secrets, require substitutions to prevent secrets from being stored with the version. The value for the secret will be replaced in response bodies with "<hidden>"
rather than the value itself.
The following is an example task execution specification leveraging substitutions:
The input arguments are then provided in the arguments
of the version launch request.
Logging
Throughout a task run's life-cycle, several logs and system-related files are produced with information about the execution of the job. TES requires the user to provide an external location to serve as the file store for these files. A "systemFiles" field in the execution body is used to provide a URL and optional credentials (similar to an output mapping) for storing these files.
During the execution of the task run, TES creates a folder with the name matching the task run ID (ie, trn.<uid>
) directly under the final path component of the provided URL. For example, if the above task run is executed, a folder is created at gds://taskruns/trn.<uid>
where trn.<uid>
is the unique ID assigned to the task run resource. Any system-related files will be stored within the trn.<uid>
folder.
stdout/stderr
During task run execution, the stdout and stderr of executing processes are redirected to /var/log/tessystemlogs/task-stdouterr.log
. Other container log artifacts are placed in the /var/log/tessystemlogs
folder. These log files are uploaded every 3 seconds and can be accessed while the task run is executing.
An output mapping may be specified using the /var/log/tessystemlogs
as the path to send the folder contents to an alternative location from the URL specified in the "systemFiles" field.
The following is an example of a task execution specification that will send logs stored in /var/log/tessystemlogs
to gds://volume1/myLogs
. Any other system-related files produced by the task run job will be sent to the "systemFiles" URL, gds://taskruns
:
Other logs
The following is a table of the files produced during a task run's life-cycle. Some files are only produced under certain conditions.
bsfs-stdouterr.log
Logs associated with mounting input files
logging-stdouterr.log
Logs associated with logging container
output#-stdouterr.log
Logs associated with uploading to each output location (# is replaced with the index of the output in the execution body)
task-stdouterr.log
stdout/stderr of the application
_manifest.json
Records metadata for all uploaded files, including:
Relative path where the file was uploaded.
md5 checksum.
file size in bytes.
UTC timestamp when the file was uploaded.
_tags.json
Records the UTC timestamp when the uploads are completed and the Task Run ID.
Marshalling Data
Inputs
The task execution body provides an array of inputs that will be attached to volume mounts on the running instance of the task. Each object in the inputs array must contain a "path" and a "url". The path is the volume mount location in the container to which the input will be downloaded. The url is the external location the input will be downloaded from.
Input paths are made read-only at the last folder component in the path. In general, applications run on TES should use the /scratch
path for intermediate files. The DRAGEN application should use the /ephemeral
path for intermediate files.
Inputs must meet the following conditions:
The path must be absolute.
The same path must not be reused for multiple inputs.
The path must not lead to any of the following:
/
,/usr
,/var
,/log
,/lib
,/usr/bin
.Http-based URLs must not require authentication.
GDS-based URLs must be accessible by the token used to launch the task run.
TES currently supports a maximum of 20,000 input files, including files mounted with a folder input.
Input File Example
Input Folder Example
When you specify a folder as input, the "systemFiles" property must also be set.
AWS S3 Inputs
To read inputs from a private S3 bucket, the credentials to that bucket must be provided in the credentials field of the inputs, and the storageProvider must be set to aws. A substitution is required for each of the fields in credentials when defined in a task version. The following are the valid keys that can be provided in credentials:
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_SESSION_TOKEN
There are two ways to provide access keys. For permanent credentials, include the AWS_ACCESS_KEY_ID and the AWS_SECRET_ACCESS_KEY. For temporary credentials, include the AWS_ACCESS_KEY_ID, the AWS_SECRET_ACCESS_KEY, and the AWS_SESSION_TOKEN.
The following is an example of a task execution specification that reads inputs from a private S3 location:
Download Mode
By default, input resources are streamed to the task run job during execution. It may be preferable to force the complete download of certain resources prior to executing the command. For example, applications that use a random access pattern need the complete file contents available. Inputs may be specified as requiring download using the "mode" field with a value of "download". Available options for the mode include "download" and "stream" (default).
Manifest type input
Each TES task has a maximum number of input files (128) allowed in the inputs
list (input of type file
).
To launch a task with very large number of inputs, user may use one input of type manifest
.
Here mode
can be either download
or stream
, which is applied to all input files in the manifest. The value of url
is an https-based presigned URL of the manifest JSON file (in case of GDS based manifest JSON, user needs to call GDS API to get its presigned URL). The manifest JSON is a list of input items, each item in following format,
Here, url
is the presigned URL of each input file, size
is the exact size in byte of input file, and path
is the intended mount path relative to the path of manifest itself inside the container. For instance, in above example, the absolute mount path of file hg38_alt_aware_nohla.fa
is /manifest/mount/path/reference/fasta/hg38_alt_aware_nohla.fa
.
Note, GDS folder or S3 folder is not supported inside manifest. The presigned URLs of all files under the folder need to be iteratively generated before being added to the manifest JSON.
Only one input of type manifest
is allowed in the inputs
list of a task launch request. All additional input (of type file
or folder
) are ignored. They should be included into the manifest JSON.
Manifest JSON can be gzipped. The max size of input manifest JSON is 1GB (gzipped or uncompressed).
Outputs
The task execution body provides an array of outputs to upload files and folders local to the task container to an external URL. Similar to inputs, each object in the outputs array must contain a "path" and a "url". The contents of the path will be uploaded to the mapped URL.
Requirements:
The path must be absolute.
The same path must not be reused for multiple outputs.
The path and URL must lead to a folder.
The URL scheme must match one of the following:
gds://
,s3://
GDS-based URLs must be accessible by the token used to launch the task run.
In addition to the outputs generated by the task execution, a _manifest.json
and _tags.json
file is uploaded to each mounted output location. These files contain information about the files uploaded to that specific mount location.
_manifest.json
Records metadata for all uploaded files, including the following:
Relative path where the file was uploaded.
md5 checksum.
File size in bytes.
UTC timestamp when the file was uploaded.
_tags.json
Records the UTC timestamp when the uploads completed and the Task Run Id.
AWS S3 Outputs
To write outputs to a private S3 bucket, the credentials to that bucket must be provided in the "credentials" field of the "outputs", and the "storageProvider" must be set to "aws". A substitution is required to be provided for each of the fields in credentials when defined in a task version. The following are the valid keys that can be provided in credentials:
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_SESSION_TOKEN
There are two ways to provide access keys. For permanent credentials, include the AWS_ACCESS_KEY_ID and the AWS_SECRET_ACCESS_KEY. For temporary credentials, include the AWS_ACCESS_KEY_ID, the AWS_SECRET_ACCESS_KEY, and the AWS_SESSION_TOKEN.
The following is an example of an execution specification that, when launched, will output logs to a private S3 location:
Private Image Repositories
TES supports running Docker-based images from public or private repositories. For images stored in a private repository, such as a private Docker repo or a private AWS ECR, access must be provided through credentials or an AWS policy.
Private AWS ECR
Substitute <platform_aws_account>
with the platform AWS account ID: 079623148045
.
Setting this policy allows the image to be specified for task runs by any ICA users, so it is important to ensure no private data is stored on the images in the AWS ECR.
Example AWS ECR Image
Private Docker Hub
The following is an example of a task execution specification that can be provided with the image password at launch time:
When this task version is launched, the password is provided in the launch arguments as follows:
Bring Your Own Docker
TES adopted the Kubernetes convention for launching, as follows: the Docker image's ENTRYPOINT is overridden by the "Command" field, and the Docker Image's CMD is overridden by the "Args" field in the task execution body. The "Image" field should match the image name in Docker Hub. To pull images from a private repository, you can provide the image credentials in the execution object.
Currently TES does not support the array syntax for "Command". Only a string can be provided. If your Docker image requires the array syntax, it must be enabled in the image itself by specifying the ENTRYPOINT as an array, and "Command" must not be specified in the task execution specification.
Working Directory
Performance Optimizations
Instance Retry
Task runs may experience an unexpected interruption where the compute instance hosting the task run job fails. Failure causes include the following:
Hardware failures
Instance eviction by the cloud infrastructure
Task run application fails with non-0 exit code
To prevent unexpected task run failures, a retryLimit can be provided to specify the number of attempts a task run should be retried if an unexpected job failure occurs. The retryLimit field is specified in the execution body as an integer between 0 and 6 with a default of 3. When developing and testing a task run, it's recommended to use a retryLimit of 0.
Last updated