Upload Sessions

Files that are directly uploaded to the cloud provider's object store must be indexed in order to appear in the platform GUI or APIs. This happens in an eventually consistent basis. On average, files and associated folder records are indexed within X TBD, but can vary depending on delivery of events from the cloud provider.

Upload sessions can be used to ensure that all files uploaded to the underlying cloud provider are immediately indexed and ready for use.

ℹ️ Important

Although using sessions when uploading data is optional, it's recommended as best practice, especially when a subsequent process is expected to query for and use the uploaded data. An example is during a pipeline, when one step of a workflow uploads files that are then immediately expected to exist in the subsequent step.

When a folder create or update request with object store access is processed, a session is started and a session ID is returned as part of the API response. The uploader then utilizes the provided credentials to upload data to the cloud provider and to keep track of the number of files uploaded. When the upload is complete, the user can indicate to GDS that all files have been uploaded by calling the sessions complete API:

POST: v1/folders/{folderId}/sessions/{sessionId}:complete

At this point the session goes to a closed or a completed state. If completed then the files are all available to use immediately. If closed, then GDS may do a background reconciliation of objects to ensure everything has been indexed and then transition to completed. A client should continue to poll a closed session until it transitions to completed. To do so, client may repeat the same call as above and refer to the status field in the response.

Once completed, Users can then reference the files throughout the ICA system and trigger any workflows that utilize them. If a user does not close a session, GDS closes the session automatically when the upload credentials expire.

The CLI handles this behavior natively.

Last updated