Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create TES Basespace Upload manager service #186

Merged
merged 16 commits into from
Apr 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions config/config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ import { getPostgresManagerStackProps } from './stacks/postgresManager';
import { getMetadataManagerStackProps } from './stacks/metadataManager';
import { getSequenceRunManagerStackProps } from './stacks/sequenceRunManager';
import { getFileManagerStackProps } from './stacks/fileManager';
import { getBsRunsUploadManagerStackProps } from './stacks/bsRunsUploadManager';

interface EnvironmentConfig {
name: string;
Expand Down Expand Up @@ -45,10 +46,10 @@ export const getEnvironmentConfig = (
metadataManagerStackProps: getMetadataManagerStackProps(),
sequenceRunManagerStackProps: getSequenceRunManagerStackProps(),
fileManagerStackProps: getFileManagerStackProps(accountName),
bsRunsUploadManagerStackProps: getBsRunsUploadManagerStackProps(accountName),
},
},
};
break;

case 'gamma':
return {
Expand All @@ -66,10 +67,10 @@ export const getEnvironmentConfig = (
metadataManagerStackProps: getMetadataManagerStackProps(),
sequenceRunManagerStackProps: getSequenceRunManagerStackProps(),
fileManagerStackProps: getFileManagerStackProps(accountName),
bsRunsUploadManagerStackProps: getBsRunsUploadManagerStackProps(accountName),
},
},
};
break;

case 'prod':
return {
Expand All @@ -87,6 +88,7 @@ export const getEnvironmentConfig = (
metadataManagerStackProps: getMetadataManagerStackProps(),
sequenceRunManagerStackProps: getSequenceRunManagerStackProps(),
fileManagerStackProps: getFileManagerStackProps(accountName),
bsRunsUploadManagerStackProps: getBsRunsUploadManagerStackProps(accountName),
},
},
};
Expand Down
7 changes: 7 additions & 0 deletions config/constants.ts
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,10 @@ export const devBucket = 'umccr-temp-dev';
export const stgBucket = 'umccr-temp-stg';
export const prodBucket = 'org.umccr.data.oncoanalyser';

export const devGdsBsRunsUploadLogPath = 'gds://development/primary_data/temp/bs_runs_upload_tes/';
export const stgGdsBsRunsUploadLogPath = 'gds://staging/primary_data/temp/bs_runs_upload_tes/';
export const prodGdsBsRunsUploadLogPath = 'gds://production/primary_data/temp/bs_runs_upload_tes/';

/**
* Validate the secret name so that it doesn't end with 6 characters and a hyphen.
*
Expand Down Expand Up @@ -64,6 +68,9 @@ export const eventSourceQueueName = 'orcabus-event-source-queue';

export const serviceUserSecretName = 'orcabus/token-service-user'; // pragma: allowlist secret
export const jwtSecretName = 'orcabus/token-service-jwt'; // pragma: allowlist secret
export const icaAccessTokenSecretName = 'IcaSecretsPortal'; // pragma: allowlist secret

export const basespaceAccessTokenSecretName = '/manual/BaseSpaceAccessTokenSecret'; // pragma: allowlist secret

// const statelessConfig = {
// multiSchemaConstructProps: {
Expand Down
38 changes: 38 additions & 0 deletions config/stacks/bsRunsUploadManager.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
import {
devGdsBsRunsUploadLogPath,
stgGdsBsRunsUploadLogPath,
prodGdsBsRunsUploadLogPath,
AccountName,
icaAccessTokenSecretName,
jwtSecretName,
basespaceAccessTokenSecretName,
eventBusName,
} from '../constants';
import { BsRunsUploadManagerConfig } from '../../lib/workload/stateless/stacks/bs-runs-upload-manager/deploy/stack';

export const getBsRunsUploadManagerStackProps = (n: AccountName): BsRunsUploadManagerConfig => {
const baseConfig = {
ica_token_secret_id: icaAccessTokenSecretName,
portal_token_secret_id: jwtSecretName,
basespace_token_secret_id: basespaceAccessTokenSecretName,
eventbus_name: eventBusName,
};

switch (n) {
case 'beta':
return {
...baseConfig,
gds_system_files_path: devGdsBsRunsUploadLogPath,
};
case 'gamma':
return {
...baseConfig,
gds_system_files_path: stgGdsBsRunsUploadLogPath,
};
case 'prod':
return {
...baseConfig,
gds_system_files_path: prodGdsBsRunsUploadLogPath,
};
}
};
41 changes: 41 additions & 0 deletions lib/workload/components/python-lambda-layer/index.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
import { Construct } from 'constructs';
import { PythonLayerVersion } from '@aws-cdk/aws-lambda-python-alpha';
import * as lambda from 'aws-cdk-lib/aws-lambda';

export interface PythonLambdaLayerConstructProps {
layer_name: string;
layer_directory: string;
layer_description: string;
}

export class PythonLambdaLayerConstruct extends Construct {
public readonly lambda_layer_arn: string;
public readonly lambda_layer_version_obj: PythonLayerVersion;

constructor(scope: Construct, id: string, props: PythonLambdaLayerConstructProps) {
super(scope, id);

this.lambda_layer_version_obj = new PythonLayerVersion(this, 'python_lambda_layer', {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexiswl Just minor request, pls.

I just notice these as post-mortem review. When you have spare time down the track, could you please follow up refactoring naming convention on these CDK variables. There seems to be mixed use of snake_case and camelCase.

ditto - current guidance on TypeScript
https://github.com/umccr/orcabus/blob/main/README.md#typography

This apply to both bsRunsUploadManager and ICAv2CopyBatchManager services that have merged.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And.. separate PRs for these refactor, pls.

Copy link
Member Author

@alexiswl alexiswl Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And.. separate PRs for these refactor, pls.

Ah that's going to cause some annoying merge conflicts, the PythonLayer construct example above will rename the attribute lambda_layer_version_obj to lambdaLayerVersionObj, but the property of the bs runs upload manager stack (lambda_layer_obj will be renamed to lambdaLayerObj),

so the layers attribute in any of the python functions will go from layers: [props.lambda_layer_obj.lambda_layer_version_obj] to layers: [props.lambdaLayerObj.lambdaLayerVersionObj] where lambdaLayerObj comes from one branch and lambdaLayerVersionObj comes from another branch.

I think this would be easier to all be in one PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, sure. In that case, that's fine, Alexis. Thanks

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, ok to be all in one go, refactor PR, yes.

layerVersionName: props.layer_name,
entry: props.layer_directory,
compatibleRuntimes: [lambda.Runtime.PYTHON_3_11],
compatibleArchitectures: [lambda.Architecture.X86_64],
license: 'GPL3',
description: props.layer_description,
bundling: {
commandHooks: {
// eslint-disable-next-line @typescript-eslint/no-unused-vars
beforeBundling(inputDir: string, outputDir: string): string[] {
return [];
},
afterBundling(inputDir: string, outputDir: string): string[] {
return [`python -m pip install ${inputDir} -t ${outputDir}`];
},
},
},
});

// Set outputs
this.lambda_layer_arn = this.lambda_layer_version_obj.layerVersionArn;
}
}
112 changes: 112 additions & 0 deletions lib/workload/stateless/stacks/bs-runs-upload-manager/Readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
# BS Runs Upload Manager

<!-- TOC -->
* [BS Runs Upload Manager](#bs-runs-upload-manager)
* [Summary](#summary)
* [Inputs](#inputs)
* [Example input](#example-input)
* [Lambdas in this directory](#lambdas-in-this-directory)
* [Upload V2 SampleSheet to GDS Bssh](#upload-v2-samplesheet-to-gds-bssh)
* [Launch BS Runs Upload Tes](#launch-bs-runs-upload-tes)
<!-- TOC -->

## Summary

Quick and dirty hack to push our runs from ICAv1 to our V2 BaseSpace server domain

Once we move from V1 to V2 we probably won't need this.

This statemachine will copy data from v1 to v2 via bs runs upload.

The bs runs upload will trigger an Autolaunch of BCLConvert in ICAv2.

The two steps of the statemachine are:

1. Generate a V2 Samplesheet and reupload it
2. Launch an ICAv1 tes task that runs the bs runs upload command

This statemachine will subscribe to the orcabus.srm events and trigger the statemachine when a new run is detected.

![](images/bs_runs_upload_manager.png)

## Inputs

The AWS Step functions takes in the following parameters

* gdsFolderPath: The path to the run folder in GDS
* gdsVolumeName: The GDS volume name
* sampleSheetName: The name of the sample sheet file

### Example input

```json
{
"run_folder_path": "/Runs/231109_A01052_0171_BHLJW7DSX7_r.NULhvzxcSEWmqZw8QljXfQ",
"run_volume_name": "bssh.acddbfda498038ed99fa94fe79523959",
"sample_sheet_name": "SampleSheet.csv"
}
```

### Lambdas in this directory

#### Upload V2 SampleSheet to GDS Bssh

This lambda will take in an existing v1 samplesheet and convert it to a v2 samplesheet. It will then upload the v2 samplesheet to the GDS volume.

This uses the ssbackend API in order to generate the V2 samplesheet since some metadata is required to create V2 samplesheets not present in the V1 samplesheet.

**Example Input**

```json
{
"gds_folder_path": "/Runs/240315_A01052_0186_AH5HM5DSXC_r.YpC_0U_7-06Oom1cFl9Y5A",
"gds_volume_name": "bssh.acddbfda498038ed99fa94fe79523959",
"samplesheet_name": "SampleSheet.csv"
}
```

**Example Output**

```json
{
"gds_folder_path": "/Runs/240315_A01052_0186_AH5HM5DSXC_r.YpC_0U_7-06Oom1cFl9Y5A",
"gds_volume_name": "bssh.acddbfda498038ed99fa94fe79523959",
"samplesheet_name": "SampleSheet.V2.<timestamp>.csv",
"instrument_run_id": "240315_A01052_0186_AH5HM5DSXC"
}
```

#### Launch BS Runs Upload Tes

This lambda will launch a tes task that will run the bs runs upload command.

**Example Input**

```json
{
"gds_folder_path": "/Runs/240315_A01052_0186_AH5HM5DSXC_r.YpC_0U_7-06Oom1cFl9Y5A",
"gds_volume_name": "bssh.acddbfda498038ed99fa94fe79523959",
"samplesheet_name": "SampleSheet.V2.<timestamp>.csv",
"instrument_run_id": "240315_A01052_0186_AH5HM5DSXC"
}
```

**Example Output**

```json
{
"task_run_id": "trn.4fd3414f98fe47c3a6cfc31a67b7418a"
}
```

#### External parameters

The following properites are required in order to deploy the statemachine / stack:

* SecretsManager:
* ICA Access Token: `IcaSecretsPortal`
* Portal Token: `orcabus/token-service-jwt`
* BaseSpace Token Secret ID: `/manual/BaseSpaceAccessTokenSecret`
* Strings
* gds system files path root (where to do the TES logs go?)
* EventBus Name: `OrcabusMain`
Loading
Loading