Skip to content

Commit

Permalink
feat(cli): garbage collect s3 assets (under --unstable flag) (#31611)
Browse files Browse the repository at this point in the history
## S3 Asset Garbage Collection

This PR introduces a new CLI command under the new `--unstable` flag. This flag ensures that users understand and opt-in to experimental or incomplete CLI features.

`cdk gc` will garbage collect unused assets in your bootstrapped S3 bucket. It goes through each object in the bucket, checks to see if the asset hash shows up in a cloudformation stack, and if not, tags the object as unused and/or deletes the object (depending on your configuration).

## **THIS COMMAND WILL DELETE OBJECTS IN YOUR BOOTSTRAPPED S3 BUCKET**

basic garbage collection (immediately deletes objects that are unused):

```bash
cdk gc aws://0123456789012/us-east-1 \
  --unstable='gc' \
  --type='s3'
```

garbage collection with a buffer (deletes unused objects > # of days specified):

```bash
cdk gc aws://0123456789012/us-east-1 \
  --unstable='gc' \
  --type='s3' \
  --rollback-buffer-days=30
```

garbage collection with a created at buffer (deletes unused objects only if they have lived longer than this many days):

```bash
cdk gc aws://0123456789012/us-east-1 \
  --unstable='gc' \
  --type='s3' \
  --created-buffer-days=5
```

garbage collect a specific bootstrap stack:

```bash
cdk gc aws://0123456789012/us-east-1 \
  --unstable='gc' \
  --type='s3' \
  --bootstrap-stack-name=cdktest-0lc2i3vebi7-bootstrap-stack
```

before actually deleting your assets, you will be prompted one last time:

```bash
Found 1 objects to delete based off of the following criteria:
- objects have been isolated for > 0 days
- objects were created > 0 days ago

Delete this batch (yes/no/delete-all)?
```

To disable this, specify the `--skip-delete-prompt` option.

## Todo in another PR

- [ ] ECR asset collection

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
  • Loading branch information
kaizencc authored Oct 21, 2024
1 parent be4154b commit 0a0e4ad
Show file tree
Hide file tree
Showing 11 changed files with 1,940 additions and 17 deletions.
45 changes: 45 additions & 0 deletions packages/@aws-cdk-testing/cli-integ/lib/with-cdk-app.ts
Original file line number Diff line number Diff line change
Expand Up @@ -331,6 +331,30 @@ export interface CdkModernBootstrapCommandOptions extends CommonCdkBootstrapComm
readonly usePreviousParameters?: boolean;
}

export interface CdkGarbageCollectionCommandOptions {
/**
* The amount of days an asset should stay isolated before deletion, to
* guard against some pipeline rollback scenarios
*
* @default 0
*/
readonly rollbackBufferDays?: number;

/**
* The type of asset that is getting garbage collected.
*
* @default 'all'
*/
readonly type?: 'ecr' | 's3' | 'all';

/**
* The name of the bootstrap stack
*
* @default 'CdkToolkit'
*/
readonly bootstrapStackName?: string;
}

export class TestFixture extends ShellHelper {
public readonly qualifier = this.randomString.slice(0, 10);
private readonly bucketsToDelete = new Array<string>();
Expand Down Expand Up @@ -464,6 +488,26 @@ export class TestFixture extends ShellHelper {
});
}

public async cdkGarbageCollect(options: CdkGarbageCollectionCommandOptions): Promise<string> {
const args = [
'gc',
'--unstable=gc', // TODO: remove when stabilizing
'--confirm=false',
'--created-buffer-days=0', // Otherwise all assets created during integ tests are too young
];
if (options.rollbackBufferDays) {
args.push('--rollback-buffer-days', String(options.rollbackBufferDays));
}
if (options.type) {
args.push('--type', options.type);
}
if (options.bootstrapStackName) {
args.push('--bootstrapStackName', options.bootstrapStackName);
}

return this.cdk(args);
}

public async cdkMigrate(language: string, stackName: string, inputPath?: string, options?: CdkCliOptions) {
return this.cdk([
'migrate',
Expand Down Expand Up @@ -634,6 +678,7 @@ async function ensureBootstrapped(fixture: TestFixture) {
CDK_NEW_BOOTSTRAP: '1',
},
});

ALREADY_BOOTSTRAPPED_IN_THIS_RUN.add(envSpecifier);
}

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,202 @@
import { GetObjectTaggingCommand, ListObjectsV2Command, PutObjectTaggingCommand } from '@aws-sdk/client-s3';
import { integTest, randomString, withoutBootstrap } from '../../lib';

jest.setTimeout(2 * 60 * 60_000); // Includes the time to acquire locks, worst-case single-threaded runtime

integTest(
'Garbage Collection deletes unused assets',
withoutBootstrap(async (fixture) => {
const toolkitStackName = fixture.bootstrapStackName;
const bootstrapBucketName = `aws-cdk-garbage-collect-integ-test-bckt-${randomString()}`;
fixture.rememberToDeleteBucket(bootstrapBucketName); // just in case

await fixture.cdkBootstrapModern({
toolkitStackName,
bootstrapBucketName,
});

await fixture.cdkDeploy('lambda', {
options: [
'--context', `bootstrapBucket=${bootstrapBucketName}`,
'--context', `@aws-cdk/core:bootstrapQualifier=${fixture.qualifier}`,
'--toolkit-stack-name', toolkitStackName,
'--force',
],
});
fixture.log('Setup complete!');

await fixture.cdkDestroy('lambda', {
options: [
'--context', `bootstrapBucket=${bootstrapBucketName}`,
'--context', `@aws-cdk/core:bootstrapQualifier=${fixture.qualifier}`,
'--toolkit-stack-name', toolkitStackName,
'--force',
],
});

await fixture.cdkGarbageCollect({
rollbackBufferDays: 0,
type: 's3',
bootstrapStackName: toolkitStackName,
});
fixture.log('Garbage collection complete!');

// assert that the bootstrap bucket is empty
await fixture.aws.s3.send(new ListObjectsV2Command({ Bucket: bootstrapBucketName }))
.then((result) => {
expect(result.Contents).toBeUndefined();
});
}),
);

integTest(
'Garbage Collection keeps in use assets',
withoutBootstrap(async (fixture) => {
const toolkitStackName = fixture.bootstrapStackName;
const bootstrapBucketName = `aws-cdk-garbage-collect-integ-test-bckt-${randomString()}`;
fixture.rememberToDeleteBucket(bootstrapBucketName); // just in case

await fixture.cdkBootstrapModern({
toolkitStackName,
bootstrapBucketName,
});

await fixture.cdkDeploy('lambda', {
options: [
'--context', `bootstrapBucket=${bootstrapBucketName}`,
'--context', `@aws-cdk/core:bootstrapQualifier=${fixture.qualifier}`,
'--toolkit-stack-name', toolkitStackName,
'--force',
],
});
fixture.log('Setup complete!');

await fixture.cdkGarbageCollect({
rollbackBufferDays: 0,
type: 's3',
bootstrapStackName: toolkitStackName,
});
fixture.log('Garbage collection complete!');

// assert that the bootstrap bucket has the object
await fixture.aws.s3.send(new ListObjectsV2Command({ Bucket: bootstrapBucketName }))
.then((result) => {
expect(result.Contents).toHaveLength(1);
});

await fixture.cdkDestroy('lambda', {
options: [
'--context', `bootstrapBucket=${bootstrapBucketName}`,
'--context', `@aws-cdk/core:bootstrapQualifier=${fixture.qualifier}`,
'--toolkit-stack-name', toolkitStackName,
'--force',
],
});
fixture.log('Teardown complete!');
}),
);

integTest(
'Garbage Collection tags unused assets',
withoutBootstrap(async (fixture) => {
const toolkitStackName = fixture.bootstrapStackName;
const bootstrapBucketName = `aws-cdk-garbage-collect-integ-test-bckt-${randomString()}`;
fixture.rememberToDeleteBucket(bootstrapBucketName); // just in case

await fixture.cdkBootstrapModern({
toolkitStackName,
bootstrapBucketName,
});

await fixture.cdkDeploy('lambda', {
options: [
'--context', `bootstrapBucket=${bootstrapBucketName}`,
'--context', `@aws-cdk/core:bootstrapQualifier=${fixture.qualifier}`,
'--toolkit-stack-name', toolkitStackName,
'--force',
],
});
fixture.log('Setup complete!');

await fixture.cdkDestroy('lambda', {
options: [
'--context', `bootstrapBucket=${bootstrapBucketName}`,
'--context', `@aws-cdk/core:bootstrapQualifier=${fixture.qualifier}`,
'--toolkit-stack-name', toolkitStackName,
'--force',
],
});

await fixture.cdkGarbageCollect({
rollbackBufferDays: 100, // this will ensure that we do not delete assets immediately (and just tag them)
type: 's3',
bootstrapStackName: toolkitStackName,
});
fixture.log('Garbage collection complete!');

// assert that the bootstrap bucket has the object and is tagged
await fixture.aws.s3.send(new ListObjectsV2Command({ Bucket: bootstrapBucketName }))
.then(async (result) => {
expect(result.Contents).toHaveLength(2); // also the CFN template
const key = result.Contents![0].Key;
const tags = await fixture.aws.s3.send(new GetObjectTaggingCommand({ Bucket: bootstrapBucketName, Key: key }));
expect(tags.TagSet).toHaveLength(1);
});
}),
);

integTest(
'Garbage Collection untags in-use assets',
withoutBootstrap(async (fixture) => {
const toolkitStackName = fixture.bootstrapStackName;
const bootstrapBucketName = `aws-cdk-garbage-collect-integ-test-bckt-${randomString()}`;
fixture.rememberToDeleteBucket(bootstrapBucketName); // just in case

await fixture.cdkBootstrapModern({
toolkitStackName,
bootstrapBucketName,
});

await fixture.cdkDeploy('lambda', {
options: [
'--context', `bootstrapBucket=${bootstrapBucketName}`,
'--context', `@aws-cdk/core:bootstrapQualifier=${fixture.qualifier}`,
'--toolkit-stack-name', toolkitStackName,
'--force',
],
});
fixture.log('Setup complete!');

// Artificially add tagging to the asset in the bootstrap bucket
const result = await fixture.aws.s3.send(new ListObjectsV2Command({ Bucket: bootstrapBucketName }));
const key = result.Contents!.filter((c) => c.Key?.split('.')[1] == 'zip')[0].Key; // fancy footwork to make sure we have the asset key
await fixture.aws.s3.send(new PutObjectTaggingCommand({
Bucket: bootstrapBucketName,
Key: key,
Tagging: {
TagSet: [{
Key: 'aws-cdk:isolated',
Value: '12345',
}, {
Key: 'bogus',
Value: 'val',
}],
},
}));

await fixture.cdkGarbageCollect({
rollbackBufferDays: 100, // this will ensure that we do not delete assets immediately (and just tag them)
type: 's3',
bootstrapStackName: toolkitStackName,
});
fixture.log('Garbage collection complete!');

// assert that the isolated object tag is removed while the other tag remains
const newTags = await fixture.aws.s3.send(new GetObjectTaggingCommand({ Bucket: bootstrapBucketName, Key: key }));

expect(newTags.TagSet).toEqual([{
Key: 'bogus',
Value: 'val',
}]);
}),
);
73 changes: 73 additions & 0 deletions packages/aws-cdk/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ The AWS CDK Toolkit provides the `cdk` command-line interface that can be used t
| [`cdk watch`](#cdk-watch) | Watches a CDK app for deployable and hotswappable changes |
| [`cdk destroy`](#cdk-destroy) | Deletes a stack from an AWS account |
| [`cdk bootstrap`](#cdk-bootstrap) | Deploy a toolkit stack to support deploying large stacks & artifacts |
| [`cdk gc`](#cdk-gc) | Garbage collect assets associated with the bootstrapped stack |
| [`cdk doctor`](#cdk-doctor) | Inspect the environment and produce information useful for troubleshooting |
| [`cdk acknowledge`](#cdk-acknowledge) | Acknowledge (and hide) a notice by issue number |
| [`cdk notices`](#cdk-notices) | List all relevant notices for the application |
Expand Down Expand Up @@ -876,6 +877,78 @@ In order to remove that permissions boundary you have to specify the
cdk bootstrap --no-previous-parameters
```

### `cdk gc`

CDK Garbage Collection.

> [!CAUTION]
> CDK Garbage Collection is under development and therefore must be opted in via the `--unstable` flag: `cdk gc --unstable=gc`.
>
> [!WARNING]
> `cdk gc` currently only supports garbage collecting S3 Assets. You must specify `cdk gc --unstable=gc --type=s3` as ECR asset garbage collection has not yet been implemented.
`cdk gc` garbage collects unused S3 assets from your bootstrap bucket via the following mechanism:

- for each object in the bootstrap S3 Bucket, check to see if it is referenced in any existing CloudFormation templates
- if not, it is treated as unused and gc will either tag it or delete it, depending on your configuration.

The most basic usage looks like this:

```console
cdk gc --unstable=gc --type=s3
```

This will garbage collect S3 assets from the current bootstrapped environment(s) and immediately delete them. Note that, since the default bootstrap S3 Bucket is versioned, object deletion will be handled by the lifecycle
policy on the bucket.

Before we begin to delete your assets, you will be prompted:

```console
cdk gc --unstable=gc --type=s3

Found X objects to delete based off of the following criteria:
- objects have been isolated for > 0 days
- objects were created > 1 days ago

Delete this batch (yes/no/delete-all)?
```

Since it's quite possible that the bootstrap bucket has many objects, we work in batches of 1000 objects. To skip the
prompt either reply with `delete-all`, or use the `--confirm=false` option.

```console
cdk gc --unstable=gc --type=s3 --confirm=false
```

If you are concerned about deleting assets too aggressively, there are multiple levers you can configure:

- rollback-buffer-days: this is the amount of days an asset has to be marked as isolated before it is elligible for deletion.
- created-buffer-days: this is the amount of days an asset must live before it is elligible for deletion.

When using `rollback-buffer-days`, instead of deleting unused objects, `cdk gc` will tag them with
today's date instead. It will also check if any objects have been tagged by previous runs of `cdk gc`
and delete them if they have been tagged for longer than the buffer days.

When using `created-buffer-days`, we simply filter out any assets that have not persisted that number
of days.

```console
cdk gc --unstable=gc --type=s3 --rollback-buffer-days=30 --created-buffer-days=1
```

You can also configure the scope that `cdk gc` performs via the `--action` option. By default, all actions
are performed, but you can specify `print`, `tag`, or `delete-tagged`.

- `print` performs no changes to your AWS account, but finds and prints the number of unused assets.
- `tag` tags any newly unused assets, but does not delete any unused assets.
- `delete-tagged` deletes assets that have been tagged for longer than the buffer days, but does not tag newly unused assets.

```console
cdk gc --unstable=gc --type=s3 --action=delete-tagged --rollback-buffer-days=30
```

This will delete assets that have been unused for >30 days, but will not tag additional assets.

### `cdk doctor`

Inspect the current command-line environment and configurations, and collect information that can be useful for
Expand Down
Loading

0 comments on commit 0a0e4ad

Please sign in to comment.