Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ETL-654] Clean up before integration test run #121

Merged
merged 13 commits into from
Jul 2, 2024
Merged

Conversation

BryanFauble
Copy link
Contributor

Problem:

  1. When an integration test is run multiple times on a namespaced branch (Or on the staging namespace) it will affect results.

Solution:

  1. Cleaning out the input bucket, and json intermediate bucket before running the integration test in both namespaced, and main branches

Testing:

  1. Will be verifying against namespaced branches for testing, and then once merged into the main branch

@BryanFauble BryanFauble requested a review from a team as a code owner July 2, 2024 19:27
Copy link
Contributor

@rxu17 rxu17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just a couple of comments

.github/workflows/README.md Outdated Show resolved Hide resolved
src/scripts/manage_artifacts/clean_staging.py Outdated Show resolved Hide resolved
Copy link

sonarqubecloud bot commented Jul 2, 2024

@@ -517,6 +541,28 @@ def glue_crawler_role(namespace):
role_name = f"{namespace}-pytest-crawler-role"
glue_service_policy_arn = "arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole"
s3_read_policy_arn = "arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess"

# Cleanup if the role/policy already exist
Copy link
Contributor

@rxu17 rxu17 Jul 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, were we running into role/policy duplicates/conflicts here when trying to create role when it already exists?

NVM I see what is happening. If the tests abruptly fail, sometimes the role/crawler/database might have been created already and not deleted properly because test failed prior to that. So next time the tests are run, it can run into an error.

These changes looks good to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly! This is to make sure we're always running the test from a known state.

@BryanFauble BryanFauble merged commit 2c3439f into main Jul 2, 2024
17 checks passed
@BryanFauble BryanFauble deleted the etl-654-clean-data branch July 2, 2024 22:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants