[ETL-674] Add script to compress intermediate JSON #134

philerooski · 2024-07-25T23:41:20Z

Recombines the NDJSON part sets which we wrote to our intermediate bucket and uploads a gzipped NDJSON to an adjacent location.

Data Input

Each part set is located at:

s3://<bucket-name>/<namespace>/json/dataset={dataset}/cohort={cohort}/

where {dataset} is the dataset identifier (datatype), and {cohort} is either 'adults_v1' or 'pediatric_v1'. Each file in the part set is named:

{file_identifier}.part{part_number}.ndjson

Data Output

Each combined and gzipped part set is written to:

s3://<bucket-name>/<namespace>/compressed_json/dataset={dataset}/cohort={cohort}/{file_identifier}.ndjson.gz

sonarqubecloud · 2024-07-26T19:03:19Z

Quality Gate passed

Issues
17 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

philerooski · 2024-07-26T19:17:26Z

@BryanFauble I amended my commit but it seems like SonarCloud ran its analysis on the previous version of the code again.

BryanFauble · 2024-07-26T19:19:40Z

@BryanFauble I amended my commit but it seems like SonarCloud ran its analysis on the previous version of the code again.

I am not sure if SonarCloud would work properly with amended commits. I suspect that is needs a new commit hash for sonarcloud to run.

thomasyu888

🔥 LGTM!

philerooski requested a review from a team as a code owner July 25, 2024 23:41

philerooski temporarily deployed to develop July 25, 2024 23:43 — with GitHub Actions Inactive

philerooski temporarily deployed to develop July 25, 2024 23:44 — with GitHub Actions Inactive

Add script to compress intermediate JSON

49fb99c

philerooski force-pushed the etl-674 branch from 79a6d94 to 49fb99c Compare July 26, 2024 19:03

philerooski temporarily deployed to develop July 26, 2024 19:03 — with GitHub Actions Inactive

philerooski temporarily deployed to develop July 26, 2024 19:05 — with GitHub Actions Inactive

philerooski temporarily deployed to develop July 26, 2024 19:11 — with GitHub Actions Inactive

philerooski temporarily deployed to develop July 26, 2024 19:13 — with GitHub Actions Inactive

philerooski temporarily deployed to develop July 26, 2024 19:14 — with GitHub Actions Inactive

thomasyu888 approved these changes Aug 12, 2024

View reviewed changes

philerooski merged commit 1f47dfc into main Aug 12, 2024
17 checks passed

philerooski deleted the etl-674 branch August 12, 2024 18:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ETL-674] Add script to compress intermediate JSON #134

[ETL-674] Add script to compress intermediate JSON #134

philerooski commented Jul 25, 2024

sonarqubecloud bot commented Jul 26, 2024

philerooski commented Jul 26, 2024

BryanFauble commented Jul 26, 2024

thomasyu888 left a comment

[ETL-674] Add script to compress intermediate JSON #134

[ETL-674] Add script to compress intermediate JSON #134

Conversation

philerooski commented Jul 25, 2024

Data Input

Data Output

sonarqubecloud bot commented Jul 26, 2024

Quality Gate passed

philerooski commented Jul 26, 2024

BryanFauble commented Jul 26, 2024

thomasyu888 left a comment

Choose a reason for hiding this comment