Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dag to create elementary report #506

Merged
merged 33 commits into from
Nov 12, 2024
Merged

Conversation

amishas157
Copy link
Contributor

@amishas157 amishas157 commented Oct 3, 2024

PR Checklist

PR Structure

  • This PR has reasonably narrow scope (if not, break it down into smaller PRs).
  • This PR avoids mixing refactoring changes with feature changes (split into two PRs
    otherwise).
  • This PR's title starts with the jira ticket associated with the PR.

Thoroughness

  • This PR adds tests for the most critical parts of the new functionality or fixes.
  • I've updated the README with the added features, breaking changes, new instructions on how to use the repository.

What

This PR adds a new DAG to generate elementary report and send it to slack. The DAG will be run once a week (on Monday) and generate report for last 7 days.

Why

The obervability report will help us to monitor any pending test failures during the week.

Known limitations

The edr command is lil wonky with memory requirements. It is a known issue and observed by many community members as well.
elementary-data/elementary#761
Even though above is closed, but it is not fully resolved. Also, it requires high ephemeral storage since it downloads all the data and then compute the report.

Testing:

  • ✅ It takes 7-8 min to finish generating the report
  • Screenshot 2024-11-12 at 1 35 19 PM
  • ❓ High memory usage check from gcloud
    Screenshot 2024-11-12 at 1 27 50 PM

@amishas157 amishas157 requested a review from a team as a code owner October 3, 2024 23:47
@amishas157 amishas157 force-pushed the patch/elementary-report-dag branch from 344fe8e to 192fa70 Compare October 4, 2024 02:20
update

update
@amishas157 amishas157 force-pushed the patch/elementary-report-dag branch from 192fa70 to f60fb74 Compare October 4, 2024 02:21
@amishas157 amishas157 force-pushed the patch/elementary-report-dag branch from 202655e to 8260234 Compare October 4, 2024 02:50
Copy link
Contributor

@sydneynotthecity sydneynotthecity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about the implications of running this DAG hourly. Will the report only contain data for the past day? Is it only sent if there are failures?

dags/elementary_report_dag.py Outdated Show resolved Hide resolved
@amishas157 amishas157 marked this pull request as draft October 4, 2024 17:01
@@ -124,7 +124,7 @@
"partnership_assets__account_holders_activity_fact": false,
"partnership_assets__asset_activity_fact": false
},
"dbt_image_name": "stellar/stellar-dbt:53375b5f9",
"dbt_image_name": "stellar/stellar-dbt-dev:dba580d7c",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Update this image once final dbt image is created. Also update corresponding var in prod

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -288,21 +288,28 @@
"dbt": {
"requests": {
"cpu": "1",
"ephemeral-storage": "500Mi",
"ephemeral-storage": "1Gi",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the default value is 1Gi, do we even need to set for dbt and default tasks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, good to 🔪

@amishas157 amishas157 marked this pull request as ready for review November 12, 2024 19:44
Copy link
Contributor

@sydneynotthecity sydneynotthecity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one suggestion but looks good. We won't need to upgrade worker size, correct?

@@ -288,21 +288,28 @@
"dbt": {
"requests": {
"cpu": "1",
"ephemeral-storage": "500Mi",
"ephemeral-storage": "1Gi",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the default value is 1Gi, do we even need to set for dbt and default tasks?

default_args=get_default_dag_args(),
start_date=datetime(2024, 11, 11, 0, 0),
description="This DAG creates elementary report and send it to slack",
schedule="0 0 * * MON", # Runs every Monday
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL you can specify day of week by abbreviation instead of number 🙃

I suggest shifting this schedule back to 2 or 3 UTC so that it captures the latest dbt_stellar_marts execution. Otherwise this data will be 23 hours stale for the marts tables specifically

@amishas157
Copy link
Contributor Author

one suggestion but looks good. We won't need to upgrade worker size, correct?

No, we don't need to

@amishas157 amishas157 merged commit 48c8e63 into master Nov 12, 2024
6 checks passed
@sydneynotthecity sydneynotthecity deleted the patch/elementary-report-dag branch November 21, 2024 03:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants