Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HUBBLE 444 - Refactor Elementary monitoring to run every 30 min #379

Closed
wants to merge 17 commits into from

Conversation

edualvess
Copy link
Contributor

@edualvess edualvess commented Jun 11, 2024

PR Checklist

PR Structure

  • This PR has reasonably narrow scope (if not, break it down into smaller PRs).
  • This PR avoids mixing refactoring changes with feature changes (split into two PRs
    otherwise).
  • This PR's title starts with the jira ticket associated with the PR.

Thoroughness

  • This PR adds tests for the most critical parts of the new functionality or fixes.
  • I've updated the README with the added features, breaking changes, new instructions on how to use the repository.

What

The elementary_slack_alert_dbt_sdf_marts task was moved into a separate DAG that executes independently of the dbt models after some data quality tests, as defined in Alerting for dbt . This way, we will detect and trigger alerts on problems in a timely manner. The DAG is scheduled to run every 30 minutes, which aligns with the dbt_enriched_base_tables DAG.
The dbt tests running in the new DAG use a new tag designed to fit all the unit tests related to the models' data quality.

Why

Elementary monitoring and alerting are currently executed at the very end of the dbt DAGs. If an upstream dbt model fails, this means that elementary never alerts to a data quality issue because the alerting depends on DAG execution status. The result is that alerts are late arriving or arrive after an issue has already been resolved.

Known limitations

This PR must wait for the singular_test tag to be deployed for the dbt project.

default_args=get_default_dag_args(),
start_date=datetime(2024, 6, 11, 0, 0),
description="This DAG runs dbt tests and Elementary alerts at a half-hourly cadence",
schedule_interval="*/30 * * * *",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change the interval to */15,*/45 * * * *?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it makes much more sense to run in-between the dbt dags.

task_name = f"{task_name}_with_exclude"
args.append("--exclude")
if isinstance(excluded, list):
args.append(",".join(excluded))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be a comma or a space? I think the comma means it is the intersection of the items in the excluded list whereas space means both are excluded

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, it should be space-separated to provide union for the arguments; I'm fixing it in the next commit.

@edualvess edualvess marked this pull request as ready for review June 20, 2024 18:36
@edualvess edualvess requested a review from a team as a code owner June 20, 2024 18:36
@edualvess edualvess closed this Jun 27, 2024
@edualvess edualvess deleted the feature/add_dbt_data_quality_alerting_dag branch July 1, 2024 12:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants