-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HUBBLE 444 - Refactor Elementary monitoring to run every 30 min #379
Conversation
dags/dbt_data_quality_alerts_dag.py
Outdated
default_args=get_default_dag_args(), | ||
start_date=datetime(2024, 6, 11, 0, 0), | ||
description="This DAG runs dbt tests and Elementary alerts at a half-hourly cadence", | ||
schedule_interval="*/30 * * * *", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we change the interval to */15,*/45 * * * *
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, it makes much more sense to run in-between the dbt dags.
task_name = f"{task_name}_with_exclude" | ||
args.append("--exclude") | ||
if isinstance(excluded, list): | ||
args.append(",".join(excluded)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this supposed to be a comma or a space? I think the comma means it is the intersection of the items in the excluded list whereas space means both are excluded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, it should be space-separated to provide union for the arguments; I'm fixing it in the next commit.
PR Checklist
PR Structure
otherwise).
Thoroughness
What
The
elementary_slack_alert_dbt_sdf_marts
task was moved into a separate DAG that executes independently of the dbt models after some data quality tests, as defined in Alerting for dbt . This way, we will detect and trigger alerts on problems in a timely manner. The DAG is scheduled to run every 30 minutes, which aligns with thedbt_enriched_base_tables
DAG.The dbt tests running in the new DAG use a new tag designed to fit all the unit tests related to the models' data quality.
Why
Elementary monitoring and alerting are currently executed at the very end of the dbt DAGs. If an upstream dbt model fails, this means that elementary never alerts to a data quality issue because the alerting depends on DAG execution status. The result is that alerts are late arriving or arrive after an issue has already been resolved.
Known limitations
This PR must wait for the
singular_test
tag to be deployed for the dbt project.