-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dbt Elementary and tests configuration #11
Conversation
Yes, if you can send me those failures that would be great. Feel free to send through Slack or email |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple nits that need to be addressed before approving.
For the src_*
tables, it might also be worthwhile to investigate how the query performs with batch_insert_ts
as the column for the freshness timestamp column. I have no idea if it will do a full table scan since that column is not the partition column, but I would like confirmation that it will not scan the entire table.
There are a couple of other tables where the date column configured is not the partition key. Given how large these tables are, can you confirm that BQ is not processing all bytes in the table when running the queries? if it is, the columns will need to be updated to the partition column.
Similar to this PR, i think it would be nice to have some volume alerts too if the volume of the data drastically increased or decreased during a timespan. If you are able to implement volume alerts in the other PR, please also apply here.
- relationships: | ||
to: ref('stg_history_transactions') | ||
field: transaction_id | ||
|
||
- name: type | ||
description: '{{ doc("type") }}' | ||
tests: | ||
- not_null | ||
- not_null: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a test that confirms that the type <= 26
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented through the dbt utils test "expression_is_true".
…tion column and added description to the tests
Once this tests are approved, I'll move forward to configure the slack alerts as requested. |
…s and dbt utils recency for models.
…nto feature/sources_tests_updates
Wanted to confirm this before approving.
Was this ever tested or confirmed? If it does cause problems the current batch_run_date/closed_at will work |
As I remember, I don't think using batch_insert_ts processed much more than batch_run_date due to the query elementary builds to analyze the metrics. I compared configuring two tests for the same table (account_signers) using different columns and both processed the same size. But anyway I changed the timestamp column to the partition column for all tables (basically batch_run_date and closed_at). Let me know if this makes sense to you as well. Thank you, |
Initial commit for reviewing the public sources and marts test configurations.
Confirm that the period defined for testing is as expected. Also confirm that the date column chosen (closed_at, batch_insert_ts, etc) to be used as a reference for the defined periods is correct.
During the execution of the
dbt test
on the configured sources, there were a few test failures. If you want more details and the list of failures @sydneynotthecity just let me know and I can provide it to you.Also there is a few docs that needs updates. If you want to provide the description I can update it.