Move tests from scheduled queries / business queries to DBT #83

amishas157 · 2024-09-03T10:52:56Z

PR Checklist

PR Structure

This PR has reasonably narrow scope (if not, break it down into smaller PRs).
This PR avoids mixing refactoring changes with feature changes (split into two PRs
otherwise).
This PR's title starts with the jira ticket associated with the PR.

Thoroughness

This PR adds tests for the most critical parts of the new functionality or fixes.
I've updated the docs and README with the added features, breaking changes, new instructions on how to use the repository.

Release planning

I've decided if this PR requires a new major/minor/patch version accordingly to
semver, and I've changed the name of the BRANCH to release/* , feature/* or patch/* .

What

This PR:

changes severity of existing tests to error instead of warning to make it compatible with tests from scheduled queries
Adds few more tests for column checks based on list of scheduled queries
Adds tests from scheduled queries as generic tests

Why

To centralize the data quality tests

Related PR: https://github.com/stellar/stellar-dbt/pull/227

Known limitations

None

…stead of warning

lint

chowbao · 2024-09-11T18:54:42Z

models/marts/history_assets.yml

          meta:
            description: "Monitors the freshness of your table over time, as the expected time between data updates."
+      - incremental_unique_combination_of_columns:
+          combination_of_columns:
+            - batch_run_date


I'm not sure batch_run_date should be included. My assumption was that the history_assets table was unique on asset_code, asset_issuer, and asset_type otherwise we would have "duplicate assets" based where each asset would have multiple batch_run_dates

Addressed in 6bd6abf

chowbao · 2024-09-11T18:57:39Z

models/sources/src_accounts.yml

@@ -14,7 +14,6 @@ sources:
          - incremental_unique_combination_of_columns:
              combination_of_columns:
                - account_id
-                - sequence_number


I think sequence_number should be included. The src/staging accounts table should record every instance of an account on every sequence_number if the account had a change for that given ledger sequence iirc

Got it. I will keep it then, It was not present in the scheduled query, so I treated that as source of truth

Addressed in 6bd6abf

chowbao · 2024-09-11T19:03:27Z

tests/bucketlist_db_size_check.sql

@@ -0,0 +1,17 @@
+{{ config(
+    severity="error"
+    , tags=["singular_test"]


FYI this tag runs every 30 mins in airflow. This is a higher frequency compared to what the cloud function and scheduled query tests used to run at. Which is good.

Just mentioning this in case we get noisy alerts where we might want to adjust the query and/or the frequency the tests are run (possibly with a separate dbt tag).

chowbao · 2024-09-11T19:05:53Z

tests/bucketlist_db_size_check.sql

+  select sequence,
+    closed_at,
+    total_byte_size_of_bucket_list / 1000000000 as bl_db_gb
+  from {{ source('crypto_stellar', 'history_ledgers') }}


I think this should be a ref('stg_history_ledgers') instead of a source. Otherwise these tests would be hardcoded to just prod right?

Wont test still use crypto_stellar.history_ledgers instead of test_crypto_stellar.history_ledgers ?

Example:

stellar-dbt-public/models/staging/stg_history_ledgers.sql

Lines 6 to 12 in 2e63a8a

with

raw_table as (

select *

from {{ source('crypto_stellar', 'history_ledgers') }}

)

, history_ledgers as (

It will use the test project/dataset because the source for staging tables will be overwritten by the dbt_project.yml in the private dbt repo
https://github.com/stellar/stellar-dbt/blob/master/dbt_project.yml#L63-L68

+project: "{% if target.name == 'prod' %} crypto-stellar {% else %} {{ target.project }} {% endif %}"

I don't think generic tests has such an override defined. So technically you can add a generic test source override to dbt_project.yml. But my preference would be to just change the generic test source to ref because I feel like it is cleaner because you only need to define a single override for staging table instead of two overrides (staging + generic tests)

Got it. Yes, agree in that case we should just use ref

Addressed in 74dad5a

chowbao · 2024-09-11T19:06:39Z

tests/eho_by_ops.sql

+  FROM {{ source('crypto_stellar', 'history_operations') }} op
+  LEFT OUTER JOIN {{ ref('enriched_history_operations') }} eho


Same comment here

I think this should be a ref('stg_history_operations') instead of a source. Otherwise these tests would be hardcoded to just prod right?

Edit: also in this case there would be a miss match between data if run in test because there is a ref('enriched_history_operations')

Addressed in 74dad5a

chowbao · 2024-09-12T17:56:43Z

tests/ledger_sequence_increment.sql

@@ -11,7 +11,7 @@ with
            , batch_id
            , closed_at
            , max(sequence) as max_sequence
-        from {{ source('crypto_stellar', 'history_ledgers') }}
+        from {{ ref('stg_history_ledgers') }}


amishas157 added 5 commits September 3, 2024 16:22

Move tests from scheduled queries

88d0fed

Move tests from scheduled queries and set freshness check to error in…

a05afc4

…stead of warning

Update generic tests

8dc3b54

remove trailing whitespace

acfc893

Move adhoc business queries

c62a947

lint

amishas157 force-pushed the patch/hubble-520/move-scheduled-queries-to-dbt-tests branch from eed3dec to c62a947 Compare September 11, 2024 09:13

amishas157 added 2 commits September 11, 2024 20:38

remove semicolon

a86ea70

Fix the reference

228f0dc

amishas157 changed the title ~~Move tests from scheduled queries~~ Move tests from scheduled queries / business queries to DBT Sep 11, 2024

amishas157 marked this pull request as ready for review September 11, 2024 17:52

amishas157 requested a review from a team as a code owner September 11, 2024 17:52

chowbao reviewed Sep 11, 2024

View reviewed changes

amishas157 added 2 commits September 12, 2024 16:02

feedback

6bd6abf

Use staging tables in test instead of source to handle test env

74dad5a

chowbao reviewed Sep 12, 2024

View reviewed changes

chowbao approved these changes Sep 12, 2024

View reviewed changes

amishas157 merged commit 0425e26 into master Sep 12, 2024
3 checks passed

sydneynotthecity deleted the patch/hubble-520/move-scheduled-queries-to-dbt-tests branch November 14, 2024 16:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move tests from scheduled queries / business queries to DBT #83

Move tests from scheduled queries / business queries to DBT #83

amishas157 commented Sep 3, 2024 •

edited

Loading

chowbao Sep 11, 2024

amishas157 Sep 12, 2024

chowbao Sep 11, 2024 •

edited

Loading

amishas157 Sep 11, 2024

amishas157 Sep 12, 2024

chowbao Sep 11, 2024

chowbao Sep 11, 2024

amishas157 Sep 12, 2024 •

edited

Loading

chowbao Sep 12, 2024 •

edited

Loading

amishas157 Sep 12, 2024

amishas157 Sep 12, 2024 •

edited

Loading

chowbao Sep 11, 2024 •

edited

Loading

amishas157 Sep 12, 2024 •

edited

Loading

chowbao Sep 12, 2024

	with
	raw_table as (
	select *
	from {{ source('crypto_stellar', 'history_ledgers') }}
	)

	, history_ledgers as (

		FROM {{ source('crypto_stellar', 'history_operations') }} op
		LEFT OUTER JOIN {{ ref('enriched_history_operations') }} eho

Move tests from scheduled queries / business queries to DBT #83

Move tests from scheduled queries / business queries to DBT #83

Conversation

amishas157 commented Sep 3, 2024 • edited Loading

PR Structure

Thoroughness

Release planning

What

Why

Known limitations

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chowbao Sep 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amishas157 Sep 12, 2024 • edited Loading

Choose a reason for hiding this comment

chowbao Sep 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amishas157 Sep 12, 2024 • edited Loading

Choose a reason for hiding this comment

chowbao Sep 11, 2024 • edited Loading

Choose a reason for hiding this comment

amishas157 Sep 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amishas157 commented Sep 3, 2024 •

edited

Loading

chowbao Sep 11, 2024 •

edited

Loading

amishas157 Sep 12, 2024 •

edited

Loading

chowbao Sep 12, 2024 •

edited

Loading

amishas157 Sep 12, 2024 •

edited

Loading

chowbao Sep 11, 2024 •

edited

Loading

amishas157 Sep 12, 2024 •

edited

Loading