Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unnecessary warning for exposure_schema_validity test #1708

Open
data-blade opened this issue Sep 27, 2024 · 2 comments
Open

unnecessary warning for exposure_schema_validity test #1708

data-blade opened this issue Sep 27, 2024 · 2 comments
Labels
Bug Something isn't working Triage 👀

Comments

@data-blade
Copy link

data-blade commented Sep 27, 2024

Describe the bug
given exposure_a and exposure_b,

each with just model_x in their depends_on,

each only specifying column_name.

running exposure_schema_validity will yield this warning for each exposure and column

Warning - missing node property for the exposure: <exposure> which not the only exposure depending on <model>, We're not able to verify the column level dependencies of this exposure

To Reproduce
create a simple setup with exposure_a, exposure_b and model_x as described above

Expected behavior
no warning, since model_x is the only model in the depends_on of each exposure, therefore no node property should be needed

Screenshots
n/a

Environment

  • Elementary CLI (edr) version: 0.15.1
  • Elementary dbt package version: 0.15.2
  • dbt version you're using: 1.8.7
  • Data warehouse: databricks
  • Infrastructure details: n/a

Additional context
the test passes despite the warning

Would you be willing to contribute a fix for this issue?
if time allows

@data-blade data-blade added Bug Something isn't working Triage 👀 labels Sep 27, 2024
@haritamar
Copy link
Collaborator

Hi @data-blade !
Thanks for opening this issue, it indeed looks to me like a bug - it seems that what we're currently checking is whether or not there is more than one exposure depending on the same node, but I'm not sure why that is the check.

It may take us a bit of time to get to so if you're willing to contribute a fix will be happy to point to give pointers on where to look.

@data-blade
Copy link
Author

data-blade commented Jan 30, 2025

@haritamar's observation is correct and hints at the root cause.

but since this bug never received any love and we're now automating our test coverage, we have to replace broken elementary with proprietary tests. here it is, if anybody is interested

note this assumes node: <model> instead of node: ref('<model>')

{% test expect_exposure_columns_to_exist(model) %}

	{{ config(
		store_failures = true,
  		enabled = target.name != 'prod',
	) }}

	{% if execute %}
		{% set all_nodes = graph.nodes.keys() %}
		{% set node = graph.nodes['model.'~project_name~'.'~model.identifier] %}
		{% set exposures = graph.exposures.values() %}
	{% endif %}
	{% set exposure_columns = [] %}
	{% for exposure in exposures %}
		{% if node.unique_id in exposure.depends_on.nodes %}
			{% for column in exposure.meta.referenced_columns %}
				{# detect unknown models #}
				{% if column.node and 'model.'~project_name~'.'~column.node not in all_nodes %}
					{{ exceptions.raise_compiler_error("UNKNOWN NODE: '" ~ exposure.name ~ "' depends on unknown model '" ~ column.node ~ "'") }}
				{% endif %}
				{# set the column's parent model #}
				{% if exposure.depends_on.nodes|length == 1 %}
					{% set node_ref = node.name %}
				{% else %}
					{# detect missing but required node parameters #}
					{% if not column.node %}
						{{ exceptions.raise_compiler_error("MISSING NODE: '" ~ exposure.name ~ "' depends on multiple models, but '" ~ column.column_name ~ "' has no 'node' parameter") }}
					{% endif %}
					{% set node_ref = column.node %}
				{% endif %}
				{# validate if expected column is missing from the model #}
				{% if node_ref == node.name and column.column_name not in node.columns %}
					{% do exposure_columns.append([column.column_name, exposure.name]) %}
				{% endif %}
			{% endfor %}
		{% endif %}
	{% endfor %}

	{% if exposure_columns %}
		{% for col, exp in exposure_columns %}
			select '{{ col }}' as col, '{{ exp }}' as exp
			{{ 'union all' if not loop.last }}
		{% endfor %}
	{% else %}
		select null limit 0
	{% endif %}

{% endtest %}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working Triage 👀
Projects
None yet
Development

No branches or pull requests

2 participants