-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compare_column_values not working with Athena/Presto #87
Comments
I still have this issue in version 0.11.0 |
The root cause is that the group by clause does not allow to use the alias. See presto discussion A workaround solution is to add a macro at replace {% macro athena__compare_column_values(a_query, b_query, primary_key, column_to_compare, emojis, a_relation_name, b_relation_name) -%}
with a_query as (
{{ a_query }}
),
b_query as (
{{ b_query }}
),
joined as (
select
coalesce(a_query.{{ primary_key }}, b_query.{{ primary_key }}) as {{ primary_key }},
a_query.{{ column_to_compare }} as a_query_value,
b_query.{{ column_to_compare }} as b_query_value,
case
when a_query.{{ column_to_compare }} = b_query.{{ column_to_compare }} then '{% if emojis %}✅: {% endif %}perfect match'
when a_query.{{ column_to_compare }} is null and b_query.{{ column_to_compare }} is null then '{% if emojis %}✅: {% endif %}both are null'
when a_query.{{ primary_key }} is null then '{% if emojis %}🤷: {% endif %}missing from {{ a_relation_name }}'
when b_query.{{ primary_key }} is null then '{% if emojis %}🤷: {% endif %}missing from {{ b_relation_name }}'
when a_query.{{ column_to_compare }} is null then '{% if emojis %}🤷: {% endif %}value is null in {{ a_relation_name }} only'
when b_query.{{ column_to_compare }} is null then '{% if emojis %}🤷: {% endif %}value is null in {{ b_relation_name }} only'
when a_query.{{ column_to_compare }} != b_query.{{ column_to_compare }} then '{% if emojis %}❌: {% endif %}values do not match'
else 'unknown' -- this should never happen
end as match_status,
case
when a_query.{{ column_to_compare }} = b_query.{{ column_to_compare }} then 0
when a_query.{{ column_to_compare }} is null and b_query.{{ column_to_compare }} is null then 1
when a_query.{{ primary_key }} is null then 2
when b_query.{{ primary_key }} is null then 3
when a_query.{{ column_to_compare }} is null then 4
when b_query.{{ column_to_compare }} is null then 5
when a_query.{{ column_to_compare }} != b_query.{{ column_to_compare }} then 6
else 7 -- this should never happen
end as match_order
from a_query
full outer join b_query on a_query.{{ primary_key }} = b_query.{{ primary_key }}
),
aggregated as (
select
'{{ column_to_compare }}' as column_name,
match_status,
match_order,
count(*) as count_records
from joined
group by '{{ column_to_compare }}', match_status, match_order
)
select
column_name,
match_status,
count_records,
round(100.0 * count_records / sum(count_records) over (), 2) as percent_of_total
from aggregated
order by match_order
{% endmacro %} |
This was referenced Aug 5, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
Function compare_column_values not working on Athena/Presto.
Steps to reproduce
Create test in Presto environment:
Log output
COLUMN_NOT_FOUND: line 59:10: Column 'column_name' cannot be resolved or requester is not authorized to access requested resources
System information
dbt-labs/audit_helper version: 0.9.0
Which database are you using dbt with?
Additional context
Presto does not support grouping by aliases.
Line 53 of
compare_column_values.sql
:group by column_name
should be removed in order to work with PrestoAre you interested in contributing the fix?
I can try making a PR
The text was updated successfully, but these errors were encountered: