-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix/country names #246
Fix/country names #246
Conversation
… fix/country_names
…with the country reference data so that Global can create standardized groupings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a few notes to consider- otherwise LGTM!
from {{ref('seed_country_standardizations')}} | ||
), | ||
|
||
combined as ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see an argument for having this in user_geos, but if we wanted to keep the principle of staging tables only referencing a single model, we could integrate the country mappings in dim_users instead. This data is not surfaced to the end user in the staging model, so I think it might make more sense if this logic was in the dimensional layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The user_geos table feeds dim_student_projects, not just dim_users. These are corrections we want to apply universally and they won't vary for different models, so I'd rather do them as upstream as possible. To keep the idea of staging referring to a single input model, would it make more sense to you to just do the corrections as code directly in the staging table? There aren't a lot (<15) but that number could grow over time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotcha, I think how you have it makes sense then. The only other way I could think of doing it is having a dim_user_geos model, instead of only a staging and then changing all the references downstream. Thoughts?
… fix/country_names
- Converts the seed file lookup for country name edits to a macro - Applies macro to HOC + user_geos - Adds additional columns to dim_country_reference - Docs
Description
Fix to standardize country names in user_geos according to Martina's sheet: https://docs.google.com/spreadsheets/d/1UB3flfF2brnsWcSdp5khbrGf3Mp1PA7t9GgQMSZrENE/edit?gid=1271278978#gid=1271278978
Secondary goals:
Links
Jira ticket(s): https://codedotorg.atlassian.net/browse/DATAOPS-1046
Testing story
eg.
-
not_null
-
unique
- `dbt_utils.unique_combination_of_columns: , ["value","value","value"...]
Note: when submitting a new model for review please make sure the following have been tested:
dbt build -m 'your_model'
)or: has the dbt Cloud job succeeded?
dbt run -m 'your_model'
)select 1 from 'your_model'
)Privacy
i.
ii.
iii.
PR Checklist:
--> Note: if these are not all checked, the PR will be sent back.
.yml.
, diddbt docs generate
succeed?)dbt docs
has been updated successfully on Github Pageschore/
,feature/
,fix/
)