-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
User Export #317
base: main
Are you sure you want to change the base?
User Export #317
Conversation
@adamribaudo-velir, while I've verified that these work locally, I've only verified with a minimal amount of data. Multi-site changes were verified by setting the same property ID several times under It would be best to verify this using a proper multi-site installation. I don't have any current clients, but I can ask some past clients. I did not write any tests. dbt recommends testing certain things. The only logic worth testing , the multi-site macro, is not testable by the new framework. I defaulted these new tables to I also ran |
macros/combine_property_data.sql
Outdated
@@ -6,41 +6,69 @@ | |||
{% if not should_full_refresh() %} | |||
{# If incremental, then use static_incremental_days variable to find earliest shard to copy #} | |||
{%- set earliest_shard_to_retrieve = (modules.datetime.date.today() - modules.datetime.timedelta(days=var('static_incremental_days')))|string|replace("-", "")|int -%} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dgitis I think you've introduced some duplicate code in this macro?
I'm on board with the approach, but unable to run it due to this error. Couldn't find an obvious issue. I can run the main branch fine with the same project configuration.
|
I fixed the duplicate code issues which seems to have fixed the error, not sure why though, and also updated a test that worked on my very small test site but wouldn't work on a larger site. |
I get an error when building the new staging models:
I have 1 user property defined:
I'd be ok approving a PR with JUST the base model data for user export so that package users at least have access to that data. We could sort out the most advanced stuff (pulling in properties, audiences) later. |
Description & motivation
Work-in-progress resolving #285.
The goal of this PR is to support the
pseudonymous_user_id
anduser_id
tables.My initial thought is that we should keep both the package's current user tables and new ones derived from the new user export options in GA4. The reason for this is to support both old and new installations.
README modifications are not done, but I expect we will want to add a new section on disabling user models that explains the differences between our various models and how to use them.
What defaults should we have for
+enable
on these tables?This PR supports
audiences
fields from the export. While working with a test site that has implemented audiences, the data doesn't seem to be very useful unless its ID field joins with Google Ads audience exports. Despite this, I think we should leave this in because it needs to be enabled by a variable.The
user_properties
fields were done without sample data. Naming and logic will likely need to be updated.Multi-site is not currently supported.
We also might want to move the logic in
base_ga4___*
models tostg_ga4__*
models.I also need to review package naming conventions to ensure data, like geo data, is consistent with elsewhere in the package.
To-do:
user_property
logicHere are the docs for the source tables.
Checklist
dbt test
andpython -m pytest .
to validate existing tests