-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to recreate GA4 reports using this schema? #337
Comments
You're going to need to be much more specific if you want help here.
Regardless, don't expect numbers to exactly match. There are a number of places where we have deliberately decided to diverge from GA4. Google uses Hyperlog++ to estimate numbers. The dbt-GA4 package counts the numbers. Google calculates source, medium, and campaign individually resulting in source/medium combinations that shouldn't have a campaign being assigned the next available campaign. The dbt-GA4 package assigns all attribution parameters to the first event with a valid source. I've written a free course on how you are supposed to use the dbt-GA4 package. I think it might help you get started but the goal of the package is to produce better numbers than GA4 so be prepared to see differences. |
Thanks a lot, @dgitis! I am trying to recreate the GA4 "User Acquisition" report. Could you please help me with the SQL query for this report? Additionally, it would be fantastic if you could assist with the e-commerce purchase report as well. Thank you for your assistance! |
For the first report, you'll want to blend or join (the terminology depends on the reporting tool) the You'll want either From
Treat the above queryies as pseudo code. The actual query will depend on your reporting tool. |
The second report requires customization and it depends on how you set your warehouse up. Take the course I linked, set things up. Try to figure it out yourself, and then share your code. It will be much easier to help you when you have a base setup. Hint: You'll want to create a events fact table with all events in it. This table will mostly be used for funnel reporting so you can filter some events that aren't relevant to funnels, like Treat this as pseudo code. The actual query will depend on your reporting tool. |
Thanks @dgitis!!! |
Hey @dgitis, I wanted to get the report for the page path and the total revenue associated with it. My team decided to add user_pseudo_id to both the fact_ga4__pages and fact_ga4__items_ecommerce tables, and then join them on this field to calculate the total revenue. However, this approach doesn’t seem correct to me can you verify that and If possible, could you please help us identify the fields we could use for these two tables in a way that allows us to accurately calculate the total revenue for each page path after join? Thanks in advance! |
You should be using the The pages model isn't meant to be joined. It's optimized for looking at data by page URL adding an ID to this table defeats the purpose of having a pre-aggregated table. Instead, you should create a |
I am currently working on recreating some reports using the modeled schema provided. However, I am encountering some challenges in formulating the SQL queries needed to generate these reports accurately.
Could anyone please share guidance or examples on how to recreate the GA4 reports using SQL queries based on the modeled schema?
The text was updated successfully, but these errors were encountered: