Skip to content
forked from google/megalista

First Party data integration solution built for marketing teams to enable audience and conversion onboarding into Google Marketing products (Google Ads, Campaign Manager, Google Analytics).

License

Notifications You must be signed in to change notification settings

petlove/megalista

 
 

Repository files navigation

Megalista

Sample integration code for onboarding offline/CRM data from BigQuery as custom audiences or offline conversions in Google Ads, Google Analytics 360, Google Display & Video 360, and Google Campaign Manager.

Disclaimer: This is not an officially supported Google product.

Supported integrations

  • Google Ads

    • Contact Info Customer Match (email, phone, address) [details]
    • Id Based Customer Match (device Id, user id)
    • Offline Conversions through gclid [details]
    • Enhanced Conversions for Leads or Offline Conversions through user_identifiers [details]
    • Store Sales Direct (SSD) conversions [details]
  • Google Analytics (Universal analytics)

  • Campaign Manager

    • Offline Conversions API (user id, device id, match id, gclid, dclid, value, quantity, and customVariables) [details]
  • Google Analytics 4

  • Display & Video

    • Contact Info Customer Match (email, phone, address) [details]
    • Id Based Customer Match (device Id)
  • Appsflyer

    • S2S Offline events API (conversion upload), to be used for audience creation and in-app events with Google Ads and DV360 [details]

How does it work

Megalista was designed to separate the configuration of conversion/audience upload rules from the engine, giving more freedom for non-technical teams (i.e. Media and Business Intelligence) to setup multiple upload rules on their own.

The solution consists of #1 a configuration environment (either Google Sheet or JSON file, or a Google Cloud Firestore collection) in which all rules are defined by mapping a data source (BigQuery Table) to a destination (data upload endpoint) and #2, an Apache Beam workflow running on Google Dataflow, scheduled to upload the data in batch mode.

Prerequisites

Google Cloud Services

  • Google Cloud Platform account
    • Billing enabled
    • BigQuery enabled
    • Dataflow enabled
    • Cloud storage enabled
    • Cloud scheduler enabled
    • App Engine enabled
  • At least one of:
    • Google Ads API Access
    • Campaign Manager API Access
    • Google Analytics API Access
    • Display & Video API Access
  • Python3
  • Google Cloud SDK

Access Requirements

Those are the minimum roles necessary to deploy Megalista:

  • OAuth Config Editor
  • BigQuery User
  • BigQuery Job User
  • BigQuery Data Viewer
  • Cloud Scheduler Admin
  • Storage Admin
  • Dataflow Admin
  • Service Account Admin
  • Logs Viewer
  • Service Consumer

APIs

Required APIs will depend on upload endpoints in use.

  • Google Sheets (required if using Sheets configuration) [link]
  • Google Analytics [link]
  • Google Analytics Reporting [link]
  • Google Ads [link]
  • Campaign Manager [link]
  • Google Cloud Firestore [link]
  • Display & Video [link]

Configure Megalista

Megalista can be configured via Google Sheets, a JSON file, or a Google Cloud Firestore collection. Expected data schemas (Sources) and metadata (Destinations) for each use case are defined in the Megalista Wiki.

Instructions for each configuration method method can be found in the Megalista wiki

Deployment

This guide assumes it'll be followed inside Google Cloud Platform Console.

Creating required access tokens

To access campaigns and user lists on Google's platforms, this dataflow will need OAuth tokens for an account that can authenticate in those systems.

In order to create it, follow these steps:

  • Access the GCP console
  • Go to the API & Services section on the top-left menu.
  • On the OAuth Consent Screen and configure an Internal Consent Screen
  • Then, go to the Credentials and create an OAuth client Id with Application type set as Desktop App
  • This will generate a Client Id and a Client secret. Save these values as they are required during the deployment
  • Run the generate_megalista_token.sh script in this folder providing these two values and follow the instructions
    • Sample: ./generate_megalista_token.sh client_id client_secret
  • This will generate the Access Token and the Refresh token
    • The user who opened the generated link and clicked on Allow must have access to the platforms that Megalista will integrate, including the configuration Sheet, if this is the chosen method for configuration.

Deploying Pipeline

  • Download the latest Megalista code. To deploy the full Megalista pipeline, use the following command from the deployment folder: ./deploy.sh The script will require some parameters, please add them to the config.json file. Some parameters have default values and can be changed.

  • Auxliary bigquery dataset for Megalista operations to create

    • This dataset will be used for storing operational data and will be created by the deployment script
  • Google Cloud Storage Bucket to create

    • This Cloud Storage Bucket will be used to store Megalista compiled binary, metadata, and temp files and will be created by the deployment script.
  • Setup Firestore collection, URL for JSON configuration and Setup Sheet Id

    • Only one of these three should be filled and the other should be left black accordingly to the chosen configuration method.
  • Client ID, Client Secret, Access Token and Refresh Token from the previous step.

    Disclaimer: Please store your config.json file in a secure place or delete it after the deployment.

Updating the Binary

To update the binary without redoing the whole deployment process, run:

  • ./deployment/deploy_cloud.sh gcp_project_id bucket_name region service_account_email

Usage

Every upload method expects as source a BigQuery data with specific fields, in addition to specific configuration metadata. For details on how to setup your upload routines, refer to the Megalista Wiki.

Errors notifications by email

To have uploaders errors captured and sent by email, do the following: In Cloud Scheduler, in the parameters section of the request body, add notify_errors_by_email parameter as true and errors_destination_emails with a list of emails divided by comma ([email protected],[email protected] etc). These parameters should be added to the same list of pre-configured ones, such as client_id, client_secret etc.

If the access tokens being used were generated prior to version v4.4, new access and refresh tokens must be generated to activate this feature. This is necessary because old tokens don't have the gmail.send scope.

Note about Google Ads API access

Calls to the Google Ads API will fail if the user that generated the OAuth2 credentials (Access Token and Refresh Token) doesn't have direct access to the Google Ads account to which the calls are being directed. It's not enough for the user to have access to a MCC above this account and being able to access the account through the interface, it's required that the user has permissions on the account itself.

About

First Party data integration solution built for marketing teams to enable audience and conversion onboarding into Google Marketing products (Google Ads, Campaign Manager, Google Analytics).

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 94.6%
  • Shell 3.9%
  • HCL 1.2%
  • Dockerfile 0.3%