google-auth
google-auth-oauthlib
google-auth-httplib2
google-api-python-client
pymongo
schedule
You can install the necessary dependencies via pip:
pip install google-auth google-auth-oauthlib google-auth-httplib2 google-api-python-client pymongo schedule
- Create a project on the Google Cloud Platform Console.
- Enable the Gmail API for your project.
- Download the client configuration file
credentials.json
and place it in the root directory of this script. - Setup a MongoDB Atlas cluster and obtain the connection string.
Once the Gmail configuration is complete, let's face the challenge. It's unclear how Uber Eats email receipts look in different countries, but here is what mine looks like:
Ensure you have the same format; otherwise, you might encounter errors when attempting to retrieve Uber Eats emails. Also, if your Uber Eats emails are in English, replace the field names in French with their English equivalents in your receipts.
After launching the getGmail.py
script, approximately 500 Uber Eats emails are retrieved. You can adjust this number. They are sent to MongoDB as documents resembling this:
The ultimate goal is to have a star schema model like this:
Navigate to the models
folder, create a .env
file, and fill in these values:
To configure connections to MongoDB and Snowflake, follow the steps below.
Add these details to your configuration file or environment:
MONGODB_URI=mongodb+srv://username:[email protected]/admin?retryWrites=true&w=majority
SNOWFLAKE_USER=
SNOWFLAKE_PASSWORD=
SNOWFLAKE_ACCOUNT=xxxx.us-east-1
SNOWFLAKE_WAREHOUSE=
SNOWFLAKE_DATABASE=
SNOWFLAKE_SCHEMA=PUBLIC
Note: You can create free accounts on Snowflake and MongoDB Atlas to have ready-to-use servers.
Once set up, launch the mongodb_connection.py
and snowflake_connection.py
files to test the connections. At this stage, everything should be okay.
Create the dimension tables by executing create_table_snowflake.py
. Check in Snowflake to see the three dimension tables and the fact table created.
Execute model.py
to populate the fact and dimension tables, preparing your dimensional model for analysis. You can use Power BI, Tableau, or create views for dashboarding.
For manual execution, follow the steps above.
For automatic execution:
- Navigate to the
airflow_docker
folder. - Launch a local server by executing:
docker-compose up --build -d
. - Once ready, go to
localhost:8080
with username/passwordairflow
. - Before anything else, add connections for Snowflake and MongoDB in the Admin tab.
- Start the main DAG. If everything is okay, you should see an output similar to this: