MLA_Dashboard

Problem description

Market platforms like Mercado Libre Argentina have millions of products listed on their platform, making it difficult for shoppers to find the most popular and best-selling products. Also, with the sheer number of products on the platform, it can be overwhelming for sellers to keep track of how their own product is performing against its competitors. This project aims to solve this problem by creating a board that shows the 20 best-selling products by category in Mercado Libre Argentina. This will help both buyers and sellers to easily track and analyze the performance of products in their respective categories.

The dashboard will allow users to filter products by category and rating within each category, and will also display the number of products sold and their respective prices. Additionally, users will be able to view the total gross sales of each product in Argentine pesos. With this dashboard, both buyers and sellers can easily identify the most popular products in a given category and track their sales performance over time.

Diagram

dbt Transformation

Final Result

DWH

Currently, the data warehouse (DWH) is not optimized by partitioning and clustering, since the tables are not expected to grow in size and currently the largest table handled has only 567 rows.

How to reproduce this project

Initial setup:

cd Projects
git clone https://github.com/mary435/MLA_Dashboard.git
cd MLA_Dashboard

Connect to the MLA API. Follow these steps.

Alternatively if you don't want to connect to the api, you can use the files I download from the api to my repository in the data folder. And instead of deploying the etl_web_to_gcs_bq.py file to prefect, load this alternative that reads the files locally and loads into bigquery: elt_local_to_bq.py.

Activate your environment and install the requirements.

conda activate mla_dashboard
pip install -r requirements.txt

Start prefect server

prefect orion start

Open up your browser to http://127.0.0.1:4200 to get started.
Folow these steps to connect prefect to gcs.
To create the deployment, run the following command. Which will create the file etl_parent_flow-deployment.yaml and will create the deployment in prefect with the name etl_MLA, scheduled to run daily at 1:00 p.m.:

prefect deployment build etl_web_to_gcs_bq.py:etl_parent_flow -n etl_MLA --cron "0 13 * * *" -a

Remember that if you did not connect to the api here you must change the file. prefect deployment build elt_local_to_bq.py:etl_parent_flow -n etl_MLA --cron "0 13 * * *" -a

Start the default woker agent:

prefect agent start  --work-queue "default"

The script runs a parent flow and four child flows, with different tasks. This script communicates with the API, downloads the information to a parquet file and saves it first in a bucket and then in the DWH. This is what the radar of the flows looks like:
Create the datasets "dbt_sandbox," "production," and "staging" in BigQuery.
For the dbt transformation clone a new repository:

git clone https://github.com/mary435/MLA_dbt.git

Create a new account in dbt. And Connect dbt to BigQuery and the new repository then initialize the project. Folow these steps
Run this command in dbt and you will see how the views and tables go to the bigquery sandbox.

dbt run

Now run this command in dbt and you will see how the views and tables go to bigquery production and the documentation tab is enabled in dbt with the information of the tables.

dbt build

Configure dbt to update the data automatically every day. Folow these steps.
Open now datadtudio or lookerstudio. Select create in the upper left corner.And choose the option "Data Source".
Here choose BigQuery and the project name "mla-dashboard-zoom", then the dataset "production" and the table "fact_best_sellers" and now click connect. Now you are ready here you can create the dashboard!

Licence

Distributed under the terms of the MIT license, "MLA_Dashboard" is free and open source software.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
data		data
images		images
.gitignore		.gitignore
.prefectignore		.prefectignore
How_api_connect.md		How_api_connect.md
How_install_node.md		How_install_node.md
LICENSE		LICENSE
README.md		README.md
big_query.sql		big_query.sql
etl_gcs_to_bq.py		etl_gcs_to_bq.py
etl_local_to_bq.py		etl_local_to_bq.py
etl_parent_flow-deployment.yaml		etl_parent_flow-deployment.yaml
etl_web_to_gcs_bq.py		etl_web_to_gcs_bq.py
how_connect_dbt.md		how_connect_dbt.md
how_connect_gcs.md		how_connect_gcs.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLA_Dashboard

Problem description

Diagram

dbt Transformation

Final Result

DWH

How to reproduce this project

Licence

About

Releases

Packages

Languages

License

mary435/MLA_Dashboard

Folders and files

Latest commit

History

Repository files navigation

MLA_Dashboard

Problem description

Diagram

dbt Transformation

Final Result

DWH

How to reproduce this project

Licence

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages