Skip to content

birehan/Traffic-Analytics-Data-Warehouse-Airflow-dbt-Redash

Repository files navigation

Traffic-Analytics-Data-Warehouse-Airflow-dbt-Redash

Overview

This project focuses on creating a scalable data warehouse for a city traffic department, utilizing swarm UAVs (drones) to collect traffic data. The data is intended for improving traffic flow and undisclosed projects. The tech stack comprises MySQL, DBT, and Airflow, following the Extract Load Transform (ELT) framework.

Tech Stack Flow

Project Structure

The project structure includes:

  • data: Raw and cleaned datasets' CSV files.
  • dags: Airflow DAGs for task orchestration.
  • notebooks: Jupyter notebook for Explanatory Data Analysis (EDA).
  • screenshots: Visual representations of the project, including tech stack flow, path for track ID, and speed comparisons.
  • scripts: Python utility scripts.
  • traffic_dbt: dbt (Data Build Tool) files and configurations.
  • docker-compose.yaml: YAML file for Docker Compose, facilitating the setup of Airflow and Docker.

Airflow Data Loading with Docker

This repository contains the necessary files to set up a Dockerized Airflow environment for data loading into PostgreSQL.

Prerequisites

Getting Started

  1. Clone the Repository:

    git clone https://github.com/birehan/Traffic-Analytics-Data-Warehouse-Airflow-dbt-Redash
    cd Traffic-Analytics-Data-Warehouse-Airflow-dbt-Redash
  2. Configure Environment Variables (Optional):

    If needed, you can set environment variables by creating a .env file in the project root. Adjust variables as necessary.

    Example .env file:

    AIRFLOW_UID=1001
    AIRFLOW_IMAGE_NAME=apache/airflow:2.8.0
    _PIP_ADDITIONAL_REQUIREMENTS=your_additional_requirements.txt
  3. Build and Run Airflow Services:

    docker-compose up --build
  4. Access Airflow Web Interface:

    Once the services are running, access the Airflow web interface at http://localhost:8080.

  5. Stop Airflow Services:

    When you're done, stop the Airflow services:

    docker-compose down

DAG Information

  • The Airflow DAG create_vehicle_tables is designed to create a PostgreSQL database, tables, and load data from a CSV file.
  • Customize the DAG or SQL scripts in the dags and dags/sql directories as needed.

About

Traffic Analytics Data Warehouse Airflow dbt Redash

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published