ETL and Data Pipelines using Airflow.
Project aims to de-congest the national highways by analyzing the road traffic data from different toll plazas. Each highway is operated by a different toll operator with different IT setup that use different file formats. As a vehicle passes a toll plaza, the vehicle's data like vehicle_id,vehicle_type,toll_plaza_id and timestamp are streamed to Kafka.
- Collect data available in different formats and, consolidate it into a single file.
- Create a data-pipeline that collects the streaming data and loads it into a database.
- confirm the submitted DAG runs successfully usng command: ' airflow dags list '