Skip to content

KatTiel/data_pipeline_weather_data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ETL Data Pipeline For Current and Historical Weather Data 🌄

This project aims at exploring the shifts in yearly temperature and rainfall patterns dating back to 1979.
I've chosen to focus on five beautiful destinations worldwide:

Berlin, Ko Tao, Parque Nacional Corcovado, San Diego, and Tulum.

The project is divided into two main parts:
Both historical and current weather data collection and analysis.

Prerequisites

Architecture

workflow_weather

How It Works

Historical Weather Data

Extract

Transform

Load

  • Using DBeaver, create RDS weather tables by executing the queries provided in the .sql file
  • In DBeaver, import the historical weather data into the designated table for each city

Current Weather Data

This part of the project is accomplished by using two different AWS Lambda functions. The initial function is scheduled to execute every hour automatically, while the second function triggers upon the upload of a .csv file into the designated S3 bucket. This upload marks the concluding step of the first function's execution.

AWS Lambda functions need to be initialized with a specific Python runtime. If additional dependencies are required, such as psycopg2 for database connections, they must to be uploaded as a .zip file, along with the lambda_function. Pandas should be integrated as an AWS Lambda function layer rather than being part of the uploaded dependencies.

❗ I highly recommend the approach of downloading dependencies for the second AWS Lambda function in Python 3.8 runtime as shown in the video as it resolves several compatibility issues encountered with alternative methods ❗

Extract & Transform

AWS Lambda ❗Python 3.10❗

  • Create an AWS S3 bucket for current weather data
  • Create an AWS Lambda function in Python 3.10 runtime
  • Add Pandas 3.10 layer
  • Upload the function and its dependencies as a .zip file, as described in the video
  • Create an AWS CloudWatch Event to activate the function at hourly intervals

Load

AWS Lambda ❗Python 3.8❗

License

MIT

Have fun collecting weather data for your personal analysis! 🌿

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published