Rain Prediction is a universal problem among data scientist and statisticians. Over the past 10 years there is no perfect approach as to find the possibility of rain using technology. A tool to predict rain has infinite uses. It could be used by farmers and also by Aviation industry.
After doing extensive research on the topic and finding the dataset I have laid out what I think the best approach for predicting rain using consepts of Big Data and Machine Learning.
In this project I have used Machine Learning to predict the possibility of Rain in Australia on a given day.
As the data is pretty big I have used consepts of big data in this project. Pyspark is the main tool used for management of data and analysis and MongoDB is used as the database.
In the Machine Learning part I have used multiple Algorithms and a comparision of performance of all the Algorithms have been done.
The Algorithms used are:
- GBT Classifier
- Logistic Regression
- Random Forest
- Decision Tree
Python3 is used for this project. I have detailed out explaination as to what is done and each step is explained elaborately in the Jupyter Notebook.