This project was completed as part of my summer internship in 2024. The goal was to develop a machine learning model capable of predicting house prices based on certain features such as the number of bedrooms, bathrooms, square footage, and the age of the house. The project uses Python, scikit-learn
for building the machine learning model, and Streamlit for creating an interactive web application.
file.py
: This script is responsible for loading the dataset, training a linear regression model, and saving the model as a.pkl
file.app2.py
: This script creates a web application using Streamlit. The user can input house features, and the model predicts the house price based on those inputs.house_price_model.pkl
: The trained machine learning model.house_price_prediction_dataset.csv
: The dataset used to train the model.
The dataset (house_price_prediction_dataset.csv
) contains the following columns:
- num_bedrooms: Number of bedrooms in the house.
- num_bathrooms: Number of bathrooms in the house.
- square_footage: Total square footage of the house.
- age_of_house: Age of the house in years.
- house_price: The actual house price (used as the target for prediction).
-
Data Loading and Preprocessing:
The dataset is loaded and the relevant features (bedrooms, bathrooms, square footage, and age) are selected for model training. The target variable is the house price. -
Model Training:
A Linear Regression model is trained usingscikit-learn
. The dataset is split into training and testing sets (80% training and 20% testing). After training, the model is saved ashouse_price_model.pkl
usingjoblib
. -
Web Application:
A simple web application is created using Streamlit. Users can input the number of bedrooms, bathrooms, square footage, and house age, and the app predicts the house price using the pre-trained model.
Below is an example of the web application after predicting a house price based on user input:
git clone https://github.com/Parshuramsingh013/House-Price-Prediction