The goal of this project is to build a machine learning model that accurately predicts house prices using historical data and property features. The project aims to:
- Achieve a Mean Absolute Error (MAE) below $50,000.
- Provide actionable insights for buyers, sellers, and real estate agents.
- Highlight key factors influencing house prices.
This project uses a dataset sourced from Kaggle: House Sales in King County Dataset. The dataset provides detailed information on home sales in King County, Washington, including property features, sale prices, and geographical data.
Variable | Description |
---|---|
id |
A unique identifier for a house |
date |
The date the house was sold |
price |
Sale price of the house (target variable for prediction) |
bedrooms |
Number of bedrooms |
bathrooms |
Number of bathrooms |
sqft_living |
Square footage of the living space |
sqft_lot |
Square footage of the lot |
floors |
Total number of floors (levels) in the house |
waterfront |
Indicator if the house has a view of the waterfront |
view |
Number of times the property was viewed |
condition |
Overall condition rating of the house |
grade |
Overall grade given to the house, based on the King County grading system |
sqft_above |
Square footage of the house excluding the basement |
sqft_basement |
Square footage of the basement |
yr_built |
Year the house was originally built |
yr_renovated |
Year the house was last renovated |
zipcode |
ZIP code of the location |
lat |
Latitude coordinate |
long |
Longitude coordinate |
sqft_living15 |
Square footage of interior living space in 2015 (may reflect renovations affecting living space) |
sqft_lot15 |
Square footage of the lot in 2015 (may reflect renovations affecting lot size) |
|