Final project for MSDS 604 Time Series Analysis.
Click here to view a detailed report.
- Andy Cheon
- Roja Immanni
- Lin Meng
- Kevin Wong
Our goal is to forecast the median sold price of all homes in California from January 2016 through August 2017 by month (20 forecasted values).
The Zillow dataset consists of four variables:
- Median rental price of all houses in California
- Median mortgage rate
- Unemployment rate
- Median sold price of all houses in California (target)
We use a variety of time series methods, including SARIMA, VAR, and exponential smoothing models. We also perform extensive EDA and differencing to investigate any trends and seasonality present in the data. We find that a SARIMA (1, 1, 2) x (0, 1, 2, 12) model achieves the lowest validation RMSE (root mean squared error) and generate our final predictions with this model.