My idea for a group project is to retrieve and analyze data about the real estate market. The plan is to analyze market data via Zillow, which allows for web scraping of its public listings and also has an official API. My idea would be, given a zip code or city name, to retrieve that market's active listings through web scraping and/or the Zillow API, and use the data to find trends and correlations between a property's attributes and its price. How does the price per square foot change with number of bedrooms/bathrooms? Which neighborhoods have had the highest price increase in the last year (if historical data is available)? Which properties are overpriced, or even better, underpriced? There is also the text mining aspect of using the description to extract even more attributes and features we can consider. This even has the potential to discover country-wide market trends. Given a list of major cities, which cities are the most expensive? Which cities have seen the fastest price increase over time, and which have the highest density of listings per square mile? These are all potential paths that we can take this project idea.
Vishal Aiely, Sehee Hwang, Matt Mohandiss, Marc Muszik, Selena Ya Xue