Skip to content

Analyses the Toronto Airbnb dataset from Sept 2020. Performs exploratory analysis and builds a predictive machine learning model to predict the most important features that impact Airbnb listings

Notifications You must be signed in to change notification settings

peterle93/Toronto-Airbnb-Analysis

Repository files navigation

Table of Contents

  1. Project Motivation
  2. Summary of Results
  3. Medium Blog Post
  4. Libraries
  5. File Descriptions
  6. Acknowledgements

Project Motivation

Toronto Airbnb Dataset Analysis

This project (Write a Data Science Blog Post) is part of Udacity Data Scientist Nanodegree Program. I used Toronto Airbnb Dataset for this project as its the city I live in. I'm interested in using data science techniques to analyze ways to improve future listings. The questions analyzed may be similar to data sources one might encounter in a business setting. Additionally, many of the approaches and skills used in this project can be applicable to future work projects.

Using the data, I answered the following questions:

  1. What are the most common amenities in the dataset?
  2. Which neighborhoods have the highest number of listings and rating review scores?
  3. What is the relationship between the type of room and price listing?
  4. What are the most influential features of the dataset to predict the price of a listing?

The dataset describes the listing activities. The original dataset can be found here: https://www.kaggle.com/robinkongninglo/toronto-airbnb-dataset

Summary of Results

Determined the most common amenities in Toronto listings are:

  1. Wifi
  2. Heating
  3. Smoke Alarm
  4. Essentials
  5. Kitchen
  • Waterfront Communities - The Island has the most listings, followed by Niagara, and then Annex.

  • Forest Hill South, Ionview, and High Park-Swansea have the highest review score ratings.

  • Entire home/apt has the highest median price compared to the other room type listing. Shared room is at the lowest median.

  • The features that has the most influence on the price listing are bedrooms, followed by Entire home/apt, then accommodates.

Medium Blog Post

Here is the Medium blog post I have written: https://le-peter1993.medium.com/data-exploration-for-toronto-airbnb-56b5387d7007

Libraries:

I use Python3 in my Jupyter Notebook:

  1. Numpy
  2. Pandas
  3. Scikit Learn
  4. Matplotlib
  5. Seaborn
  6. Folium
  7. Collections
  8. Math

File Descriptions

  1. Toronto Airbnb Dataset.ipynb - Jupyter notebook with complete analysis, answers to the questions, explanations and visualisations
  2. listings_sep_09_2020.csv - Original Toronto Airbnb Dataset from Sept 2020 in csv format

Acknowledgements

  1. https://www.kaggle.com/robinkongninglo/toronto-airbnb-dataset
  2. https://towardsdatascience.com/an-extensive-guide-to-exploratory-data-analysis-ddd99a03199e
  3. https://medium.com/@josh_2774/how-do-you-become-a-developer-5ef1c1c68711

About

Analyses the Toronto Airbnb dataset from Sept 2020. Performs exploratory analysis and builds a predictive machine learning model to predict the most important features that impact Airbnb listings

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published