airbnb_scraper 🕷️

Spider built with scrapy and ScrapySplash to crawl listings

Checklist

This checklist is for personal use and isn't relevant to using the scraper.

Spider can successfully parse one page of listings
Spider can successfully parse mutliple/all pages of designated location
Spider can take price ranges as arguments (price_lb and price_ub)
Spider can take location as argument

Set up

Since Airbnb uses JavaScript to render content, just scrapy on its own cannot suffice sometimes. We need to use Splash as well, which is a plugin created by the Scrapy team that integrates nicely with scrapy.

To install Splash, we need to do several things:

Install Docker, create a Docker account (if you don't already have one), and run Docker in the background before crawling with

docker run -p 8050:8050 scrapinghub/splash

It might take a few minutes to pull the image for the first time doing this. When this is done, you can type localhost:8050 in your browser to check if it's working. If an interface opens up, you are good to go.

Install scrapy-splash using pip

pip install scrapy-splash

See scrapy-splash if you run into any issues.

Crawling

Run the spider with scrapy crawl airbnb -o {filename}.json -a city='{cityname}' -a price_lb='{pricelowerbound}' -a price_ub='{priceupperbound}'

cityname refers to a valid city name

pricelowerbound refers to a lower bound for price from 0 to 999

priceupperbound refers to upper bound for price from 0 to 999. Spider will close if priceupperbound is less than pricelowerbound
Note: Airbnb only returns a maximum of ~300 listings per specific filter (price range). To get more listings, I recommend scraping multiple times using small increments in price and concatenating the datasets.

If you would like to do multiple scrapes over a wide price range (e.g. 10-spaced intervals from 20 to 990), see cancun.sh which I used to crawl a large number listings for Cancún.

Acknowledgements

I would like to thank Ahmed Rafik for his guidance and teachings.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.vscode		.vscode
airbnb_scraper		airbnb_scraper
testing		testing
.DS_Store		.DS_Store
README.md		README.md
cancun.sh		cancun.sh
new_york.sh		new_york.sh
scrapy.cfg		scrapy.cfg
toronto.sh		toronto.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

airbnb_scraper 🕷️

Checklist

Set up

Crawling

Acknowledgements

About

Releases

Packages

Languages

kailu3/airbnb-scraper

Folders and files

Latest commit

History

Repository files navigation

airbnb_scraper 🕷️

Checklist

Set up

Crawling

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages