Skip to content

Latest commit

 

History

History
23 lines (16 loc) · 886 Bytes

readme.md

File metadata and controls

23 lines (16 loc) · 886 Bytes

Building an RSS feed scraper with Python

This is a project created to illustrate the basics of web scraping by pulling information from the HackerNews RSS feed. This builds from a simple web scraper in scraping.py, into an automated scraping tool in tasks.py.

Articles

  1. Building an RSS feed scraper with Python is available here.

  2. Automated web scraping with Python and Celery is available here.

Automated scraping commands

The following are used to start the scheduled scraping with Celery in tasks.py.

Starting our RabbitMQ server (terminal #1):

rabbitmq-server

Starting the scraping (terminal #2):

celery -A tasks worker -B -l INFO

MIT License.