Building an RSS feed scraper with Python

This is a project created to illustrate the basics of web scraping by pulling information from the HackerNews RSS feed. This builds from a simple web scraper in scraping.py, into an automated scraping tool in tasks.py.

Articles

Building an RSS feed scraper with Python is available here.
Automated web scraping with Python and Celery is available here.

Automated scraping commands

The following are used to start the scheduled scraping with Celery in tasks.py.

Starting our RabbitMQ server (terminal #1):

rabbitmq-server

Starting the scraping (terminal #2):

celery -A tasks worker -B -l INFO

MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

readme.md

Building an RSS feed scraper with Python

Articles

Automated scraping commands

Files

readme.md

Latest commit

History

readme.md

File metadata and controls

Building an RSS feed scraper with Python

Articles

Automated scraping commands