Skip to content
This repository has been archived by the owner on Feb 8, 2018. It is now read-only.

Writing Spiders

andbirkebaek edited this page Apr 18, 2013 · 2 revisions

For now, we recommend looking at the following spider definitions to get a feel for writing them:

Both files are extensively documented, and should give you an idea of what's involved. If you have questions, check the Feedback section and hit us up.

To generate your own spider, use the included generate.py program. From the scrapy_proj directory, run the following (make sure you are in the correct virtualenv:

python generate.py SPIDER_NAME START_URL

This will generate a basic spider for you named SPIDER_NAME that starts crawling at START_URL. All that remains for you to do is to fill in the correct info for scraping the name, image, etc. See `python generate.py --help' for other command line options.

We'll use the "fork & pull" development model for collaboration, so if you plan to contribute, make sure to fork your own repo off of ours. Then you can send us a pull request when you have something to contribute. Please follow "PEP 8 - Style Guide for Python Code" for code you write.