Spidy (/spˈɪdi/) is the simple, easy to use command line web crawler.
Given a list of web links, it uses Python
`requests
<http://docs.python-requests.org>`__ to query the
webpages, and `lxml
<http://lxml.de/index.html>`__ to extract all
links from the page. Pretty simple!
Created by rivermont (/rɪvɜːrmɒnt/)
and FalconWarriorr (/fælcʌnraɪjɔːr/),
and developed with help from these awesome
people. Looking for
technical documentation? Check out
DOCS.mdLooking to contribute to this project? Have a look at
`CONTRIBUTING.md
<https://github.com/rivermont/spidy/blob/master/docs/CONTRIBUTING.md>`__,
then check out the docs.
- The logo was designed by Cutwell
- 3onyc - PEP8 Compliance.
- DeKaN - Getting PyPI packaging to work.
- esouthren - Unit testing.
- j-setiawan - Paths that work on all OS's.
- kylesalk - Logging file handlers
- michellemorales - Confirmed OS/X support.
- quatroka - Fixed testing bugs.
- stevelle - Respect robots.txt.
- thatguywiththatname - README link corrections
We used the Gnu General Public
License (see
LICENSE) as
it was the license that best suited our needs. Honestly, if you link to
this repo and credit rivermont
and FalconWarriorr
, and you
aren't selling spidy in any way, then we would love for you to
distribute it. Thanks!