Yellow Page Crawler / Spider

It's a Crawler / Spider for crawling company data on Yellow Page, it written in Python with Scrapy.

Installation guide for packages

You should be install the Scrapy first, other packages (e.g. csv, datetime) should be installed by default.

You can check all the packages by the following command:

pip3 list

Outputs:

Package            Version
------------------ ---------
Scrapy             2.5.0
...                ...

If Scrapy is not on the list, you need to install it by:

pip3 install scrapy

Required packages

Required Package
csv
datetime
scrapy

Development Environment

Tools	Version
Python	3.9.6

Run the cralwer

You can run the crawler by following command, it will crawl the YelloPage with default keyword 體檢 which encoded as %E9%AB%94%E6%AA%A2:

scrapy crawl clinic_spider

However, you can pass any keyword to it with the -a custom arugment flag (e.g. -a {keyword})

scrapy crawl clinic_spider -a {KEYWORD_TO_BE_SEARCH}

Output files

The crawler will generate the result csv files with filename format company_YYYYMMDD_HHmmss.csv when each crawl.

Development Roadmap

Search: Search by Keyword (Done)
Search: Search by Category (Not yet started)

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
clinic_crawler		clinic_crawler
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
scrapy.cfg		scrapy.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Yellow Page Crawler / Spider

Installation guide for packages

Required packages

Development Environment

Run the cralwer

Output files

Development Roadmap

About

Releases

Packages

Languages

License

ericchan3721/yellow-page-scrapy-crawler

Folders and files

Latest commit

History

Repository files navigation

Yellow Page Crawler / Spider

Installation guide for packages

Required packages

Development Environment

Run the cralwer

Output files

Development Roadmap

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages