CourseWebCrawler

Description

This project is used to crawl online courses in the imooc (http://www.imooc.com) and class-central homepage (http://www.class-central.com). The content crawled will be saved in MongoDB. We will display info the courses by ranking.

Plan

TodoList

Create a scrapy project to craw data from specified url.
Save the crawled data in MongoDB.
Display data by ranking.

Time Schedule

Stage	Start	End	Goals
1	7/25	7/31	Plan Discussion, Environment Setup, and Proposal Draft Writing
2	8/1	8/7	Implement crawling function, crawl list page and detail pages
3	8/8	8/14	MongoDB Setup and Saving data
4	8/15	8/21	Front-end Display and Document Writing
5	8/22	8/28	User Manual Writing and Presentation Making

Resource

CourseWebCrawler
Scrapy
MongoDB
Django

License

See the LICENSE file for license rights and limitations (MIT).

Project Infomation

category: full stack
team: CourseWebCrawler
description: a Scrapy project to crawl valuable online courses.
stack: scrapy, mongodb, django

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal.md

Proposal.md

CourseWebCrawler

Description

Plan

TodoList

Time Schedule

Resource

License

Project Infomation

Files

Proposal.md

Latest commit

History

Proposal.md

File metadata and controls

CourseWebCrawler

Description

Plan

TodoList

Time Schedule

Resource

License

Project Infomation