Skip to content

Latest commit

 

History

History
39 lines (28 loc) · 1.44 KB

Proposal.md

File metadata and controls

39 lines (28 loc) · 1.44 KB

CourseWebCrawler

Description

This project is used to crawl online courses in the imooc (http://www.imooc.com) and class-central homepage (http://www.class-central.com). The content crawled will be saved in MongoDB. We will display info the courses by ranking.

Plan

TodoList

  • Create a scrapy project to craw data from specified url.

  • Save the crawled data in MongoDB.

  • Display data by ranking.

Time Schedule

Stage Start End Goals
1 7/25 7/31 Plan Discussion, Environment Setup, and Proposal Draft Writing
2 8/1 8/7 Implement crawling function, crawl list page and detail pages
3 8/8 8/14 MongoDB Setup and Saving data
4 8/15 8/21 Front-end Display and Document Writing
5 8/22 8/28 User Manual Writing and Presentation Making

Resource

License

See the LICENSE file for license rights and limitations (MIT).

Project Infomation

  • category: full stack
  • team: CourseWebCrawler
  • description: a Scrapy project to crawl valuable online courses.
  • stack: scrapy, mongodb, django