This project is used to crawl online courses in the imooc (http://www.imooc.com) and class-central homepage (http://www.class-central.com). The content crawled will be saved in MongoDB. We will display info the courses by ranking.
-
Create a scrapy project to craw data from specified url.
-
Save the crawled data in MongoDB.
-
Display data by ranking.
Stage | Start | End | Goals |
---|---|---|---|
1 | 7/25 | 7/31 | Plan Discussion, Environment Setup, and Proposal Draft Writing |
2 | 8/1 | 8/7 | Implement crawling function, crawl list page and detail pages |
3 | 8/8 | 8/14 | MongoDB Setup and Saving data |
4 | 8/15 | 8/21 | Front-end Display and Document Writing |
5 | 8/22 | 8/28 | User Manual Writing and Presentation Making |
See the LICENSE file for license rights and limitations (MIT).
- category: full stack
- team: CourseWebCrawler
- description: a Scrapy project to crawl valuable online courses.
- stack: scrapy, mongodb, django