Code to scrape course reviews from Coursera and perform sentiment analysis and topic modeling
Code ready to use
- CourseraClass.py
- scrape_coursera_reviews.py
- scrape_coursera_urls.py
- Review_Sentiment_Analysis.ipynb
- Review_Topic_Modeling.ipynb
Experiments
- coursera_reviews_scraper.ipynb
- coursera_url_scrapper.ipynb
- To scrape all course urls use scrape_coursera_urls.py
- To scrape reviews from stored course urls in text files, use scrape_coursera_reviews.py
- To perform sentiment analysis and topic modeling, use respective jupyter notebooks
Requires
- Selenium
- Pandas
- ScikitLearn
- BeautifulSoup
- A stable internet connection
Notes
- Sometimes when scrapping the whole Coursera site the server leads you to a different website than expected, so it is good to once in a while check on the browser driver if everything is going well.
Sources in code.