This project was completed during my time at the General Assembly Data Science Immersive course in London.
- Capstone.html - Capstone Report and Model Outputs - This is the best way to view it.
- Capstone.ipynb - Capstone Report and Model Outputs - Uses Github's preview for Jupyter Notebook.
- Capstone-presentation.pdf - An overview presentation which was prepared to educate a non-technical audience about the project, the results and any recommendations. This presentation has speaking notes and also animations so if viewing I would recommend either downloading:
- Created a multi-threaded web scraper using Python libraries such as: from concurrent.futures import ThreadPoolExecutor
- Created custom sci-kit learn custom classes for smoothly executing pipelines that included numerical and text data.
- Utilised a Google Page Speed Insights API python script to collect web page speed data for 15,000 articles.