Summary of the notebooks in the root directory:
- Exploratory analysis of the growth in use of various modes of transport since 2000
- Overview of changes to the population, fuel prices and air pollution
- Comparison of the use patterns of hire and other bicycles throughout the day
- Overview of the impact of factors such as day of the week and weather on each group
- Visualisation of the increase in hire bike availability over time
- Hire bikes docking stations mapped together with tube stations and their level of use
- Predictive modelling of daily hire bike use levels via linear regression
- Comparison of model accuracy when using different feature sets
- Exploration of more recent trends such as the increase in dockless hire bikes
It may take a minute or so to build the repo for interactivity.
All of the collated data required for the primary notebooks is in the data directory.
The original datasets came from various sources found online, including:
London Datastore
Transport for London API
RP5 Weather
'What Do They Know' FOI Requests
Her Majesty's Nautical Almanac Office
Holiday Calendars
Some cleaning and basic editing of a few of the files was carried out in spreadsheets. Subsequently the notebooks in the data-prep directory have been run to produce the collated output in the data directory used by the primary notebooks.
I hope to automate the collection and cleaning of the raw data or provide instructions so that the process can be reproduced in full and updated periodically.
Project undertaken while studying Data Science at General Assembly, London.