- A python program will run and extract the data into a raw storage database. This first extraction of data is unstructured, the stopping and restarting of data extraction will be incorporated in the python program - SQL query design.
- The data can be processed, modelled and stored into a second structured relational database where querying is optimized.
- This data can now visualized using Javascript. The D3 (Data Driven Documents) Javascript Library was chosen for the visualisation.
First fgather.py
was used to retrieve the data source and insert it into a raw unstructured database (rawfdata.sqlite
)
A method to stop and continue retrieving where the program left off was introduced to handle large data retrieval.
Next the data needed to be restructured, so the raw database was retrieved by fmodel.py
and inserted into a new relational database, optimized for data retrieval.
Again a method was introduced to allow the program to pick up from where it left off if interruptions occur.
Once the data was modelled it could be analysed, such as flights frequency plotted on a chart, or visualizing connections between cities.