-
Notifications
You must be signed in to change notification settings - Fork 52
Home
D3 is a JavaScript library which empowers creating beautiful interactive visualisations in HTML. Although not tied to the Web per se, it is predominantly used to do data-driven manipulations of Web content, especially SVG documents embedded in HTML. D3 is the fourth iteration of a visualization library, its precursors are Prefuse (Java, 2005), Flare (Actionscript, 2007) and Protoviz (Javascript, 2009), all of which the author of D3 had a leading role in.
D3 was created 2010 by Mike Bostock and sponsored by his employer, The New York Times. It has since received great attention and is used in various scenarios, especially for data visualisations. It has gained considerable traction in the relatively new discipline of data journalism.
Scalable Vector Graphics (SVG) is an XML-based vector image format that has support for interactivity and animation. Embedding SVG inside HTML documents render it possible to dynamically create shapes other than rectangles, like circles and bézier curves. SVG can be scripted via DOM API and styled with CSS. Together with D3 you can build interactive graphics in the browser.
To minimize computation needs on the client, the data is preprocessed and converted to a structure which is best suited for the JavaScript which generates the circular plot. While CSV is a best fit for data interchange, JSON (JavaScript Object Notation) has the ability to hold structured data. An important step is to filter countries which have very small migration flows. Otherwise the graphic becomes too complex with too many elements and gets unresponsive. Many fine shapes also makes it difficult to read the plot and distracts from the important flows.
The interchange CSV looks like this:
originregion_name,destinationregion_name,origin_iso,origin_name,destination_iso,destination_name,countryflow_1990,countryflow_1995,countryflow_2000,countryflow_2005
North America,North America,CAN,Canada,CAN,Canada,0,0,0,0
North America,North America,CAN,Canada,USA,United States,1509,190436,238,28
North America,North America,USA,United States,CAN,Canada,56108,635,84430,96074
North America,North America,USA,United States,USA,United States,0,0,0,0
Filtering is done by reading a CSV file, which defines which countries are visible (1 = yes; 2 = no), eg:
iso,show
USA,1
FIN,0
In this case, USA
will be shown, whereas FIN
will be hidden. The result is exactly like the input CSV, except that the rows, where either origin_iso
or destination_iso
have a 0
in the countries filter CSV, are filtered out.
This CSV is then used as input of the compile step. Here we create a data structure, which can be consumed by the client in an efficient matter.
The resulting JSON looks like this:
{
"regions": [0, 3, 36, 61, 74, 88, 96, 101, 110, 113],
"names": [
"North America",
"Canada",
"United States",
"Africa",
"Angola",
...
"Venezuela"
],
"matrix": {
"2005": [
[ 139950, ... 8621 ],
[ 51564, ... 458 ],
...
],
"1990": [
...
]
}
}
To reduce the amount of chords displayed at any time the data is accumulated as region flows.
The graph starts collapsed and the user can expand a region to see individual country flows.
There are only two regions collapsed at any time, when the user expands a new region,
the first one gets closed, if there were two. To achieve this the region flows are
stored in the flow matrix
, followed by the appropriate country flows. A regions
index
keeps track of the region flows. Expanding a region is then done by displaying all flows
in the matrix between the current region index and the next region index. To display
labels, region and country names
are listed.
An implementation of these tasks as well as a description and usage instructions can be found in the Circular Migration Plot Library.
While D3 provides helpful layouts
for generating chrords,
they had to be extended to fit the requirements of migration flow charts.
One major difference between the chart provided by D3 and the needs of migration
flow charts is the fact that migration flow charts display two chords
(A chord is a shape which shows a single flow.
It is a geometry which consists of two arcs connected with two bezier curves.)
for every direction, one for emigration and one for imigration.
The other difference is that the chords end with slightly smaller radius,
to distinguish direction.
In order to display tooltips and numbers the data was added to the data generated
by the layout.
The modified chord layout
can be found together with the extended
chord shape
in the lib/
folder of the Circular Migration Plot Library.
TODO
- OpenWeb technologies made interactive charts possible / made them widely used
- Interactive visualisations sometimes make data explorable at all