You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
and transform them, adding a LINESTRING between the centroids of each pair of MSOAs
west_yorkshire_od<- od_to_sf(west_yorkshire_od, west_yorkshire_msoa)
#> 0 origins with no match in zone ids#> 42084 destinations with no match in zone ids#> points not in od data removed.
Each row should represent the number of commuters that, every day, go from one MSOA to the same or another MSOA. The data come from 2011 census. For example, in the following map I represent the OD line that "arrive" at "E02006875", which is the code for the MSOA at Leeds city center. Clearly, most of the commuters live nearby their workplace.
and filter the events located in the west-yorkshire region. The polygonal boundary for the west-yorkshire region is created by st-unioning the MSOA zones (after rouding all coordinates to the nearest 10s of meter).
and plot the result. The first map represents the location of all car crashes that occurred in the West-Yorkshire during 2019, while the second map is a choropleth map of car crashes counts in each MSOA zone.
I don't live in Yorkshire but I think that the second map clearly highlights the cities of Leeds, Bradford and Wakefield. The same procedure can be repeated for 2011-2018, creating a simple animation. First I need to download and filter car crashes data for 2011-2019. Warning: the following command may take some time to run
The car crashes focus more or less always in the same areas (for obvious reasons) and they seem to be more and more focused around the big cities.
Now I want to estimate a traffic measure for each MSOA zone (to be used as an offset in a car crashes model, offtopic here). The problem with raw OD data is that they ignore the fact that people may travel to their workplace through several MSOAs (@Robinlovelace please correct me here). So I estimate a traffic measure using the following approximation. First I build a neighbours list starting from the MSOA regions:
I think this is more or less equivalent to queen_nb_west_yorkshire <- st_relate(msoa_west_yorkshire, pattern = "F***T****") since we set st_precision(msoa_west_yorkshire) <- units::set_units(10, "m"). I'm not sure how to visualise nb objects with tmap (but I think it could be a nice extension, if absent), so I will use base R + spdep for the moment:
The lines connect the centroid of each MSOA with the centroids of all neighbouring MSOA. This structure is useful since it uniquely determines the adjacency matrix of a graph that can be used to estimate the (approximate) shortest path between two MSOA (i.e. the Origin and the Destination).
That graph has 299 vertices (the number of MSOA in West Yorkshire) and 1716 edges (the lines visualised in the previous plot). Now I'm going to "decorate" that graph by assigning to each edge a weight that is equal to the geographical distance between the centroids that define the edge. First I need the edgelist:
Now I'm going to loop over all OD lines, calculate the shortest path between the Origin and Destination MSOA, and assign to each MSOA in the shortest path a traffic measure that is proportional to the number of people that commute using that OD line. I know that it sounds a little confusing but I can write it better if you like the idea:
west_yorkshire_msoa$traffic<-0# initialize the traffic columnpb<- txtProgressBar(min=1, max= nrow(west_yorkshire_od), style=3)
for (iin seq_len(nrow(west_yorkshire_od))) {
origin<-west_yorkshire_od[["geo_code1"]][i]
destination<-west_yorkshire_od[["geo_code2"]][i]
if (origin==destination) { # this should be an "intrazonal" idx<- which(west_yorkshire_msoa$geo_code==origin)
west_yorkshire_msoa$traffic[idx] <-west_yorkshire_msoa$traffic[idx] +west_yorkshire_od[["all"]][i]
# west_yorkshire_msoa$traffic[idx] is the previous value while# west_yorkshire_od[["all"]][i] is the "new" traffic valuenext()
}
# Estimate shortest pathidx_geo_code<- as_ids(shortest_paths(west_yorkshire_graph, from=origin, to=destination)$vpath[[1]])
idx<- which(west_yorkshire_msoa$geo_code%in%idx_geo_code)
west_yorkshire_msoa$traffic[idx] <-west_yorkshire_msoa$traffic[idx] + (west_yorkshire_od[["all"]][i] / length(idx))
# west_yorkshire_msoa$traffic[idx] are the previous values while# west_yorkshire_od[["all"]][i] is the new traffic count (divided by the# number of MSOAs in the shortest path)
setTxtProgressBar(pb, i)
}
For example, if i = 53778, then
west_yorkshire_od[53778, 1:3]
#> Simple feature collection with 1 feature and 3 fields#> geometry type: LINESTRING#> dimension: XY#> bbox: xmin: 415279.3 ymin: 418009.9 xmax: 430001.7 ymax: 433379.7#> projected CRS: OSGB 1936 / British National Grid#> geo_code1 geo_code2 all geometry#> 53778 E02006875 E02002299 18 LINESTRING (430001.7 433379...
i.e. the number of daily commuters is equal to 18, while the shortest path between E02006875 and E02002299 is given by
The scales are quite different (obviously), but they highlight the same areas (which is not incredible, but still...)
I think that the examples presented here are nice since they are related to a "real-world application" with open data (with some licence limitation maybe, I don't know). I think that the visualisation component should be stressed much more (and the maps related to neighborhood matrices should be implemented with tmap). This example is probably too difficult for a book not related to road safety or models for car crashes, but if you like it I think we can organize something!
Hi! The following reprex should present the datasets.
First, I download MSOA data for the West-Yorkshire region (polygonal data)
The MSOA areas are defined here. Then, I download OD data for the West-Yorkshire region
and transform them, adding a LINESTRING between the centroids of each pair of MSOAs
This is the result
Each row should represent the number of commuters that, every day, go from one MSOA to the same or another MSOA. The data come from 2011 census. For example, in the following map I represent the OD line that "arrive" at "E02006875", which is the code for the MSOA at Leeds city center. Clearly, most of the commuters live nearby their workplace.
Now I download car crashes data for England
and filter the events located in the west-yorkshire region. The polygonal boundary for the west-yorkshire region is created by st-unioning the MSOA zones (after rouding all coordinates to the nearest 10s of meter).
I can now count the occurrences in each MSOA
and plot the result. The first map represents the location of all car crashes that occurred in the West-Yorkshire during 2019, while the second map is a choropleth map of car crashes counts in each MSOA zone.
I don't live in Yorkshire but I think that the second map clearly highlights the cities of Leeds, Bradford and Wakefield. The same procedure can be repeated for 2011-2018, creating a simple animation. First I need to download and filter car crashes data for 2011-2019. Warning: the following command may take some time to run
Then, I need to estimate the number of car crashes per MSOA per year:
And create the animation
The car crashes focus more or less always in the same areas (for obvious reasons) and they seem to be more and more focused around the big cities.
Now I want to estimate a traffic measure for each MSOA zone (to be used as an offset in a car crashes model, offtopic here). The problem with raw OD data is that they ignore the fact that people may travel to their workplace through several MSOAs (@Robinlovelace please correct me here). So I estimate a traffic measure using the following approximation. First I build a neighbours list starting from the MSOA regions:
I think this is more or less equivalent to
queen_nb_west_yorkshire <- st_relate(msoa_west_yorkshire, pattern = "F***T****")
since we setst_precision(msoa_west_yorkshire) <- units::set_units(10, "m")
. I'm not sure how to visualisenb
objects withtmap
(but I think it could be a nice extension, if absent), so I will use base R +spdep
for the moment:The lines connect the centroid of each MSOA with the centroids of all neighbouring MSOA. This structure is useful since it uniquely determines the adjacency matrix of a graph that can be used to estimate the (approximate) shortest path between two MSOA (i.e. the Origin and the Destination).
That graph has 299 vertices (the number of MSOA in West Yorkshire) and 1716 edges (the lines visualised in the previous plot). Now I'm going to "decorate" that graph by assigning to each edge a weight that is equal to the geographical distance between the centroids that define the edge. First I need the edgelist:
and then I can estimate the distances
I can also assign a name to each vertex equal to the corresponding MSOA code:
This is the result, with an analogous interpretation as before
Now I'm going to loop over all OD lines, calculate the shortest path between the Origin and Destination MSOA, and assign to each MSOA in the shortest path a traffic measure that is proportional to the number of people that commute using that OD line. I know that it sounds a little confusing but I can write it better if you like the idea:
For example, if
i = 53778
, theni.e. the number of daily commuters is equal to 18, while the shortest path between E02006875 and E02002299 is given by
It would be nice to represent the "shortest path" like the previous map (and check the results) but I'm not sure how to do that right now.
Now, we can compare the traffic estimates with the car crashes counts:
The scales are quite different (obviously), but they highlight the same areas (which is not incredible, but still...)
I think that the examples presented here are nice since they are related to a "real-world application" with open data (with some licence limitation maybe, I don't know). I think that the visualisation component should be stressed much more (and the maps related to neighborhood matrices should be implemented with
tmap
). This example is probably too difficult for a book not related to road safety or models for car crashes, but if you like it I think we can organize something!Created on 2020-10-10 by the reprex package (v0.3.0)
Session info
The text was updated successfully, but these errors were encountered: