Skip to content

Commit

Permalink
Merge pull request #48 from ITSLeeds/max-dist
Browse files Browse the repository at this point in the history
max_dist and max_dest for points_to_od()
  • Loading branch information
Robinlovelace authored Aug 19, 2024
2 parents 5800895 + 4fbc5ff commit 75055ac
Show file tree
Hide file tree
Showing 10 changed files with 206 additions and 72 deletions.
1 change: 1 addition & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,4 @@
^\.binder$
^\.vscode$
^CRAN-SUBMISSION$
^ad-hoc-tests$
5 changes: 3 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: od
Title: Manipulate and Map Origin-Destination Data
Version: 0.4.4
Version: 0.5.0
Authors@R: c(
person("Robin", "Lovelace", email = "[email protected]", role = c("aut", "cre"),
comment = c(ORCID = "0000-0001-5679-6536")),
Expand All @@ -27,6 +27,7 @@ Depends: R (>= 3.4.0)
Imports:
sfheaders,
methods,
nngeo,
vctrs
Suggests:
sf,
Expand All @@ -35,6 +36,6 @@ Suggests:
tinytest,
covr,
lwgeom
RoxygenNote: 7.2.3
RoxygenNote: 7.3.2
VignetteBuilder: knitr
Roxygen: list(markdown = TRUE)
4 changes: 4 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
# od 0.5.0 (2024-08)

* New `max_dist` argument in `points_to_od()` (also applicable to `points_to_odl()`) to limit the distance between origins and destinations. Credit to Malcolm Morgan @mem48 for this contribution, closing 4-year-old issue #18.

# od 0.4.4 (2024-03)

* Fix minor issue with geometry checking, result of upstream changes
Expand Down
10 changes: 8 additions & 2 deletions R/od-funs.R
Original file line number Diff line number Diff line change
Expand Up @@ -77,8 +77,14 @@ od_to_sfc = function(x,

#' Create matrices representing origin-destination coordinates
#'
#' This function takes a wide range of input data types (spatial lines, points or text strings)
#' and returns a data frame of coordinates representing origin (ox, oy) and destination (dx, dy) points.
#' This function takes an 'od data frame' with the first
#' two columns matching IDs of spatial objects, and
#' matches them with objects representing origins and destinations
#' in wide range of input data types (spatial lines, points or text strings).
#' It returns a data frame of coordinates representing movement between all origin (ox, oy) and destination (dx, dy) points.
#'
#' See [points_to_od()] for a function that creates
#' an 'od data frame' from a set (or two sets) of points.
#' @param p Points representing origins and destinations
#' @param pd Points representing destinations, if different from origin points
#' @param sfnames Should output column names be compatible with the sf package?
Expand Down
107 changes: 66 additions & 41 deletions R/points_to_od.R
Original file line number Diff line number Diff line change
@@ -1,20 +1,31 @@
#' Convert a series of points into a dataframe of origins and destinations
#'
#' Takes a series of geographical points and converts them into a data.frame
#' representing the potential flows, or 'spatial interaction', between every combination
#' of points.
#' representing the potential flows, or 'spatial interaction', between every
#' combination of points.
#'
#' `points_to_odl()` generates the same output but returns
#' a geographic object representing desire lines in the class `sf`.
#' `points_to_odl()` generates the same output but returns a geographic object
#' representing desire lines in the class `sf`.
#'
#' @param p A spatial points object or a matrix of coordinates representing points
#' @param pd Optional spatial points object or matrix objects representing destinations
#' @param interzone_only Should the result only include interzonal OD pairs, in which
#' the ID of the origin is different from the ID of the destination zone?
#' `FALSE` by default
#' @param ids_only Should a data frame with only 2 columns (origin and destination IDs)
#' be returned? The default is `FALSE`, meaning the result should also contain the
#' coordinates of the start and end points of each OD pair.
#' @param p A spatial points object or a matrix of coordinates representing
#' points
#' @param pd Optional spatial points object objects representing
#' destinations.
#' `pd` is ignored if `p` is a matrix.
#' If `pd` is not provided, `p` is used as the destination points.
#' @param interzone_only Should the result only include interzonal OD pairs, in
#' which the ID of the origin is different from the ID of the destination
#' zone? `FALSE` by default
#' @param ids_only Should a data frame with only 2 columns (origin and
#' destination IDs) be returned? The default is `FALSE`, meaning the result
#' should also contain the coordinates of the start and end points of each OD
#' pair.
#' @param max_dist Numeric, maximum distance to consider. Default Inf.
#' Not applicable when `p` is a matrix.
#' @param max_dest The maximum number of destinations for each origin (numeric)
#' sorted from closest to furthest. Default is Inf. Alternative to max_dist
#' for limiting the number of ODs.
#' Not applicable when `p` is a matrix.
#' @export
#' @examples
#' library(sf)
Expand All @@ -23,48 +34,61 @@
#' points_to_od(p, ids_only = TRUE)
#' (l = points_to_odl(p, interzone_only = TRUE))
#' plot(l)
#' library(sf) # for subsetting sf objects:
#' points_to_od(od_data_centroids[1:2, ], od_data_centroids[3, ])
#' l = points_to_odl(od_data_centroids[1:2, ], od_data_centroids[3, ])
#' plot(l)
#' (od = points_to_od(p, interzone_only = TRUE))
#' l2 = od_to_sf(od, od_data_centroids)
#' l2$v = 1
#' (l2_oneway = od_oneway(l2))
#' plot(l2)
points_to_od = function(p, pd = NULL, interzone_only = FALSE, ids_only = FALSE) {
#' sf::st_length(l2)
#' # With max_dist:
#' (l3 = points_to_odl(p, max_dist = 10000))
#' sf::st_length(l3)
points_to_od = function(p, pd = NULL, interzone_only = FALSE, ids_only = FALSE,
max_dist = Inf, max_dest = Inf) {
# to work with other classes at some point, possibly, it's a generic:
UseMethod("points_to_od")
}
#' @export
points_to_od.sf = function(p, pd = NULL, interzone_only = FALSE, ids_only = FALSE) {
points_to_od.sf = function(p, pd = NULL, interzone_only = FALSE, ids_only = FALSE,
max_dist = Inf, max_dest = Inf) {

single_geometry = is.null(pd)
if(single_geometry) {
pd = p
ids = p[[1]]
if(any(duplicated(ids))) {
warning("Duplicated ids found in first column of origins")
}
odf = data.frame(
stringsAsFactors = FALSE,
expand.grid(p[[1]], pd[[1]], stringsAsFactors = FALSE)[2:1]
)
} else {
ids = p[[1]]
if(any(duplicated(ids))) {
warning("Duplicated ids found in first column of origins")
}
ids = pd[[1]]
if(any(duplicated(ids))) {

if(any(duplicated(p[[1]]))) {
warning("Duplicated ids found in first column of origins")
}

if(any(sf::st_geometry_type(p) != "POINT")){
message("Converting p to centroids")
suppressWarnings(p <- sf::st_centroid(p))
}

if(!single_geometry){
if(any(duplicated(pd[[1]]))) {
warning("Duplicated ids found in first column of destinations")
}
odf = data.frame(
stringsAsFactors = FALSE,
expand.grid(p[[1]], pd[[1]], stringsAsFactors = FALSE)
)
if(any(sf::st_geometry_type(p) != "POINT")){
message("Converting pd to centroids")
suppressWarnings(p <- sf::st_centroid(p))
}
}

names(odf) = c("O", "D")
if(single_geometry) {
pd = p
}

if(max_dest > nrow(pd)){
max_dest = nrow(pd)
}

nn <- nngeo::st_nn(p, pd, k = max_dest, maxdist = max_dist, returnDist = FALSE,
progress = FALSE)
odf = data.frame(O = rep(p[[1]], lengths(nn)),
D = pd[[1]][unlist(nn, use.names = FALSE)])


if(interzone_only) {
odf = od_interzone(odf)
}
Expand All @@ -79,15 +103,16 @@ points_to_od.sf = function(p, pd = NULL, interzone_only = FALSE, ids_only = FALS
cbind(odf, odc)
}
#' @export
points_to_od.matrix = function(p, pd = NULL, interzone_only = FALSE, ids_only = FALSE) {
points_to_od.matrix = function(p, pd = NULL, interzone_only = FALSE, ids_only = FALSE, max_dist = NULL, max_dest = NULL) {
coords_to_od(p, interzone_only = interzone_only, ids_only = ids_only)
}
#' @rdname points_to_od
#' @inheritParams points_to_od
#' @inheritParams odc_to_sf
#' @param ... Additional arguments passed to `points_to_od)`
#' @export
points_to_odl = function(p, pd = NULL, interzone_only = FALSE, ids_only = FALSE, crs = 4326) {
odf = points_to_od(p, pd, interzone_only, ids_only)
points_to_odl = function(p, pd = NULL, crs = 4326, ...) {
odf = points_to_od(p, pd, ...)
odc_to_sf(odf[3:6], d = odf[1:2], crs = crs)
}
#' Convert coordinates into a data frame of origins and destinations
Expand Down
1 change: 1 addition & 0 deletions ad-hoc-tests/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
/.quarto/
69 changes: 69 additions & 0 deletions ad-hoc-tests/test-max-dist-speedup.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
---
format: gfm
---

This document tests the new `max-dist` functionality in PR [#48](https://github.com/ITSLeeds/od/pull/48).

Let's start the test documented in the PR with the installed version of the package.

```{r}
remotes::install_cran("od")
library(sf)
```

# Test 1: 1000 points


```{r}
p = pct::get_centroids_ew()
p = p[1:1000,]
system.time(r1 <- od::points_to_od(p))
head(r1)
nrow(r1)
```

Now let's test the new `max-dist` functionality.


```{r}
if (!file.exists("DESCRIPTION")) {
setwd("..")
}
devtools::load_all()
system.time(r2 <- points_to_od(p))
head(r2)
nrow(r2)
```


```{r}
system.time(r3 <- points_to_od(p, max_dist = 1000))
head(r3)
nrow(r3)
```

The benchmark shows that the new `max-dist` functionality is faster than the original implementation for large datasets.

Let's compare the results.


```{r}
waldo::compare(head(r1), head(r2))
r2_sorted = r2 |>
dplyr::arrange(desc(O), desc(D))
r1_sorted = r1 |>
dplyr::arrange(desc(O), desc(D))
waldo::compare(head(r1_sorted), head(r2_sorted))
```

Let's plot the results for the max-dist = 1000 case.


```{r}
r3_sf = od::od_to_sf(r3, p)
plot(sf::st_geometry(p), col = "red")
plot(sf::st_geometry(r3_sf), add = TRUE)
```

# Test 2: od_coordinates
16 changes: 9 additions & 7 deletions man/coords_to_od.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

11 changes: 9 additions & 2 deletions man/od_coordinates.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 75055ac

Please sign in to comment.