-
Notifications
You must be signed in to change notification settings - Fork 2
/
README.Rmd
129 lines (93 loc) · 4.34 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
# homelocator
## Overview
<div style="text-align: justify">
The goal of `homelocator` is to provide a consistent framework and interface for the adoption of different approaches for identifying home locations for users. With the package, you are able to write structured, algorithmic 'recipes' to identify home locations according to your research requirements. The package also has a number of built-in 'recipes' that have been translated from approaches in the existing literature.
</div>
## Installation
``` r
# Install development version from GitHub
install_github("spatialnetworkslab/homelocator")
```
## Example
These are some basic examples that show you how to use common functions in the package.
### Validate input dataset
You need to make sure the input dataset includes three essential attributes:
- a unique identifier for the person or user
- a unique identifier for the spatial location for the data point
- a timestamp that reflects the time the data point was created
You can use `validate_dataset()` to validate your input dataset before starting identifying meaningful locations. In this function, you need to specify the names of three essential attribute that used in your dataset.
```{r}
# Load homelocator library
library(homelocator)
```
```{r message=FALSE}
# Load other needed libraries
library(tidyverse)
library(here)
```
```{r validate}
# load test sample dataset
data("test_sample", package = "homelocator")
df_validated <- validate_dataset(test_sample, user = "u_id", timestamp = "created_at", location = "grid_id")
head(df_validated)
```
### Nesting users for parallel computing
To speed up computing progress, you can nest the validated dataset by user so that the subsequent location inference can be applied to each user at the same time.
```{r nesting}
df_nested <- nest_verbose(df_validated, c("created_at", "grid_id"))
head(df_nested)
head(df_nested$data[[1]])
```
### Enrich variables from timestamp
Add additional needed varialbes derived from the timestamp column. These are often used/needed as intermediate variables in home location algorithms, such as year, month, day, day of the week and hour of the day, etc.
```{r}
df_enriched <- enrich_timestamp(df_nested, timestamp = "created_at")
head(df_enriched$data[[1]])
```
### Use built-in recipes
Current available recipes, where `HMLC` is the default recipe used in `identify_location`:
- `HMLC`:
- Weighs data points across multiple time frames to ‘score’ potentially meaningful locations for each user
- `FREQ`
- Selects the most frequently 'visited' location assuming a user is active mainly around their home location.
- `OSNA`: [Efstathiades et al.2015](https://doi.org/10.1080/10630731003597306)
- Finds the most 'popular' location during 'rest', 'active' and 'leisure time. Here we focus on 'rest' and 'leisure' time to find the most possible home location for each user.
- `APDM`: [Ahas et al. 2010](https://doi.org/10.1080/10630731003597306)
- Calculates the average and standard deviation of start time data points by a single user, in a single location.
#### HMLC
```{r eval=F}
# default recipe: homelocator -- HMLC
identify_location(test_sample, user = "u_id", timestamp = "created_at", location = "grid_id", show_n_loc = 1, recipe = "HMLC")
```
#### FREQ
```{r eval=F}
# recipe: Frequency -- FREQ
identify_location(test_sample, user = "u_id", timestamp = "created_at", location = "grid_id",
show_n_loc = 1, recipe = "FREQ")
```
#### OSNA
```{r eval=F}
# recipe: Online Social Network Activity -- OSNA
identify_location(test_sample, user = "u_id", timestamp = "created_at", location = "grid_id",
show_n_loc = 1, recipe = "OSNA")
```
#### APDM
```{r eval=F}
# recipe: Online Social Network Activity -- APDM
## APDM recipe strictly returns the most likely home location
## It is important to create your location neighbors table before you use the recipe!!
## example: st_queen <- function(a, b = a) st_relate(a, b, pattern = "F***T****")
## neighbors <- st_queen(df_sf) ===> convert result to dataframe
data("df_neighbors", package = "homelocator")
identify_location(test_sample, user = "u_id", timestamp = "created_at", location = "grid_id",
show_n_loc = 1, recipe = "APDM")
```