README.Rmd

---
output: github_document
---


```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

# homelocator

## Overview 

<div style="text-align: justify"> 
The goal of `homelocator` is to provide a consistent framework and interface for the adoption of different approaches for identifying home locations for users. With the package, you are able to write structured, algorithmic 'recipes' to identify home locations according to your research requirements. The package also has a number of built-in 'recipes' that have been translated from approaches in the existing literature.
</div>


## Installation

``` r
# Install development version from GitHub
install_github("spatialnetworkslab/homelocator")
```
## Example

These are some basic examples that show you how to use common functions in the package. 

### Validate input dataset

You need to make sure the input dataset includes three essential attributes: 

  - a unique identifier for the person or user 
  - a unique identifier for the spatial location for the data point 
  - a timestamp that reflects the time the data point was created 

You can use `validate_dataset()` to validate your input dataset before starting identifying meaningful locations. In this function, you need to specify the names of three essential attribute that used in your dataset. 

```{r}
# Load homelocator library
library(homelocator)
```

```{r message=FALSE}
# Load other needed libraries
library(tidyverse)
library(here)
```


```{r validate}
# load test sample dataset 
data("test_sample", package = "homelocator")
df_validated <- validate_dataset(test_sample, user = "u_id", timestamp = "created_at", location = "grid_id")
head(df_validated)
```


### Nesting users for parallel computing 

To speed up computing progress, you can nest the validated dataset by user so that the subsequent location inference can be applied to each user at the same time. 

```{r nesting}
df_nested <- nest_verbose(df_validated, c("created_at", "grid_id"))
head(df_nested)
head(df_nested$data[[1]])
```


### Enrich variables from timestamp

Add additional needed varialbes derived from the timestamp column. These are often used/needed as intermediate variables in home location algorithms, such as year, month, day, day of the week and hour of the day, etc. 


```{r}
df_enriched <- enrich_timestamp(df_nested, timestamp = "created_at")
head(df_enriched$data[[1]])
```


### Use built-in recipes 

Current available recipes, where `HMLC` is the default recipe used in `identify_location`: 

- `HMLC`: 
  - Weighs data points across multiple time frames to ‘score’ potentially meaningful locations for each user 
- `FREQ`
  - Selects the most frequently 'visited' location assuming a user is active mainly around their home location. 
- `OSNA`: [Efstathiades et al.2015](https://doi.org/10.1080/10630731003597306)
  - Finds the most 'popular' location during 'rest', 'active' and 'leisure time. Here we focus on 'rest' and 'leisure' time to find the most possible home location for each user. 
- `APDM`: [Ahas et al. 2010](https://doi.org/10.1080/10630731003597306)
  - Calculates the average and standard deviation of start time data points by a single user, in a single location. 

#### HMLC
```{r eval=F}
# default recipe: homelocator -- HMLC
identify_location(test_sample, user = "u_id", timestamp = "created_at", location = "grid_id", show_n_loc = 1, recipe = "HMLC")
```

#### FREQ
```{r eval=F}
# recipe: Frequency -- FREQ
identify_location(test_sample, user = "u_id", timestamp = "created_at", location = "grid_id", 
                  show_n_loc = 1, recipe = "FREQ")
```

#### OSNA
```{r eval=F}
# recipe: Online Social Network Activity -- OSNA
identify_location(test_sample, user = "u_id", timestamp = "created_at", location = "grid_id", 
                  show_n_loc = 1, recipe = "OSNA")
```

#### APDM
```{r eval=F}
# recipe: Online Social Network Activity -- APDM
## APDM recipe strictly returns the most likely home location
## It is important to create your location neighbors table before you use the recipe!!
## example: st_queen <- function(a, b = a) st_relate(a, b, pattern = "F***T****")
##          neighbors <- st_queen(df_sf) ===> convert result to dataframe 
data("df_neighbors", package = "homelocator")
identify_location(test_sample, user = "u_id", timestamp = "created_at", location = "grid_id", 
                  show_n_loc = 1, recipe = "APDM")
```