Should we change the way random walks are generated to allow dimensions #53
Replies: 6 comments
-
Here is an attempt at a new function to replace the current: #' Generate Multiple Random Normal Walks in Multiple Dimensions
#'
#' @family Generator Functions
#'
#' @author Steven P. Sanderson II, MPH
#'
#' @description
#' The `random_normal_walk` function generates multiple random walks in 1, 2, or 3 dimensions.
#' Each walk is a sequence of steps where each step is a random draw from a normal distribution.
#' The user can specify the number of walks, the number of steps in each walk, and the
#' parameters of the normal distribution (mean and standard deviation). The function
#' also allows for sampling a proportion of the steps and optionally sampling with replacement.
#'
#' @param .num_walks An integer specifying the number of random walks to generate. Default is 25.
#' @param .n An integer specifying the number of steps in each walk. Default is 100.
#' @param .mu A numeric value indicating the mean of the normal distribution. Default is 0.
#' @param .sd A numeric value indicating the standard deviation of the normal distribution. Default is 0.1.
#' @param .initial_value A numeric value indicating the initial value of the walks. Default is 0.
#' @param .samp A logical value indicating whether to sample the normal distribution values. Default is TRUE.
#' @param .replace A logical value indicating whether sampling is with replacement. Default is TRUE.
#' @param .sample_size A numeric value between 0 and 1 specifying the proportion of `.n` to sample. Default is 0.8.
#' @param .dimensions An integer specifying the number of dimensions (1, 2, or 3). Default is 1.
#'
#' @return A tibble containing the generated random walks with columns depending on the number of dimensions:
#' \itemize{
#' \item `walk_number`: Factor representing the walk number.
#' \item `step_number`: Step index.
#' \item `y`: If `.dimensions = 1`, the value of the walk at each step.
#' \item `x`, `y`: If `.dimensions = 2`, the values of the walk in two dimensions.
#' \item `x`, `y`, `z`: If `.dimensions = 3`, the values of the walk in three dimensions.
#' }
#'
#' The following are also returned based upon how many dimensions there are and could be any of x, y and or z:
#' \itemize{
#' \item `walk_number`: Factor representing the walk number.
#' \item `x`: Step index.
#' \item `y`: Normal distribution values.
#' \item `cum_sum`: Cumulative sum of `y`.
#' \item `cum_prod`: Cumulative product of `y`.
#' \item `cum_min`: Cumulative minimum of `y`.
#' \item `cum_max`: Cumulative maximum of `y`.
#' }
#'
#' The tibble includes attributes for the function parameters.
#'
#' @examples
#' # Generate 10 random walks in 1 dimension with 50 steps each
#' set.seed(123)
#' random_normal_walk(.num_walks = 10, .n = 50)
#'
#' # Generate random walks in 2 dimensions
#' set.seed(123)
#' random_normal_walk(.num_walks = 10, .n = 50, .dimensions = 2)
#'
#' # Generate random walks in 3 dimensions
#' set.seed(123)
#' random_normal_walk(.num_walks = 10, .n = 50, .dimensions = 3)
#'
#' @export
random_normal_walk <- function(.num_walks = 25, .n = 100, .mu = 0, .sd = 0.1,
.initial_value = 0, .samp = TRUE, .replace = TRUE,
.sample_size = 0.8, .dimensions = 1) {
# Defensive checks
if (.num_walks < 0) {
rlang::abort(".num_walks cannot be less than 0", use_cli_format = TRUE)
}
if (.n < 0) {
rlang::abort(".n cannot be less than 0", use_cli_format = TRUE)
}
if (.mu < 0) {
rlang::abort(".mu cannot be less than 0", use_cli_format = TRUE)
}
if (.sd < 0) {
rlang::abort(".sd cannot be less than 0", use_cli_format = TRUE)
}
if (.sample_size < 0 || .sample_size > 1) {
rlang::abort(".sample_size cannot be less than 0 or more than 1",
use_cli_format = TRUE)
}
if (!.dimensions %in% c(1, 2, 3)) {
rlang::abort("Number of dimensions must be 1, 2, or 3.", use_cli = TRUE)
}
# Variables
num_walks <- as.integer(.num_walks)
n <- as.integer(.n)
mu <- as.numeric(.mu)
sd <- as.numeric(.sd)
initial_value <- as.numeric(.initial_value)
replace <- as.logical(.replace)
samp <- as.logical(.samp)
samp_size <- round(.sample_size * n, 0)
periods <- if (samp) samp_size else n
# Define dimension names
dim_names <- switch(.dimensions,
`1` = c("y"),
`2` = c("x", "y"),
`3` = c("x", "y", "z"))
# Function to generate a single random walk
generate_walk <- function(walk_num) {
# Generate random steps for each dimension
rand_steps <- purrr::map(
dim_names,
~ if (samp) {
sample(stats::rnorm(n, mu, sd), size = periods, replace = replace)
} else {
stats::rnorm(periods, mu, sd)
}
)
# Set column names
rand_steps <- stats::setNames(rand_steps, dim_names)
rand_steps <- purrr::map(rand_steps, \(x) dplyr::as_tibble(x)) |>
purrr::list_cbind()
colnames(rand_steps) <- dim_names
rand_steps <- purrr::map(
rand_steps, \(x) x |>
unlist(use.names = FALSE)) |>
dplyr::as_tibble()
# Combine into a tibble
dplyr::tibble(
walk_number = factor(walk_num),
step_number = 1:periods
) |>
dplyr::bind_cols(rand_steps)
}
# Generate all walks
res <- map_dfr(1:num_walks, generate_walk)
res <- res |>
dplyr::group_by(walk_number) |>
std_cum_sum_augment(.value = dplyr::all_of(dim_names), .initial_value = initial_value) |>
std_cum_prod_augment(.value = dplyr::all_of(dim_names), .initial_value = initial_value) |>
std_cum_min_augment(.value = dplyr::all_of(dim_names), .initial_value = initial_value) |>
std_cum_max_augment(.value = dplyr::all_of(dim_names), .initial_value = initial_value) |>
std_cum_mean_augment(.value = dplyr::all_of(dim_names), .initial_value = initial_value) |>
dplyr::ungroup()
# Add attributes
attr(res, "n") <- n
attr(res, "num_walks") <- num_walks
attr(res, "mu") <- mu
attr(res, "sd") <- sd
attr(res, "initial_value") <- initial_value
attr(res, "replace") <- replace
attr(res, "samp") <- samp
attr(res, "samp_size") <- samp_size
attr(res, "periods") <- periods
attr(res, "fns") <- "random_normal_walk"
attr(res, "dimensions") <- .dimensions
# Return the result
return(res)
} Sample Output: > random_normal_walk(.dimensions = 1) |>
+ glimpse()
Rows: 2,000
Columns: 8
$ walk_number <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ step_number <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 2…
$ y <dbl> -0.1002532219, -0.0797601456, 0.1147956309, -0.0575512640, -0.0071935353, -0…
$ cum_sum_y <dbl> -0.100253222, -0.180013367, -0.065217737, -0.122769001, -0.129962536, -0.333…
$ cum_prod_y <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ cum_min_y <dbl> -0.1002532, -0.1002532, -0.1002532, -0.1002532, -0.1002532, -0.2030538, -0.2…
$ cum_max_y <dbl> -0.10025322, -0.07976015, 0.11479563, 0.11479563, 0.11479563, 0.11479563, 0.…
$ cum_mean_y <dbl> -1.002532e-01, -9.000668e-02, -2.173925e-02, -3.069225e-02, -2.599251e-02, -…
> random_normal_walk(.dimensions = 2) |>
+ glimpse()
Rows: 2,000
Columns: 14
$ walk_number <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ step_number <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 2…
$ x <dbl> 6.738246e-03, 7.403996e-02, -4.919611e-02, -7.715558e-02, -2.005732e-01, -1.…
$ y <dbl> -0.019300710, -0.122338203, 0.045713054, 0.142202671, 0.010560777, -0.160084…
$ cum_sum_x <dbl> 0.006738246, 0.080778210, 0.031582097, -0.045573484, -0.246146678, -0.398783…
$ cum_sum_y <dbl> -0.019300710, -0.141638913, -0.095925859, 0.046276812, 0.056837589, -0.10324…
$ cum_prod_x <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ cum_prod_y <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ cum_min_x <dbl> 0.006738246, 0.006738246, -0.049196113, -0.077155581, -0.200573194, -0.20057…
$ cum_min_y <dbl> -0.01930071, -0.12233820, -0.12233820, -0.12233820, -0.12233820, -0.16008473…
$ cum_max_x <dbl> 0.006738246, 0.074039964, 0.074039964, 0.074039964, 0.074039964, 0.074039964…
$ cum_max_y <dbl> -0.01930071, -0.01930071, 0.04571305, 0.14220267, 0.14220267, 0.14220267, 0.…
$ cum_mean_x <dbl> 0.006738246, 0.040389105, 0.010527366, -0.011393371, -0.049229336, -0.066463…
$ cum_mean_y <dbl> -0.0193007100, -0.0708194567, -0.0319752864, 0.0115692029, 0.0113675177, -0.…
> random_normal_walk(.dimensions = 3) |>
+ glimpse()
Rows: 2,000
Columns: 20
$ walk_number <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ step_number <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 2…
$ x <dbl> -0.1257393403, 0.1444225094, -0.0614083652, 0.0951444936, 0.0676052630, -0.0…
$ y <dbl> -0.083598254, -0.089906891, -0.055312913, 0.044652238, 0.134638354, 0.123363…
$ z <dbl> 0.0722389305, 0.0023475955, -0.0002119357, 0.0320583149, 0.0707547864, -0.07…
$ cum_sum_x <dbl> -0.12573934, 0.01868317, -0.04272520, 0.05241930, 0.12002456, 0.06812772, 0.…
$ cum_sum_y <dbl> -0.08359825, -0.17350515, -0.22881806, -0.18416582, -0.04952747, 0.07383572,…
$ cum_sum_z <dbl> 0.07223893, 0.07458653, 0.07437459, 0.10643291, 0.17718769, 0.10357439, 0.17…
$ cum_prod_x <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ cum_prod_y <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ cum_prod_z <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ cum_min_x <dbl> -0.1257393, -0.1257393, -0.1257393, -0.1257393, -0.1257393, -0.1257393, -0.1…
$ cum_min_y <dbl> -0.08359825, -0.08990689, -0.08990689, -0.08990689, -0.08990689, -0.08990689…
$ cum_min_z <dbl> 0.0722389305, 0.0023475955, -0.0002119357, -0.0002119357, -0.0002119357, -0.…
$ cum_max_x <dbl> -0.1257393, 0.1444225, 0.1444225, 0.1444225, 0.1444225, 0.1444225, 0.1444225…
$ cum_max_y <dbl> -0.08359825, -0.08359825, -0.05531291, 0.04465224, 0.13463835, 0.13463835, 0…
$ cum_max_z <dbl> 0.07223893, 0.07223893, 0.07223893, 0.07223893, 0.07223893, 0.07223893, 0.07…
$ cum_mean_x <dbl> -0.125739340, 0.009341585, -0.014241732, 0.013104824, 0.024004912, 0.0113546…
$ cum_mean_y <dbl> -0.083598254, -0.086752573, -0.076272686, -0.046041455, -0.009905493, 0.0123…
$ cum_mean_z <dbl> 0.072238931, 0.037293263, 0.024791530, 0.026608226, 0.035437538, 0.017262398… |
Beta Was this translation helpful? Give feedback.
-
Here is another example of an updated random_normal_walk_with_drift() random_normal_drift_walk <- function(.num_walks = 25, .n = 100, .mu = 0, .sd = 1,
.drift = 0.1, .initial_value = 0, .dimensions = 1) {
# Convert inputs to appropriate types
num_walks <- as.integer(.num_walks)
num_steps <- as.integer(.n)
mu <- as.numeric(.mu)
sd <- as.numeric(.sd)
drift <- as.numeric(.drift)
initial_value <- as.numeric(.initial_value)
# Checks
if (num_walks <= 0) {
rlang::abort("Number of walks must be a positive integer.", use_cli = TRUE)
}
if (num_steps <= 0) {
rlang::abort("Number of steps must be a positive integer.", use_cli = TRUE)
}
if (sd <= 0) {
rlang::abort("Standard deviation must be a positive number.", use_cli = TRUE)
}
if (is.na(mu)) {
rlang::abort("Mean must be a number.", use_cli = TRUE)
}
if (is.na(drift)) {
rlang::abort("Drift must be a number.", use_cli = TRUE)
}
if (is.na(initial_value)) {
rlang::abort("Initial value must be a number.", use_cli = TRUE)
}
if (!.dimensions %in% c(1, 2, 3)) {
rlang::abort("Number of dimensions must be 1, 2, or 3.", use_cli = TRUE)
}
# Create drift sequences for each dimension
dr <- purrr::map(
1:.dimensions,
~ seq(from = drift, to = drift * num_steps, length.out = num_steps)
)
# Define dimension names
dim_names <- switch(.dimensions,
`1` = c("y"),
`2` = c("x", "y"),
`3` = c("x", "y", "z"))
# Function to generate a single random walk with drift for multiple dimensions
single_random_walk_with_drift <- function(num_steps, mu, sd, dr) {
walks_per_dim <- purrr::map2(dr, dim_names, function(drift_seq, dim) {
wn <- stats::rnorm(n = num_steps, mean = mu, sd = sd)
rw <- cumsum(wn)
res <- wn + rw + drift_seq
res
})
# Set Column Names
rand_steps <- stats::setNames(walks_per_dim, dim_names)
rand_steps <- purrr::map(rand_steps, \(x) dplyr::as_tibble(x)) |>
purrr::list_cbind()
colnames(rand_steps) <- dim_names
rand_steps <- purrr::map(
rand_steps, \(x) x |>
unlist(use.names = FALSE)) |>
dplyr::as_tibble()
}
# Generate all walks for each dimension
walks <- purrr::map(
1:num_walks,
~ single_random_walk_with_drift(num_steps, mu, sd, dr)
)
# Create a tibble with all walks for all dimensions
res <- dplyr::bind_rows(walks, .id = "walk_number") |>
dplyr::mutate(walk_number = as.factor(walk_number)) |>
dplyr::group_by(walk_number) |>
dplyr::mutate(step_number = 1:num_steps) |>
dplyr::select(walk_number, step_number, dplyr::all_of(dim_names)) |>
std_cum_sum_augment(.value = dplyr::all_of(dim_names), .initial_value = initial_value) |>
std_cum_prod_augment(.value = dplyr::all_of(dim_names), .initial_value = initial_value) |>
std_cum_min_augment(.value = dplyr::all_of(dim_names), .initial_value = initial_value) |>
std_cum_max_augment(.value = dplyr::all_of(dim_names), .initial_value = initial_value) |>
std_cum_mean_augment(.value = dplyr::all_of(dim_names), .initial_value = initial_value) |>
dplyr::ungroup()
# Add attributes
attr(res, "n") <- num_steps
attr(res, "num_walks") <- num_walks
attr(res, "mu") <- mu
attr(res, "sd") <- sd
attr(res, "drift") <- drift
attr(res, "fns") <- "random_normal_drift_walk"
attr(res, "dimensions") <- .dimensions
return(res)
}
> random_normal_drift_walk() |> dplyr::glimpse()
Rows: 2,500
Columns: 8
$ walk_number <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ step_number <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 2…
$ y <dbl> -0.704207824, 0.005405342, -1.630592090, 2.540589878, -1.391934499, 0.230809…
$ cum_sum_y <dbl> -0.7042078, -0.6988025, -2.3293946, 0.2111953, -1.1807392, -0.9499297, -3.74…
$ cum_prod_y <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ cum_min_y <dbl> -0.7042078, -0.7042078, -1.6305921, -1.6305921, -1.6305921, -1.6305921, -2.7…
$ cum_max_y <dbl> -0.704207824, 0.005405342, 0.005405342, 2.540589878, 2.540589878, 2.54058987…
$ cum_mean_y <dbl> -0.70420782, -0.34940124, -0.77646486, 0.05279883, -0.23614784, -0.15832161,…
> random_normal_drift_walk(.dimensions = 2) |> dplyr::glimpse()
Rows: 2,500
Columns: 14
$ walk_number <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ step_number <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 2…
$ x <dbl> 1.3000643, -0.5741537, 1.2487231, -1.8004786, 2.5158755, 0.8701070, -0.27102…
$ y <dbl> 0.84614643, 0.90587021, 4.98652494, 2.05560699, 2.91931159, 3.88258039, 0.17…
$ cum_sum_x <dbl> 1.3000643, 0.7259106, 1.9746337, 0.1741550, 2.6900305, 3.5601376, 3.2891090,…
$ cum_sum_y <dbl> 0.8461464, 1.7520166, 6.7385416, 8.7941486, 11.7134602, 15.5960405, 15.77078…
$ cum_prod_x <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ cum_prod_y <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ cum_min_x <dbl> 1.3000643, -0.5741537, -0.5741537, -1.8004786, -1.8004786, -1.8004786, -1.80…
$ cum_min_y <dbl> 0.8461464, 0.8461464, 0.8461464, 0.8461464, 0.8461464, 0.8461464, 0.1747481,…
$ cum_max_x <dbl> 1.300064, 1.300064, 1.300064, 1.300064, 2.515875, 2.515875, 2.515875, 2.5158…
$ cum_max_y <dbl> 0.8461464, 0.9058702, 4.9865249, 4.9865249, 4.9865249, 4.9865249, 4.9865249,…
$ cum_mean_x <dbl> 1.30006431, 0.36295528, 0.65821123, 0.04353876, 0.53800610, 0.59335626, 0.46…
$ cum_mean_y <dbl> 0.8461464, 0.8760083, 2.2461805, 2.1985371, 2.3426920, 2.5993401, 2.2529698,…
>random_normal_drift_walk(.dimensions = 3) |> dplyr::glimpse()
Rows: 2,500
Columns: 20
$ walk_number <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ step_number <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 2…
$ x <dbl> -0.782059844, -3.404440324, -3.190737459, -0.872538854, -3.275845149, -0.158…
$ y <dbl> -0.943494797, -2.753556318, -2.827901366, -3.170963854, -6.324274173, -6.328…
$ z <dbl> -1.0782543, 0.9098405, -0.9205521, 0.1392489, 0.5606996, -0.3060099, -1.6772…
$ cum_sum_x <dbl> -0.7820598, -4.1865002, -7.3772376, -8.2497765, -11.5256216, -11.6845978, -1…
$ cum_sum_y <dbl> -0.9434948, -3.6970511, -6.5249525, -9.6959163, -16.0201905, -22.3483258, -2…
$ cum_sum_z <dbl> -1.0782543, -0.1684139, -1.0889659, -0.9497171, -0.3890174, -0.6950273, -2.3…
$ cum_prod_x <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ cum_prod_y <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ cum_prod_z <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ cum_min_x <dbl> -0.7820598, -3.4044403, -3.4044403, -3.4044403, -3.4044403, -3.4044403, -3.4…
$ cum_min_y <dbl> -0.9434948, -2.7535563, -2.8279014, -3.1709639, -6.3242742, -6.3281353, -6.3…
$ cum_min_z <dbl> -1.078254, -1.078254, -1.078254, -1.078254, -1.078254, -1.078254, -1.677289,…
$ cum_max_x <dbl> -0.7820598, -0.7820598, -0.7820598, -0.7820598, -0.7820598, -0.1589761, 1.35…
$ cum_max_y <dbl> -0.9434948, -0.9434948, -0.9434948, -0.9434948, -0.9434948, -0.9434948, -0.9…
$ cum_max_z <dbl> -1.0782543, 0.9098405, 0.9098405, 0.9098405, 0.9098405, 0.9098405, 0.9098405…
$ cum_mean_x <dbl> -0.78205984, -2.09325008, -2.45907921, -2.06244412, -2.30512433, -1.94743296…
$ cum_mean_y <dbl> -0.9434948, -1.8485256, -2.1749842, -2.4239791, -3.2040381, -3.7247210, -3.8…
$ cum_mean_z <dbl> -1.078254341, -0.084206938, -0.362988647, -0.237429272, -0.077803488, -0.115… |
Beta Was this translation helpful? Give feedback.
-
Here is an example of test_discrete_walk <- function(.num_walks = 25, .n = 100, .upper_bound = 1,
.lower_bound = -1, .upper_probability = 0.5,
.initial_value = 100, .dimensions = 1) {
# Variables
num_walks <- as.integer(.num_walks)
periods <- as.integer(.n)
upper_bound <- as.numeric(.upper_bound)
lower_bound <- as.numeric(.lower_bound)
upper_probability <- as.numeric(.upper_probability)
lower_probability <- 1 - upper_probability
initial_value <- as.numeric(.initial_value)
dimensions <- as.integer(.dimensions)
# Checks
if (!is.integer(num_walks) | num_walks < 1) {
rlang::abort(
message = "The number of walks must be an integer greater than 0.",
use_cli_format = TRUE
)
}
if (!is.integer(periods) | periods < 1) {
rlang::abort(
message = "The number of periods must be an integer greater than 0.",
use_cli_format = TRUE
)
}
if (!is.numeric(upper_bound)) {
rlang::abort(
message = "The upper bound must be a numeric value.",
use_cli_format = TRUE
)
}
if (!is.numeric(lower_bound)) {
rlang::abort(
message = "The lower bound must be a numeric value.",
use_cli_format = TRUE
)
}
if (!is.numeric(upper_probability) | upper_probability < 0 | upper_probability > 1) {
rlang::abort(
message = "The upper probability must be a numeric value between 0 and 1.",
use_cli_format = TRUE
)
}
if (!is.numeric(initial_value)) {
rlang::abort(
message = "The initial value must be a numeric value.",
use_cli_format = TRUE
)
}
if (!is.integer(dimensions) | dimensions < 1 | dimensions > 3) {
rlang::abort(
message = "The number of dimensions must be an integer between 1 and 3.",
use_cli_format = TRUE
)
}
if (!.dimensions %in% c(1, 2, 3)) {
rlang::abort(
message = "The number of dimensions must be an integer between 1 and 3.",
use_cli_format = TRUE
)
}
# Define dimension names
dim_names <- switch(.dimensions,
`1` = c("y"),
`2` = c("x", "y"),
`3` = c("x", "y", "z"))
# Generate walks for each dimension
single_discrete_walk <- function(periods, upper_bound, lower_bound,
upper_probability, lower_probability){
rand_steps <- purrr::map(
dim_names,
~ replicate(
n = periods,
sample(
x = c(upper_bound, lower_bound),
size = 1,
prob = c(upper_probability, lower_probability))
)
)
# Set Column Names
rand_steps <- stats::setNames(rand_steps, dim_names)
rand_steps <- purrr::map(rand_steps, \(x) dplyr::as_tibble(x)) |>
purrr::list_cbind()
colnames(rand_steps) <- dim_names
rand_steps <- purrr::map(
rand_steps, \(x) x |>
unlist(use.names = FALSE)) |>
dplyr::as_tibble()
# Combine into a tibble
dplyr::tibble(
walk_number = factor(num_walks),
step_number = 1:periods
) |>
dplyr::bind_cols(rand_steps)
}
# Generate walks
walks <- purrr::map(
1:num_walks,
~ single_discrete_walk(periods, upper_bound, lower_bound,
upper_probability, lower_probability)
)
# Create a tibble with all walks for all dimensions
res <- dplyr::bind_rows(walks, .id = "walk_number") |>
dplyr::mutate(walk_number = as.factor(walk_number)) |>
dplyr::group_by(walk_number) |>
dplyr::select(walk_number, step_number, dplyr::all_of(dim_names)) |>
std_cum_sum_augment(.value = dplyr::all_of(dim_names), .initial_value = initial_value) |>
std_cum_prod_augment(.value = dplyr::all_of(dim_names), .initial_value = initial_value) |>
std_cum_min_augment(.value = dplyr::all_of(dim_names), .initial_value = initial_value) |>
std_cum_max_augment(.value = dplyr::all_of(dim_names), .initial_value = initial_value) |>
std_cum_mean_augment(.value = dplyr::all_of(dim_names), .initial_value = initial_value) |>
dplyr::ungroup()
# Attributes
attr(res, "n") <- periods
attr(res, "num_walks") <- .num_walks
attr(res, "upper_bound") <- upper_bound
attr(res, "lower_bound") <- lower_bound
attr(res, "upper_probability") <- upper_probability
attr(res, "lower_probability") <- lower_probability
attr(res, "initial_value") <- initial_value
attr(res, "fns") <- "discrete_walk"
attr(res, "dimension") <- dimensions
# Return
return(res)
}
> test_discrete_walk() |> dplyr::glimpse()
Rows: 2,500
Columns: 8
$ walk_number <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ step_number <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 2…
$ y <dbl> 1, 1, 1, -1, 1, -1, 1, 1, -1, 1, -1, -1, -1, -1, 1, 1, -1, -1, 1, -1, 1, 1, …
$ cum_sum_y <dbl> 101, 102, 103, 102, 103, 102, 103, 104, 103, 104, 103, 102, 101, 100, 101, 1…
$ cum_prod_y <dbl> 200, 400, 800, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ cum_min_y <dbl> 101, 101, 101, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 9…
$ cum_max_y <dbl> 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 1…
$ cum_mean_y <dbl> 101.00000, 101.00000, 101.00000, 100.50000, 100.60000, 100.33333, 100.42857,…
> test_discrete_walk(.dimensions = 2) |> dplyr::glimpse()
Rows: 2,500
Columns: 14
$ walk_number <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ step_number <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 2…
$ x <dbl> 1, 1, 1, 1, -1, 1, -1, 1, -1, -1, 1, 1, 1, -1, -1, -1, -1, 1, -1, 1, 1, 1, -…
$ y <dbl> 1, -1, 1, 1, -1, 1, 1, 1, 1, 1, -1, 1, -1, -1, -1, -1, 1, -1, -1, 1, 1, -1, …
$ cum_sum_x <dbl> 101, 102, 103, 104, 103, 104, 103, 104, 103, 102, 103, 104, 105, 104, 103, 1…
$ cum_sum_y <dbl> 101, 100, 101, 102, 101, 102, 103, 104, 105, 106, 105, 106, 105, 104, 103, 1…
$ cum_prod_x <dbl> 200, 400, 800, 1600, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ cum_prod_y <dbl> 200, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ cum_min_x <dbl> 101, 101, 101, 101, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, …
$ cum_min_y <dbl> 101, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99,…
$ cum_max_x <dbl> 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 1…
$ cum_max_y <dbl> 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 1…
$ cum_mean_x <dbl> 101.00000, 101.00000, 101.00000, 101.00000, 100.60000, 100.66667, 100.42857,…
$ cum_mean_y <dbl> 101.0000, 100.0000, 100.3333, 100.5000, 100.2000, 100.3333, 100.4286, 100.50…
> test_discrete_walk(.dimensions = 3) |> dplyr::glimpse()
Rows: 2,500
Columns: 20
$ walk_number <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ step_number <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 2…
$ x <dbl> -1, -1, -1, 1, 1, 1, -1, -1, 1, 1, -1, -1, 1, -1, -1, -1, -1, -1, 1, -1, 1, …
$ y <dbl> 1, -1, 1, 1, 1, 1, -1, -1, -1, 1, -1, 1, -1, -1, -1, 1, 1, 1, -1, 1, 1, -1, …
$ z <dbl> 1, 1, 1, -1, 1, 1, -1, -1, -1, -1, 1, 1, -1, 1, 1, -1, 1, 1, -1, -1, -1, 1, …
$ cum_sum_x <dbl> 99, 98, 97, 98, 99, 100, 99, 98, 99, 100, 99, 98, 99, 98, 97, 96, 95, 94, 95…
$ cum_sum_y <dbl> 101, 100, 101, 102, 103, 104, 103, 102, 101, 102, 101, 102, 101, 100, 99, 10…
$ cum_sum_z <dbl> 101, 102, 103, 102, 103, 104, 103, 102, 101, 100, 101, 102, 101, 102, 103, 1…
$ cum_prod_x <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ cum_prod_y <dbl> 200, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ cum_prod_z <dbl> 200, 400, 800, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ cum_min_x <dbl> 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, …
$ cum_min_y <dbl> 101, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99,…
$ cum_min_z <dbl> 101, 101, 101, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 9…
$ cum_max_x <dbl> 99, 99, 99, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101,…
$ cum_max_y <dbl> 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 1…
$ cum_max_z <dbl> 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 1…
$ cum_mean_x <dbl> 99.00000, 99.00000, 99.00000, 99.50000, 99.80000, 100.00000, 99.85714, 99.75…
$ cum_mean_y <dbl> 101.00000, 100.00000, 100.33333, 100.50000, 100.60000, 100.66667, 100.42857,…
$ cum_mean_z <dbl> 101.0000, 101.0000, 101.0000, 100.5000, 100.6000, 100.6667, 100.4286, 100.25…
> test_discrete_walk(.dimensions = 4)
Error in `test_discrete_walk()`:
! The number of dimensions must be an integer between 1 and 3.
Run `rlang::last_trace()` to see where the error occurred.
Called from: signal_abort(cnd, .file)
tst <- test_discrete_walk(.dimensions = 2, .num_walks = 9, .n = 250)
tst |>
ggplot(aes(x = cum_sum_x, y = cum_sum_y)) +
facet_wrap(~ walk_number, scales = "free") +
geom_path(aes(color = step_number), linewidth = 0.5) +
geom_point(data = tst |>
group_by(walk_number) |>
filter(step_number == min(step_number) | step_number == max(step_number)) |>
ungroup(),
aes(color = step_number), size = 3) +
scale_color_viridis_c(option = "plasma") +
theme_minimal() |
Beta Was this translation helpful? Give feedback.
-
Here is an example of Brownian Motion: #' Brownian Motion
#'
#' @family Generator Functions
#'
#' @author Steven P. Sanderson II, MPH
#'
#' @description Create a Brownian Motion Tibble
#'
#' @details Brownian Motion, also known as the Wiener process, is a
#' continuous-time random process that describes the random movement of particles
#' suspended in a fluid. It is named after the physicist Robert Brown,
#' who first described the phenomenon in 1827.
#'
#' The equation for Brownian Motion can be represented as:
#'
#' W(t) = W(0) + sqrt(t) * Z
#'
#' Where W(t) is the Brownian motion at time t, W(0) is the initial value of the
#' Brownian motion, sqrt(t) is the square root of time, and Z is a standard
#' normal random variable.
#'
#' Brownian Motion has numerous applications, including modeling stock prices in
#' financial markets, modeling particle movement in fluids, and modeling random
#' walk processes in general. It is a useful tool in probability theory and
#' statistical analysis.
#'
#' @param .n Total time of the simulation.
#' @param .num_walks Total number of simulations.
#' @param .delta_time Time step size.
#' @param .initial_value Integer representing the initial value.
#' @param .dimensions The default is 1. Allowable values are 1, 2 and 3.
#'
#' @examples
#' library(ggplot2)
#'
#' set.seed(123)
#' brownian_motion()
#'
#' set.seed(123)
#' brownian_motion() |>
#' ggplot(aes(x = x, y = y, group = walk_number, color = walk_number)) +
#' geom_line() +
#' labs(title = "Brownian Motion", x = "Time", y = "Value") +
#' theme_minimal() +
#' theme(legend.position = "none")
#'
#' @return A tibble containing the generated random walks with columns depending on the number of dimensions:
#' \itemize{
#' \item `walk_number`: Factor representing the walk number.
#' \item `step_number`: Step index.
#' \item `y`: If `.dimensions = 1`, the value of the walk at each step.
#' \item `x`, `y`: If `.dimensions = 2`, the values of the walk in two dimensions.
#' \item `x`, `y`, `z`: If `.dimensions = 3`, the values of the walk in three dimensions.
#' }
#'
#' The following are also returned based upon how many dimensions there are and could be any of x, y and or z:
#' \itemize{
#' \item `cum_sum`: Cumulative sum of `dplyr::all_of(.dimensions)`.
#' \item `cum_prod`: Cumulative product of `dplyr::all_of(.dimensions)`.
#' \item `cum_min`: Cumulative minimum of `dplyr::all_of(.dimensions)`.
#' \item `cum_max`: Cumulative maximum of `dplyr::all_of(.dimensions)`.
#' \item `cum_mean`: Cumulative mean of `dplyr::all_of(.dimensions)`.
#' }
#'
#' @name brownian_motion
NULL
#' @export
#' @rdname brownian_motion
brownian_motion_v2 <- function(.num_walks = 25, .n = 100, .delta_time = 1,
.initial_value = 0, .dimensions = 1) {
# Tidyeval ----
num_sims <- as.numeric(.num_walks)
t <- as.numeric(.n)
initial_value <- as.numeric(.initial_value)
delta_time <- as.numeric(.delta_time)
# Checks
if (!is.numeric(num_sims) | !is.numeric(t) | !is.numeric(initial_value) |
!is.numeric(delta_time)){
rlang::abort(
message = "The parameters `.num_walks`, `.n`, `.delta_time`, and `.initial_value` must be numeric.",
use_cli_format = TRUE
)
}
# .num_walks and .n must be >= 1
if (num_sims < 1 | t < 1){
rlang::abort(
message = "The parameters of `.num_walks` and `.n` must be >= 1.",
use_cli_format = TRUE
)
}
# .delta_time must be > 0
if (delta_time <= 0){
rlang::abort(
message = "The parameter `.delta_time` must be > 0.",
use_cli_format = TRUE
)
}
if (!is.logical(return_tibble)){
rlang::abort(
message = "The parameter `.return_tibble` must be either TRUE/FALSE",
use_cli_format = TRUE
)
}
if (!.dimensions %in% c(1, 2, 3)) {
rlang::abort("Number of dimensions must be 1, 2, or 3.", use_cli = TRUE)
}
# Define dimension names
dim_names <- switch(.dimensions,
`1` = c("y"),
`2` = c("x", "y"),
`3` = c("x", "y", "z"))
# Matrix of random draws - one for each simulation
generate_brownian_motion <- function(num_sims) {
rand_steps <- purrr::map(
dim_names,
~ stats::rnorm(t, mean = 0, sd = sqrt(delta_time))
)
# Set column names
rand_steps <- stats::setNames(rand_steps, dim_names)
rand_steps <- purrr::map(rand_steps, \(x) dplyr::as_tibble(x)) |>
purrr::list_cbind()
colnames(rand_steps) <- dim_names
rand_steps <- purrr::map(
rand_steps, \(x) x |>
unlist(use.names = FALSE)) |>
dplyr::as_tibble()
# Combine into a tibble
dplyr::tibble(
walk_number = factor(num_sims),
step_number = 1:t
) |>
dplyr::bind_cols(rand_steps)
}
# Get the Brownian Motion and convert to price paths
res <- purrr::map(1:num_sims, ~ generate_brownian_motion(.x)) |>
dplyr::bind_rows() |>
dplyr::select(walk_number, step_number, dplyr::any_of(dim_names)) |>
dplyr::group_by(walk_number) |>
std_cum_sum_augment(.value = dplyr::all_of(dim_names), .initial_value = initial_value) |>
std_cum_prod_augment(.value = dplyr::all_of(dim_names), .initial_value = initial_value) |>
std_cum_min_augment(.value = dplyr::all_of(dim_names), .initial_value = initial_value) |>
std_cum_max_augment(.value = dplyr::all_of(dim_names), .initial_value = initial_value) |>
std_cum_mean_augment(.value = dplyr::all_of(dim_names), .initial_value = initial_value) |>
dplyr::ungroup()
# Return ----
attr(res, "n") <- .n
attr(res, "num_walks") <- .num_walks
attr(res, "delta_time") <- .delta_time
attr(res, "initial_value") <- .initial_value
attr(res, "return_tibble") <- .return_tibble
attr(res, "fns") <- "brownian_motion"
attr(res, "dimension") <- 1
return(res)
} Output: > brownian_motion_v2()
# A tibble: 2,500 × 8
walk_number step_number y cum_sum_y cum_prod_y cum_min_y cum_max_y cum_mean_y
<fct> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 0.463 0.463 0 0.463 0.463 0.463
2 1 2 0.365 0.829 0 0.365 0.463 0.414
3 1 3 1.11 1.93 0 0.365 1.11 0.645
4 1 4 -0.204 1.73 0 -0.204 1.11 0.433
5 1 5 0.701 2.43 0 -0.204 1.11 0.486
6 1 6 0.599 3.03 0 -0.204 1.11 0.505
7 1 7 -0.101 2.93 0 -0.204 1.11 0.419
8 1 8 -0.117 2.81 0 -0.204 1.11 0.352
9 1 9 -0.658 2.15 0 -0.658 1.11 0.239
10 1 10 2.28 4.44 0 -0.658 2.28 0.444
# ℹ 2,490 more rows
# ℹ Use `print(n = ...)` to see more rows
> brownian_motion_v2(.dimensions = 2)
# A tibble: 2,500 × 14
walk_number step_number x y cum_sum_x cum_sum_y cum_prod_x cum_prod_y cum_min_x
<fct> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 -1.93 0.0384 -1.93 0.0384 0 0 -1.93
2 1 2 2.09 -1.03 0.160 -0.995 0 0 -1.93
3 1 3 -0.914 -1.16 -0.754 -2.16 0 0 -1.93
4 1 4 -1.21 -2.18 -1.96 -4.34 0 0 -1.93
5 1 5 -1.77 -0.184 -3.73 -4.52 0 0 -1.93
6 1 6 -1.25 0.316 -4.97 -4.21 0 0 -1.93
7 1 7 0.789 0.826 -4.19 -3.38 0 0 -1.93
8 1 8 1.22 -0.653 -2.97 -4.03 0 0 -1.93
9 1 9 -0.275 -0.787 -3.24 -4.82 0 0 -1.93
10 1 10 -1.35 -0.971 -4.59 -5.79 0 0 -1.93
# ℹ 2,490 more rows
# ℹ 5 more variables: cum_min_y <dbl>, cum_max_x <dbl>, cum_max_y <dbl>, cum_mean_x <dbl>,
# cum_mean_y <dbl>
# ℹ Use `print(n = ...)` to see more rows
> brownian_motion_v2(.dimensions = 3)
# A tibble: 2,500 × 20
walk_number step_number x y z cum_sum_x cum_sum_y cum_sum_z cum_prod_x
<fct> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 0.425 -0.189 -0.253 0.425 -0.189 -0.253 0
2 1 2 1.89 0.350 -1.23 2.31 0.161 -1.48 0
3 1 3 -0.383 -1.29 -0.533 1.93 -1.13 -2.01 0
4 1 4 -0.539 1.02 -0.838 1.39 -0.106 -2.85 0
5 1 5 -0.285 -0.706 -0.0792 1.10 -0.812 -2.93 0
6 1 6 -2.65 -0.176 0.00443 -1.55 -0.988 -2.93 0
7 1 7 0.869 0.292 -0.760 -0.676 -0.696 -3.69 0
8 1 8 0.707 0.405 -1.34 0.0307 -0.291 -5.03 0
9 1 9 0.686 0.952 0.176 0.717 0.661 -4.85 0
10 1 10 0.663 -0.578 0.209 1.38 0.0826 -4.65 0
# ℹ 2,490 more rows
# ℹ 11 more variables: cum_prod_y <dbl>, cum_prod_z <dbl>, cum_min_x <dbl>, cum_min_y <dbl>,
# cum_min_z <dbl>, cum_max_x <dbl>, cum_max_y <dbl>, cum_max_z <dbl>, cum_mean_x <dbl>,
# cum_mean_y <dbl>, cum_mean_z <dbl>
# ℹ Use `print(n = ...)` to see more rows |
Beta Was this translation helpful? Give feedback.
-
Geometric Brownian Motion: #' Geometric Brownian Motion
#'
#' @family Generator Functions
#'
#' @author Steven P. Sanderson II, MPH
#'
#' @description Create a Geometric Brownian Motion.
#'
#' @details Geometric Brownian Motion (GBM) is a statistical method for modeling
#' the evolution of a given financial asset over time. It is a type of stochastic
#' process, which means that it is a system that undergoes random changes over
#' time.
#'
#' GBM is widely used in the field of finance to model the behavior of stock
#' prices, foreign exchange rates, and other financial assets. It is based on
#' the assumption that the asset's price follows a random walk, meaning that it
#' is influenced by a number of unpredictable factors such as market trends,
#' news events, and investor sentiment.
#'
#' The equation for GBM is:
#'
#' dS/S = mdt + sdW
#'
#' where S is the price of the asset, t is time, m is the expected return on the
#' asset, s is the volatility of the asset, and dW is a small random change in
#' the asset's price.
#'
#' GBM can be used to estimate the likelihood of different outcomes for a given
#' asset, and it is often used in conjunction with other statistical methods to
#' make more accurate predictions about the future performance of an asset.
#'
#' This function provides the ability of simulating and estimating the parameters
#' of a GBM process. It can be used to analyze the behavior of financial
#' assets and to make informed investment decisions.
#'
#' @param .n Total time of the simulation, how many `n` points in time.
#' @param .num_walks Total number of simulations.
#' @param .delta_time Time step size.
#' @param .initial_value Integer representing the initial value.
#' @param .mu Expected return
#' @param .sigma Volatility
#' @param .dimensions The default is 1. Allowable values are 1, 2 and 3.
#'
#' @examples
#' library(ggplot2)
#'
#' set.seed(123)
#' geometric_brownian_motion()
#'
#' set.seed(123)
#' geometric_brownian_motion() |>
#' ggplot(aes(x = x, y = y, group = walk_number, color = walk_number)) +
#' geom_line() +
#' labs(title = "Geometric Brownian Motion", x = "Time", y = "Value") +
#' theme_minimal() +
#' theme(legend.position = "none")
#'
#' @return A tibble containing the generated random walks with columns depending
#' on the number of dimensions:
#' \itemize{
#' \item `walk_number`: Factor representing the walk number.
#' \item `step_number`: Step index.
#' \item `y`: If `.dimensions = 1`, the value of the walk at each step.
#' \item `x`, `y`: If `.dimensions = 2`, the values of the walk in two dimensions.
#' \item `x`, `y`, `z`: If `.dimensions = 3`, the values of the walk in three dimensions.
#' }
#'
#' The following are also returned based upon how many dimensions there are and
#' could be any of x, y and or z:
#' \itemize{
#' \item `cum_sum`: Cumulative sum of `dplyr::all_of(.dimensions)`.
#' \item `cum_prod`: Cumulative product of `dplyr::all_of(.dimensions)`.
#' \item `cum_min`: Cumulative minimum of `dplyr::all_of(.dimensions)`.
#' \item `cum_max`: Cumulative maximum of `dplyr::all_of(.dimensions)`.
#' \item `cum_mean`: Cumulative mean of `dplyr::all_of(.dimensions)`.
#' }
#'
#' @name geometric_brownian_motion
NULL
#' @export
#' @rdname geometric_brownian_motion
geometric_brownian_motion_v2 <- function(.num_walks = 25, .n = 100,
.mu = 0, .sigma = 0.1,
.initial_value = 100,
.delta_time = 0.003,
.dimensions = 1) {
# Tidyeval ----
# Thank you to https://robotwealth.com/efficiently-simulating-geometric-brownian-motion-in-r/
num_sims <- as.numeric(.num_walks)
t <- as.numeric(.n)
mu <- as.numeric(.mu)
sigma <- as.numeric(.sigma)
initial_value <- as.numeric(.initial_value)
delta_time <- as.numeric(.delta_time)
# Checks ----
if (!is.numeric(num_sims) | !is.numeric(t) | !is.numeric(mu) |
!is.numeric(sigma) | !is.numeric(initial_value) | !is.numeric(delta_time)){
rlang::abort(
message = "The parameters of `.n', `.num_walks`, `.mu`, `.sigma`,
`.initial_value`, and `.delta_time` must be numeric.",
use_cli_format = TRUE
)
}
# .mu and .sigma and .detla_time must be >= 0
if (mu < 0 | sigma < 0 | delta_time < 0){
rlang::abort(
message = "The parameters of `.mu`, `.sigma`, and `.delta_time` must be >= 0.",
use_cli_format = TRUE
)
}
# .num_walks and .n must be >= 1
if (num_sims < 1 | t < 1){
rlang::abort(
message = "The parameters of `.num_walks` and `.n` must be >= 1.",
use_cli_format = TRUE
)
}
if (!.dimensions %in% c(1, 2, 3)) {
rlang::abort("Number of dimensions must be 1, 2, or 3.", use_cli = TRUE)
}
# Define dimension names
dim_names <- switch(.dimensions,
`1` = c("y"),
`2` = c("x", "y"),
`3` = c("x", "y", "z"))
# matrix of random draws - one for each day for each simulation
generate_gbm <- function(num_sims){
rand_steps <- purrr::map(
dim_names,
~ exp((mu - sigma * sigma / 2) * delta_time + sigma * stats::rnorm(t) * sqrt(delta_time)) |>
cumprod()
)
# Set column names
rand_steps <- stats::setNames(rand_steps, dim_names)
rand_steps <- purrr::map(rand_steps, \(x) dplyr::as_tibble(x)) |>
purrr::list_cbind()
colnames(rand_steps) <- dim_names
rand_steps <- purrr::map(
rand_steps, \(x) x |>
unlist(use.names = FALSE)) |>
dplyr::as_tibble()
# Combine into a tibble
dplyr::tibble(
walk_number = factor(num_sims),
step_number = 1:t
) |>
dplyr::bind_cols(rand_steps)
}
res <- purrr::map(1:num_sims, ~ generate_gbm(.x)) |>
dplyr::bind_rows() |>
dplyr::select(walk_number, step_number, dplyr::all_of(dim_names)) |>
dplyr::group_by(walk_number) |>
std_cum_min_augment(.value = dplyr::all_of(dim_names), .initial_value = initial_value) |>
std_cum_max_augment(.value = dplyr::all_of(dim_names), .initial_value = initial_value) |>
std_cum_mean_augment(.value = dplyr::all_of(dim_names), .initial_value = initial_value) |>
dplyr::ungroup()
# Return
attr(res, "n") <- .n
attr(res, "num_walks") <- .num_walks
attr(res, "mean") <- .mu
attr(res, "sigma") <- .sigma
attr(res, "initial_value") <- .initial_value
attr(res, "delta_time") <- .delta_time
attr(res, "fns") <- "geometric_brownian_motion"
attr(res, "dimension") <- .dimensions
return(res)
} Output: > geometric_brownian_motion_v2() |> dplyr::glimpse()
Rows: 2,500
Columns: 6
$ walk_number <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ step_number <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 2…
$ y <dbl> 1.0032995, 0.9950700, 0.9950563, 0.9939038, 0.9992407, 0.9955925, 0.9961966,…
$ cum_min_y <dbl> 101.0033, 100.9951, 100.9951, 100.9939, 100.9939, 100.9939, 100.9939, 100.99…
$ cum_max_y <dbl> 101.0033, 101.0033, 101.0033, 101.0033, 101.0033, 101.0033, 101.0033, 101.00…
$ cum_mean_y <dbl> 101.0033, 100.9992, 100.9978, 100.9968, 100.9973, 100.9970, 100.9969, 100.99…
> geometric_brownian_motion_v2(.dimensions = 2) |> dplyr::glimpse()
Rows: 2,500
Columns: 10
$ walk_number <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ step_number <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 2…
$ x <dbl> 0.9900641, 0.9942326, 0.9967875, 1.0028786, 0.9901350, 0.9843250, 0.9902295,…
$ y <dbl> 0.9949847, 0.9999695, 0.9929160, 1.0004247, 0.9977902, 0.9912838, 0.9940214,…
$ cum_min_x <dbl> 100.9901, 100.9901, 100.9901, 100.9901, 100.9901, 100.9843, 100.9843, 100.98…
$ cum_min_y <dbl> 100.9950, 100.9950, 100.9929, 100.9929, 100.9929, 100.9913, 100.9913, 100.99…
$ cum_max_x <dbl> 100.9901, 100.9942, 100.9968, 101.0029, 101.0029, 101.0029, 101.0029, 101.00…
$ cum_max_y <dbl> 100.9950, 101.0000, 101.0000, 101.0004, 101.0004, 101.0004, 101.0004, 101.00…
$ cum_mean_x <dbl> 100.9901, 100.9921, 100.9937, 100.9960, 100.9948, 100.9931, 100.9927, 100.99…
$ cum_mean_y <dbl> 100.9950, 100.9975, 100.9960, 100.9971, 100.9972, 100.9962, 100.9959, 100.99…
> geometric_brownian_motion_v2(.dimensions = 3) |> dplyr::glimpse()
Rows: 2,500
Columns: 14
$ walk_number <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ step_number <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 2…
$ x <dbl> 0.9994094, 1.0055157, 1.0106583, 1.0017301, 0.9962237, 0.9957554, 1.0023433,…
$ y <dbl> 0.9929238, 0.9964699, 0.9971869, 1.0032907, 1.0052320, 1.0103826, 1.0215538,…
$ z <dbl> 0.9920106, 0.9802242, 0.9715859, 0.9724813, 0.9825850, 0.9928515, 0.9939654,…
$ cum_min_x <dbl> 100.9994, 100.9994, 100.9994, 100.9994, 100.9962, 100.9958, 100.9958, 100.99…
$ cum_min_y <dbl> 100.9929, 100.9929, 100.9929, 100.9929, 100.9929, 100.9929, 100.9929, 100.99…
$ cum_min_z <dbl> 100.9920, 100.9802, 100.9716, 100.9716, 100.9716, 100.9716, 100.9716, 100.97…
$ cum_max_x <dbl> 100.9994, 101.0055, 101.0107, 101.0107, 101.0107, 101.0107, 101.0107, 101.01…
$ cum_max_y <dbl> 100.9929, 100.9965, 100.9972, 101.0033, 101.0052, 101.0104, 101.0216, 101.02…
$ cum_max_z <dbl> 100.9920, 100.9920, 100.9920, 100.9920, 100.9920, 100.9929, 100.9940, 100.99…
$ cum_mean_x <dbl> 100.9994, 101.0025, 101.0052, 101.0043, 101.0027, 101.0015, 101.0017, 101.00…
$ cum_mean_y <dbl> 100.9929, 100.9947, 100.9955, 100.9975, 100.9990, 101.0009, 101.0039, 101.00…
$ cum_mean_z <dbl> 100.9920, 100.9861, 100.9813, 100.9791, 100.9798, 100.9820, 100.9837, 100.98…
geometric_brownian_motion_v2(.dimensions = 2) |>
ggplot(aes(x = x, y = y)) +
facet_wrap(~ walk_number, scales = "free") +
geom_path() +
theme_minimal() |
Beta Was this translation helpful? Give feedback.
-
I am implementing this into the |
Beta Was this translation helpful? Give feedback.
-
should we change the way random walks are generated in order to allow users to choose a type and thereby only need to call a function like 1D, 2D or 3D
What it really comes down to is how do we efficiently generate random walks of dimensions 1, 2 or 3 without having a function for each dimension
Beta Was this translation helpful? Give feedback.
All reactions