Add lightly edited version of shortcut testing vignette

openpharma · Aug 18, 2023 · ff87431 · ff87431
1 parent 4be8567
commit ff87431
Show file tree

Hide file tree

Showing 3 changed files with 600 additions and 1 deletion.
diff --git a/.github/workflows/pkgdown.yaml b/.github/workflows/pkgdown.yaml
@@ -2,7 +2,7 @@
 # Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help
 on:
   push:
-    branches: [feature-power, feature-pkgdown, dev, main, release]
+    branches: [vignettes-eb, feature-pkgdown, dev, main, release]
   pull_request:
     branches: [dev, main, release]
   release:

diff --git a/vignettes/graphicalMCP.Rmd b/vignettes/graphicalMCP.Rmd
@@ -0,0 +1,279 @@
+---
+title: "Get started"
+output:
+  rmarkdown::html_vignette:
+    code_folding: "show"
+vignette: >
+  %\VignetteIndexEntry{Get started}
+  %\VignetteEngine{knitr::rmarkdown}
+  %\VignetteEncoding{UTF-8}
+---
+
+```{r, include = FALSE}
+knitr::opts_chunk$set(
+  collapse = TRUE,
+  comment = "#>",
+  cache.lazy = FALSE
+)
+```
+
+```{r setup}
+library(gt)
+library(gMCP)
+library(graphicalMCP)
+```
+
+## Installation
+
+graphicalMCP is not on CRAN, so install it from GitHub with
+
+```{r eval=FALSE}
+# install.packages("pak")
+pak::pak("Gilead-BioStats/graphicalMCP@dev")
+```
+
+## Testing a graph
+
+Start testing with a simple successive graph with two primary and two secondary hypotheses.
+
+```{r create-graph-1}
+ss_graph <- simple_successive_2(c("A1", "B1", "A2", "B2"))
+
+p_values <- c(.012, .005, .013, .0135)
+
+ss_graph
+```
+
+### Bonferroni
+
+This graph can be tested most simply with the default weighted Bonferroni test. When testing at the global alpha level 0.025, we can reject hypotheses A1 and B1, but not A2 or B2.
+
+```{r bonferroni-mix-1}
+graph_test_closure(ss_graph, p = p_values, alpha = .025)
+```
+
+### Simes
+
+[I don't like this paragraph, but I'm not sure why]
+
+Simes testing relies on non-negative dependency between hypotheses to increase the weights of some hypotheses. The strength of the Simes test is most apparent when multiple p-values fall below the global alpha level, but above their local alpha in some intersection(s). In the following case, B1 & B2 are rejected in the Bonferroni testing procedure for intersection `B1 ∩ B2` because the p-value is greater than `alpha * w` for each hypothesis in that case. However, the Simes test rejects `B1 ∩ B2` because the weight from B1 is added to the weight for B2.
+
+```{r simes-all-1}
+graph_test_closure(ss_graph, p = p_values, alpha = .025, test_types = "s")
+```
+
+### Parametric
+
+If a correlation matrix for the test statistics is partially or fully known, a parametric test can be used for any subsets whose correlation matrix is fully known. Here B1 & B2 get a `c` value calculated that boosts their testing threshold slightly higher. This also demonstrates how to use different test types for different groups of hypotheses.
+
+```{r parametric-1}
+corr1 <- list(NA, rbind(c(1, .5), c(.5, 1)))
+
+graph_test_closure(ss_graph,
+  p = p_values,
+  alpha = .025,
+  groups = list(1:2, 3:4),
+  test_types = c("b", "p"),
+  corr = corr1
+)
+```
+
+### Adjusted significance vs adjusted p-values
+
+There are two different testing methods - one which uses weights to adjust the significance level, and another which adjusts the p-values. The adjusted p-values method is sometimes more efficient, so it is the standard method. Additional details about the adjusted p-values calculation can be seen by setting `verbose = TRUE`.
+
+```{r verbose}
+graph_test_closure(
+  ss_graph,
+  p = p_values,
+  alpha = .025,
+  corr = corr1,
+  groups = list(1:2, 3:4),
+  test_types = c("s", "p"),
+  verbose = TRUE
+)
+```
+
+The adjusted weights method tests every hypothesis in the closure, rather than summarizing adjusted p-values before testing. Setting `test_values = TRUE` displays the values used in each of these tests. This can provide more detailed information about what caused a hypothesis to fail than the adjusted p-values. However, it comes with some cost in computation time. Note that the `Intersection` columns in the adjusted p-value and adjusted weights sections serve as a persistent index between datasets.
+
+```{r test_values}
+graph_test_closure(
+  ss_graph,
+  p = p_values,
+  alpha = .025,
+  corr = corr1,
+  groups = list(1:2, 3:4),
+  test_types = c("s", "p"),
+  verbose = TRUE,
+  test_values = TRUE
+)
+```
+
+### Bonferroni shortcut
+
+Exclusive Bonferroni testing admits a shortcut which allows us to not test the full closure of a graph. Use [graph_test_shortcut()] to call this shortcut method. In this case, `verbose = TRUE` adds the intermediate graphs after each deletion, and `test_values = TRUE` gives test details for the order in which hypotheses are deleted. Overall hypothesis rejection results can be found in the `Test summary` section.
+
+```{r sequential}
+graph_test_shortcut(
+  graph = ss_graph,
+  p = p_values,
+  alpha = .025,
+  verbose = TRUE,
+  test_values = TRUE
+)
+```
+
+## Test print options
+
+The print generic for test results includes a couple of additional options. Each section within results is indented 2 spaces by default for readability, but this can be adjusted with `indent` (Note that some portions of the output can be indented by a minimum of 1 space - These pieces will ignore an indent of 0). Numeric values are rounded to 6 significant figures to control the amount of space used, but this can be set using the `precision` argument. Only the printing format is impacted, not the underlying values.
+
+```{r print-indent}
+mix_test <- graph_test_closure(
+  ss_graph,
+  p = p_values,
+  alpha = .025,
+  groups = list(c(1, 4), 2:3),
+  test_types = c("b", "s"),
+  verbose = TRUE,
+  test_values = TRUE
+)
+
+print(mix_test)
+
+print(mix_test, indent = 6, precision = 10)
+```
+
+## Power simulations
+
+It's not always obvious from a given graph how easy or difficult it will be to reject each hypothesis. One way to understand this better is to run a power simulation. The essence of a power simulation is to generate many different p-values using some chosen distribution, then test the graph against each set of p-values to see how it performs.
+
+### Bonferroni (Shortcut)
+
+The default for power simulations, like for testing, is to test a graph at alpha level .025 with Bonferroni testing. The default number of simulations (100) will typically need to be increased for more meaningful results. In order to run as efficiently as possible, Bonferroni power calculations use a custom shortcut method that is highly optimized using vectorization and early exit.
+
+```{r power-bonf}
+graph_calculate_power(ss_graph, sim_n = 1e5)
+```
+
+### Other tests (Closure)
+
+All valid testing strategies are available for power calculations as well - Simes, parametric, Bonferroni, or any combination of the three. Note that groups consisting of a single hypothesis will be silently converted to Bonferroni. Furthermore, a testing strategy comprised of only Bonferroni groups will be silently converted to shortcut testing for optimal performance.
+
+```{r power-mix}
+corr2 <- matrix(.5, nrow = 4, ncol = 4)
+diag(corr2) <- 1
+corr2 <- list(corr2)
+
+graph_calculate_power(
+  ss_graph,
+  test_groups = list(1:4),
+  test_types = "p",
+  test_corr = corr2,
+  sim_n = 1e5
+)
+```
+
+### Simulation options
+
+In addition to the testing-related options, there are options which control how p-values are simulated from the multivariate normal distribution. These parameters control the (estimated) "true" distribution of the hypotheses.
+
+```{r power-sims}
+s_corr1 <- rbind(
+  c(1, .5, .5, .25),
+  c(.5, 1, .25, .5),
+  c(.5, .25, 1, .5),
+  c(.25, .5, .5, 1)
+)
+
+graph_calculate_power(
+  ss_graph,
+  test_groups = list(1:4),
+  test_types = "p",
+  test_corr = corr2,
+  marginal_power = c(.4, .4, .9, .9),
+  sim_corr = s_corr1,
+  sim_seed = 52423, # Set a seed if you need consistent p-values
+  sim_n = 1e5
+)
+```
+
+### Power output
+
+The four standard power calculations are:
+- Local power - The rate each individual hypothesis is rejected in the power simulation
+- Expected no. of rejections - The average rejections per simulation
+- Power to reject 1 or more - Proportion of simulations that reject at least 1 hypothesis
+- Power to reject all - Proportion of simulations that reject every hypothesis
+
+In addition to these standard calculations, a user can define arbitrary "success" power calculations. For instance, in the simple successive example, perhaps the clinical trial will only be considered a success if both primary hypotheses are rejected.
+
+```{r power-success-1}
+graph_calculate_power(
+  ss_graph,
+  test_groups = list(1:4),
+  test_types = "p",
+  test_corr = corr2,
+  marginal_power = c(.4, .4, .9, .9),
+  sim_corr = s_corr1,
+  sim_seed = 52423,
+  sim_n = 1e5,
+  sim_success = function(x) x[1] & x[2]
+)
+```
+
+Because success functions could be anything, this parameter has few limitations. It is up to the user to understand the meaning of their functions. Each user defined function is applied to each simulation's test results, and the resulting vector is averaged.
+
+Here are a few more success function examples, some of which imitate the standard calculations. When using logical operators, make sure to use the vectorized operators (|, &), rather than the single-value operators (||, &&). Hypotheses can be weighted as well.
+
+```{r power-success-2}
+graph_calculate_power(
+  ss_graph,
+  test_groups = list(1:4),
+  test_types = "p",
+  test_corr = corr2,
+  marginal_power = c(.4, .4, .9, .9),
+  sim_corr = s_corr1,
+  sim_seed = 52423, # Set a seed if you need consistent p-values
+  sim_n = 1e5,
+  sim_success = list(
+    A1 = function(x) x[1],
+    B1 = function(x) x[2],
+    `Expected no. of rejections` = function(x) x[1] + x[2] + x[3] + x[4],
+    `1 or more` = function(x) x[1] | x[2] | x[3] | x[4],
+    `All` = function(x) x[1] & x[2] & x[3] & x[4],
+    `All 'A' or all 'B'` = function(x) (x[1] & x[3]) | (x[2] & x[4]),
+    `Weighted` = function(x) .75 * (x[1] & x[3]) + .25 * (x[2] & x[4])
+  )
+)
+```
+
+### Verbose simulation details
+
+Finally, power output has a `verbose` option, which opts in to include the full simulation and test results matrices.
+
+```{r power-verbose}
+graph_calculate_power(
+  ss_graph,
+  test_groups = list(1:4),
+  test_types = "p",
+  test_corr = corr2,
+  marginal_power = c(.4, .4, .9, .9),
+  sim_corr = s_corr1,
+  sim_seed = 52423, # Set a seed if you need consistent p-values
+  sim_n = 1e5,
+  sim_success = list(
+    A1 = function(x) x[1],
+    B1 = function(x) x[2],
+    `Expected no. of rejections` = function(x) x[1] + x[2] + x[3] + x[4],
+    `1 or more` = function(x) x[1] | x[2] | x[3] | x[4],
+    `All` = function(x) x[1] & x[2] & x[3] & x[4],
+    `All 'A' or all 'B'` = function(x) (x[1] & x[3]) | (x[2] & x[4]),
+    `Weighted` = function(x) .75 * (x[1] & x[3]) + .25 * (x[2] & x[4])
+  ),
+  verbose = TRUE
+)
+```
+
+## Power print options
+
+Like testing, the `print()` method for power output uses the `indent` and `precision` arguments.