Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simulation / validation vignette #2

Open
ncullen93 opened this issue Jun 16, 2022 · 1 comment
Open

simulation / validation vignette #2

ncullen93 opened this issue Jun 16, 2022 · 1 comment

Comments

@ncullen93
Copy link

ncullen93 commented Jun 16, 2022

Hey Mike,

Just wanted to throw in a little example code that could maybe turn into a vignette. This example uses ADNI data to grab pilot estimates of covariance, etc from an lme model, then simulates data based on those estimates and calculates power to detect a treatment effect over 1000 trials. This is compared to the closed-form estimate from longpower. I think it's a great way to validate the formulas and show how power and such can also be calculated using simulation which is more flexible.

#install.packages('~/Downloads/ADNIMERGE_0.0.1.tar.gz', repos=NULL, type='source')
library(ADNIMERGE)
library(nlme)
library(tidyverse)
library(mvtnorm)
library(longpower)

# 1: PROCESS DATA
df <- ADNIMERGE::adnimerge %>%
  filter(
    VISCODE %in% c('bl','m06','m12','m18', 'm24'),
    DX.bl %in% c('EMCI','LMCI'),
    as.numeric(ABETA.bl) < 880,
    AGE >= 55, AGE <= 85
  ) %>%
  tibble() %>%
  dplyr::select(
    RID, VISCODE, Years.bl, DX.bl, AGE, CDRSB
  ) %>%
  filter(complete.cases(.)) %>%
  mutate(
    month = factor(VISCODE, levels=c('bl','m06','m12','m18', 'm24'),
                   labels=c('0','6','12','18','24')),
    visit = as.numeric(month)
  ) %>%
  rename(
    YEARS_bl = Years.bl,
    DX_bl = DX.bl
  )

# 2: GET MODEL ESTIMATES FROM PILOT DATA
pilot_lme <- lme(CDRSB ~ YEARS_bl,
                 data = df,
                 random = ~ YEARS_bl | RID,
                 control = nlme::lmeControl(
                   maxIter = 1e10,
                   msMaxIter = 1000,
                   opt = "optim"
                 ),
                 method='ML')

# 3. SIMULATE DATA USING MODEL ESTIMATES
simulate_data <- function(fit_lme, n_per_arm, treatment_effect) {
  # get intercept-slope covariance matrix and residual error
  covmat <- unname(as.matrix(getVarCov(fit_lme)[1:2,1:2]))
  sigma_residual <- fit_lme$sigma
  Beta <- fixef(fit_lme)
  # treatment effect
  Beta['YEARS_bl:group'] <- -Beta['YEARS_bl'] * treatment_effect

  n_arm <- 2
  n <- n_arm * n_per_arm
  visits <- c(0, 0.5, 1, 1.5, 2)

  data <-
    data.frame(
      id = 1:n,
      group = rep(c(0,1), each=n/n_arm)
    ) %>%
    tibble() %>%
    bind_cols(
      rmvnorm(n, mean=c(0,0), sigma=covmat) %>%
        data.frame() %>%
        setNames(c('r_i', 'r_s'))
    ) %>%
    right_join(
      expand.grid(id = 1:n, YEARS_bl=visits),
      by = 'id'
    ) %>%
    select(c(id, YEARS_bl), everything()) %>%
    mutate(
      r_e = rnorm(n*length(visits), sd = sigma_residual)
    ) %>%
    arrange(id, YEARS_bl) %>%
    mutate(
      # fixed effects
      CDRSB = (model.matrix(~ YEARS_bl + YEARS_bl:group)[,names(Beta)] %*% Beta)[,1],
      # ranodm effects
      CDRSB = CDRSB + r_i + r_s*YEARS_bl + r_e
    )
  data
}

# longpower - calculate n for a given random power
res <- lmmpower(pilot_lme, power=0.47, pct.change=0.3, t=seq(0,2,0.5))
n_exp <- round(res$n[1], 0) #143
print(n_exp)

# simulation - calculate power for the previously calculated n (it should match)
sig_vec <- c()
ntrials <- 1000
for (i in 1:ntrials) {
  print(i)
  data_sim <- simulate_data(pilot_lme, n_per_arm=143, treatment_effect=0.3)
  x <- summary(lme(CDRSB~YEARS_bl+group:YEARS_bl,
                   data=data_sim,
                   random=~YEARS_bl|id,
                   control = nlme::lmeControl(
                     maxIter = 1e10,
                     msMaxIter = 1000,
                     opt = "optim"
                   ),
                   method='ML'))
  pval <- coef(x)['YEARS_bl:group','p-value']
  sig_vec <- c(sig_vec, pval)
}
pwr <- mean(sig_vec < 0.05) # 0.47 or something very close
print(pwr)
@mcdonohue
Copy link
Owner

Thanks, Nick.

Something like this would be very useful, but I don't think CRAN would allow a reference to ADNIMERGE (since it's not on CRAN). Maybe demo with some other dataset instead?

It's also problematic to have vignettes that take a long time to run, but it looks like others have thought about solutions to that problem: https://www.kloppenborg.ca/2021/06/long-running-vignettes/

Mike

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants