Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect Efron approximation for competing risks #268

Open
SiddRoy opened this issue Jul 25, 2024 · 2 comments
Open

Incorrect Efron approximation for competing risks #268

SiddRoy opened this issue Jul 25, 2024 · 2 comments

Comments

@SiddRoy
Copy link

SiddRoy commented Jul 25, 2024

The competing risk vignette has an example that shows how we can fit competing risks model by manually stacking data in order to more easily share coefficients. However, the resulting cumulative hazard curves with new data do not match.

After looking into the code, the Efron approximation code in the coxsurv1 (and 2) used in multihaz doesn't match the correct approximation code in agsurv4 and agsurv5.

Making the following changes in coxsurv1 resulted in matching results between the 2 approaches.

  1. line 150: want k < 10 to reset n[8:9] = 0 and avoid accumulating components from previous intervals.
  2. line 183: meanwt = n[5] / n[3]
  3. Efron sums on 185-186 should be adding 1/(n[2] - k * meanwt)
  4. Based on 3, we would want to also use reciprocal terms on 179-180, 188-189 and change miltihaz function to use cn[,,5] / cn[,,9] for hazard computation similar to coxsurv.fit and agsurv.

Reproducible code showing the difference in cumulative hazards

library(survival)
library(tidyverse)

# example from competing risks vignette: setup mgus2
mgus2$etime <- with(mgus2, ifelse(pstat==0, futime, ptime))
event <- with(mgus2, ifelse(pstat==0, 2*death, 1))
mgus2$event <- factor(event, 0:2, labels=c("censor", "pcm", "death"))

# stacked vs. multistate CR: DATA setup by removing NA's 
mgus3 <- filter(mgus2, !is.na(mspike), !is.na(hgb))
temp1 <- mgus3 %>% mutate(tim = etime, status = (event == 'pcm') * 1, group = 'pcm')
temp2 <- mgus3 %>% mutate(tim = etime, status = (event == 'death') * 1, group = 'death')
stacked <- bind_rows(temp1, temp2)

# stacked vs. multistate CR: Cox fit with same hgb coefficient 
hgfit <-  coxph(list(Surv(etime, event) ~ age + sex + mspike,
                     1:2 + 1:3 ~ hgb / common),
                data = mgus3, id = id, x = TRUE, model = TRUE)
allfit <- coxph(Surv(tim, status) ~ hgb + (age + sex+ mspike) * strata(group),
                data=stacked)

# Compare cox results
max(abs(hgfit$loglik- allfit$loglik)) # 0

# stacked vs. multistate CR: Survival Curves
newd <- mgus3[1, ] %>% select(id, age, sex, mspike, hgb) %>% mutate(age = 35, mspike = .6, hgb = 13.3)
sf_MS <- survfit(hgfit, newdata = newd, se.fit = TRUE)

newd2 <- bind_rows(
  mutate(newd, group = 'pcm'),
  mutate(newd, group = 'death'))
sf_strat <- survfit(allfit, newdata = newd2)

# These should be the same also!
sf_MS$cumhaz[1:10,1, ]
matrix(sf_strat$cumhaz, 267)[1:10, ]
@therneau
Copy link
Owner

Impressive. I'll try to add your test to my set of test cases, though I'll need to remove all the tidyverse stuff.

@SiddRoy
Copy link
Author

SiddRoy commented Jul 26, 2024

Thanks :)!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants