Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature request] add likelihood ratio test #164

Open
2 tasks
clarkliming opened this issue Nov 2, 2022 · 8 comments
Open
2 tasks

[feature request] add likelihood ratio test #164

clarkliming opened this issue Nov 2, 2022 · 8 comments
Labels
discussion enhancement New feature or request

Comments

@clarkliming
Copy link
Collaborator

clarkliming commented Nov 2, 2022

currently we have wald tests for mmrm, and I wonder if we should include likelihood tests too.
The implementation should be similar to lmertestr

a <- mmrm(formula, data)
b <- mmrm(formula2, data)
pchisq(a$likelihood - b$likelihood, df = xx)

we can further wrap this into something like

lrt <- function(fit, contrast, ...) {
fit_formula2 <- update_formula(fit$formula, contrast)
fit2 <- mmrm(fit_formula2, fit$data)
pchisq(fit$likelihood - fit2$likelihood, df)
}

To do:

  • Understand what kind of adjustment of degrees of freedom would be needed here
    • can check the lmerTest package how they do it
@clarkliming clarkliming changed the title [feature request [feature request] add likelihood ratio test Nov 2, 2022
@danielinteractive danielinteractive added enhancement New feature or request discussion labels Nov 2, 2022
@danielinteractive
Copy link
Collaborator

I am not sure how the degrees of freedom would be chosen here. I have read in the Satterthwaite vignette from Christensen e.g.:
image

@DanChaltiel
Copy link

DanChaltiel commented Jan 29, 2023

I agree with OP that an LRT would be very useful, for instance when testing for categorical variables as a whole.

In lmerTest, degrees of freedom are computed in contestMD.lmerModLmerTest() (code). Both Kenward-Roger's and Satterthwaite's method are handled in the code (line 409 and 436 respectively).

Also, note nlme:::anova.gls() (code), which seem to simply consider a number of df equal to the number of coefficients estimated (1 for a numeric variable or a 2-levels factor, 2 for a 3-levels factor, ...).

@danielinteractive
Copy link
Collaborator

I think the methodology side is important to get right here. I would guess intuitively that comparing two models which are nested (i.e. have same covariance model, but nested sets of covariates) should be ok. On the other hand, when e.g. the covariance model is different (and not nested, e.g. not just unstructured vs. AR1, but AR1 vs. CS) or the covariates are not nested then it could be problematic.
@chstock What would be your perspective on this?

@wlandau
Copy link

wlandau commented Feb 27, 2023

It seems reasonable to me to throw an informative error if the covariance structures are different. Seems like the majority of use cases would use the same covariance structures and only care about comparing different sets of fixed effects.

@DanChaltiel
Copy link

If you want to be exaggeratedly thorough, an LRT might also apply when covariance structures are "nested" (i.e. one is a special case of the other). For instance, there is an example in the SAS documentation comparing AR(1) and Toeplitz. I guess this would be a less significant feature though.

@chstock
Copy link
Collaborator

chstock commented Feb 27, 2023

Sounds generally worthwhile to me too. However, in practice most people do not seem to rely on it as a small sample adjustment for the MMRM. It's 2*(log_likelihood_full - log_likelihood_reduced) that is chi-squared distributed, right? This looks different in the above code. Df might just be equal to the difference in the number of parameters. Since the test considers two nested models, there may be an issue to be aware of when there is missing data in one covariate that is left out in the reduced model. It could then happen that the latter model is fitted on a different (larger) dataset.

@danielinteractive
Copy link
Collaborator

Thanks @DanChaltiel , agree that this would be nice. Basically we need a function that compares two models and assesses whether or not they are nested.

Thanks @chstock, so the function might need to refit the models if it finds that the used data is different. That would be possible since we have all the information we need in the objects.

@Generalized
Copy link

Generalized commented Oct 30, 2023

Forgive me, if my notes are of no interest to you.

For the general linear model (GLS, mixed or OLS), Wald's and LRT are pretty equivalent, if I'm not terribly wrong (I saw the math explanations long ago, quite extensive yet show the equivalence) as long, as the log-likelihood curve is nicely quadratic (makes issue in non-Gaussian GLM, BTW). The restricted vs. unrestricted model comparison done via Wilks LRT (on the vertical axis of log-lik) will rather follow the coefficient-based comparisons via Wald (on the horizontal axis of the log-likelihood plot). The F statistic you already have tests the reduction of residual variance (=deviance in GLM), and to account for the anti-conservativeness in small samples there are already the Satterthwaite and KR adjustments, invented exactly for this kind of issue. I'm wondering how big would be the differences vs. LRT?

Under the null hypothesis the 3 approaches (Wald, LRT and Rao) will agree, because - looking at the log-likelihood curve - as the difference between the true and estimated parameter on the horizontal axis --> 0 (Wald's), also the vertical distance (LRT) --> 0, and also the tangential (Rao score) becomes "flat", its slope --> 0. The bigger discrepancies usually appear under the alternative hypothesis, at low p-values. In my experience this usually happened below most classic significance levels (0.1, 0.01, 0.05, 0.005, 0.001), if only one of the conditional distributions of covariates wasn't too weird (e.g. log-normal with huge variance). This rather happened to me with ordinal data fitted via GEE P-O or logistic regression, but that's not a surprise for binomial models...

Last, but not least, LRT needs multiple fits. One may say: "oh, no big deal! Today's computers are fast!". OK, fine! But MMRM can be used under the multiple imputation framework. Now imagine: 20 datasets, 30 iterations, multiple endpoints, unstructured covariance - all this doubled (in the best scenario) by the LRT. I already tried that and my laptop "invited me" to make a coffee and watch some Netflix ;]

/ sure, yes, I do know MMRM itself handles MAR/MCAR, but it doesn't allow for sensitivity/tipping-point analysis and won't handle compound primary endpoints, and doesn't allow one to impute multiple outcomes in a chain, maximizing utilization of the available information. /

Also, bearing in mind, that there are already so many factors that can affect the inference, namely:

  • method of calculating the DoF (BE/WITHIN, residual, Satt, KR, containment and maybe more, I don't remember now),
  • covariance structure
  • standard errors - naive, empirical, bias-adjusted, jacknife...
  • adjustments for covariates ("intersected" with all visits and arms, or only at baseline)
  • selection of the optimizer engine and setting its parameters

it's really possible to have same data with varying results of the same MMRM analysis with just different settings. Now the question is how truly beneficial will be introducing the new feature, adding to the existing complexity?

Well, yes, this will be a beneficial addition, especially to compare covariance structures, main effects (I guess this links to the "type 2/3 ANOVA topic).

PS: if you really consider it, please look and possible extend the awesome glmglrt package:
https://rdrr.io/cran/glmglrt/man/p_value_contrast.html
https://rdrr.io/cran/glmglrt/man/confint_contrast.html

Currently LRT is supported only GLM (and LM via Gaussian distr. and identity link), and Wald is available for many other models and estimation methods, including nlme::gls (I used the two in the before-MMRM era :-) ).

PS: and if you do that, it would be great to have a convenient way to specify contrasts test not just for the main effects, but also (mostly!) the levels of categorical predictors, to test specific simple effects (currently available only the Wald's approach via emmeans). That's what the glmglrt allows for.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants