initialize package with std_lm #2

clarkliming · 2023-07-05T15:12:00Z

close #1
here to illustrate the package structure, std_lm is included here; this is actually the same as lm but with robust covariance matrix (with summary the result is printed to the console)

this PR is not completed yet but provide a framework of how collaboration should happen

clarkliming · 2023-07-07T07:43:35Z

hi team, I add a basic framework for these analysis.

In general, there will be a wrapper around the lm, glm for the standardization methods, and robust covariance will be used. treatment variable have to be specified explicitly and only two levels is allowed (otherwise we can not do the standardization correctly).

example

std_lm(Sepal.Length ~ Species, data = subset(iris, Species != "virginica"), trt = "Species")

another possibility is to use some special functions in formula

std_lm(Sepal.Length ~ trt(Species), data = subset(iris, Species != "virginica"))

iptw is not implemented yet, but should be similar.

any ideas on the basic structure of the package?

xinzhn · 2023-07-13T02:20:57Z

Hi Liming,

For unconditional treatment effect of linear models, I suggest that we follow the recommendation in the FDA final guidance as quoted below.

Nominal standard errors are often the default method in most statistical software packages. Even if the model is incorrectly specified, they are acceptable in two arm trials with 1:1 randomization. However, in other settings, these standard errors can be inaccurate when the model is misspecified. Therefore, the Agency recommends that sponsors consider use of a robust standard error method such as the Huber-White “sandwich” standard error when the model does not include treatment by covariate interactions (Rosenblum and van der Laan2009; Lin 2013). Other robust standard error methods proposed in the literature can also cover cases with interactions (Ye et al. 2022). An appropriate nonparametric bootstrap procedure can also be used (Efron and Tibshirani 1993)..

Accordingly, for asymptotic standard error, it should provide:

Nominal standard errors and maybe, robust sandwich standard errors as well, if it is two arms with 1:1 randomization; we should also suggest the user not including any treatment by covariate interactions as the one without such interactions is optimal in this case¹.
Robust sandwich standard error (already implemented) for other cases when NOT including treatment by covariate interactions.
The estimated values using ANHECOVA and associated robust standard error ² when including treatment by covariate interactions.

For the input of std_lm, I suggest adding options for whether to 1) include treatment by covariate interactions in the regression model and 2) to account for stratified randomization for standard error estimation. To include those interactions, the simplest way is to fit separate regression models to each treatment group²³.

For the output of std_lm, I suggest including a vector of mean outcomes for all treatment groups and its estimated covariance matrix. The unconditional treatment effect for difference, ratio and odds ratio (for binary) can be obtain by another function with the output of std_lm, where its standard error can be computed using the delta method based on the estimated covariance matrix from std_lm. This can save the effort to develop separate function for different summary measures. We can follow a similar way to develop other estimators for unconditional treatment with a similar interface.

For IPTW, we may consider to build based on the PWS package, which used the standard error estimator from the paper⁴ cited by the FDA final guidance.

Look forward to hearing thoughts from you and other people.

Lin, W. (2013), “Agnostic notes on regression adjustments to experimental data: Reexamining Freedman’s critique,” The Annals of Applied Statistics, 7, 295–318. https://doi.org/10.1214/12-aoas583. ↩
Ting Ye, Jun Shao, Yanyao Yi & Qingyuan Zhao (2022) Toward Better Practice of Covariate Adjustment in Analyzing Randomized Clinical Trials, Journal of the American Statistical Association, DOI: 10.1080/01621459.2022.2049278 ↩ ↩²
Tsiatis, A.A., Davidian, M., Zhang, M. and Lu, X. (2008), Covariate adjustment for two-sample treatment comparisons in randomized clinical trials: A principled yet flexible approach. Statist. Med., 27: 4658-4677. https://doi.org/10.1002/sim.3113 ↩
Williamson, E.J., Forbes, A. and White, I.R. (2014), Variance reduction in randomised trials by inverse probability weighting using the propensity score. Statist. Med., 33: 721-737. https://doi.org/10.1002/sim.5991 ↩

danielinteractive · 2023-07-20T14:29:58Z

design/structure.Rmd

+
+# Implementation of standardization method for linear models
+
+The function name is `lm_std`, and the arguments are quite similar to `lm`.


General idea: any possibility that we start from lm result and then pipe this to a package function to get the covariate adjustment on top?

it could be, but we lose some general consistency with other methods, like for standardization methods, we usually need to modify the data and create counter factual treatment and predict the results (sometimes also need to check if the data is of correct structure, like binary treatment) , like iptw we need provide weights obtained from the probability of treatment (other methods we don't include weights); in addition, when it comes to some new methods, then we still need a new interface of the regression. So to make other covariate adjustment methods consistent, it might be good to create some wrappers like this

cool. makes sense. Then a consistent prefix with std_ will be nice

I like the idea of the class of method coming first in the prefix if that works for others i.e.

std_ <model/method>_ for standisation

ipw _ <model/method> _ for inverse weighting ,etc.

This would follow the Morris et al. overview paper / classification of methods.

It feels intuitive, especially if using autocomplete i.e. searching within the method class...

any possibility that we start from lm result and then pipe this to a package function

I like this idea a lot. I'm just getting up-to-speed trying to understand the package's interface goals - apologies if I'm over-simplifying things to make this work. Would something like this be feasible:

data %>% cov_adjust(by = "Species") %>% lm(Sepal.Length ~ Species) # calls lm.covadj_spec with signature lm(<covadj_spec>, formula) data %>% cov_adjust(by = "Species", weights = <weights>) %>% lm(Sepal.Length ~ Species)

I could see this being more comfortable if the basic parameters are reused across many methods.

Hm interesting idea @dgkf ! @clarkliming what do you think?

this sounds good but may lead to other confusions, like standardization methods is not actually a linear model, or standardized glm is not a glm; they focus on estimating the treatment effect. so using the lm as generic can be a little inappropriate I think; but a similar grammar may be adopted to define the treatment variable (anyway, the treatment effect is the key)

clarkliming · 2023-09-04T06:48:03Z

Hi @xinzhn, there are some updates to the current design, following your suggestions

checks are added to warn about treatment*covariate interaction terms (ANHECOVA not implemented yet so will report a warning here)
the original fit is stored so whenever you want you can still access the original data
added a generic called treatment_effect which would use "vcov_method" for robust standard error (here use "constant" to obtain the nominal standard errors), and use trt + ref to specify the treatment effect of trt compared to ref

@bailliem the namings are updated and in another branch the std_glm is also added. Other method to be added later, but should be following the same naming conventions

clarkliming added 2 commits July 5, 2023 15:07

initialize package with std_lm

b183df5

add trt argument

d5d9fdd

clarkliming requested review from xinzhn, baillma3, xidongdxi and przybal2 July 7, 2023 07:38

clarkliming added 2 commits July 19, 2023 06:52

update design

3cd33f4

update design

3dbf3e8

danielinteractive reviewed Jul 20, 2023

View reviewed changes

clarkliming added 5 commits August 28, 2023 13:12

update structure

e905d84

minor correction

2dae844

update the docs

0c808b9

update structure

cbb010b

update treatment_effect function

e1b1d34

clarkliming closed this Jan 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

initialize package with std_lm #2

initialize package with std_lm #2

clarkliming commented Jul 5, 2023 •

edited

Loading

clarkliming commented Jul 7, 2023

xinzhn commented Jul 13, 2023 •

edited

Loading

danielinteractive Jul 20, 2023

clarkliming Jul 20, 2023

danielinteractive Jul 20, 2023

bailliem Jul 24, 2023 •

edited

Loading

dgkf Aug 2, 2023

danielinteractive Aug 7, 2023

clarkliming Aug 7, 2023

clarkliming commented Sep 4, 2023


		# Implementation of standardization method for linear models

		The function name is `lm_std`, and the arguments are quite similar to `lm`.

initialize package with std_lm #2

initialize package with std_lm #2

Conversation

clarkliming commented Jul 5, 2023 • edited Loading

clarkliming commented Jul 7, 2023

xinzhn commented Jul 13, 2023 • edited Loading

Footnotes

danielinteractive Jul 20, 2023

Choose a reason for hiding this comment

clarkliming Jul 20, 2023

Choose a reason for hiding this comment

danielinteractive Jul 20, 2023

Choose a reason for hiding this comment

bailliem Jul 24, 2023 • edited Loading

Choose a reason for hiding this comment

dgkf Aug 2, 2023

Choose a reason for hiding this comment

danielinteractive Aug 7, 2023

Choose a reason for hiding this comment

clarkliming Aug 7, 2023

Choose a reason for hiding this comment

clarkliming commented Sep 4, 2023

clarkliming commented Jul 5, 2023 •

edited

Loading

xinzhn commented Jul 13, 2023 •

edited

Loading

bailliem Jul 24, 2023 •

edited

Loading