-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.Rmd
90 lines (59 loc) · 3.76 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
---
output:
github_document:
pandoc_args: [
"--mathjax"
]
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%")
```
# bspme
<!-- badges: start -->
<!-- badges: end -->
Go to the package website: [[link]](https://changwoo-lee.github.io/bspme/)
See a vignette with NO2 exposure and simulated health data: [[link]](https://changwoo-lee.github.io/bspme/articles/no2-exposure-and-health-data-analysis.html)
See bspme_1.0.1.pdf for the pdf file of the package manual.
**bspme** is an R package that provides fast, scalable inference tools for **B**ayesian **sp**atial exposure **m**easurement **e**rror models, namely, the Bayesian linear and generalized linear models with the presence of spatial exposure measurement error of covariate(s).
These models typically arise from a two-stage Bayesian analysis of environmental exposures and health outcomes.
From a first-stage model, predictions of the covariate of interest ("exposure") and their uncertainty information (typically contained in MCMC samples) are obtained and used to form a multivariate normal prior distribution $X\sim N(\mu, \Sigma)$ for exposure in a second-stage regression model.
Naive, non-sparse choices of the precision matrix $Q = \Sigma^{-1}$ of the multivariate normal (such as a sample precision matrix) lead to the MCMC posterior inference algorithm being infeasible to run for a large number of subjects $n$ because of the cubic computational cost associated with the $n$-dimensional MVN prior.
With a sparse precision matrix $Q$ obtained from the Vecchia approximation, the **bspme** package offers fast, scalable algorithms to conduct posterior inference for large health datasets, with the number of subjects $n$ possibly reaching tens of thousands.
For more details, please see the following paper:
> Lee, C. J., Symanski, E., Rammah, A., Kang, D. H., Hopke, P. K., & Park, E. S. (2024). A scalable two-stage Bayesian approach accounting for exposure measurement error in environmental epidemiology. arXiv preprint arXiv:2401.00634. <https://arxiv.org/abs/2401.00634>
## Installation
You can install the R package bspme with the following code:
``` r
# install.packages("devtools")
devtools::install_github("changwoo-lee/bspme", build_vignettes = T)
```
To browse and see vignettes, run
``` r
browseVignettes("bspme")
```
## Functionality
| Function | Description |
| ---------------------- | -------------------------------------------------------------------------|
| `blm_me()` | Bayesian linear regression models with spatial exposure measurement error. |
| `bglm_me()` | Bayesian generalized linear models with spatial exposure measurement error. |
| `vecchia_cov()` | Run Vecchia approximation given a covariance matrix. |
To see function description in R environment, run the following lines:
``` r
?blm_me
?bglm_me
?vecchia_cov
```
## datasets
| Dataset call | Description |
| ---------------------- | -------------------------------------------------------------------------|
| `data("NO2_Jan2012")` | Daily average NO2 concentrations in and around Harris County, Texas, in Jan 2012. |
| `data("health_sim")` | Simulated health data associated with ln(NO2) concentration on Jan 10, 2012. |
## Examples
Please see the vignette "NO2-exposure-and-health-data-analysis".
## Acknowldegements
This work was supported by the National Institute of Environmental Health Sciences (NIEHS) of the National Institutes of Health (NIH) under R01ES031990.