Skip to content

Commit

Permalink
readme update
Browse files Browse the repository at this point in the history
  • Loading branch information
changwoo-lee committed Dec 28, 2023
1 parent 8b1e263 commit d389ffe
Show file tree
Hide file tree
Showing 2 changed files with 50 additions and 35 deletions.
22 changes: 15 additions & 7 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ knitr::opts_chunk$set(
<!-- badges: end -->
Go to package website: [[link]](https://changwoo-lee.github.io/bspme/)

See a vignette with ozone exposure data: [[link]](https://changwoo-lee.github.io/bspme/articles/Ozone-exposure-and-health-data-analysis.html)

**bspme** is an R package that provides a set of functions for **B**ayesian **sp**atially correlated **m**easurement **e**rror models, the Bayesian linear models with the presence of (spatially) correlated measurement error of covariate(s). For more details, please see the following paper:


Expand All @@ -36,13 +38,19 @@ $$
Y_i = \beta_0 + \beta_x X_i + \beta_z Z_i + \epsilon_i, \quad \epsilon_i \stackrel{iid}{\sim} N(0, \sigma^2_Y), \quad i=1,\dots,n,
$$

For example in the context of environmental epidemiology, covariate $X_i$ can be an exposure to air pollution of subject $i$ at different locations, $Z_i$ can be demographic information, and $Y_i$ can be associated health outcome. Since exposure $X_i$ are not directly measured at health subject locations but are predicted from air pollution monitoring station locations, this induces spatially correlated measurement error in $X_i$. Also, the uncertainty information should be taken account, which depends on the proximity of the monitoring station to the subject location. One way to incorporate this information is to use a multivariate normal prior on the covariate $(X_1,\dots,X_n)$ (MVN prior approach) adopted by **bspme** package, the **B**ayesian **sp**atially correlated **m**easurement **e**rror models.
In the context of environmental epidemiology, the covariate $X_i$ can be an exposure to air pollution of subject $i$ at different locations, $Z_i$ can be demographic information, and $Y_i$ can be the associated health outcome. Since exposure $X_i$, $i=1,\dots,n$ are not directly measured at health subject locations but are predicted from air pollution monitoring station locations, this induces spatially correlated measurement error in $X_i$. Also, the uncertainty information should be taken account, which depends on the proximity of the monitoring station to the subject location. One way to incorporate this information is to use a multivariate normal prior on the covariate $(X_1,\dots,X_n)$,
$$
(X_1,\dots,X_n)\sim \mathrm{N}_n(\mathbf{m}, \mathbf{Q}^{-1}),
$$
with some mean $\mathbf{m}$ and precision (inverse covariance) matrix $\mathbf{Q}$, referred as a MVN prior approach.

The **bspme** package provides fast, scalable computational tools for Bayesian linear regression models with spatially correlated measurement errors represented as a MVN prior distribution. The posterior inference is carried out using Markov chain Monte Carlo (MCMC) methods. When implemented naively, running the MCMC is impossible because of the $O(n^3)$ computational cost associated with the $n$-dimensional MVN prior, where $n$ is the number of subjects. By adopting sparse MVN prior that has sparse precision matrices based on Vecchia approximation, the **bspme** package provides a fast algorithm to carry out posterior inference for large datasets, with the number of subjects $n$ possibly as big as tens of thousands.
The **bspme** package provides fast, scalable inference tools for Bayesian spatially correlated measurement error models, where measurement error information is represented as an MVN prior distribution with a **sparse precision matrix** $\mathbf{Q}$.
Naive choices of $\mathbf{Q}$, such as a sample precision matrix, make the MCMC posterior inference algorithm infeasible to run for a large number of subjects $n$ because of the $O(n^3)$ computational cost associated with the $n$-dimensional MVN prior.
With the sparse precision matrix $\mathbf{Q}$ obtained from the Vecchia approximation, the **bspme** package offers a fast, scalable algorithm to conduct posterior inference for large health datasets, with the number of subjects $n$ possibly reaching tens of thousands.

## Installation

You can install the development version of bspme like so:
You can install the development version of bspme with the following code:

``` r
# install.packages("devtools")
Expand All @@ -55,17 +63,17 @@ devtools::install_github("changwoo-lee/bspme")
| Function | Description |
| ---------------------- | -------------------------------------------------------------------------|
| `blinreg_me()` | Bayesian normal linear regression models with (spatially) correlated measurement errors |
| `blogireg_me()` | Bayesian generalized linear regression models with (spatially) correlated measurement errors |
| `vecchia_cov()` | Vecchia approximation given a covariance matrix |
| `blogireg_me()` | Bayesian logistic regression models with (spatially) correlated measurement errors |
| `vecchia_cov()` | Perform Vecchia approximation given a MVN covariance matrix |



## datasets

| Dataset call | Description |
| ---------------------- | -------------------------------------------------------------------------|
| `data("ozone")` | 1987 midwest ozone exposure data |
| `data("health_sim")` | Simulated health data |
| `data("ozone")` | 1987 midwest ozone exposure data |
| `data("health_sim")` | Simulated health data corresponding to ozone exposure data |

## Examples

Expand Down
63 changes: 35 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@

Go to package website: [\[link\]](https://changwoo-lee.github.io/bspme/)

See a vignette with ozone exposure data:
[\[link\]](https://changwoo-lee.github.io/bspme/articles/Ozone-exposure-and-health-data-analysis.html)

**bspme** is an R package that provides a set of functions for
**B**ayesian **sp**atially correlated **m**easurement **e**rror models,
the Bayesian linear models with the presence of (spatially) correlated
Expand All @@ -27,34 +30,38 @@ $$
Y_i = \beta_0 + \beta_x X_i + \beta_z Z_i + \epsilon_i, \quad \epsilon_i \stackrel{iid}{\sim} N(0, \sigma^2_Y), \quad i=1,\dots,n,
$$

For example in the context of environmental epidemiology, covariate
$X_i$ can be an exposure to air pollution of subject $i$ at different
locations, $Z_i$ can be demographic information, and $Y_i$ can be
associated health outcome. Since exposure $X_i$ are not directly
In the context of environmental epidemiology, the covariate $X_i$ can be
an exposure to air pollution of subject $i$ at different locations,
$Z_i$ can be demographic information, and $Y_i$ can be the associated
health outcome. Since exposure $X_i$, $i=1,\dots,n$ are not directly
measured at health subject locations but are predicted from air
pollution monitoring station locations, this induces spatially
correlated measurement error in $X_i$. Also, the uncertainty information
should be taken account, which depends on the proximity of the
monitoring station to the subject location. One way to incorporate this
information is to use a multivariate normal prior on the covariate
$(X_1,\dots,X_n)$ (MVN prior approach) adopted by **bspme** package, the
**B**ayesian **sp**atially correlated **m**easurement **e**rror models.

The **bspme** package provides fast, scalable computational tools for
Bayesian linear regression models with spatially correlated measurement
errors represented as a MVN prior distribution. The posterior inference
is carried out using Markov chain Monte Carlo (MCMC) methods. When
implemented naively, running the MCMC is impossible because of the
$O(n^3)$ computational cost associated with the $n$-dimensional MVN
prior, where $n$ is the number of subjects. By adopting sparse MVN prior
that has sparse precision matrices based on Vecchia approximation, the
**bspme** package provides a fast algorithm to carry out posterior
inference for large datasets, with the number of subjects $n$ possibly
as big as tens of thousands.
$(X_1,\dots,X_n)$, $$
(X_1,\dots,X_n)\sim \mathrm{N}_n(\mathbf{m}, \mathbf{Q}^{-1}),
$$ with some mean $\mathbf{m}$ and precision (inverse covariance) matrix
$\mathbf{Q}$, referred as a MVN prior approach.

The **bspme** package provides fast, scalable inference tools for
Bayesian spatially correlated measurement error models, where
measurement error information is represented as an MVN prior
distribution with a **sparse precision matrix** $\mathbf{Q}$. Naive
choices of $\mathbf{Q}$, such as a sample precision matrix, make the
MCMC posterior inference algorithm infeasible to run for a large number
of subjects $n$ because of the $O(n^3)$ computational cost associated
with the $n$-dimensional MVN prior. With the sparse precision matrix
$\mathbf{Q}$ obtained from the Vecchia approximation, the **bspme**
package offers a fast, scalable algorithm to conduct posterior inference
for large health datasets, with the number of subjects $n$ possibly
reaching tens of thousands.

## Installation

You can install the development version of bspme like so:
You can install the development version of bspme with the following
code:

``` r
# install.packages("devtools")
Expand All @@ -63,18 +70,18 @@ devtools::install_github("changwoo-lee/bspme")

## Functionality

| Function | Description |
|-----------------|----------------------------------------------------------------------------------------------|
| `blinreg_me()` | Bayesian normal linear regression models with (spatially) correlated measurement errors |
| `blogireg_me()` | Bayesian generalized linear regression models with (spatially) correlated measurement errors |
| `vecchia_cov()` | Vecchia approximation given a covariance matrix |
| Function | Description |
|-----------------|-----------------------------------------------------------------------------------------|
| `blinreg_me()` | Bayesian normal linear regression models with (spatially) correlated measurement errors |
| `blogireg_me()` | Bayesian logistic regression models with (spatially) correlated measurement errors |
| `vecchia_cov()` | Perform Vecchia approximation given a MVN covariance matrix |

## datasets

| Dataset call | Description |
|----------------------|----------------------------------|
| `data("ozone")` | 1987 midwest ozone exposure data |
| `data("health_sim")` | Simulated health data |
| Dataset call | Description |
|----------------------|------------------------------------------------------------|
| `data("ozone")` | 1987 midwest ozone exposure data |
| `data("health_sim")` | Simulated health data corresponding to ozone exposure data |

## Examples

Expand Down

0 comments on commit d389ffe

Please sign in to comment.