-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
322 lines (231 loc) · 13.7 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
library(pactr)
```
# pactr: An Interface to the Pandemic PACT Data Repository <img src='man/figures/logo.png' width='200px' align='right' />
<!-- badges: start -->
[![Project Status: WIP – Initial development is in progress, but there has not yet been a stable, usable release suitable for the public.](https://www.repostatus.org/badges/latest/wip.svg)](https://www.repostatus.org/#wip)
[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
[![R-CMD-check](https://github.com/OxfordIHTM/pactr/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/OxfordIHTM/pactr/actions/workflows/R-CMD-check.yaml)
[![test-coverage.yaml](https://github.com/OxfordIHTM/pactr/actions/workflows/test-coverage.yaml/badge.svg)](https://github.com/OxfordIHTM/pactr/actions/workflows/test-coverage.yaml)
[![Codecov test coverage](https://codecov.io/gh/OxfordIHTM/pactr/branch/main/graph/badge.svg)](https://app.codecov.io/gh/OxfordIHTM/pactr?branch=main)
[![CodeFactor](https://www.codefactor.io/repository/github/OxfordIHTM/pactr/badge)](https://www.codefactor.io/repository/github/OxfordIHTM/pactr)
[![DOI](https://zenodo.org/badge/847474685.svg)](https://zenodo.org/badge/latestdoi/847474685)
<!-- badges: end -->
The [Pandemic PACT](https://www.pandemicpact.org/) monitors and analyses global funding and research evidence related to diseases with pandemic potential, as well as broader research preparedness efforts, and is equipped to pivot in response to outbreaks. It collects, curates, codes, and analyses data in alignment with WHO priority diseases and other selected illnesses, including pandemic influenza, mpox, and plague. Pandemic PACT aims to guide policy and decision-making for research funders, policymakers, researchers, multilateral agencies. The Pandemic PACT data is publicly available for download from its [website](https://www.pandemicpact.org/) and from [Figshare](https://portal.sds.ox.ac.uk/pandemicpact). This package provides an application programming interface (API) to both the research programme's Figshare repository and website data download facility to provide programmatic access to its publicly available tracker data along with its other data products.
## What does `{pactr}` do?
`{pactr}` provides functions to interface programmatically with Pandemic PACT's data products either through its Figshare repository or its website.
The functions for Figshare interface are wrappers to specific functions of the [`{deposits}` package](https://docs.ropensci.org/deposits/index.html) which provides the underlying universal interface to various online research data deposition services including Figshare. Current Figshare-specific functionalities available in `{pactr}` are:
1. Listing of outputs/assets available from the Pandemic PACT Figshare repository (*experimental*);
2. Downloading of outputs/assets available from the Pandemic PACT Figshare repository (*experimental*);
3. Reading of dataset outputs/assets available from the Pandemic PACT Figshare repository (*experimental*); and,
4. Processing of Pandemic PACT data (*experimental*).
The functions for interfacing with the data available from the Pandemic PACT website allow for downloading, reading, and processing. Current website data-specific functionalities available in `{pactr}` are:
1. Downloading of Pandemic PACT dataset available from the website (*stable*);
2. Reading of Pandemic PACT dataset available from the website (*stable*); and,
3. Processing of Pandemic PACT dataset available from the website (*experimental*).
## Motivation
The main motivation for the development of `{pactr}` is to create a standardised programmatic interface to Pandemic PACT's data for those performing research or investigation relevant to Pandemic PACT's objectives. Standardised programmatic interface, in turn, allow for reproducible scientific workflows based on the Pandemic PACT dataset.
## Installation
`{pactr}` is not yet on [CRAN](https://cran.r-project.org) but can be installed through the [Oxford iHealth R Universe](https://oxfordihtm.r-universe.dev) with:
```R
install.packages(
"pactr",
repos = c("https://oxfordihtm.r-universe.dev", "https://cloud.r-project.org")
)
```
## Usage - Figshare workflow
### Set a Figshare client
The `{pactr}` Figshare workflow always starts with the setting up of a Figshare client. This requires creating a Figshare account and then creating a personal access token [here](https://figshare.com/account/applications).
Once a Figshare token is created, it needs to be stored as a local environment variable. This can be done using the following command in R:
```R
Sys.setenv("FIGSHARE_TOKEN"="YOUR_TOKEN_HERE")
```
Once this token has been set as described above, the following command can be run to setup a Figshare client:
```{r setup-client}
pact_client <- pact_client_set()
```
Once a Figshare client has been setup, you can now perform the functionalities provided by the `{pactr}` package.
### List outputs/assets
To list available outputs/assets from the Pandemic PACT Figshare repository, issue the following command:
```{r usage-1a, eval = FALSE}
pact_list(pact_client)
```
The output is a `data.frame` containing metadata regarding contents of the Figshare Pandemic PACT group. The information within the metadata are those provided by [Figshare's application programming interface (API)](https://docs.figshare.com/) and are either set by the Pandemic PACT data team or by Figshare. The data.frame would look as follows:
```{r usage-1b, echo = FALSE}
pact_list(pact_client) |>
tibble::tibble()
```
This function is useful in getting an overview of what is currently available in the Pandemic PACT Figshare repository.
### Download outputs/assets
To download a specific output/asset - say the scoping review data - from the Pandemic PACT Figshare repository, issue the following commands:
```{r usage-2a, eval = FALSE}
## Get the unique identifier for the scoping review data from Figshare ----
file_id <- pact_list(pact_client) |>
subset(title == "Scoping Review Data", select = id) |>
unlist()
pact_download_figshare(id = file_id, path = ".")
```
This will download the file `Scoping_Review-Data.xlsx` from the Pandemic PACT Figshare repository into the current working directory.
### Read the Pandemic PACT tracker dataset and data dictionary
To read the Pandemic PACT tracker dataset into R, issue the following command:
```{r usage-3a, eval = FALSE}
pact_read_figshare(pact_client)
```
which outputs a data.frame with 4637 records and 860 fields.
```{r usage-3b, echo = FALSE, cache = TRUE}
figshare_data <- pact_read_figshare(pact_client) |>
tibble::tibble()
figshare_data
```
This function reads the *labelled* tracker dataset by default. If the raw dataset is required, then issue the following command:
```{r usage-3c, eval = FALSE}
pact_read_figshare(pact_client, tracker_type = "raw")
```
which outputs a data.frame with 4638 records and 860 fields.
```{r usage-3d, echo = FALSE, cache = TRUE}
figshare_data_raw <- pact_read_figshare(pact_client, tracker_type = "raw") |>
tibble::tibble()
figshare_data_raw
```
### Process the Pandemic PACT tracker dataset
```{r usage-3e, eval = FALSE}
pact_read_figshare(pact_client) |>
pact_process_figshare() |>
tibble::tibble()
```
```{r usage-3f, echo = FALSE}
figshare_data |>
pact_process_figshare() |>
tibble::tibble()
```
For a more detailed discussion of the usage and limitations of the `{pactr}` Figshare functions, see this [vignette](https://oxford-ihtm.io/pactr/articles/figshare-workflow.html).
## Usage - website data workflow
### Download the Pandemic PACT tracker dataset from the website
To download the Pandemic PACT tracker dataset available from its website, the following command can be used:
```{r usage-website-1a, eval = FALSE}
## Save the dataset from website to a temporary directory ----
pact_download_website(path = tempdir())
```
which will return the path to the downloaded dataset:
```{r usage-website-1b, echo = FALSE}
## Save the dataset from website to a temporary directory ----
pact_download_website(path = tempdir())
```
### Read the Pandemic PACT tracker dataset from the website
Instead of downloading, the Pandemic PACT dataset available from its website can be read into R directly as follows:
```{r usage-website-2a, eval = FALSE}
pact_read_website()
```
which results in the following:
```{r usage-website-2b, echo = FALSE}
pact_read_website()
```
### Process the Pandemic PACT tracker dataset from the website
The package includes functions that will process the Pandemic PACT tracker dataset into specific structures and aggregations that will allow for further plotting and reporting of similar outputs that are currently presented in the Pandemic PACT website.
For example, the following will process the Pandemic PACT tracker dataset into an aggregated dataset structure that can be used to create a similar plot to the one presented in the [website](https://www.pandemicimpact.org/visualise#disease).
```{r usage-website-3a, eval = FALSE}
pact_read_website() |>
pact_process_website() |>
pact_table_topic_group(topic = "Disease", group = "GrantStartYear")
```
which produces the following output:
```{r usage-website-3b, echo = FALSE, cache = FALSE}
pact_data <- pact_read_website() |>
pact_process_website()
tidy_df <- pact_data |>
pact_table_topic_group(topic = "Disease", group = "GrantStartYear")
tidy_df
```
which in turn can be plotted as follows:
```{r usage-website-3c, echo = FALSE, fig.height = 8}
disease_df <- tidy_df |>
dplyr::mutate(
Disease = factor(
x = Disease,
levels = c(
"COVID-19", "Disease X", "Crimean-Congo haemorrhagic fever",
"Ebola virus disease", "Hendra virus infection", "Lassa fever",
"Marburg virus disease",
"Middle East Respiratory Syndrome Coronavirus (MERS-CoV)",
"Mpox", "Nipah and henipaviral disease", "Pandemic-prone influenza",
"Plague", "Rift Valley Fever", "Severe Acute Respiratory Syndrome (SARS)", "Zika virus disease", "Congenital Zika virus disease", "Other",
"Unspecified", "Not applicable"
),
labels = c(
"COVID-19", "Disease X", "CCHF", "Ebola", "Hendra virus", "Lassa fever",
"Marburg", "MERS-CoV", "Mpox", "NiV and Henipa", "Pandemic influenza",
"Plague", "RVF", "SARS", "Zika", "Zika (congenital)", "Other",
"Unspecified", "Not applicable"
)
)
)
ggplot2::ggplot(
data = disease_df,
mapping = ggplot2::aes(
x = GrantStartYear, y = n_grants, group = Disease, color = Disease
)
) +
ggplot2::geom_line() +
ggplot2::labs(x = "Year of Award Start", y = "Number of Grants") +
oxthema::theme_oxford() +
ggplot2::theme(
legend.position = "top",
legend.title = ggplot2::element_blank(),
legend.text = ggplot2::element_text(size = 8),
legend.spacing.y = ggplot2::unit(2, 'points'),
panel.border = ggplot2::element_blank(),
panel.grid.minor = ggplot2::element_blank()
)
```
or alternatively:
```{r usage-website-3d, echo = FALSE, fig.height = 12}
disease_df |>
ggplot2::ggplot(
mapping = ggplot2::aes(x = GrantStartYear, y = n_grants, group = Disease)
) +
ggplot2::geom_line(
colour = oxthema::get_oxford_colour("sky"),
linewidth = 0.75, alpha = 0.7
) +
ggplot2::scale_y_continuous(breaks = scales::breaks_pretty()) +
ggplot2::facet_wrap(. ~ Disease, nrow = 5, ncol = 4, scales = "free_y") +
ggplot2::labs(x = "Year of Award Start", y = "Number of Grants") +
ggplot2::theme_bw() +
ggplot2::theme(
strip.background = ggplot2::element_rect(
fill = oxthema::get_oxford_colour("Oxford blue"),
colour = oxthema::get_oxford_colour("Oxford blue")
),
strip.text = ggplot2::element_text(colour = "#FFFFFF", size = 9),
panel.border = ggplot2::element_blank(),
panel.grid.minor = ggplot2::element_blank(),
axis.text.x = ggplot2::element_text(angle = 90, vjust = 0.5, hjust = 1),
axis.ticks = ggplot2::element_blank()
)
```
For a more detailed discussion of the usage and limitations of the `{pactr}` website dataset functions, see this [vignette](https://oxford-ihtm.io/pactr/articles/website-dataset-workflow.html).
## Citation
To cite the `{pactr}` package, please use the suggested citation provided by a call to the `citation()` function as follows:
```{r citation-package}
citation("pactr")
```
To cite the Pandemic PACT Tracker dataset, please use the suggested citation provided by a call to the `pact_cite()` function as follows:
```{r citation-data}
## cite the labelled version of the tracker dataset
pact_cite(pact_client, id = 24763548)
```
## Disclaimer
This project is an independent effort by [Oxford iHealth](https://oxford-ihtm.io) in support of related analytics and research using the Pandemic PACT dataset. This project is not recognised by the Pandemic PACT project. Any issues or problems arising from using the `{pactr}` package or from participating or contributing to the development of this project are the responsibility of the authors and maintainers of this project and should be addressed to them accordingly and not to the Pandemic PACT project.
## Community guidelines
Feedback, bug reports and feature requests are welcome; file issues or seek support [here](https://github.com/OxfordIHTM/pactr/issues). If you would like to contribute to the package, please see our [contributing guidelines](https://oxford-ihtm.io/pactr/CONTRIBUTING.html).
This project is released with a [Contributor Code of Conduct](https://oxford-ihtm/pactr/CODE_OF_CONDUCT.html). By participating in this project you agree to abide by its terms.