Skip to content

Commit

Permalink
Merge branch 'develop' into test-metrics-quantile
Browse files Browse the repository at this point in the history
  • Loading branch information
nikosbosse authored Jan 5, 2024
2 parents 85733bf + 545b40c commit 520785d
Show file tree
Hide file tree
Showing 40 changed files with 185 additions and 75 deletions.
1 change: 1 addition & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,4 @@
^\.devcontainer$
^CODE_OF_CONDUCT\.md$
^inst/manuscript/output$
^CRAN-SUBMISSION$
21 changes: 21 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
<!-- Thanks for opening this pull request! Below we have provided a suggested template for PRs to this repository and a checklist to complete before opening a PR -->

## Description

This PR closes #<issue-number>.

[Describe the changes that you made in this pull request.]

## Checklist

- [ ] My PR is based on a package issue and I have explicitly linked it.
- [ ] I have included the target issue or issues in the PR title as follows: *issue-number*: PR title
- [ ] I have tested my changes locally.
- [ ] I have added or updated unit tests where necessary.
- [ ] I have updated the documentation if required.
- [ ] I have built the package locally and run rebuilt docs using roxygen2.
- [ ] My code follows the established coding standards and I have run `lintr::lint_package()` to check for style issues introduced by my changes.
- [ ] I have added a news item linked to this PR.
- [ ] I have reviewed CI checks for this PR and addressed them as far as I am able.

<!-- Thanks again for this PR - @scoringutils dev team -->
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,4 @@ docs
..bfg-report/
.DS_Store
.vscode
README.html
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,8 @@ Description:
Scoring metrics can be used either through a convenient data.frame format,
or can be applied as individual functions in a vector / matrix format.
All functionality has been implemented with a focus on performance and is
robustly tested. Find more information about scoringutils in the
accompanying paper (Bosse et al., 2022) <arXiv:2205.07090v1>.
robustly tested. Find more information about the package in the
accompanying paper (<doi:10.48550/arXiv.2205.07090>).
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
Expand Down
18 changes: 14 additions & 4 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,14 +40,24 @@ The update introduces breaking changes. If you want to keep using the older vers
- added documentation for the return value of `summarise_scores()`.
- Added unit tests for `interval_coverage_quantile()` and `interval_coverage_dev_quantile()` in order to make sure that the functions provide the correct warnings when insufficient quantiles are provided.

# scoringutils 1.2.2

## Package updates
- `scoringutils` now depends on R 3.6. The change was made since packages `testthat` and `lifecycle`, which are used in `scoringutils` now require R 3.6. We also updated the Github action CI check to work with R 3.6 now.
- Added a new PR template with a checklist of things to be included in PRs to facilitate the development and review process

## Bug fixes
- Fixes a bug with `set_forecast_unit()` where the function only workded with a data.table, but not a data.frame as an input.
- The metrics table in the vignette [Details on the metrics implemented in `scoringutils`](https://epiforecasts.io/scoringutils/articles/metric-details.html) had duplicated entries. This was fixed by removing the duplicated rows.

# scoringutils 1.2.1

## Package updates
- This minor update fixes a few issues related to gh actions and the vignettes displayed at epiforecasts.io/scoringutils. It
- gets rid of the preferably package in _pkgdown.yml. The theme had a toggle between light and dark theme that didn't work properly
- updates the gh pages deploy action to v4 and also cleans up files when triggered
- introduces a gh action to automatically render the Readme from Readme.Rmd
- removes links to vignettes that have been renamed
- Gets rid of the preferably package in _pkgdown.yml. The theme had a toggle between light and dark theme that didn't work properly
- Updates the gh pages deploy action to v4 and also cleans up files when triggered
- Introduces a gh action to automatically render the Readme from Readme.Rmd
- Removes links to vignettes that have been renamed

# scoringutils 1.2.0

Expand Down
4 changes: 3 additions & 1 deletion R/available_forecasts.R
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,9 @@
#' @export
#' @keywords check-forecasts
#' @examples
#' data.table::setDTthreads(1) # only needed to avoid issues on CRAN
#' \dontshow{
#' data.table::setDTthreads(2) # restricts number of cores used on CRAN
#' }
#'
#' get_forecast_counts(example_quantile,
#' by = c("model", "target_type")
Expand Down
14 changes: 7 additions & 7 deletions R/data.R
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
#' \item{horizon}{forecast horizon in weeks}
#' }
# nolint start
#' @source \url{https://github.com/covid19-forecast-hub-europe/covid19-forecast-hub-europe/commit/a42867b1ea152c57e25b04f9faa26cfd4bfd8fa6/}
#' @source \url{https://github.com/european-modelling-hubs/covid19-forecast-hub-europe/commit/a42867b1ea152c57e25b04f9faa26cfd4bfd8fa6/}
# nolint end
"example_quantile"

Expand All @@ -47,7 +47,7 @@
#' \item{horizon}{forecast horizon in weeks}
#' }
# nolint start
#' @source \url{https://github.com/covid19-forecast-hub-europe/covid19-forecast-hub-europe/commit/a42867b1ea152c57e25b04f9faa26cfd4bfd8fa6/}
#' @source \url{https://github.com/european-modelling-hubs/covid19-forecast-hub-europe/commit/a42867b1ea152c57e25b04f9faa26cfd4bfd8fa6/}
# nolint end
"example_point"

Expand All @@ -74,7 +74,7 @@
#' \item{sample_id}{id for the corresponding sample}
#' }
# nolint start
#' @source \url{https://github.com/covid19-forecast-hub-europe/covid19-forecast-hub-europe/commit/a42867b1ea152c57e25b04f9faa26cfd4bfd8fa6/}
#' @source \url{https://github.com/european-modelling-hubs/covid19-forecast-hub-europe/commit/a42867b1ea152c57e25b04f9faa26cfd4bfd8fa6/}
# nolint end
"example_continuous"

Expand All @@ -101,7 +101,7 @@
#' \item{sample_id}{id for the corresponding sample}
#' }
# nolint start
#' @source \url{https://github.com/covid19-forecast-hub-europe/covid19-forecast-hub-europe/commit/a42867b1ea152c57e25b04f9faa26cfd4bfd8fa6/}
#' @source \url{https://github.com/european-modelling-hubs/covid19-forecast-hub-europe/commit/a42867b1ea152c57e25b04f9faa26cfd4bfd8fa6/}
# nolint end
"example_integer"

Expand Down Expand Up @@ -134,7 +134,7 @@
#' \item{predicted}{predicted value}
#' }
# nolint start
#' @source \url{https://github.com/covid19-forecast-hub-europe/covid19-forecast-hub-europe/commit/a42867b1ea152c57e25b04f9faa26cfd4bfd8fa6/}
#' @source \url{https://github.com/european-modelling-hubs/covid19-forecast-hub-europe/commit/a42867b1ea152c57e25b04f9faa26cfd4bfd8fa6/}
# nolint end
"example_binary"

Expand All @@ -159,7 +159,7 @@
#' \item{horizon}{forecast horizon in weeks}
#' }
# nolint start
#' @source \url{https://github.com/covid19-forecast-hub-europe/covid19-forecast-hub-europe/commit/a42867b1ea152c57e25b04f9faa26cfd4bfd8fa6/}
#' @source \url{https://github.com/european-modelling-hubs/covid19-forecast-hub-europe/commit/a42867b1ea152c57e25b04f9faa26cfd4bfd8fa6/}
# nolint end
"example_quantile_forecasts_only"

Expand All @@ -181,7 +181,7 @@
#' \item{location_name}{name of the country for which a prediction was made}
#' }
# nolint start
#' @source \url{https://github.com/covid19-forecast-hub-europe/covid19-forecast-hub-europe/commit/a42867b1ea152c57e25b04f9faa26cfd4bfd8fa6/}
#' @source \url{https://github.com/european-modelling-hubs/covid19-forecast-hub-europe/commit/a42867b1ea152c57e25b04f9faa26cfd4bfd8fa6/}
# nolint end
"example_truth_only"

Expand Down
4 changes: 3 additions & 1 deletion R/pairwise-comparisons.R
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,9 @@
#' @author Johannes Bracher, \email{johannes.bracher@@kit.edu}
#' @keywords scoring
#' @examples
#' data.table::setDTthreads(1) # only needed to avoid issues on CRAN
#' \dontshow{
#' data.table::setDTthreads(2) # restricts number of cores used on CRAN
#' }
#'
#' scores <- score(example_quantile)
#' pairwise <- pairwise_comparison(scores, by = "target_type")
Expand Down
4 changes: 3 additions & 1 deletion R/pit.R
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,9 @@
#' @seealso [pit()]
#' @importFrom stats runif
#' @examples
#' data.table::setDTthreads(1) # only needed to avoid issues on CRAN
#' \dontshow{
#' data.table::setDTthreads(2) # restricts number of cores used on CRAN
#' }
#'
#' ## continuous predictions
#' observed <- rnorm(20, mean = 1:20)
Expand Down
13 changes: 9 additions & 4 deletions R/plot.R
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,9 @@
#' @examples
#' library(ggplot2)
#' library(magrittr) # pipe operator
#' data.table::setDTthreads(1) # only needed to avoid issues on CRAN
#' \dontshow{
#' data.table::setDTthreads(2) # restricts number of cores used on CRAN
#' }
#'
#' scores <- score(example_quantile) %>%
#' summarise_scores(by = c("model", "target_type")) %>%
Expand Down Expand Up @@ -577,11 +579,12 @@ make_na <- make_NA
#' @importFrom data.table dcast
#' @export
#' @examples
#' data.table::setDTthreads(1) # only needed to avoid issues on CRAN
#' \dontshow{
#' data.table::setDTthreads(2) # restricts number of cores used on CRAN
#' }
#' data_coverage <- add_coverage(example_quantile)
#' summarised <- summarise_scores(data_coverage, by = c("model", "range"))
#' plot_interval_coverage(summarised)

plot_interval_coverage <- function(scores,
colour = "model") {
## overall model calibration - empirical interval coverage
Expand Down Expand Up @@ -830,7 +833,9 @@ plot_pairwise_comparison <- function(comparison_result,
#' @importFrom stats density
#' @return vector with the scoring values
#' @examples
#' data.table::setDTthreads(1) # only needed to avoid issues on CRAN
#' \dontshow{
#' data.table::setDTthreads(2) # restricts number of cores used on CRAN
#' }
#'
#' # PIT histogram in vector based format
#' observed <- rnorm(30, mean = 1:30)
Expand Down
4 changes: 3 additions & 1 deletion R/score.R
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,9 @@
#' @importFrom stats na.omit
#' @examples
#' library(magrittr) # pipe operator
#' data.table::setDTthreads(1) # only needed to avoid issues on CRAN
#' \dontshow{
#' data.table::setDTthreads(2) # restricts number of cores used on CRAN
#' }
#'
#' validated <- as_forecast(example_quantile)
#' score(validated) %>%
Expand Down
4 changes: 3 additions & 1 deletion R/summarise_scores.R
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,9 @@
#' to the names of the columns of the original data specified in `by` or
#' `across` using the `fun` passed to `summarise_scores()`.
#' @examples
#' data.table::setDTthreads(1) # only needed to avoid issues on CRAN
#' \dontshow{
#' data.table::setDTthreads(2) # restricts number of cores used on CRAN
#' }
#' library(magrittr) # pipe operator
#' \dontrun{
#' scores <- score(example_continuous)
Expand Down
9 changes: 9 additions & 0 deletions R/zzz.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
.onAttach <- function(libname, pkgname) {
packageStartupMessage(
"Note: scoringutils is currently undergoing major development changes ",
"(with an update planned for the first quarter of 2024). We would very ",
"much appreciate your opinions and feedback on what should be included in ",
"this major update: ",
"https://github.com/epiforecasts/scoringutils/discussions/333"
)
}
2 changes: 1 addition & 1 deletion README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ library(knitr)

The `scoringutils` package provides a collection of metrics and proper scoring rules and aims to make it simple to score probabilistic forecasts against observed values.

You can find additional information and examples in the papers [Evaluating Forecasts with scoringutils in R](https://arxiv.org/abs/2205.07090) [Scoring epidemiological forecasts on transformed scales](https://www.medrxiv.org/content/10.1101/2023.01.23.23284722v1) as well as the Vignettes ([Getting started](https://epiforecasts.io/scoringutils/articles/scoringutils.html), [Details on the metrics implemented](https://epiforecasts.io/scoringutils/articles/metric-details.html) and [Scoring forecasts directly](https://epiforecasts.io/scoringutils/articles/scoring-forecasts-directly.html)).
You can find additional information and examples in the papers [Evaluating Forecasts with scoringutils in R](https://arxiv.org/abs/2205.07090) [Scoring epidemiological forecasts on transformed scales](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1011393) as well as the Vignettes ([Getting started](https://epiforecasts.io/scoringutils/articles/scoringutils.html), [Details on the metrics implemented](https://epiforecasts.io/scoringutils/articles/metric-details.html) and [Scoring forecasts directly](https://epiforecasts.io/scoringutils/articles/scoring-forecasts-directly.html)).

The `scoringutils` package offers convenient automated forecast evaluation through the function `score()`. The function operates on data.frames (it uses `data.table` internally for speed and efficiency) and can easily be integrated in a workflow based on `dplyr` or `data.table`. It also provides experienced users with a set of reliable lower-level scoring metrics operating on vectors/matrices they can build upon in other applications. In addition it implements a wide range of flexible plots designed to cover many use cases.

Expand Down
52 changes: 28 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ You can find additional information and examples in the papers
[Evaluating Forecasts with scoringutils in
R](https://arxiv.org/abs/2205.07090) [Scoring epidemiological forecasts
on transformed
scales](https://www.medrxiv.org/content/10.1101/2023.01.23.23284722v1)
scales](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1011393)
as well as the Vignettes ([Getting
started](https://epiforecasts.io/scoringutils/articles/scoringutils.html),
[Details on the metrics
Expand Down Expand Up @@ -143,17 +143,20 @@ example_quantile %>%
digits = 2
) %>%
kable()
#> Some rows containing NA values may be removed. This is fine if not unexpected.
#> Some rows containing NA values may be removed. This is fine if not unexpected.
#> Some rows containing NA values may be removed. This is fine if not unexpected.
```

| model | target_type | wis | overprediction | underprediction | dispersion | bias | coverage_50 | coverage_90 | coverage_deviation | ae_median | relative_skill | scaled_rel_skill |
|:----------------------|:------------|------:|---------------:|----------------:|-----------:|--------:|------------:|------------:|-------------------:|----------:|---------------:|-----------------:|
| EuroCOVIDhub-baseline | Cases | 28000 | 14000.0 | 10000.0 | 4100 | 0.0980 | 0.33 | 0.82 | -0.120 | 38000 | 1.30 | 1.6 |
| EuroCOVIDhub-baseline | Deaths | 160 | 66.0 | 2.1 | 91 | 0.3400 | 0.66 | 1.00 | 0.120 | 230 | 2.30 | 3.8 |
| EuroCOVIDhub-ensemble | Cases | 18000 | 10000.0 | 4200.0 | 3700 | -0.0560 | 0.39 | 0.80 | -0.100 | 24000 | 0.82 | 1.0 |
| EuroCOVIDhub-ensemble | Deaths | 41 | 7.1 | 4.1 | 30 | 0.0730 | 0.88 | 1.00 | 0.200 | 53 | 0.60 | 1.0 |
| UMass-MechBayes | Deaths | 53 | 9.0 | 17.0 | 27 | -0.0220 | 0.46 | 0.88 | -0.025 | 78 | 0.75 | 1.3 |
| epiforecasts-EpiNow2 | Cases | 21000 | 12000.0 | 3300.0 | 5700 | -0.0790 | 0.47 | 0.79 | -0.070 | 28000 | 0.95 | 1.2 |
| epiforecasts-EpiNow2 | Deaths | 67 | 19.0 | 16.0 | 32 | -0.0051 | 0.42 | 0.91 | -0.045 | 100 | 0.98 | 1.6 |
| model | target_type | wis | overprediction | underprediction | dispersion | bias | interval_coverage_50 | interval_coverage_90 | interval_coverage_deviation | ae_median | relative_skill | scaled_rel_skill |
|:----------------------|:------------|------:|---------------:|----------------:|-----------:|--------:|---------------------:|---------------------:|----------------------------:|----------:|---------------:|-----------------:|
| EuroCOVIDhub-baseline | Cases | 28000 | 14000.0 | 10000.0 | 4100 | 0.0980 | 0.33 | 0.82 | -0.120 | 38000 | 1.30 | 1.6 |
| EuroCOVIDhub-baseline | Deaths | 160 | 66.0 | 2.1 | 91 | 0.3400 | 0.66 | 1.00 | 0.120 | 230 | 2.30 | 3.8 |
| EuroCOVIDhub-ensemble | Cases | 18000 | 10000.0 | 4200.0 | 3700 | -0.0560 | 0.39 | 0.80 | -0.100 | 24000 | 0.82 | 1.0 |
| EuroCOVIDhub-ensemble | Deaths | 41 | 7.1 | 4.1 | 30 | 0.0730 | 0.88 | 1.00 | 0.200 | 53 | 0.60 | 1.0 |
| UMass-MechBayes | Deaths | 53 | 9.0 | 17.0 | 27 | -0.0220 | 0.46 | 0.88 | -0.025 | 78 | 0.75 | 1.3 |
| epiforecasts-EpiNow2 | Cases | 21000 | 12000.0 | 3300.0 | 5700 | -0.0790 | 0.47 | 0.79 | -0.070 | 28000 | 0.95 | 1.2 |
| epiforecasts-EpiNow2 | Deaths | 67 | 19.0 | 16.0 | 32 | -0.0051 | 0.42 | 0.91 | -0.045 | 100 | 0.98 | 1.6 |

`scoringutils` contains additional functionality to transform forecasts,
to summarise scores at different levels, to visualise them, and to
Expand All @@ -175,27 +178,28 @@ example_quantile %>%
score %>%
summarise_scores(by = c("model", "target_type", "scale")) %>%
head()
#> Some rows containing NA values may be removed. This is fine if not unexpected.
#> model target_type scale wis overprediction
#> 1: EuroCOVIDhub-ensemble Cases natural 11550.70664 3650.004755
#> 2: EuroCOVIDhub-baseline Cases natural 22090.45747 7702.983696
#> 3: epiforecasts-EpiNow2 Cases natural 14438.43943 5513.705842
#> 4: EuroCOVIDhub-ensemble Deaths natural 41.42249 7.138247
#> 5: EuroCOVIDhub-baseline Deaths natural 159.40387 65.899117
#> 6: UMass-MechBayes Deaths natural 52.65195 8.978601
#> underprediction dispersion bias coverage_50 coverage_90
#> 1: 4237.177310 3663.52458 -0.05640625 0.3906250 0.8046875
#> 2: 10284.972826 4102.50094 0.09726562 0.3281250 0.8203125
#> 3: 3260.355639 5664.37795 -0.07890625 0.4687500 0.7890625
#> 4: 4.103261 30.18099 0.07265625 0.8750000 1.0000000
#> 5: 2.098505 91.40625 0.33906250 0.6640625 1.0000000
#> 6: 16.800951 26.87239 -0.02234375 0.4609375 0.8750000
#> coverage_deviation ae_median
#> 1: -0.10230114 17707.95312
#> 2: -0.11437500 32080.48438
#> 3: -0.06963068 21530.69531
#> 4: 0.20380682 53.13281
#> 5: 0.12142045 233.25781
#> 6: -0.02488636 78.47656
#> underprediction dispersion bias interval_coverage_50
#> 1: 4237.177310 3663.52458 -0.05640625 0.3906250
#> 2: 10284.972826 4102.50094 0.09726562 0.3281250
#> 3: 3260.355639 5664.37795 -0.07890625 0.4687500
#> 4: 4.103261 30.18099 0.07265625 0.8750000
#> 5: 2.098505 91.40625 0.33906250 0.6640625
#> 6: 16.800951 26.87239 -0.02234375 0.4609375
#> interval_coverage_90 interval_coverage_deviation ae_median
#> 1: 0.8046875 -0.10230114 17707.95312
#> 2: 0.8203125 -0.11437500 32080.48438
#> 3: 0.7890625 -0.06963068 21530.69531
#> 4: 1.0000000 0.20380682 53.13281
#> 5: 1.0000000 0.12142045 233.25781
#> 6: 0.8750000 -0.02488636 78.47656
```

## Citation
Expand Down
Binary file modified data/metrics.rda
Binary file not shown.
7 changes: 4 additions & 3 deletions inst/WORDLIST
Original file line number Diff line number Diff line change
@@ -1,16 +1,18 @@
AJ
al
Bosse
Bracher
CMD
COVID
CRPS
Camacho
Comput
Cori
DSS
Dawid
ECDC
Eggo
EpiNow
et
EuroCOVIDhub
Gneiting
Höhle
Expand Down Expand Up @@ -44,9 +46,7 @@ facetted
facetting
frac
ggplot
implict
jss
matriced
medRxiv
metacran
miscalibrated
Expand All @@ -58,6 +58,7 @@ pval
pvalues
rel
scoringRules
scoringutils
standalone
u
underprediction
Expand Down
Loading

0 comments on commit 520785d

Please sign in to comment.