Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating the manuscript for the scoringutils paper #528

Closed
wants to merge 23 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
81389fa
fix orientation of the log score
nikosbosse Sep 6, 2023
955a538
Address some comments made by reviewers
nikosbosse Sep 6, 2023
62c2a54
temporarily delete metrics-detailed.rds to work around merge conflicts
nikosbosse Nov 15, 2023
5e5286e
temporarily delete metrics-detailed.rds to avoid merge conflict
nikosbosse Nov 15, 2023
1d3ae89
Temporarily change create-metric-tables.R to avoid merge conflict
nikosbosse Nov 15, 2023
dcf9a9d
Merge pull request #440 from epiforecasts/scoringutils-review-manuscript
nikosbosse Nov 15, 2023
cbccd92
Update metric description of the log score to reconcile different ver…
nikosbosse Nov 15, 2023
a527407
update manuscript introduction
nikosbosse Nov 15, 2023
c80cd2e
update introduction of the mansucript
nikosbosse Nov 15, 2023
611b511
fix code in manuscript
nikosbosse Nov 15, 2023
434f1ad
add citation
nikosbosse Nov 16, 2023
3a71580
Draft section on package structure
nikosbosse Nov 16, 2023
42937a1
small manuscript update
nikosbosse Nov 18, 2023
170453d
rework manuscript introduction
nikosbosse Nov 20, 2023
892a9c0
update introduction slightly
nikosbosse Nov 21, 2023
1321507
Write package overview
nikosbosse Nov 22, 2023
8f56536
small correction to documentation for score, update of manuscript sec…
nikosbosse Nov 22, 2023
26df724
Update manuscript
nikosbosse Nov 24, 2023
1ec0458
Create a stub for a new vignette based on scoring things from the man…
nikosbosse Nov 25, 2023
c23cc7d
Update manuscript
nikosbosse Nov 25, 2023
242397a
Update manuscript
nikosbosse Nov 25, 2023
0a54940
Update manuscript
nikosbosse Nov 27, 2023
c17af10
Merge branch 'develop' into rework-manuscript
nikosbosse Dec 7, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ inst/manuscript/manuscript.aux
inst/manuscript/manuscript.blg
inst/manuscript/manuscript.pdf
inst/manuscript/manuscript.tex
inst/manuscript/manuscript.out
inst/manuscript/manuscript.bbl
inst/manuscript/manuscript_files/
docs
..bfg-report/
Expand Down
2 changes: 1 addition & 1 deletion R/score.R
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@
#' of the data belong together and jointly form a single forecasts. This is
#' easy e.g. for point forecast, where there is one row per forecast. For
#' quantile or sample-based forecasts, however, there are multiple rows that
#' belong to single forecast.
#' belong to a single forecast.
#'
#' The *forecast unit* or *unit of a single forecast* is then described by the
#' combination of columns that uniquely identify a single forecast.
Expand Down
Binary file modified data/metrics.rda
Binary file not shown.
10 changes: 5 additions & 5 deletions inst/create-metric-tables.R
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ log_score <- list(
`C` = r"($\checkmark$)",
`B` = r"($\checkmark$)",
`Q` = r"($-$)",
`Properties` = "Proper scoring rule, smaller is better, only evaluates predictive density at observed value (local), penalises over-confidence severely, susceptible to outliers",
`Properties` = "Proper scoring rule, smaller is better, equals negative log of the predictive density at observed value (local), penalises over-confidence severely, susceptible to outliers",
`References` = ""
)

Expand Down Expand Up @@ -278,14 +278,14 @@ crps <- list(

log_score <- list(
`Metric` = "Log score",
`Explanation` = r"(The Log score is a proper scoring rule that is simply computed as the log of the predictive density evaluated at the observed value. It is given as
$$ \text{log score} = \log f(y), $$
`Explanation` = r"(The Log score is a proper scoring rule that is computed as the negative log of the predictive density evaluated at the observed value. It is given as
$$ \text{log score} = -\log f(y), $$
where $f$ is the predictive density function and y is the observed value. For integer-valued forecasts, the log score can be computed as
$$ \text{log score} = \log p_y, $$
$$ \text{log score} = -\log p_y, $$
where $p_y$ is the probability assigned to outcome p by the forecast F.

**Usage and caveats**:
Larger values are better, but sometimes the sign is reversed. The log score is sensitive to outliers, as individual negative log score contributions quickly can become very large if the event falls in the tails of the predictive distribution, where $f(y)$ (or $p_y$) is close to zero. Whether or not that is desirable depends ont the application. In scoringutils, the log score cannot be used for integer-valued forecasts, as the implementation requires a predictive density. In contrast to the crps, the log score is a local scoring rule: it's value only depends only on the probability that was assigned to the actual outcome. This property may be desirable for inferential purposes, for example in a Bayesian context (Winkler et al., 1996). In settings where forecasts inform decision making, it may be more appropriate to score forecasts based on the entire predictive distribution.)"
Smaller values are better, but sometimes the sign is reversed. The log score is sensitive to outliers, as individual log score contributions can become very large if the event falls in a range of the predictive distribution where $f(y)$ (or $p_y$) is close to zero. Whether or not that is desirable depends ont the application. In scoringutils, the log score cannot be used for integer-valued forecasts, as the implementation requires a predictive density. In contrast to the crps, the log score is a local scoring rule: it's value only depends only on the probability that was assigned to the actual outcome. This property may be desirable for inferential purposes, for example in a Bayesian context (Winkler et al., 1996). In settings where forecasts inform decision making, it may be more appropriate to score forecasts based on the entire predictive distribution.)"
)

wis <- list(
Expand Down
583 changes: 250 additions & 333 deletions inst/manuscript/manuscript.Rmd

Large diffs are not rendered by default.

194 changes: 0 additions & 194 deletions inst/manuscript/manuscript.aux

This file was deleted.

419 changes: 0 additions & 419 deletions inst/manuscript/manuscript.bbl

This file was deleted.

48 changes: 0 additions & 48 deletions inst/manuscript/manuscript.blg

This file was deleted.

23 changes: 0 additions & 23 deletions inst/manuscript/manuscript.out

This file was deleted.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added inst/manuscript/output/flowchart-score.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added inst/manuscript/output/illustration-score.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added inst/manuscript/output/input-score.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added inst/manuscript/output/pairwise-comparisons.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added inst/manuscript/output/workflow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
156 changes: 145 additions & 11 deletions inst/manuscript/references.bib
Original file line number Diff line number Diff line change
@@ -1,14 +1,3 @@
@Article{scoringRules,
title = {Evaluating Probabilistic Forecasts with {scoringRules}},
author = {Alexander Jordan and Fabian Kr\"uger and Sebastian Lerch},
journal = {Journal of Statistical Software},
year = {2019},
volume = {90},
number = {12},
pages = {1--37},
doi = {10.18637/jss.v090.i12},
}

@Manual{Metrics,
title = {Metrics: Evaluation Metrics for Machine Learning},
author = {Ben Hamner and Michael Frasco},
Expand Down Expand Up @@ -78,3 +67,148 @@ @Manual{kableExtra
note = {R package version 1.3.4},
url = {https://CRAN.R-project.org/package=kableExtra},
}

@Article{scoringRules,
title = {Evaluating Probabilistic Forecasts with {scoringRules}},
author = {Alexander Jordan and Fabian Kr\"uger and Sebastian Lerch},
journal = {Journal of Statistical Software},
year = {2019},
volume = {90},
number = {12},
pages = {1--37},
doi = {10.18637/jss.v090.i12},
}

@Manual{predtools,
title = {predtools: Prediction Model Tools},
author = {Mohsen Sadatsafavi and Abdollah Safari and Tae Yoon Lee},
year = {2023},
note = {R package version 0.0.3},
url = {https://CRAN.R-project.org/package=predtools},
}

@Manual{probably,
title = {probably: Tools for Post-Processing Class Probability Estimates},
author = {Max Kuhn and Davis Vaughan and Edgar Ruiz},
year = {2023},
note = {R package version 1.0.2},
url = {https://CRAN.R-project.org/package=probably},
}

@Manual{yardstick,
title = {yardstick: Tidy Characterizations of Model Performance},
author = {Max Kuhn and Davis Vaughan and Emil Hvitfeldt},
year = {2023},
note = {R package version 1.2.0},
url = {https://CRAN.R-project.org/package=yardstick},
}

@Manual{GLMMadaptive,
title = {GLMMadaptive: Generalized Linear Mixed Models using Adaptive Gaussian
Quadrature},
author = {Dimitris Rizopoulos},
year = {2023},
note = {R package version 0.9-0},
url = {https://CRAN.R-project.org/package=GLMMadaptive},
}

@Article{surveillance,
author = {Sebastian Meyer and Leonhard Held and Michael Höhle},
title = {Spatio-Temporal Analysis of Epidemic Phenomena Using the {R} Package {surveillance}},
journal = {Journal of Statistical Software},
year = {2017},
volume = {77},
number = {11},
pages = {1--55},
doi = {10.18637/jss.v077.i11},
}

@Article{surveillance2,
author = {Maëlle Salmon and Dirk Schumacher and Michael Höhle},
title = {Monitoring Count Time Series in {R}: Aberration Detection in Public Health Surveillance},
journal = {Journal of Statistical Software},
year = {2016},
volume = {70},
number = {10},
pages = {1--35},
doi = {10.18637/jss.v070.i10},
}

@Manual{cvGEE,
title = {cvGEE: Cross-Validated Predictions from GEE},
author = {Dimitris Rizopoulos},
year = {2019},
note = {R package version 0.3-0},
url = {https://CRAN.R-project.org/package=cvGEE},
}

@Article{scoring,
title = {Choosing a Strictly Proper Scoring Rule},
author = {Edgar C. Merkle and Mark Steyvers},
journal = {Decision Analysis},
year = {2013},
volume = {10},
pages = {292--304},
}

@Manual{verification,
title = {verification: Weather Forecast Verification Utilities},
author = {NCAR - Research Applications Laboratory},
year = {2015},
note = {R package version 1.42},
url = {https://CRAN.R-project.org/package=verification},
}

@Manual{SpecsVerification,
title = {SpecsVerification: Forecast Verification Routines for Ensemble Forecasts of Weather
and Climate},
author = {Stefan Siegert},
year = {2020},
note = {R package version 0.5-3},
url = {https://CRAN.R-project.org/package=SpecsVerification},
}

@Article{tidyverse,
title = {Welcome to the {tidyverse}},
author = {Hadley Wickham and Mara Averick and Jennifer Bryan and Winston Chang and Lucy D'Agostino McGowan and Romain François and Garrett Grolemund and Alex Hayes and Lionel Henry and Jim Hester and Max Kuhn and Thomas Lin Pedersen and Evan Miller and Stephan Milton Bache and Kirill Müller and Jeroen Ooms and David Robinson and Dana Paige Seidel and Vitalie Spinu and Kohske Takahashi and Davis Vaughan and Claus Wilke and Kara Woo and Hiroaki Yutani},
year = {2019},
journal = {Journal of Open Source Software},
volume = {4},
number = {43},
pages = {1686},
doi = {10.21105/joss.01686},
}

@Manual{tidymodels,
title = {Tidymodels: a collection of packages for modeling and machine learning using tidyverse principles.},
author = {Max Kuhn and Hadley Wickham},
url = {https://www.tidymodels.org},
year = {2020},
}

@Article{checkmate,
title = {{checkmate}: Fast Argument Checks for Defensive {R} Programming},
author = {Michel Lang},
journal = {The R Journal},
year = {2017},
doi = {10.32614/RJ-2017-028},
pages = {437--445},
volume = {9},
number = {1},
}

@Manual{data.table,
title = {data.table: Extension of `data.frame`},
author = {Matt Dowle and Arun Srinivasan},
year = {2023},
note = {R package version 1.14.8},
url = {https://CRAN.R-project.org/package=data.table},
}

@Manual{fabletools,
title = {fabletools: Core Tools for Packages in the 'fable' Framework},
author = {Mitchell O'Hara-Wild and Rob Hyndman and Earo Wang},
year = {2023},
note = {R package version 0.3.4},
url = {https://CRAN.R-project.org/package=fabletools},
}
Loading
Loading