Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #403: Rename available_forecasts() to get_forecast_counts() #511

Merged
merged 12 commits into from
Dec 14, 2023
7 changes: 3 additions & 4 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
# Generated by roxygen2: do not edit by hand

S3method(plot,scoringutils_available_forecasts)
S3method(print,scoringutils_check)
S3method(quantile_to_interval,data.frame)
S3method(quantile_to_interval,numeric)
Expand All @@ -19,8 +18,6 @@ export(add_coverage)
export(add_pairwise_comparison)
export(ae_median_quantile)
export(ae_median_sample)
export(avail_forecasts)
export(available_forecasts)
export(available_metrics)
export(bias_quantile)
export(bias_range)
Expand All @@ -31,6 +28,7 @@ export(crps_sample)
export(dispersion)
export(dss_sample)
export(get_duplicate_forecasts)
export(get_forecast_counts)
export(get_forecast_type)
export(get_forecast_unit)
export(interval_coverage_deviation_quantile)
Expand All @@ -49,8 +47,8 @@ export(overprediction)
export(pairwise_comparison)
export(pit)
export(pit_sample)
export(plot_avail_forecasts)
export(plot_correlation)
export(plot_forecast_counts)
export(plot_heatmap)
export(plot_interval_coverage)
export(plot_pairwise_comparison)
Expand Down Expand Up @@ -88,6 +86,7 @@ importFrom(checkmate,assert_list)
importFrom(checkmate,assert_logical)
importFrom(checkmate,assert_number)
importFrom(checkmate,assert_numeric)
importFrom(checkmate,assert_string)
importFrom(checkmate,assert_vector)
importFrom(checkmate,check_atomic_vector)
importFrom(checkmate,check_data_frame)
Expand Down
10 changes: 5 additions & 5 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,10 @@ The update introduces breaking changes. If you want to keep using the older vers
- `add_coverage()` was reworked completely. It's new purpose is now to add coverage information to the raw forecast data (essentially fulfilling some of the functionality that was previously covered by `score_quantile()`)
- The function `find_duplicates()` was renamed to `get_duplicate_forecasts()`
- Changes to `avail_forecasts()` and `plot_avail_forecasts()`:
- The function `avail_forecasts()` was renamed to `available_forecasts()` for consistency with `available_metrics()`. The old function, `avail_forecasts()` is still available as an alias, but will be removed in the future.
- For clarity, the output column in `avail_forecasts()` was renamed from "Number forecasts" to "count".
- `available_forecasts()` now also displays combinations where there are 0 forecasts, instead of silently dropping corresponding rows.
- `plot_avail_forecasts()` has been deprecated in favour of an S3 method for `plot()`. An alias is still available, but will be removed in the future.
- The function `avail_forecasts()` was renamed to `get_forecast_counts()`. This represents a change in the naming convention where we aim to name functions that provide the user with additional useful information about the data with a prefix "get_". Sees Issue #403 and #521 and PR #511 by @nikosbosse and reviewed by @seabbs for details.
- For clarity, the output column in `get_forecast_counts()` was renamed from "Number forecasts" to "count".
- `get_forecast_counts()` now also displays combinations where there are 0 forecasts, instead of silently dropping corresponding rows.
- `plot_avail_forecasts()` was renamed `plot_forecast_counts()` in line with the change in the function name. The `x` argument no longer has a default value, as the value will depend on the data provided by the user.
- The deprecated `..density..` was replaced with `after_stat(density)` in ggplot calls.
- Files ending in ".Rda" were renamed to ".rds" where appropriate when used together with `saveRDS()` or `readRDS()`.
- added documentation for the return value of `summarise_scores()`.
Expand Down Expand Up @@ -188,7 +188,7 @@ to a function `summarise_scores()`
- New function `check_forecasts()` to analyse input data before scoring
- New function `correlation()` to compute correlations between different metrics
- New function `add_coverage()` to add coverage for specific central prediction intervals.
- New function `available_forecasts()` allows to visualise the number of available forecasts.
- New function `avail_forecasts()` allows to visualise the number of available forecasts.
- New function `find_duplicates()` to find duplicate forecasts which cause an error.
- All plotting functions were renamed to begin with `plot_`. Arguments were
simplified.
Expand Down
24 changes: 3 additions & 21 deletions R/available_forecasts.R
Original file line number Diff line number Diff line change
Expand Up @@ -31,10 +31,10 @@
#' @examples
#' data.table::setDTthreads(1) # only needed to avoid issues on CRAN
#'
#' available_forecasts(example_quantile,
#' get_forecast_counts(example_quantile,
#' by = c("model", "target_type")
#' )
available_forecasts <- function(data,
get_forecast_counts <- function(data,
by = NULL,
collapse = c("quantile", "sample_id")) {

Expand All @@ -58,7 +58,7 @@ available_forecasts <- function(data,
data <- data[data[, .I[1], by = collapse_by]$V1]

# count number of rows = number of forecasts
out <- data[, .(`count` = .N), by = by]
out <- data[, .(count = .N), by = by]

# make sure that all combinations in "by" are included in the output (with
# count = 0). To achieve that, take the unique values in data and expand grid
Expand All @@ -70,23 +70,5 @@ available_forecasts <- function(data,
out <- merge(out, out_empty, by = by, all.y = TRUE)
out[, count := nafill(count, fill = 0)]

class(out) <- c("scoringutils_available_forecasts", class(out))

return(out[])
}

#' @title Count Number of Available Forecasts `r lifecycle::badge("deprecated")`
#' @details `r lifecycle::badge("deprecated")` Deprecated in 1.2.2. Use
#' [available_forecasts()] instead.
#' @inherit available_forecasts
#' @keywords check-forecasts
#' @export
avail_forecasts <- function(data,
by = NULL,
collapse = c("quantile", "sample")) {
lifecycle::deprecate_warn(
"1.2.2", "avail_forecasts()",
"available_forecasts()"
)
available_forecasts(data, by, collapse)
}
100 changes: 32 additions & 68 deletions R/plot.R
Original file line number Diff line number Diff line change
Expand Up @@ -470,7 +470,7 @@ plot_predictions <- function(data,
# it separately here to deal with the case when only the median is provided
# (in which case ggdist::geom_lineribbon() will fail)
if (0 %in% range) {
select_median <- (forecasts$range %in% 0 & forecasts$boundary == "lower")
select_median <- (forecasts$range == 0 & forecasts$boundary == "lower")
median <- forecasts[select_median]

if (nrow(median) > 0) {
Expand Down Expand Up @@ -941,54 +941,58 @@ plot_pit <- function(pit,
#'
#' @description
#' Visualise Where Forecasts Are Available
#' @inheritParams print.scoringutils_check
#' @param x an S3 object of class "scoringutils_available_forecasts"
#' as produced by [available_forecasts()]
#' @param yvar character vector of length one that denotes the name of the column
#' @param forecast_counts a data.table (or similar) with a column `count`
#' holding forecast counts, as produced by [get_forecast_counts()]
#' @param x character vector of length one that denotes the name of the column
nikosbosse marked this conversation as resolved.
Show resolved Hide resolved
#' to appear on the x-axis of the plot.
#' @param y character vector of length one that denotes the name of the column
#' to appear on the y-axis of the plot. Default is "model".
#' @param xvar character vector of length one that denotes the name of the column
#' to appear on the x-axis of the plot. Default is "forecast_date".
#' @param make_xvar_factor logical (default is TRUE). Whether or not to convert
#' @param x_as_factor logical (default is TRUE). Whether or not to convert
#' the variable on the x-axis to a factor. This has an effect e.g. if dates
#' are shown on the x-axis.
#' @param show_numbers logical (default is `TRUE`) that indicates whether
#' @param show_counts logical (default is `TRUE`) that indicates whether
#' or not to show the actual count numbers on the plot
#' @return ggplot object with a plot of interval coverage
#' @importFrom ggplot2 ggplot scale_colour_manual scale_fill_manual
#' geom_tile scale_fill_gradient .data
#' @importFrom data.table dcast .I .N
#' @importFrom checkmate assert_string assert_logical assert
#' @export
#' @examples
#' library(ggplot2)
#' available_forecasts <- available_forecasts(
#' forecast_counts <- get_forecast_counts(
#' example_quantile, by = c("model", "target_type", "target_end_date")
#' )
#' plot(
#' available_forecasts, xvar = "target_end_date", show_numbers = FALSE
#' plot_forecast_counts(
#' forecast_counts, x = "target_end_date", show_counts = FALSE
#' ) +
#' facet_wrap("target_type")

plot.scoringutils_available_forecasts <- function(x,
yvar = "model",
xvar = "forecast_date",
make_xvar_factor = TRUE,
show_numbers = TRUE,
...) {
x <- as.data.table(x)

if (make_xvar_factor) {
x[, eval(xvar) := as.factor(get(xvar))]
plot_forecast_counts <- function(forecast_counts,
x,
y = "model",
x_as_factor = TRUE,
show_counts = TRUE) {

forecast_counts <- ensure_data.table(forecast_counts)
assert_string(y)
assert_string(x)
assert(check_columns_present(forecast_counts, c(y, x)))
assert_logical(x_as_factor)
assert_logical(show_counts)

if (x_as_factor) {
forecast_counts[, eval(x) := as.factor(get(x))]
}

setnames(x, old = "count", new = "Count")
setnames(forecast_counts, old = "count", new = "Count")

plot <- ggplot(
x,
aes(y = .data[[yvar]], x = .data[[xvar]])
forecast_counts,
aes(y = .data[[y]], x = .data[[x]])
seabbs marked this conversation as resolved.
Show resolved Hide resolved
) +
geom_tile(aes(fill = `Count`),
width = 0.97, height = 0.97
) +
width = 0.97, height = 0.97) +
scale_fill_gradient(
low = "grey95", high = "steelblue",
na.value = "lightgrey"
Expand All @@ -1001,54 +1005,14 @@ plot.scoringutils_available_forecasts <- function(x,
)
) +
theme(panel.spacing = unit(2, "lines"))

if (show_numbers) {
if (show_counts) {
plot <- plot +
geom_text(aes(label = `Count`))
}

return(plot)
}


#' @title Visualise Where Forecasts Are Available `r lifecycle::badge("deprecated")`
#'
#' @description
#' Old version of [plot.scoringutils_available_forecasts()] for compatibility.
#' @inheritParams plot.scoringutils_available_forecasts
#' @param available_forecasts an S3 object of class "scoringutils_available_forecasts"
#' as produced by [available_forecasts()]
#' @param y character vector of length one that denotes the name of the column
#' to appear on the y-axis of the plot. Default is "model".
#' @param x character vector of length one that denotes the name of the column
#' to appear on the x-axis of the plot. Default is "forecast_date".
#' @param make_x_factor logical (default is TRUE). Whether or not to convert
#' the variable on the x-axis to a factor. This has an effect e.g. if dates
#' are shown on the x-axis.
#' @export
plot_avail_forecasts <- function(available_forecasts,
y = "model",
x = "forecast_date",
make_x_factor = TRUE,
show_numbers = TRUE) {

lifecycle::deprecate_warn(
"1.2.2", "plot_avail_forecasts()",
"plot()"
)

plot.scoringutils_available_forecasts(
x = available_forecasts,
yvar = y,
xvar = x,
make_xvar_factor = make_x_factor,
show_numbers = show_numbers
)
}




#' @title Plot Correlation Between Metrics
#'
#' @description
Expand Down
4 changes: 2 additions & 2 deletions R/validate.R
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ validate.scoringutils_sample <- function(data, ...) {
#' - checks there are no duplicate forecasts
#' - if appropriate, checks the number of samples / quantiles is the same
#' for all forecasts
#' @inheritParams available_forecasts
#' @inheritParams get_forecast_counts
#' @return returns the input, with a few new attributes that hold additional
#' information, messages and warnings
#' @importFrom data.table ':=' is.data.table setattr
Expand Down Expand Up @@ -162,7 +162,7 @@ validate_general <- function(data) {
#' - makes sure that a column called `model` exists and if not creates one
#' - assigns a class
#'
#' @inheritParams available_forecasts
#' @inheritParams get_forecast_counts
#' @param classname name of the class to be created
#' @return An object of the class indicated by `classname`
#' @export
Expand Down
2 changes: 1 addition & 1 deletion inst/manuscript/R/00-standalone-Figure-replication.R
Original file line number Diff line number Diff line change
Expand Up @@ -537,7 +537,7 @@ p2 + p1 + p_true +
available_forecasts(data = example_integer,
by = c("model", "target_type", "forecast_date")) |>
plot_available_forecasts(x = "forecast_date",
show_numbers = FALSE) +
show_counts = FALSE) +
facet_wrap(~ target_type) +
labs(y = "Model", x = "Forecast date")

Expand Down
2 changes: 1 addition & 1 deletion inst/manuscript/manuscript.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -432,7 +432,7 @@ library("ggplot2")
available_forecasts(data = example_integer,
by = c("model", "target_type", "forecast_date")) |>
plot_available_forecasts(x = "forecast_date",
show_numbers = FALSE) +
show_counts = FALSE) +
facet_wrap(~ target_type) +
labs(y = "Model", x = "Forecast date")
```
Expand Down
47 changes: 0 additions & 47 deletions man/avail_forecasts.Rd

This file was deleted.

8 changes: 4 additions & 4 deletions man/available_forecasts.Rd → man/get_forecast_counts.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading