Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #604 - Add support for nominal forecasts #837

Merged
merged 46 commits into from
Aug 10, 2024
Merged

Issue #604 - Add support for nominal forecasts #837

merged 46 commits into from
Aug 10, 2024

Conversation

nikosbosse
Copy link
Contributor

@nikosbosse nikosbosse commented Jun 2, 2024

Description

This PR closes #604.

Nominal forecasts are forecasts for outcomes that can fall in one of several unordered categories. This PR implements support for nominal forecasts (see #604, #607, and #608).

Specifically, the PR

  • creates a new nominal_forecast class with
    • an assert_input_nominal function that checks the inputs passed to a scoring function
    • a check_input_nominal, doing the same thing without producing an error - UPDATE: I think I deleted that as I didn't use it for checks. See Clean up input checks #840 for some discussion on when to check what.
    • an assert_forecast.forecast_nominal function, checking that a data.table is complying with the required input format
    • a default list of metrics, provided via metrics_nominal
    • a new method score.forecast_nominal
  • adds new example data
  • updates as_forecast() to accept a new predicted_label argument.
  • updates get_forecast_type() and adds a check function to make sure that the forecast type is nominal
  • implements the log score for nominal forecasts
  • adds tests

Note:
Throughout the process, I noticed that sadly, scoringutils is currently not "easily extensible"... To make this go smoothly, there are quite a few hoops. Some of this will be simplified in the future when we implement a separate as_forecast_nominal() function instead of a single as_forecast() function that has to do all the guesswork.


Still missing (likely for a future PR)

  • Updating the manuscript to include nominal forecasts
  • other kinds of docs
    • Creating a vignette that walks through a hubVerse example
  • A helper function that completes the forecast such that users don't have to specify every single option (see Define input format for categorical forecasts #608)

One current code example:

# remotes::install_github("epiforecasts/scoringutils@multiclass")
library(dplyr)
library(hubExamples)
library(scoringutils)

pred <- hubExamples::forecast_outputs |> filter(output_type == "pmf")
obs <- hubExamples::forecast_target_observations |> 
  dplyr::filter(output_type == "pmf")
hubex <- dplyr::full_join(pred, obs)

hubex |> 
  dplyr::group_by(model_id, location, reference_date, horizon, target_end_date, target, output_type) |>
  dplyr::mutate(
    observation = output_type_id[observation == 1], 
    observation = factor(observation, levels = c("low", "moderate", "high", "very high")), 
    output_type_id = factor(output_type_id, levels =  c("low", "moderate", "high", "very high"))) |>
  as_forecast(
    model = "model_id", observed = "observation", 
    predicted = "value", predicted_label = "output_type_id"
  ) |> 
  score()

Checklist

  • My PR is based on a package issue and I have explicitly linked it.
  • I have included the target issue or issues in the PR title as follows: issue-number: PR title
  • I have tested my changes locally.
  • I have added or updated unit tests where necessary.
  • I have updated the documentation if required.
  • I have built the package locally and run rebuilt docs using roxygen2.
  • My code follows the established coding standards and I have run lintr::lint_package() to check for style issues introduced by my changes.
  • I have added a news item linked to this PR.
  • I have reviewed CI checks for this PR and addressed them as far as I am able.

@nikosbosse nikosbosse changed the title Multiclass DON'T MERGE Draft for supporting nominal forecasts Jun 7, 2024
@nikosbosse nikosbosse marked this pull request as draft June 7, 2024 14:47
@nikosbosse nikosbosse changed the title Draft for supporting nominal forecasts Issue #604 - Add support for nominal forecasts Jun 14, 2024
@nikosbosse nikosbosse requested a review from nickreich June 14, 2024 05:25
@nikosbosse nikosbosse requested a review from seabbs July 23, 2024 08:13
R/forecast.R Outdated Show resolved Hide resolved
Copy link
Collaborator

@nickreich nickreich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review closely a lot of the code related to formal S3 class setup because i'm not that familiar with the structure/functions used there. but I reviewed the tests and the general set-up with the nominal forecast type and things look good to me +/- a few very small optional suggested changes.

Copy link
Contributor

@seabbs seabbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good I think and also appears correct to me. I don't have substantive comments about this PR aside from one instance of missing docs.

I did however use it to review the current changes needed to add a new class. This has improved by splitting out as_forecast but there are still a few pain points. It looks like nearly all of the me can deal with using a bit more s3 which is great.

I think we have discussed this before but I think this would be much easier to review/parse and easier for someone new to do if all the bits that defined as specific as_forecast_type where in the same file vs being split by generic method.

R/check-inputs-scoring-functions.R Outdated Show resolved Hide resolved
R/get_-functions.R Show resolved Hide resolved
R/get_-functions.R Show resolved Hide resolved
R/get_-functions.R Show resolved Hide resolved
@nikosbosse
Copy link
Contributor Author

@seabbs some excellent points in your review here. I think moving towards as_forecast_<type>() really was the right call and should allow us to simplify things here quite a bit.

I suggest addressing your points before implementing ordinal forecasts (pinging @nickreich and @elray1 for awareness) as that will make it easier to create the new ordinal class. Since Nick and Evan care about the ordinal forecasts more than the nominal ones I also suggest addressing your points before merging this.

@seabbs
Copy link
Contributor

seabbs commented Aug 6, 2024

I also suggest addressing your points before merging this.

I don't mind either way here but I agree it would be a good idea to use the ordinal forecasts as a test case. If it were me I think I would look to merge this, make a new issue with the pain points identified, address in a PR, and then implement ordinal?

@nikosbosse nikosbosse merged commit 867a2ff into main Aug 10, 2024
9 checks passed
@nikosbosse nikosbosse deleted the multiclass branch August 10, 2024 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Meta-issue: Create a new scoringutils workflow for scoring pmf forecasts
3 participants