Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #585: allow users to specify columns and forecast unit in as_forecast() #641

Merged
merged 9 commits into from
Feb 26, 2024

Conversation

nikosbosse
Copy link
Contributor

@nikosbosse nikosbosse commented Feb 21, 2024

Description

This PR closes #585

As discussed in #585 it is convenient for users to be able to specify the forecast unit, as well as changes to column names they need to make as part of as_forecast().

This PR

  • adds additional arguments observed, predicted, model, forecast_unit, quantile_level, sample_id to as_forecast() that allow the users to specify the desired forecast unit as well as desired changes to column names
  • adds checks to validate the inputs to as_forecast()
  • adds tests to check the behaviour is as expected
  • updates the NEWS file

Additional thoughts and considerations:

  • the order of the arguments could be different. For example, we might want forecast_unit to be the first argument. Strong opinions? At the moment I put it there because I felt it was more natural to first specify the columns to be renamed and then the forecast unit and then the special columns. But 🤷
  • We could in principle create extra methods for sample and quantile-based forecasts (and then move the arguments sample_id and quantile_level to those methods. As mentioned in Discussion: Let as_forecast explicitly specify column names from user inputted data.  #585 I feel this would lead to unnecessary complexity (having to call as_forecast() --> as_forecast.default() --> as_forecast() --> as_forecast.forecast_sample() just to hide an argument that is clearly explained in the docs).

Checklist

  • My PR is based on a package issue and I have explicitly linked it.
  • I have included the target issue or issues in the PR title as follows: issue-number: PR title
  • I have tested my changes locally.
  • I have added or updated unit tests where necessary.
  • I have updated the documentation if required.
  • I have built the package locally and run rebuilt docs using roxygen2.
  • My code follows the established coding standards and I have run lintr::lint_package() to check for style issues introduced by my changes.
  • I have added a news item linked to this PR.
  • I have reviewed CI checks for this PR and addressed them as far as I am able.

Copy link

codecov bot commented Feb 21, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 87.73%. Comparing base (4d3f003) to head (23b708e).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #641      +/-   ##
==========================================
+ Coverage   87.53%   87.73%   +0.20%     
==========================================
  Files          21       21              
  Lines        1757     1786      +29     
==========================================
+ Hits         1538     1567      +29     
  Misses        219      219              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@toshiakiasakura

This comment was marked as resolved.

@nikosbosse

This comment was marked as resolved.

@toshiakiasakura

This comment was marked as resolved.

@nikosbosse

This comment was marked as resolved.

@toshiakiasakura

This comment was marked as resolved.

@seabbs

This comment was marked as resolved.

R/validate.R Show resolved Hide resolved
Copy link
Contributor

@seabbs seabbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from my comment about the default args I like this as is. Happy to watch the discussion play out.

@toshiakiasakura

This comment was marked as resolved.

R/validate.R Outdated
Comment on lines 49 to 74
as_forecast.default <- function(data,
observed = NULL,
predicted = NULL,
model = NULL,
forecast_unit = NULL,
quantile_level = NULL,
sample_id = NULL,
...) {
Copy link
Collaborator

@toshiakiasakura toshiakiasakura Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason to accept other arguments by ...? In case users misspecify the argument name, it seems better to me to omit ... to decline every argument.

Is it better to include type = NULL argument if the user want to be sure about the forecast type? For example, if type="quantile" is specified, we raise an error if the quantile_level argument is not given, or simply raise an error if the returned class is not matched with this argument. I think this is related to #603.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah those are good points!

Copy link
Contributor Author

@nikosbosse nikosbosse Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as_forecast.default() needs ... because the generic has it and the method and the generic always need to have the same arguments. But I like the idea of having a forecast_type argument

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that true for ... args though? I'm not sure it is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checked, it's true for ... args as well:

checking S3 generic/method consistency (1.7s)
   as_forecast:
     function(data, ...)
   as_forecast.default:
     function(data, forecast_unit, forecast_type, observed, predicted,
              model, quantile_level, sample_id)
   See section ‘Generic functions and methods’ in the ‘Writing R
   Extensions’ manual.   

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So instead, is it better to implement the error function for unexpected arguments manually?
Like the below for the beginning of the function.

extra_args <- setdiff(names(list(...)), names(formals(as_forecast.default)))
if (length(extra_args) > 0) {
  stop(paste("Unknown argument(s):", paste(extra_args, collapse = ", ")))
}
# Curent implementation.

I am also happy with as it is!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. Since additional arguments just have no effect at all, I would personally not throw an error for this. Given that the ... are there I would not expect an error as a user if I provide an additional argument.

@nikosbosse

This comment was marked as resolved.

@nikosbosse
Copy link
Contributor Author

nikosbosse commented Feb 23, 2024

I just pushed a new commit that updates the documentation a bit.

Open questions (either before merging this PR or for a new PR):

  • should as_forecast() have a forecast_type argument? This could be used to warn the user if their forecast type does not match the one they want. In the case of binary/point this may be useful in particular
  • do we want the arguments to be observed = NULL or observed = "observed" (see discussion above)

@seabbs
Copy link
Contributor

seabbs commented Feb 23, 2024

I agree on making it easy for the user here though good to have a back and forth on the different options.

I like the idea of allowing people to manually specify the type as a safety check.

I think either default arg option is fine so happy to leave as is

@nikosbosse
Copy link
Contributor Author

Perfect. I made the following updates:

  • as_forecast() now has an argument forecast_type
  • added documentation and updated tests for that

Also note that compared to the very first proposal, I changed the order of the arguments, which now is

as_forecast.default <- function(data,
                                forecast_unit = NULL,
                                forecast_type = NULL,
                                observed = NULL,
                                predicted = NULL,
                                model = NULL,
                                quantile_level = NULL,
                                sample_id = NULL,
                                ...) {

It felt more natural that way as forecast_unit() and forecast_type() are the ones you should use almost every time + then we have all arguments related to renaming stuff lumped together.

@nikosbosse
Copy link
Contributor Author

nikosbosse commented Feb 23, 2024

Unsure where the failing snapshots on macOS-latest are coming from. Can't reproduce that locally...
Seems like some update somewhere outside of our control triggered this... Even old tests that were previously passing fail now. I guess we just have to wait?

@nikosbosse
Copy link
Contributor Author

Update: given that macOS-latest is failing on main as well I'd be tempted to merge in this PR and #633 regardless...
That would allow me to keep developing :)

image

@seabbs
Copy link
Contributor

seabbs commented Feb 26, 2024

This snapshot failure is due to the recent ggplot2 update. I think we can ignore it.

R/validate.R Show resolved Hide resolved
@nikosbosse
Copy link
Contributor Author

Are we happy to merge as is (bypassing branch protection)?

@seabbs seabbs force-pushed the update-as_forecast() branch from 393cc54 to 23b708e Compare February 26, 2024 22:42
@nikosbosse nikosbosse merged commit 66f139f into main Feb 26, 2024
10 of 12 checks passed
@nikosbosse nikosbosse deleted the update-as_forecast() branch February 26, 2024 23:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Discussion: Let as_forecast explicitly specify column names from user inputted data.
4 participants