-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue #403: Rename available_forecasts()
to get_forecast_counts()
#511
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #511 +/- ##
===========================================
+ Coverage 81.22% 81.87% +0.64%
===========================================
Files 20 20
Lines 1726 1716 -10
===========================================
+ Hits 1402 1405 +3
+ Misses 324 311 -13 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good but the overlap in s3 naming scheme might be potentially confusing to people.
R/available_forecasts.R
Outdated
@@ -70,23 +70,7 @@ available_forecasts <- function(data, | |||
out <- merge(out, out_empty, by = by, all.y = TRUE) | |||
out[, count := nafill(count, fill = 0)] | |||
|
|||
class(out) <- c("scoringutils_available_forecasts", class(out)) | |||
class(out) <- c("forecast_counts", class(out)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to have a dangerous overlap with the s3 classes being used already (i.e. forecast_sample
, forecast_quantile
etc.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good point, we should definitely avoid that.
scoringutils_counts
? counts
? prediction_counts
?
I think I like the third one best.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to prediction_counts
for now. Maybe it would be good to devise a more systematic naming convention.
At the moment we have our 4 main classes for the forecast types, but then we're also planning to introduce all kinds of one-hit-wonder classes for plotting. Maybe prefixing them with scu_
or something like that would be a good idea?
Updates since last review:
|
Does this suggest the name of the function should also be changed? |
Just to be inconvenient I am also now wondering if |
Hmm. I would really prefer to change the class name again to something else, rather than changing the function |
I'm not that obsessed with what we call it FYI as long as there is no name clash but this does seem to indicate a bit of a logic hole in the overall naming scheme. |
Hm yeah so far we only have a consistent naming for the 4 forecast types. |
avail_forecasts()
to get_forecast_counts()
available_forecasts()
to get_forecast_counts()
Following our recent discussion to revert back to regular plotting functions instead of S3 methods we need to update |
Is the reason for this that linting issues etc have been fixed in this PR? I am not really a fan of merging in none final stuff due to these kind of issues but if that reduces the burden for now I guess will have to go with it. |
99.9% of linting issues were fixed in another PR, but here was one left over due to the long name of the plot method (i.e. > 30 characters). |
okk, I made additional changes following up with your comment about merging in non-final stuff. Created a function plot_forecast_counts() which replaces a previously implemented S3 method for plot(). I also removed the default argument for the variable on the x axis, as this really depends on the input data. The previous value, forecast_date worked well with the example data, but I'm not sure a default really makes that much sense here. |
Todo: add an informative error message if "x" is not given |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good. A couple of discussion points.
|
||
```{r} | ||
available_forecasts(example_quantile, by = c("model", "target_type")) | ||
get_forecast_counts(example_quantile, by = c("model", "target_type")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we are settled on as_forecast
I would imagine we want to showcase using that and then running this without the by vs doing it on the unconverted example data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ultimately the result should be exactly the same, regardless of whether you run as_forecast()
on the data - if I'm not mistaken.
At the moment get_forecast_counts()
does some shenanigans with a forecast_unit
attribute, but eventually I think it should just call get_forecast_unit()
internally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my point is that if you demonstrated this workflow with that flow you could avoid setting by
here and also demonstrate our proposed workflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still not sure I understand. If you don't set by
, then by
will just be the forecast unit and then you end up with a big data.table that has counts either 0 or 1 - regardless of whether you call as_forecast()
before or not.
But of course we can update the example to call as_forecast()
before (or validate()
now)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant explicitly setting the forecast unit. I also think this could be part of the as_forecast
workflow as this would be quite clean but that is a separate issue.
Recent changes:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All fine. I think you shuold use as_forecast
so that there is one standard way of working with forecasts that is clear to users. Ideally resolve that before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All fine. I think you shuold use as_forecast
so that there is one standard way of working with forecasts that is clear to users. Ideally resolve that before merging.
@seabbs currently not clear to me what you meant by the following:
Could you have a quick look at the unresolved conversation above please and let me know whether you think something should happen or whether this is good to merge? Also happy to open a new issue. |
Merge or not as you wish. I think there is a broader package problem of presenting users with many many workflow options and not being clear which is the main route through the package. |
Merging this now - opened up a new issue, #530 to discuss further |
Description
This PR closes #403. This PR closes #521.
It
avail_forecasts()
andplot_avail_forecasts()
which were aliases for the previously existing functions. My thought in removing them was that we mostly decided not to make backwards compatible changes anyway, and keeping them in increases code / maintenance complexityavailable_forecasts()
toget_forecast_counts()
everywhereplot_forecast_counts()
which replaces a previously implemented S3 method forplot()
. I also removed the default argument for the variable on the x axis, as this really depends on the input data. The previous value,forecast_date
worked well with the example data, but I'm not sure a default really makes that much sense here.avail_forecast()
and has been replaced byavailable_forecasts()
a while ago when we updated the name to `available_forecasts().The PR is related to #510 as the length of the current plotting function is one of the things the linter complains about.
Checklist
I have added or updated unit tests where necessary.lintr::lint_package()
to check for style issues introduced by my changes.