Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #403: Rename available_forecasts() to get_forecast_counts() #511

Merged
merged 12 commits into from
Dec 14, 2023

Conversation

nikosbosse
Copy link
Contributor

@nikosbosse nikosbosse commented Nov 28, 2023

Description

This PR closes #403. This PR closes #521.

It

  • removes the functions avail_forecasts() and plot_avail_forecasts() which were aliases for the previously existing functions. My thought in removing them was that we mostly decided not to make backwards compatible changes anyway, and keeping them in increases code / maintenance complexity
  • renames available_forecasts() to get_forecast_counts() everywhere
  • creates a function plot_forecast_counts() which replaces a previously implemented S3 method for plot(). I also removed the default argument for the variable on the x axis, as this really depends on the input data. The previous value, forecast_date worked well with the example data, but I'm not sure a default really makes that much sense here.
  • Updates the News item (and also fixes a mistake in a previous news item. I think the function introduced back then was avail_forecast() and has been replaced by available_forecasts() a while ago when we updated the name to `available_forecasts().
  • fixes a few linting issues (essentially replicating changes in Fix linting #510)

The PR is related to #510 as the length of the current plotting function is one of the things the linter complains about.

Checklist

  • My PR is based on a package issue and I have explicitly linked it.
  • I have included the target issue or issues in the PR title as follows: issue-number: PR title
  • I have tested my changes locally.
  • I have added or updated unit tests where necessary.
  • I have updated the documentation if required.
  • I have built the package locally and run rebuilt docs using roxygen2.
  • My code follows the established coding standards and I have run lintr::lint_package() to check for style issues introduced by my changes.
  • I have added a news item linked to this PR.
  • I have reviewed CI checks for this PR and addressed them as far as I am able.

Copy link

codecov bot commented Nov 28, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (accb868) 81.22% compared to head (edeeae7) 81.87%.
Report is 9 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #511      +/-   ##
===========================================
+ Coverage    81.22%   81.87%   +0.64%     
===========================================
  Files           20       20              
  Lines         1726     1716      -10     
===========================================
+ Hits          1402     1405       +3     
+ Misses         324      311      -13     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@nikosbosse nikosbosse mentioned this pull request Nov 28, 2023
9 tasks
Copy link
Contributor

@seabbs seabbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good but the overlap in s3 naming scheme might be potentially confusing to people.

@@ -70,23 +70,7 @@ available_forecasts <- function(data,
out <- merge(out, out_empty, by = by, all.y = TRUE)
out[, count := nafill(count, fill = 0)]

class(out) <- c("scoringutils_available_forecasts", class(out))
class(out) <- c("forecast_counts", class(out))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to have a dangerous overlap with the s3 classes being used already (i.e. forecast_sample, forecast_quantile etc.).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good point, we should definitely avoid that.
scoringutils_counts? counts? prediction_counts?

I think I like the third one best.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to prediction_counts for now. Maybe it would be good to devise a more systematic naming convention.
At the moment we have our 4 main classes for the forecast types, but then we're also planning to introduce all kinds of one-hit-wonder classes for plotting. Maybe prefixing them with scu_ or something like that would be a good idea?

@nikosbosse
Copy link
Contributor Author

Updates since last review:

  • changed the class forecast_count to prediction_count to avoid a clash with the naming convention for the other classes.

@nikosbosse nikosbosse requested a review from seabbs November 29, 2023 11:03
@seabbs
Copy link
Contributor

seabbs commented Nov 29, 2023

changed the class forecast_count to prediction_count to avoid a clash with the naming convention for the other classes.

Does this suggest the name of the function should also be changed?

@seabbs
Copy link
Contributor

seabbs commented Nov 29, 2023

Just to be inconvenient I am also now wondering if get_forecast_unit should also be renamed to get_prediction_unit as well.

@nikosbosse
Copy link
Contributor Author

Hmm. I would really prefer to change the class name again to something else, rather than changing the function get_forecast_unit - I think the term was used a lot in the past and apart from this conflict there is no real reason to change it. We have get_forecast_unit, we have get_forecast_type and I think they play nicely together.
Maybe call the new class fcount?

@seabbs
Copy link
Contributor

seabbs commented Nov 29, 2023

I'm not that obsessed with what we call it FYI as long as there is no name clash but this does seem to indicate a bit of a logic hole in the overall naming scheme.

@nikosbosse
Copy link
Contributor Author

Hm yeah so far we only have a consistent naming for the 4 forecast types.
In this instance (and a few others) we're basically creating a class with the sole purpose of providing a plot method for the output.
In principle we could just not do it if we want to have as few classes as possible (Seb mentioned Hugo had strong feelings about this for example).
If we stick with a dedicated class, maybe we could consistently just give it a class name equal to the function name? So in this instance the output would be of class get_forecast_counts.

@nikosbosse nikosbosse mentioned this pull request Nov 30, 2023
@nikosbosse nikosbosse changed the title Issue #403: Rename avail_forecasts() to get_forecast_counts() Issue #403: Rename available_forecasts() to get_forecast_counts() Dec 5, 2023
@nikosbosse
Copy link
Contributor Author

Following our recent discussion to revert back to regular plotting functions instead of S3 methods we need to update get_forecast_counts() and the associated plot method. I suggest doing this in a different PR (see #521) and merging this now (so that CI checks can finally pass for all PRs)

@seabbs
Copy link
Contributor

seabbs commented Dec 5, 2023

Is the reason for this that linting issues etc have been fixed in this PR? I am not really a fan of merging in none final stuff due to these kind of issues but if that reduces the burden for now I guess will have to go with it.

@nikosbosse
Copy link
Contributor Author

nikosbosse commented Dec 5, 2023

99.9% of linting issues were fixed in another PR, but here was one left over due to the long name of the plot method (i.e. > 30 characters).
If you prefer I can also do the reverting back in this PR. I just thought it would be cleaner if this PR renames the function (and drops the old ones) and the next PR handles the plot method and the docs etc.

@nikosbosse
Copy link
Contributor Author

okk, I made additional changes following up with your comment about merging in non-final stuff.

Created a function plot_forecast_counts() which replaces a previously implemented S3 method for plot(). I also removed the default argument for the variable on the x axis, as this really depends on the input data. The previous value, forecast_date worked well with the example data, but I'm not sure a default really makes that much sense here.

@nikosbosse nikosbosse requested a review from seabbs December 5, 2023 12:47
@nikosbosse
Copy link
Contributor Author

Todo: add an informative error message if "x" is not given

Copy link
Contributor

@seabbs seabbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. A couple of discussion points.

R/plot.R Outdated Show resolved Hide resolved
R/plot.R Show resolved Hide resolved
R/plot.R Outdated Show resolved Hide resolved
R/plot.R Outdated Show resolved Hide resolved
R/plot.R Show resolved Hide resolved
R/plot.R Outdated Show resolved Hide resolved

```{r}
available_forecasts(example_quantile, by = c("model", "target_type"))
get_forecast_counts(example_quantile, by = c("model", "target_type"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we are settled on as_forecast I would imagine we want to showcase using that and then running this without the by vs doing it on the unconverted example data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultimately the result should be exactly the same, regardless of whether you run as_forecast() on the data - if I'm not mistaken.
At the moment get_forecast_counts() does some shenanigans with a forecast_unit attribute, but eventually I think it should just call get_forecast_unit() internally

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my point is that if you demonstrated this workflow with that flow you could avoid setting by here and also demonstrate our proposed workflow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not sure I understand. If you don't set by, then by will just be the forecast unit and then you end up with a big data.table that has counts either 0 or 1 - regardless of whether you call as_forecast() before or not.

image

But of course we can update the example to call as_forecast() before (or validate() now)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant explicitly setting the forecast unit. I also think this could be part of the as_forecast workflow as this would be quite clean but that is a separate issue.

vignettes/scoringutils.Rmd Outdated Show resolved Hide resolved
@nikosbosse
Copy link
Contributor Author

Recent changes:

  • changed the order of the x and y argument
  • made it clear that the input data expects a column "count"
  • updated input checks again to check for the columns are present
  • renamed arguments make_x_factor --> x_as_factor and show_numbers --> show_counts

@nikosbosse nikosbosse requested a review from seabbs December 6, 2023 22:38
Copy link
Contributor

@seabbs seabbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All fine. I think you shuold use as_forecast so that there is one standard way of working with forecasts that is clear to users. Ideally resolve that before merging.

Copy link
Contributor

@seabbs seabbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All fine. I think you shuold use as_forecast so that there is one standard way of working with forecasts that is clear to users. Ideally resolve that before merging.

@nikosbosse
Copy link
Contributor Author

@seabbs currently not clear to me what you meant by the following:

All fine. I think you shuold use as_forecast so that there is one standard way of working with forecasts that is clear to users. Ideally resolve that before merging.

Could you have a quick look at the unresolved conversation above please and let me know whether you think something should happen or whether this is good to merge? Also happy to open a new issue.

@seabbs
Copy link
Contributor

seabbs commented Dec 14, 2023

Merge or not as you wish. I think there is a broader package problem of presenting users with many many workflow options and not being clear which is the main route through the package.

@nikosbosse
Copy link
Contributor Author

Merging this now - opened up a new issue, #530 to discuss further

@nikosbosse nikosbosse merged commit c11ce14 into develop Dec 14, 2023
11 checks passed
@nikosbosse nikosbosse deleted the rename-avail_forecasts branch December 14, 2023 12:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants