-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pilot_tune_grid
function for running a few quick fits
#722
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heard. I'm into this idea.🙂
Here's an initial pass on review! I'd say thumbs up on moving on to unit tests, waiting on pilot_fit_resamples()
pending on further discussion about how agnostic to the tuning approach this function should be oriented.
#' parameters that correspond to a model or recipe and fits the model on only | ||
#' one of the resample splits, for purposes of quick error checking. | ||
#' | ||
#' @param object A `parsnip` model specification or a [workflows::workflow()]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's @inheritParams
these.
#' See [tune_grid()] for details of defaults and computation. | ||
#' | ||
#' @export | ||
pilot_tune_grid <- function(object, ...) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some hesitation from me on the name. Ostensibly this could be useful in racing and simulated annealing as well, and in that case I think this name would suggest we ought to also have methods for those other tuning approaches, so maybe a name that's agnostic to the tuning approach?
I had thought tune_pilot()
for a moment but I like your convention of breaking from the tune_*()
pattern to be clear this ought not be used in place of others matching the pattern. Max suggested pilot_candidates()
. Not sure if I have strong suggestions on names yet.
) | ||
|
||
# Cut grid down to extreme values only for piloting | ||
grid <- grid |> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
grid <- grid |> | |
grid <- grid %>% |
Let's use the magrittr pipe—we don't yet require R 4.1.0 from our users!
Line 18 in 74854a5
R (>= 4.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here I thought I was being hip and modern in this PR... because I haven't actually converted from base pipe in my own work yet! I'll swap it back.
|
||
# ------------------------------------------------------------------------------ | ||
|
||
pilot_tune_grid_workflow <- function(workflow, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate your eye for the way other tuning functions do this in your architecting.🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy-paste-tweak ftw
|
||
# Cut grid down to extreme values only for piloting | ||
grid <- grid |> | ||
arrange(across(matches("."))) |> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might want to look into a more performant variant of this line if we imagine this helper being iterated over, as in workflow_map()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, we try to call almost everything by namespace in case it is ever used in PSOCK clusters for parallel processing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll pull it into a helper function (choose_grid_candidates
or something) and then that helper function can be changed in the future.
wflow <- add_formula(wflow, preprocessor) | ||
} | ||
|
||
tune_grid( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'd like this to call pilot_tune_grid.workflow()
, via pilot_tune_grid()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, that was the intent, thanks.
resamples <- new_bare_tibble(resamples) | ||
|
||
# Use only the first resampling split | ||
resamples <- resamples |> slice(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resamples <- resamples |> slice(1) | |
resamples <- resamples |> vctrs::vec_slice(1) |
With an eye for performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are rules for modifying rset objects. Almost always, slicing them will revert the results to basic tibbles. For example, if we have a 10-fold CV rset and get rid of some rows it isn't the same thing and we drop it's rsample classes (and perhaps some attributes too).
We do this in racing and this function is there to restructure/reclass the object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the classes are handled correctly here via the extraction of pull_rset_attributes()
and passing a bare tibble directly to tune_grid_loop()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was my understanding too, that resamples
gets stripped down to plain tibble first and so I could slice it as such. But I only understand the architecture here enough to be dangerous so please do let me know if I should approach this differently!
# Cut grid down to extreme values only for piloting | ||
grid <- grid |> | ||
arrange(across(matches("."))) |> | ||
slice(1, nrow(grid)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
slice(1, nrow(grid)) | |
vctrs::vec_slice(1, nrow(grid)) |
I'm positive about doing this but we need to consider the use case and goals. Would it be:
"All of the grid points", as you said, may not be the goal. It wouldn't be hard to figure out which candidates are on the outskirts of the hyper-parameter space and test some of those. Also, should we use the resamples? Would the training set (or some random fraction) be just as good? I was thinking of |
I like the idea that this new function will have a similar interface to functions like |
Yes, I don't think we'd want to try all grid points - just looking for some breaking on the extremes before running the full tuning and workflow map. Right now it's written to try one grid set with all hyperparameters minimized, and one with all maximized. I agree that there's probably a smarter way to select which ones to try, but it may be beyond my skills. :)
It seems really important to me that we at least use the same splitting approach as will be used in the full tune. For example, if the user is using 10 folds with a stratification, that's the type and size of resample we should "pilot" on.
Sure - but there's a different structure to piloting a planned grid search vs piloting a workflowset that doesn't include any tuning, so presumably those would need to be two different named functions regardless. |
Given the cautiously optimistic replies, I'll try to steal some time this weekend to make the review changes and write a few unit tests and stuff. Regarding the names, I initially had |
Sorry for the inactivity. We're focused on myriad survival analysis changes right now. |
This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue. |
Hi friends,
This is a cute li'l add-on function that I want to propose, based on this fosstodon post.
Basically it takes a the setup of a
tune_grid()
process, and runs the fit/validate on only the largest and smallest grid values; and only on the first resampling split. The idea is to quick-test the workflow spec(s) on the grid values where they are most likely to break, in order to spot mistakes and poor choices in the recipe or grid, before running a massive or costly tuning process.It only required some very trivial tweaks to
tune_grid()
, and you might argue that it's not necessary; certainly a savvy user could easily chop down theirgrid
and theirresamples
manually.To that I say:
(1) if they are relying on the automatic gridding then it's a few extra fiddly steps to make and then reduce a grid manually; and,
(2) there's something really elegant to me about running
workflow_map("pilot_tune_grid")
with all of your workflowsets and chosen options, and then once you've troubleshooted all the issues that come up, you only need to change"pilot_tune_grid"
to"tune_grid"
to run the whole shebang.It also might make sense to add some options, like to use 2 resample splits instead of 1, or to use all extreme combinations of grid parameters instead of just "all minimum values" and "all maximum values". But at some point, yeah, the user should just make the reduced grid themselves.
Anyhoo. I y'all decide you like it, I'm happy to do the first pass on unit tests and maybe making a
pilot_fit_resamples()
(mainly for use with workflowsets, not much value to it on one model).