-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clarify hyperparameters associated with extracted objects #743
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, here's a short summary of what we tossed around in our conversation just now:
collect metrics()
gives you the metrics based on the predictions (obviously) - and those are based on what was requested (here: different values for the number of neighbors).
collect_extracts()
promises to give you access to what was actually fitted (we should probably make that clearer in its description, too).
We discuss some options that we discarded pretty quickly:
- Putting some cosmetics on: in the output of
collect_extract()
, we could make the content ofneighbors
reflect what was requested to match the content of.config
but that would not really resolve the tension arising from what was requested not being what was fitted. It would just kick the point of realization about that difference a little further down the line by obfuscating it a little more. Bad idea. - (Trying to) Hide the submodel trick: we could lean into "what was requested is the only thing the user sees" and try to extract the relevant submodel, aka the requested model instead of the fitted model. That is a lot of work and/or we litter it with bugs. Bad idea.
- Don't use the submodel trick: that would mean the difference that's causing us to think about all of this goes away. Cost: it'll be slooooow. Bad idea.
One alternative that was more interesting:
We also discussed making collect_extracts()
return only what was fitted, so remove rows with requested configs that we did not need to fit. That leans fully into the idea of collect_extracts()
being about what was actually fitted. But what anchors all the results is what was requested, and we want users to easily be able to connect all outputs to that. Thus, we use that as the thing that dictates the format, even when it here means duplicating some entries.
Conclusion: we decided to go with the current behavior and aim to document it better. I'm approving to signal that but think that it's worth considering wordsmithing the proposed documentation a bit more to reflect some more of the clarity we gained in our discussion. (I would but my brain is currently doing a fun overflow 😝 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice changes! One round of wordsmithing for your consideration :D
Co-authored-by: Hannah Frick <[email protected]>
This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue. |
Closes #724, closes #733.
I went with the approach labeled
1.
in #724 (comment), but am open to other options. :)