Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clarify hyperparameters associated with extracted objects #743

Merged
merged 6 commits into from
Oct 11, 2023

Conversation

simonpcouch
Copy link
Contributor

Closes #724, closes #733.

I went with the approach labeled 1. in #724 (comment), but am open to other options. :)

Copy link
Member

@hfrick hfrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, here's a short summary of what we tossed around in our conversation just now:

collect metrics() gives you the metrics based on the predictions (obviously) - and those are based on what was requested (here: different values for the number of neighbors).
collect_extracts() promises to give you access to what was actually fitted (we should probably make that clearer in its description, too).

We discuss some options that we discarded pretty quickly:

  • Putting some cosmetics on: in the output of collect_extract(), we could make the content of neighbors reflect what was requested to match the content of .config but that would not really resolve the tension arising from what was requested not being what was fitted. It would just kick the point of realization about that difference a little further down the line by obfuscating it a little more. Bad idea.
  • (Trying to) Hide the submodel trick: we could lean into "what was requested is the only thing the user sees" and try to extract the relevant submodel, aka the requested model instead of the fitted model. That is a lot of work and/or we litter it with bugs. Bad idea.
  • Don't use the submodel trick: that would mean the difference that's causing us to think about all of this goes away. Cost: it'll be slooooow. Bad idea.

One alternative that was more interesting:
We also discussed making collect_extracts() return only what was fitted, so remove rows with requested configs that we did not need to fit. That leans fully into the idea of collect_extracts() being about what was actually fitted. But what anchors all the results is what was requested, and we want users to easily be able to connect all outputs to that. Thus, we use that as the thing that dictates the format, even when it here means duplicating some entries.

Conclusion: we decided to go with the current behavior and aim to document it better. I'm approving to signal that but think that it's worth considering wordsmithing the proposed documentation a bit more to reflect some more of the clarity we gained in our discussion. (I would but my brain is currently doing a fun overflow 😝 )

R/collect.R Outdated Show resolved Hide resolved
R/collect.R Outdated Show resolved Hide resolved
R/collect.R Outdated Show resolved Hide resolved
@simonpcouch simonpcouch requested review from hfrick and removed request for hfrick October 10, 2023 18:04
Copy link
Member

@hfrick hfrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice changes! One round of wordsmithing for your consideration :D

R/collect.R Outdated Show resolved Hide resolved
@simonpcouch simonpcouch merged commit 4e34edc into main Oct 11, 2023
9 checks passed
@simonpcouch simonpcouch deleted the fix-724-733 branch October 11, 2023 13:59
@github-actions
Copy link

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Oct 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

audit extracts submodels FIXME hyperparameters associated toextracted objects are incorrect
2 participants