-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[reconfigurator] Executor: only do cleanup for zones that are ready_for_cleanup
#7713
Conversation
) { | ||
// We expect to only be called with expunged zones that are ready | ||
// for cleanup; skip any with a different disposition. | ||
if !config.disposition.is_ready_for_cleanup() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this conditional be a bug if it was invoked? Should we at least be logging something here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I could arguments for any of these options:
- Document this function as "panics if called with a zone that isn't ready for cleanup" and assert.
- Document this function as "returns an error for any zone that isn't ready for cleanup".
- Keep this check and log a warning.
- Keep this check and don't log anything.
I prefer option 4, I think? "Call this method with whatever set of zones you want, and I locally choose to act on the ones I know it's safe to act on". Honestly I'd kind of prefer this take a &Blueprint
instead of an iterator of zones to avoid this problem altogether, but that makes the tests more awkward. Maybe this should be split into a pub(crate)
version that takes a &Blueprint
and a private version that takes this iterator; the &Blueprint
one could do the filtering, and tests could call the private one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to take a stab at cleaning this up in a small followup PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great cleanup. Love to see the progress here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
ready_for_cleanup
ready_for_cleanup
Builds on #7713, and is followup from #7713 (comment). In #7652 I changed all the executor substeps to take iterators instead of `&BTreeMap` references that no longer existed, but that introduced a weird split where the top-level caller had to filter the blueprint down to just the items that the inner functions expected. @smklein pointed out one place where the inner code was being extra defensive, which was just more confusing than anything. This PR removes that split: the top-level executor now always passes a full `&Blueprint` down, and the inner modules are responsible for doing their own filtering as appropriate. To easy testing, I kept the versions that take an iterator of already-filtered items as private `*_impl` functions that the new functions-that-take-a-full-`Blueprint` themselves call too.
Changes the three steps that required the earlier
PUT /omicron-zones
execution step to succeed because they needed to clean up after zones that were definitely shut down to now ready theready_for_cleanup
blueprint disposition (which is set by the planner after it confirms via inventory that the zones are dead).While I was here, I removed both the special case error-to-warning implementation in a couple of steps and the
finalize
step that compiled those. I believe all of that predates the ability foromdb
to interpret the update-engine event report. Now that it can, emitting step warnings directly is fine. I spun this branch up in a4x2 and then shut down one of the sled-agents; that results in this report the executor:Closes #6999, under the assumption that as of this PR we've identified all the inter-step dependencies and broken them.