-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server: consolidate VM creation APIs #806
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This is the first step in a long road towards two goals: 1. Get live migration targets to initialize from their sources 2. Allow the control plane to configure instances using instance specs to access all the features they present that instance ensure requests do not This particular commit does a couple of different things: - Define the One True Instance Initialization Request type - Change the key type in instance specs from String to a special SpecKey type # One True Init Type There are now three ways for Propolis clients to initialize a VM: - From its config TOML - From an instance spec provided as an argument - Via live migration The first option is meant for local development. The second is meant to be used when starting new VMs. The third is used when migrating a VM. The "migrate" option allows the client to specify new backend configurations for some or all of the backends in the source's instance spec. This allows the control plane to substitute things like Crucible generation numbers or OPTE port names that change when a VMM migrates. For this scheme to work, the control plane needs to be able to name the backends that it's replacing (so that the source and target can line up the backend replacements during the sync phase). This requires a more disciplined approach to instance spec component naming. # SpecKey Today instance specs use strings as their key names. Using UUIDs instead has the nice property that it becomes very easy to correlate control plane objects (e.g. a Disk or a NetworkInterface record) with the Propolis components that instantiate them. It is also cheaper to compare UUIDs directly than to compare their string representations. Unfortunately, it's not possible to change the existing SpecKey type alias from String to Uuid, for (at least) two reasons: 1. The config TOML method of initializing an instance accepts arbitrary strings as device and backend names. Having to generate UUIDs for these would be a hassle for local development. 2. More importantly: Some control plane objects need more than one spec element; for example, a disk has both a device and a backend component, and they can't use the same ID. To get around these problems, this commit introduces the `SpecKey` enum, which has `Uuid` and `Name` variants and uses adapters from the `serde_with` crate to serialize and deserialize it using its `Display` and `FromStr` implementations. This allows the key to be used in the components map while keeping the map `Serialize`. String keys that can convert to UUIDs are stored in that representation; otherwise they're stored as name strings. This allows Nexus to use an ID for one half of a device and then append a disambiguator to the other component. This commit fixes the build for most of the instance spec-related functions in the server, but does not deal with anything in the VM state machine or the migration protocol, where the build is still completely on the floor.
Tidy up a few more easy-to-fix errors (usually use of strings where we want the use of SpecKeys) before tackling the bigger problem of how to rearrange VM initialization.
This is needed because the spec is not known right away during a migration in.
The VCR replacement endpoint takes a disk name (in the request body) and a "disk ID" (in the request path). The former item is used to look up the Crucible element in the instance spec; the latter is used to look it up in the VM's Crucible backend map. Change this around as follows: - The VCR replacement request path takes a SpecKey. The request body no longer takes a name. - Crucible backend components are now identified by their SpecKeys. The UUID obtained from the Crucible backend itself is no longer used for this purpose. - Enable metrics for Crucible disks only if they're identified by a UUID. This is a constraint of the virtual disk provider, which expects disk IDs to be used in its time-series identifiers. In practice this means that Nexus will need to identify Crucible backends using disk IDs and use some other name for the corresponding devices (which is probably what we were going to do anyway, so no big deal). - Change the other Crucible-dependent endpoints (snapshot, volume status) to use similar path parameters.
Refactor the `VmEnsure` type state machine so that it no longer assumes that an instance spec is present at all times (it might be injected later during live migration). The biggest challenge here is that it's no longer possible to run the entire VM initialization process on the VMM runtime, because there might not be an instance spec when initialization starts, and the VMM runtime can't be created until there's a spec that specifies how many vCPUs the VM will have. To get around this, move some things around so that the `VmEnsure` state machine creates the runtime instead of running on the previously-created runtime. Slightly tweak the migration destination code so that it's ready to provide an instance spec to the ensure logic if the sync phase produces one from the preamble (it doesn't, but after the next commit it will!).
Instead of comparing the source and target instance specs for migration compatibility, have the target read the source's spec, substitute in any new backend information the target has, and then attempt to initialize itself from the resulting spec. This deletes 900+ lines of compat checking code that's no longer needed. At this point the server builds, so it should be possible to run tests on it and then regenerate the OpenAPI endpoints, after which we'll see if the new instance ensure API is worth a darn or not.
Clean up the first couple of bits of build fallout from the new API spec: - Crucible client types no longer appear directly in the propolis-client generated types (because DiskRequest is gone and instance specs expect to get serialized VCRs instead). Have PHD depend directly on crucible-client-types for these. - Remove most of the InstanceEnsureRequest validation code from the mock server. Most of these checks are for the validity of slot inputs, and slot conversions aren't part of the API anymore. The mock serial port logic (which is the bit that the Omicron integration tests really make use of) remains unchanged.
There's no longer a mode of operation where the server combines config TOML devices with ensure requests to create a complete VM configuration. Move the TOML-entry-to-spec-component conversion logic into the `propolis-server-config` crate so that it's available to clients who want it, then remove config-TOML device parsing from the server entirely. The server still needs the config TOML contents to get the bootrom path. That won't change, at least for now.
Change PHD's instance ensure paths to use the new VM creation APIs. Remove two tests that are no longer relevant: the "incompatible VMs" migration test is no longer useful now that targets inherit configuration from their sources, and the "ensure API" test is no longer useful now that everyone has to use a single ensure API. One minor annoyance here (of which I've made a note) is that instance GET requests always return their specs in an Option, causing a couple of test cases to sprout new `expect`s. Maybe we can improve on this. Tested with Alpine and Debian 11 guests.
propolis-server no longer has an option to read VM device configuration from a TOML file. This is a useful option for local development. The goal of this change is to lay the groundwork for propolis-cli to read existing config TOMLs and repackage them into instance specs that can be used with the new VM creation API. propolis-server reads config TOMLs using the propolis-server-config crate, which (now) contains logic to convert them to propolis-api-types instance specs. This logic is obnoxious to move to propolis-cli, because the CLI uses propolis-client, whose spec types are Progenitor-generated and don't match the propolis-api-types types. Instead of building dozens of `From` impls or festooning the propolis-client generator with `replace` directives, this commit completely breaks propolis-server's dependency on propolis-server-config. This is possible because the server only reads two pieces of information from the config TOML, both of which are easily replaced: 1. The bootrom path is now specified as a command-line argument to the server (where the config TOML path would have gone in times before). 2. The "migration failure" test device can now be specified in an instance spec (though the server will only accept it if it was built in the correct configuration, i.e. without the omicron-build feature). Once this dependency is removed, propolis-server-config can take a dependency on propolis-client without creating a circular dependency or layering violation. (It would probably also make sense to rename the crate to something like `propolis-config-toml`, but I haven't done that here.) This in turn allows propolis-cli to pick up propolis-server-config and use it to convert config TOMLs into propolis-client instance specs.
This test was assuming that it could write a new instance spec for its target and that the target would pick it up. The right way to do that is with the replacement list, so use that instead.
Now that config TOMLs don't directly configure propolis-server binaries, the old name makes less sense, so switch to a newer, more sensible name.
Up in Omicron, Nexus modules that need to work with volume construction requests import `sled_agent_client::types::VolumeConstructionRequest`. Sled agent's generated client includes the VCR type because the sled agent API includes a `propolis_client::types::DiskRequest`, which (when it existed) contained a `crucible_client_types::DiskRequest`. The advantage of this scheme is that Nexus code that works with VCRs is guaranteed to use the same VCR definition as Propolis does. To maintain it, have `propolis_client` re-export its VCR definition directly (since it no longer appears in the generated types).
The propolis-config-toml library wasn't inserting any SoftNpuPort devices into the specs it generated, which is a regression from the previous version of the config code. I need to figure out why I didn't catch this in Falcon testing! Also, reorganize things a bit to clarify the following: - a `SoftNpuPort` is a data port on a virtual ASIC - the data ports have names that are distinct from their instance spec keys SoftNPU treats port names as part of the overall virtual ASIC configuration, so they should go in the actual port specs; spec keys exist solely to identify components, and changing a component's key shouldn't affect how it operates (provided that any other components that refer to it also have their names updated).
This was referenced Nov 13, 2024
This is split into multiple PRs by the series beginning with #807. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
(N.B. I'm still seeing if I can spin parts of this off into separate PRs, but I'm putting this up as-is for now (a) in case folks want to take a look as-is, and (b) to have commit hashes to refer to in my falcon/omicron clones so I can stop using path references everywhere.)
Consolidate the two separate instance-ensure APIs into a single API using the approach described in #804. There are several related changes in this PR, which are:
InstanceEnsureRequest
described in two instance ensure APIs is one too many #804.SpecKey
enum for their component map keys. This enum has UUID and string variants. This allows the control plane to supply UUID keys when it's possible and fall back to strings when it's not, and it allows Propolis to store/compare UUIDs in binary form when the control plane supplies them.propolis-config-toml
crate and teachpropolis-cli
to read config TOMLs and create specs from them.The main mechanical challenge in this PR is that it's no longer possible to create the VMM tokio runtime before starting the state driver task, because the runtime's thread count depends on the guest's vCPU count, and during a migration that's not known until after the migration protocol has begun to run. This is not intractable but accounts for the most significant changes to
propolis-server/src/lib/vm
.Tests (so far):
propolis-cli
and verified they had the expected components and the expected number of tokio runtime threadscargo test
in Omicron (exercises the stripped-down mock server)Fixes #804. Fixes #776. Fixes #772.