(1/5) [nexus] Add Affinity/Anti-Affinity Groups to API (unimplemented) #7443

smklein · 2025-01-30T21:11:32Z

Pulled out of #7076

Exposes Affinity/Anti-Affinity Groups in the API, but as unimplemented.

hawkw

Overall, this looks good to me --- more or less precisely what I would expect based on what's specified in RFD 522. Nothing controversial here.

hawkw · 2025-01-30T22:45:42Z

common/src/api/external/mod.rs

+    /// If the affinity request cannot be satisfied, allow it anyway.
+    ///
+    /// This enables a "best-effort" attempt to satisfy the affinity policy.
+    Allow,
+
+    /// If the affinity request cannot be satisfied, fail explicitly.
+    Fail,


I notice that RFD 522 calls these policies for affinity groups "Allow" and "Fail", but for anti-affinity groups, we call them "Allow" and "Alert". I wonder if the different naming there is intentional --- my guess is that it isn't --- and whether we should change the RFD to use the same names for both affinity and anti-affinity, since it seems like we're going to use one enum for policies in the API.

Ah, yeah, perhaps I misinterpreted this comment here: #7076 (comment)

My naming choice here was intentional, because we cannot implement this with "Alert" just yet (if you're being generous to me, or "chose not to" if you aren't).

I think it's possible in the future that we tweak the "Policy = Fail" to cause an alert, or possible for us to add "Policy = Alert", which is the same as "Policy = Fail", with the only distinction being triggers in the alerting system? Not sure. I just omitted it because it would be a misnomer until it gets integrated later.

Would you be okay punting on saying "we will implement alerting here"? I'd rather document it explicitly when ti gets integrated, rather than speculate on where alerts might land.

I think it's possible in the future that we tweak the "Policy = Fail" to cause an alert, or possible for us to add "Policy = Alert", which is the same as "Policy = Fail", with the only distinction being triggers in the alerting system? Not sure. I just omitted it because it would be a misnomer until it gets integrated later.

IIUC the intent was for "Alert" to be "allow, but tell me you did that". Personally, I think we ought to stick with the "allow" and "fail" terminology, since we may want to alert in both cases ("we violated an affinity requirement because you said it was okay to" and "we didn't allow an instance to be scheduled because it would have violated an affinity requirement" are both things it seems reasonable to eventually want an alert about). So, my preference would be to go update the RFD to use "Allow" and "Fail" for both affinity and anti-affinity. IMO, the policy should describe whether or not we want an instance to be scheduled if it's impossible to satisfy its affinity constraint, and whether or not we generate an alert is a separate axis.

In case it's unclear I was not trying to suggest that we go back to calling it "alert" when I previously agreed that we shouldn't back in November. Instead, I was just suggesting that we should probably update the RFD to use the same names as this PR for both affinity and anti-affinity. :)

Updated these names on the RFD in https://github.com/oxidecomputer/rfd/pull/812/commits/361e0eeee741e8aa97fb501cc7da6b7a089992db

nexus/external-api/src/lib.rs

david-crespo · 2025-01-30T23:18:01Z

common/src/api/external/mod.rs

+#[serde(tag = "type", content = "value", rename_all = "snake_case")]
+pub enum AffinityGroupMember {
+    Instance(Uuid),
+}


It would be helpful to add a doc comment to the variant saying it's the value is an instance ID. The "format": "uuid" bit in OpenAPI helps, but it's not as good as a description.

Here's the OpenAPI schema from below:

"AffinityGroupMember": { "oneOf": [ { "type": "object", "properties": { "type": { "type": "string", "enum": [ "instance" ] }, "value": { "type": "string", "format": "uuid" } }, "required": [ "type", "value" ] } ] }

and here is the TypeScript code we generate for it

export type AffinityGroupMember = { type: "instance"; value: string };

Compare to

omicron/nexus/types/src/external_api/params.rs

Lines 1047 to 1060 in d062c60

/// Parameters for creating an external IP address for instances.

#[derive(Clone, Debug, Deserialize, Serialize, JsonSchema)]

#[serde(tag = "type", rename_all = "snake_case")]

pub enum ExternalIpCreate {

/// An IP address providing both inbound and outbound access. The address is

/// automatically-assigned from the provided IP Pool, or the current silo's

/// default pool if not specified.

Ephemeral { pool: Option<NameOrId> },

/// An IP address providing both inbound and outbound access. The address is

/// an existing floating IP object assigned to the current project.

///

/// The floating IP must not be in use by another instance or service.

Floating { floating_ip: NameOrId },

}

which produces

export type ExternalIpCreate = /** An IP address providing both inbound and outbound access. The address is automatically-assigned from the provided IP Pool, or the current silo's default pool if not specified. */ | { pool?: NameOrId; type: "ephemeral" } /** An IP address providing both inbound and outbound access. The address is an existing floating IP object assigned to the current project. The floating IP must not be in use by another instance or service. */ | { floatingIp: NameOrId; type: "floating" };

Though actually, the presence of NameOrId in ExternalIpCreate makes me think there is a probably a way I can get the generator to do

type Uuid = string

and use that for value: Uuid. Doesn't actually change the typesafety since it's just an alias for string, but it does help the doc problem.

It'd be helpful for the Go SDK to have the doc comments as well. I haven't looked at the rust SDK in a while but I'm guessing it'd be useful for it to have the doc comments as well?

Added some doc comments in 6cfca2d

For what it's worth - there are spots where we accept the Instance name or UUID as a parameter, in paths, but we don't do that here. This struct is only used in responses, where I'm using UUIDs exclusively.

That choice is somewhat arbitrary - I could return the name instead - but I figured that the UUID would at least be more stable.

Thanks @smklein! UUID is definitely preferred from a terraform provider point of view. It relies on stability

Ah, yeah. I didn’t realize it’s only in responses, where it will be very obvious that’s it’s an ID.

david-crespo

Looks very good.

andrewjstone · 2025-01-31T17:30:54Z

common/src/api/external/mod.rs

+#[serde(tag = "type", content = "value", rename_all = "snake_case")]
+pub enum AffinityGroupMember {
+    /// An instance belonging to this group, identified by UUID.
+    Instance(Uuid),


Should we use a typed Uuid (InstanceUuid) here instead or do we avoid those in external APIs?

We don't currently have many in the external API - I think I plumbed one through the operator API with support bundles - but I think I should be able to convert this one.

Updated in 4b08032

nexus/external-api/src/lib.rs

smklein added 2 commits January 30, 2025 12:08

[nexus] Add Affinity/Anti-Affinity Groups to API (unimplemented)

c9fb7a6

fmt

8dc0825

smklein mentioned this pull request Jan 30, 2025

[nexus] Affinity and Anti-Affinity Groups #7076

Open

44 tasks

smklein requested review from david-crespo and andrewjstone January 30, 2025 21:24

hawkw approved these changes Jan 30, 2025

View reviewed changes

tags

4e9cebc

david-crespo reviewed Jan 30, 2025

View reviewed changes

david-crespo approved these changes Jan 30, 2025

View reviewed changes

doc comments

6cfca2d

andrewjstone approved these changes Jan 31, 2025

View reviewed changes

smklein added 2 commits January 31, 2025 14:14

Merge branch 'main' into affinity-api

050b4c5

typed UUID

4b08032

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(1/5) [nexus] Add Affinity/Anti-Affinity Groups to API (unimplemented) #7443

(1/5) [nexus] Add Affinity/Anti-Affinity Groups to API (unimplemented) #7443

smklein commented Jan 30, 2025

hawkw left a comment

hawkw Jan 30, 2025

smklein Jan 30, 2025

hawkw Jan 30, 2025 •

edited

Loading

hawkw Jan 30, 2025

smklein Jan 30, 2025

david-crespo Jan 30, 2025

david-crespo Jan 30, 2025 •

edited

Loading

karencfv Jan 30, 2025

smklein Jan 31, 2025

smklein Jan 31, 2025

karencfv Jan 31, 2025

david-crespo Jan 31, 2025

david-crespo left a comment

andrewjstone Jan 31, 2025

smklein Jan 31, 2025

	/// Parameters for creating an external IP address for instances.
	#[derive(Clone, Debug, Deserialize, Serialize, JsonSchema)]
	#[serde(tag = "type", rename_all = "snake_case")]
	pub enum ExternalIpCreate {
	/// An IP address providing both inbound and outbound access. The address is
	/// automatically-assigned from the provided IP Pool, or the current silo's
	/// default pool if not specified.
	Ephemeral { pool: Option<NameOrId> },
	/// An IP address providing both inbound and outbound access. The address is
	/// an existing floating IP object assigned to the current project.
	///
	/// The floating IP must not be in use by another instance or service.
	Floating { floating_ip: NameOrId },
	}

(1/5) [nexus] Add Affinity/Anti-Affinity Groups to API (unimplemented) #7443

Are you sure you want to change the base?

(1/5) [nexus] Add Affinity/Anti-Affinity Groups to API (unimplemented) #7443

Conversation

smklein commented Jan 30, 2025

hawkw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hawkw Jan 30, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

david-crespo Jan 30, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

david-crespo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hawkw Jan 30, 2025 •

edited

Loading

david-crespo Jan 30, 2025 •

edited

Loading