Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(1/5) [nexus] Add Affinity/Anti-Affinity Groups to API (unimplemented) #7443

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

smklein
Copy link
Collaborator

@smklein smklein commented Jan 30, 2025

Pulled out of #7076

Exposes Affinity/Anti-Affinity Groups in the API, but as unimplemented.

Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this looks good to me --- more or less precisely what I would expect based on what's specified in RFD 522. Nothing controversial here.

Comment on lines +1324 to +1330
/// If the affinity request cannot be satisfied, allow it anyway.
///
/// This enables a "best-effort" attempt to satisfy the affinity policy.
Allow,

/// If the affinity request cannot be satisfied, fail explicitly.
Fail,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice that RFD 522 calls these policies for affinity groups "Allow" and "Fail", but for anti-affinity groups, we call them "Allow" and "Alert". I wonder if the different naming there is intentional --- my guess is that it isn't --- and whether we should change the RFD to use the same names for both affinity and anti-affinity, since it seems like we're going to use one enum for policies in the API.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah, perhaps I misinterpreted this comment here: #7076 (comment)

My naming choice here was intentional, because we cannot implement this with "Alert" just yet (if you're being generous to me, or "chose not to" if you aren't).

I think it's possible in the future that we tweak the "Policy = Fail" to cause an alert, or possible for us to add "Policy = Alert", which is the same as "Policy = Fail", with the only distinction being triggers in the alerting system? Not sure. I just omitted it because it would be a misnomer until it gets integrated later.

Would you be okay punting on saying "we will implement alerting here"? I'd rather document it explicitly when ti gets integrated, rather than speculate on where alerts might land.

Copy link
Member

@hawkw hawkw Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's possible in the future that we tweak the "Policy = Fail" to cause an alert, or possible for us to add "Policy = Alert", which is the same as "Policy = Fail", with the only distinction being triggers in the alerting system? Not sure. I just omitted it because it would be a misnomer until it gets integrated later.

IIUC the intent was for "Alert" to be "allow, but tell me you did that". Personally, I think we ought to stick with the "allow" and "fail" terminology, since we may want to alert in both cases ("we violated an affinity requirement because you said it was okay to" and "we didn't allow an instance to be scheduled because it would have violated an affinity requirement" are both things it seems reasonable to eventually want an alert about). So, my preference would be to go update the RFD to use "Allow" and "Fail" for both affinity and anti-affinity. IMO, the policy should describe whether or not we want an instance to be scheduled if it's impossible to satisfy its affinity constraint, and whether or not we generate an alert is a separate axis.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case it's unclear I was not trying to suggest that we go back to calling it "alert" when I previously agreed that we shouldn't back in November. Instead, I was just suggesting that we should probably update the RFD to use the same names as this PR for both affinity and anti-affinity. :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nexus/external-api/src/lib.rs Outdated Show resolved Hide resolved
#[serde(tag = "type", content = "value", rename_all = "snake_case")]
pub enum AffinityGroupMember {
Instance(Uuid),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be helpful to add a doc comment to the variant saying it's the value is an instance ID. The "format": "uuid" bit in OpenAPI helps, but it's not as good as a description.

Here's the OpenAPI schema from below:

"AffinityGroupMember": {
  "oneOf": [
    {
      "type": "object",
      "properties": {
        "type": {
          "type": "string",
          "enum": [
            "instance"
          ]
        },
        "value": {
          "type": "string",
          "format": "uuid"
        }
      },
      "required": [
        "type",
        "value"
      ]
    }
  ]
}

and here is the TypeScript code we generate for it

export type AffinityGroupMember = { type: "instance"; value: string };

Compare to

/// Parameters for creating an external IP address for instances.
#[derive(Clone, Debug, Deserialize, Serialize, JsonSchema)]
#[serde(tag = "type", rename_all = "snake_case")]
pub enum ExternalIpCreate {
/// An IP address providing both inbound and outbound access. The address is
/// automatically-assigned from the provided IP Pool, or the current silo's
/// default pool if not specified.
Ephemeral { pool: Option<NameOrId> },
/// An IP address providing both inbound and outbound access. The address is
/// an existing floating IP object assigned to the current project.
///
/// The floating IP must not be in use by another instance or service.
Floating { floating_ip: NameOrId },
}

which produces

export type ExternalIpCreate =
  /** An IP address providing both inbound and outbound access. The address is automatically-assigned from the provided IP Pool, or the current silo's default pool if not specified. */
  | { pool?: NameOrId; type: "ephemeral" }
  /** An IP address providing both inbound and outbound access. The address is an existing floating IP object assigned to the current project.

The floating IP must not be in use by another instance or service. */
  | { floatingIp: NameOrId; type: "floating" };

Copy link
Contributor

@david-crespo david-crespo Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though actually, the presence of NameOrId in ExternalIpCreate makes me think there is a probably a way I can get the generator to do

type Uuid = string

and use that for value: Uuid. Doesn't actually change the typesafety since it's just an alias for string, but it does help the doc problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be helpful for the Go SDK to have the doc comments as well. I haven't looked at the rust SDK in a while but I'm guessing it'd be useful for it to have the doc comments as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some doc comments in 6cfca2d

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth - there are spots where we accept the Instance name or UUID as a parameter, in paths, but we don't do that here. This struct is only used in responses, where I'm using UUIDs exclusively.

That choice is somewhat arbitrary - I could return the name instead - but I figured that the UUID would at least be more stable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @smklein! UUID is definitely preferred from a terraform provider point of view. It relies on stability

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah. I didn’t realize it’s only in responses, where it will be very obvious that’s it’s an ID.

Copy link
Contributor

@david-crespo david-crespo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good.

#[serde(tag = "type", content = "value", rename_all = "snake_case")]
pub enum AffinityGroupMember {
/// An instance belonging to this group, identified by UUID.
Instance(Uuid),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use a typed Uuid (InstanceUuid) here instead or do we avoid those in external APIs?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't currently have many in the external API - I think I plumbed one through the operator API with support bundles - but I think I should be able to convert this one.

Updated in 4b08032

nexus/external-api/src/lib.rs Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants