Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(1/5) [nexus] Add Affinity/Anti-Affinity Groups to API (unimplemented) #7443

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 54 additions & 0 deletions common/src/api/external/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -982,6 +982,10 @@ impl JsonSchema for Hostname {
pub enum ResourceType {
AddressLot,
AddressLotBlock,
AffinityGroup,
AffinityGroupMember,
AntiAffinityGroup,
AntiAffinityGroupMember,
AllowList,
BackgroundTask,
BgpConfig,
Expand Down Expand Up @@ -1312,6 +1316,56 @@ pub enum InstanceAutoRestartPolicy {
BestEffort,
}

// AFFINITY GROUPS

#[derive(Clone, Copy, Debug, Deserialize, Serialize, PartialEq, JsonSchema)]
#[serde(rename_all = "snake_case")]
pub enum AffinityPolicy {
/// If the affinity request cannot be satisfied, allow it anyway.
///
/// This enables a "best-effort" attempt to satisfy the affinity policy.
Allow,

/// If the affinity request cannot be satisfied, fail explicitly.
Fail,
Comment on lines +1329 to +1335
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice that RFD 522 calls these policies for affinity groups "Allow" and "Fail", but for anti-affinity groups, we call them "Allow" and "Alert". I wonder if the different naming there is intentional --- my guess is that it isn't --- and whether we should change the RFD to use the same names for both affinity and anti-affinity, since it seems like we're going to use one enum for policies in the API.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah, perhaps I misinterpreted this comment here: #7076 (comment)

My naming choice here was intentional, because we cannot implement this with "Alert" just yet (if you're being generous to me, or "chose not to" if you aren't).

I think it's possible in the future that we tweak the "Policy = Fail" to cause an alert, or possible for us to add "Policy = Alert", which is the same as "Policy = Fail", with the only distinction being triggers in the alerting system? Not sure. I just omitted it because it would be a misnomer until it gets integrated later.

Would you be okay punting on saying "we will implement alerting here"? I'd rather document it explicitly when ti gets integrated, rather than speculate on where alerts might land.

Copy link
Member

@hawkw hawkw Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's possible in the future that we tweak the "Policy = Fail" to cause an alert, or possible for us to add "Policy = Alert", which is the same as "Policy = Fail", with the only distinction being triggers in the alerting system? Not sure. I just omitted it because it would be a misnomer until it gets integrated later.

IIUC the intent was for "Alert" to be "allow, but tell me you did that". Personally, I think we ought to stick with the "allow" and "fail" terminology, since we may want to alert in both cases ("we violated an affinity requirement because you said it was okay to" and "we didn't allow an instance to be scheduled because it would have violated an affinity requirement" are both things it seems reasonable to eventually want an alert about). So, my preference would be to go update the RFD to use "Allow" and "Fail" for both affinity and anti-affinity. IMO, the policy should describe whether or not we want an instance to be scheduled if it's impossible to satisfy its affinity constraint, and whether or not we generate an alert is a separate axis.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case it's unclear I was not trying to suggest that we go back to calling it "alert" when I previously agreed that we shouldn't back in November. Instead, I was just suggesting that we should probably update the RFD to use the same names as this PR for both affinity and anti-affinity. :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

/// Describes the scope of affinity for the purposes of co-location.
#[derive(Clone, Copy, Debug, Deserialize, Serialize, PartialEq, JsonSchema)]
#[serde(rename_all = "snake_case")]
pub enum FailureDomain {
/// Instances are considered co-located if they are on the same sled
Sled,
}

#[derive(Clone, Debug, Deserialize, Serialize, JsonSchema, PartialEq)]
#[serde(tag = "type", content = "value", rename_all = "snake_case")]
pub enum AffinityGroupMember {
Instance(Uuid),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use a typed Uuid (InstanceUuid) here instead or do we avoid those in external APIs?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't currently have many in the external API - I think I plumbed one through the operator API with support bundles - but I think I should be able to convert this one.

Updated in 4b08032

}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be helpful to add a doc comment to the variant saying it's the value is an instance ID. The "format": "uuid" bit in OpenAPI helps, but it's not as good as a description.

Here's the OpenAPI schema from below:

"AffinityGroupMember": {
  "oneOf": [
    {
      "type": "object",
      "properties": {
        "type": {
          "type": "string",
          "enum": [
            "instance"
          ]
        },
        "value": {
          "type": "string",
          "format": "uuid"
        }
      },
      "required": [
        "type",
        "value"
      ]
    }
  ]
}

and here is the TypeScript code we generate for it

export type AffinityGroupMember = { type: "instance"; value: string };

Compare to

/// Parameters for creating an external IP address for instances.
#[derive(Clone, Debug, Deserialize, Serialize, JsonSchema)]
#[serde(tag = "type", rename_all = "snake_case")]
pub enum ExternalIpCreate {
/// An IP address providing both inbound and outbound access. The address is
/// automatically-assigned from the provided IP Pool, or the current silo's
/// default pool if not specified.
Ephemeral { pool: Option<NameOrId> },
/// An IP address providing both inbound and outbound access. The address is
/// an existing floating IP object assigned to the current project.
///
/// The floating IP must not be in use by another instance or service.
Floating { floating_ip: NameOrId },
}

which produces

export type ExternalIpCreate =
  /** An IP address providing both inbound and outbound access. The address is automatically-assigned from the provided IP Pool, or the current silo's default pool if not specified. */
  | { pool?: NameOrId; type: "ephemeral" }
  /** An IP address providing both inbound and outbound access. The address is an existing floating IP object assigned to the current project.

The floating IP must not be in use by another instance or service. */
  | { floatingIp: NameOrId; type: "floating" };

Copy link
Contributor

@david-crespo david-crespo Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though actually, the presence of NameOrId in ExternalIpCreate makes me think there is a probably a way I can get the generator to do

type Uuid = string

and use that for value: Uuid. Doesn't actually change the typesafety since it's just an alias for string, but it does help the doc problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be helpful for the Go SDK to have the doc comments as well. I haven't looked at the rust SDK in a while but I'm guessing it'd be useful for it to have the doc comments as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some doc comments in 6cfca2d

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth - there are spots where we accept the Instance name or UUID as a parameter, in paths, but we don't do that here. This struct is only used in responses, where I'm using UUIDs exclusively.

That choice is somewhat arbitrary - I could return the name instead - but I figured that the UUID would at least be more stable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @smklein! UUID is definitely preferred from a terraform provider point of view. It relies on stability

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah. I didn’t realize it’s only in responses, where it will be very obvious that’s it’s an ID.


impl SimpleIdentity for AffinityGroupMember {
fn id(&self) -> Uuid {
match self {
AffinityGroupMember::Instance(id) => *id,
}
}
}

#[derive(Clone, Debug, Deserialize, Serialize, JsonSchema, PartialEq)]
#[serde(tag = "type", content = "value", rename_all = "snake_case")]
pub enum AntiAffinityGroupMember {
Instance(Uuid),
}

impl SimpleIdentity for AntiAffinityGroupMember {
fn id(&self) -> Uuid {
match self {
AntiAffinityGroupMember::Instance(id) => *id,
}
}
}

// DISKS

/// View of a Disk
Expand Down
21 changes: 21 additions & 0 deletions nexus/external-api/output/nexus_tags.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,24 @@
API operations found with tag "affinity"
OPERATION ID METHOD URL PATH
affinity_group_create POST /v1/affinity-groups
affinity_group_delete DELETE /v1/affinity-groups/{affinity_group}
affinity_group_list GET /v1/affinity-groups
affinity_group_member_instance_add POST /v1/affinity-groups/{affinity_group}/members/instance/{instance}
affinity_group_member_instance_delete DELETE /v1/affinity-groups/{affinity_group}/members/instance/{instance}
affinity_group_member_instance_view GET /v1/affinity-groups/{affinity_group}/members/instance/{instance}
affinity_group_member_list GET /v1/affinity-groups/{affinity_group}/members
affinity_group_update PUT /v1/affinity-groups/{affinity_group}
affinity_group_view GET /v1/affinity-groups/{affinity_group}
anti_affinity_group_create POST /v1/anti-affinity-groups
anti_affinity_group_delete DELETE /v1/anti-affinity-groups/{anti_affinity_group}
anti_affinity_group_list GET /v1/anti-affinity-groups
anti_affinity_group_member_instance_add POST /v1/anti-affinity-groups/{anti_affinity_group}/members/instance/{instance}
anti_affinity_group_member_instance_delete DELETE /v1/anti-affinity-groups/{anti_affinity_group}/members/instance/{instance}
anti_affinity_group_member_instance_view GET /v1/anti-affinity-groups/{anti_affinity_group}/members/instance/{instance}
anti_affinity_group_member_list GET /v1/anti-affinity-groups/{anti_affinity_group}/members
anti_affinity_group_update PUT /v1/anti-affinity-groups/{anti_affinity_group}
anti_affinity_group_view GET /v1/anti-affinity-groups/{anti_affinity_group}

API operations found with tag "disks"
OPERATION ID METHOD URL PATH
disk_bulk_write_import POST /v1/disks/{disk}/bulk-write
Expand Down
225 changes: 225 additions & 0 deletions nexus/external-api/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,13 @@ const PUT_UPDATE_REPOSITORY_MAX_BYTES: usize = 4 * GIB;
allow_other_tags = false,
policy = EndpointTagPolicy::ExactlyOne,
tags = {
"affinity" = {
description = "Affinity and anti-affinity groups give control over instance placement.",
external_docs = {
url = "http://docs.oxide.computer/api/affinity"
}

},
"disks" = {
description = "Virtual disks are used to store instance-local data which includes the operating system.",
external_docs = {
Expand Down Expand Up @@ -1257,6 +1264,224 @@ pub trait NexusExternalApi {
disk_to_detach: TypedBody<params::DiskPath>,
) -> Result<HttpResponseAccepted<Disk>, HttpError>;

// Affinity Groups

/// List affinity groups
#[endpoint {
method = GET,
path = "/v1/affinity-groups",
tags = ["affinity"],
}]
async fn affinity_group_list(
rqctx: RequestContext<Self::Context>,
query_params: Query<PaginatedByNameOrId<params::ProjectSelector>>,
) -> Result<HttpResponseOk<ResultsPage<views::AffinityGroup>>, HttpError>;

/// Fetch an affinity group
#[endpoint {
method = GET,
path = "/v1/affinity-groups/{affinity_group}",
tags = ["affinity"],
}]
async fn affinity_group_view(
rqctx: RequestContext<Self::Context>,
query_params: Query<params::OptionalProjectSelector>,
path_params: Path<params::AffinityGroupPath>,
) -> Result<HttpResponseOk<views::AffinityGroup>, HttpError>;

/// List members of an affinity group
#[endpoint {
method = GET,
path = "/v1/affinity-groups/{affinity_group}/members",
tags = ["affinity"],
}]
async fn affinity_group_member_list(
rqctx: RequestContext<Self::Context>,
query_params: Query<PaginatedById<params::OptionalProjectSelector>>,
path_params: Path<params::AffinityGroupPath>,
) -> Result<HttpResponseOk<ResultsPage<AffinityGroupMember>>, HttpError>;

/// Fetch an affinity group member
#[endpoint {
method = GET,
path = "/v1/affinity-groups/{affinity_group}/members/instance/{instance}",
tags = ["affinity"],
}]
async fn affinity_group_member_instance_view(
rqctx: RequestContext<Self::Context>,
query_params: Query<params::OptionalProjectSelector>,
path_params: Path<params::AffinityInstanceGroupMemberPath>,
) -> Result<HttpResponseOk<AffinityGroupMember>, HttpError>;

/// Add a member to an affinity group
#[endpoint {
method = POST,
andrewjstone marked this conversation as resolved.
Show resolved Hide resolved
path = "/v1/affinity-groups/{affinity_group}/members/instance/{instance}",
tags = ["affinity"],
}]
async fn affinity_group_member_instance_add(
rqctx: RequestContext<Self::Context>,
query_params: Query<params::OptionalProjectSelector>,
path_params: Path<params::AffinityInstanceGroupMemberPath>,
) -> Result<HttpResponseCreated<AffinityGroupMember>, HttpError>;

/// Remove a member from an affinity group
#[endpoint {
method = DELETE,
path = "/v1/affinity-groups/{affinity_group}/members/instance/{instance}",
tags = ["affinity"],
}]
async fn affinity_group_member_instance_delete(
rqctx: RequestContext<Self::Context>,
query_params: Query<params::OptionalProjectSelector>,
path_params: Path<params::AffinityInstanceGroupMemberPath>,
) -> Result<HttpResponseDeleted, HttpError>;

/// Create an affinity group
#[endpoint {
method = POST,
path = "/v1/affinity-groups",
tags = ["affinity"],
}]
async fn affinity_group_create(
rqctx: RequestContext<Self::Context>,
query_params: Query<params::ProjectSelector>,
new_affinity_group_params: TypedBody<params::AffinityGroupCreate>,
) -> Result<HttpResponseCreated<views::AffinityGroup>, HttpError>;

/// Update an affinity group
#[endpoint {
method = PUT,
path = "/v1/affinity-groups/{affinity_group}",
tags = ["affinity"],
}]
async fn affinity_group_update(
rqctx: RequestContext<Self::Context>,
query_params: Query<params::OptionalProjectSelector>,
path_params: Path<params::AffinityGroupPath>,
updated_group: TypedBody<params::AffinityGroupUpdate>,
) -> Result<HttpResponseOk<views::AffinityGroup>, HttpError>;

/// Delete an affinity group
#[endpoint {
method = DELETE,
path = "/v1/affinity-groups/{affinity_group}",
tags = ["affinity"],
}]
async fn affinity_group_delete(
rqctx: RequestContext<Self::Context>,
query_params: Query<params::OptionalProjectSelector>,
path_params: Path<params::AffinityGroupPath>,
) -> Result<HttpResponseDeleted, HttpError>;

/// List anti-affinity groups
#[endpoint {
method = GET,
path = "/v1/anti-affinity-groups",
tags = ["affinity"],
}]
async fn anti_affinity_group_list(
rqctx: RequestContext<Self::Context>,
query_params: Query<PaginatedByNameOrId<params::ProjectSelector>>,
) -> Result<HttpResponseOk<ResultsPage<views::AntiAffinityGroup>>, HttpError>;

/// Fetch an anti-affinity group
#[endpoint {
method = GET,
path = "/v1/anti-affinity-groups/{anti_affinity_group}",
tags = ["affinity"],
}]
async fn anti_affinity_group_view(
rqctx: RequestContext<Self::Context>,
query_params: Query<params::OptionalProjectSelector>,
path_params: Path<params::AntiAffinityGroupPath>,
) -> Result<HttpResponseOk<views::AntiAffinityGroup>, HttpError>;

/// List members of an anti-affinity group
#[endpoint {
method = GET,
path = "/v1/anti-affinity-groups/{anti_affinity_group}/members",
tags = ["affinity"],
}]
async fn anti_affinity_group_member_list(
rqctx: RequestContext<Self::Context>,
query_params: Query<PaginatedById<params::OptionalProjectSelector>>,
path_params: Path<params::AntiAffinityGroupPath>,
) -> Result<HttpResponseOk<ResultsPage<AntiAffinityGroupMember>>, HttpError>;

/// Fetch an anti-affinity group member
#[endpoint {
method = GET,
path = "/v1/anti-affinity-groups/{anti_affinity_group}/members/instance/{instance}",
tags = ["affinity"],
}]
async fn anti_affinity_group_member_instance_view(
rqctx: RequestContext<Self::Context>,
query_params: Query<params::OptionalProjectSelector>,
path_params: Path<params::AntiAffinityInstanceGroupMemberPath>,
) -> Result<HttpResponseOk<AntiAffinityGroupMember>, HttpError>;

/// Add a member to an anti-affinity group
#[endpoint {
method = POST,
path = "/v1/anti-affinity-groups/{anti_affinity_group}/members/instance/{instance}",
tags = ["affinity"],
}]
async fn anti_affinity_group_member_instance_add(
rqctx: RequestContext<Self::Context>,
query_params: Query<params::OptionalProjectSelector>,
path_params: Path<params::AntiAffinityInstanceGroupMemberPath>,
) -> Result<HttpResponseCreated<AntiAffinityGroupMember>, HttpError>;

/// Remove a member from an anti-affinity group
#[endpoint {
method = DELETE,
path = "/v1/anti-affinity-groups/{anti_affinity_group}/members/instance/{instance}",
tags = ["affinity"],
}]
async fn anti_affinity_group_member_instance_delete(
rqctx: RequestContext<Self::Context>,
query_params: Query<params::OptionalProjectSelector>,
path_params: Path<params::AntiAffinityInstanceGroupMemberPath>,
) -> Result<HttpResponseDeleted, HttpError>;

/// Create an anti-affinity group
#[endpoint {
method = POST,
path = "/v1/anti-affinity-groups",
tags = ["affinity"],
}]
async fn anti_affinity_group_create(
rqctx: RequestContext<Self::Context>,
query_params: Query<params::ProjectSelector>,
new_affinity_group_params: TypedBody<params::AntiAffinityGroupCreate>,
) -> Result<HttpResponseCreated<views::AntiAffinityGroup>, HttpError>;

/// Update an anti-affinity group
#[endpoint {
method = PUT,
path = "/v1/anti-affinity-groups/{anti_affinity_group}",
tags = ["affinity"],
}]
async fn anti_affinity_group_update(
rqctx: RequestContext<Self::Context>,
query_params: Query<params::OptionalProjectSelector>,
path_params: Path<params::AntiAffinityGroupPath>,
updated_group: TypedBody<params::AntiAffinityGroupUpdate>,
) -> Result<HttpResponseOk<views::AntiAffinityGroup>, HttpError>;

/// Delete an anti-affinity group
#[endpoint {
method = DELETE,
path = "/v1/anti-affinity-groups/{anti_affinity_group}",
tags = ["affinity"],
}]
async fn anti_affinity_group_delete(
rqctx: RequestContext<Self::Context>,
query_params: Query<params::OptionalProjectSelector>,
path_params: Path<params::AntiAffinityGroupPath>,
) -> Result<HttpResponseDeleted, HttpError>;

// Certificates

/// List certificates for external endpoints
Expand Down
Loading
Loading