Skip to content

Commit

Permalink
[nexus] Don't fail instances in create saga unwind (#7437)
Browse files Browse the repository at this point in the history
When the `instance_create` saga's `sic_create_instance_record` action
unwinds, it executes the compensating action
[`sic_delete_instance_record`][1].

This action moves the instance's state to `Failed` prior to actually
calling into `project_delete_instance` to delete it, [here][2]. This
is because we presently only allow instances to be deleted when they
are in the `Stopped` or `Failed` states, as noted [here][3].

Because we must first transition the instance to `Failed` in order to
delete it, there is an intermediate state when the instance
record created by an unwinding saga exists but is in the `Failed`
state. Instances in the `Failed` state are eligible to be restarted by
the `instance_reincarnation` background task. That task queries for all
`Failed`` instances and creates `instance_start` sagas to attempt to
restart them.

Therefore, if the `instance_reincarnation` background task runs during
the window of time between when an unwinding `instance_create` saga
marks the instance record as `Failed` and when it actually attempts to
call `project_delete_instance`, the instance can transition to
`Starting` by the new instance-start saga. This results in the attempt
to delete it failing, causing the unwinding saga to get stuck. This is
not great --- it causes a test flake (see #7326), but it's actually a
real bug, as it can result in a saga unable to unwind.

This commit fixes this by moving most of `project_delete_instance` to a
new function, `project_delete_instance_in_state`, which accepts a list
of states in which the instance may be deleted as an argument.
`project_delete_instance` now calls that function with the "normal" list
of states, but unwinding instance-create sagas are additionally able to
allow the instance record to be deleted while it's `Creating` in a
single atomic database operation, avoiding the transient `Failed` state.
This fixes #7326.

Unfortunately, it's quite challenging to have a regression test for
this, because it would require interrupting the unwinding saga's
`sic_delete_instance_record` mid-activation, which we don't really have
a nice mechanism for.

Also, there was a comment in the `sic_delete_instance_record` action
that stated we should be looking up the instance record to delete by
ID rather than by project and name. The comment references issue #1536.
While I was here changing this action, I figured I'd go ahead and change
this as well. My assumption is that the previous thing predates the
`LookupPath::instance_id` method?

[1]: https://github.com/oxidecomputer/omicron/blob/ec4b5dc3c0c45b667e57a52389d82382b0b59112/nexus/src/app/sagas/instance_create.rs#L970
[2]: https://github.com/oxidecomputer/omicron/blob/ec4b5dc3c0c45b667e57a52389d82382b0b59112/nexus/src/app/sagas/instance_create.rs#L1010-L1033
[3]: https://github.com/oxidecomputer/omicron/blob/ec4b5dc3c0c45b667e57a52389d82382b0b59112/nexus/src/app/sagas/instance_create.rs#L987-L988
  • Loading branch information
hawkw authored Jan 30, 2025
1 parent e5bf48f commit d062c60
Show file tree
Hide file tree
Showing 2 changed files with 67 additions and 49 deletions.
36 changes: 31 additions & 5 deletions nexus/db-queries/src/db/datastore/instance.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1309,18 +1309,44 @@ impl DataStore {
opctx: &OpContext,
authz_instance: &authz::Instance,
) -> DeleteResult {
opctx.authorize(authz::Action::Delete, authz_instance).await?;

// This is subject to change, but for now we're going to say that an
// instance must be "stopped" or "failed" in order to delete it. The
// delete operation sets "time_deleted" (just like with other objects)
// and also sets the state to "destroyed".
self.project_delete_instance_in_states(
&opctx,
authz_instance,
&[InstanceState::NoVmm, InstanceState::Failed],
)
.await
}

/// Delete the provided `authz_instance`, as long as it is in one of the
/// provided set of [`InstanceState`]s.
///
/// A.K.A. "[`project_delete_instance`], hard mode". Typically, instances
/// may only be deleted if they are in the [`InstanceState::NoVmm`] or
/// `[`InstanceState::Failed`] states. This method exists separately in
/// order to allow an unwinding `instance-create` saga to delete the
/// instance record without transiently moving it to
/// [`InstanceState::Failed`], in which it becomes eligible for automatic
/// restart.
///
/// In general, callers should use [`project_delete_instance`] instead,
/// unless you really know what you're doing.
///
/// [`project_delete_instance`]: Self::project_delete_instance
pub async fn project_delete_instance_in_states(
&self,
opctx: &OpContext,
authz_instance: &authz::Instance,
ok_to_delete_instance_states: &'static [InstanceState],
) -> DeleteResult {
use db::schema::{disk, instance};

let stopped = InstanceState::NoVmm;
let failed = InstanceState::Failed;
opctx.authorize(authz::Action::Delete, authz_instance).await?;

let destroyed = InstanceState::Destroyed;
let ok_to_delete_instance_states = vec![stopped, failed];

let detached_label = api::external::DiskState::Detached.label();
let ok_to_detach_disk_states =
Expand Down
80 changes: 36 additions & 44 deletions nexus/src/app/sagas/instance_create.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@ use crate::app::{
};
use crate::external_api::params;
use nexus_db_model::{ExternalIp, NetworkInterfaceKind};
use nexus_db_queries::db::identity::Resource;
use nexus_db_queries::db::lookup::LookupPath;
use nexus_db_queries::db::queries::network_interface::InsertError as InsertNicError;
use nexus_db_queries::{authn, authz, db};
Expand Down Expand Up @@ -977,60 +976,53 @@ async fn sic_delete_instance_record(
&sagactx,
&params.serialized_authn,
);

let instance_id = sagactx.lookup::<InstanceUuid>("instance_id")?;
let instance_name = sagactx
.lookup::<db::model::Instance>("instance_record")?
.name()
.clone()
.into();

// We currently only support deleting an instance if it is stopped or
// failed, so update the state accordingly to allow deletion.
// TODO-correctness TODO-security It's not correct to re-resolve the
// instance name now. See oxidecomputer/omicron#1536.

let result = LookupPath::new(&opctx, &datastore)
.project_id(params.project_id)
.instance_name(&instance_name)
.instance_id(instance_id.into_untyped_uuid())
.fetch()
.await;

// Although, as mentioned in the comment above, we should not be doing the
// lookup by name here, we do want this operation to be idempotent.
//
// As such, if the instance has already been deleted, we should return with
// a no-op.
let (authz_instance, db_instance) = match result {
Ok((.., authz_instance, db_instance)) => (authz_instance, db_instance),
// This action must be idempotent, so that an unwinding saga doesn't get
// stuck if it executes twice. Therefore, if the instance has already been
// deleted, we should return with a no-op, rather than an error.
let authz_instance = match result {
Ok((.., authz_instance, _)) => authz_instance,
Err(err) => match err {
Error::ObjectNotFound { .. } => return Ok(()),
_ => return Err(err.into()),
},
};

let runtime_state = db::model::InstanceRuntimeState {
nexus_state: db::model::InstanceState::Failed,
// Must update the generation, or the database query will fail.
//
// The runtime state of the instance record is only changed as a result
// of the successful completion of the saga, or in this action during
// saga unwinding. So we're guaranteed that the cached generation in the
// saga log is the most recent in the database.
gen: db::model::Generation::from(db_instance.runtime_state.gen.next()),
..db_instance.runtime_state
};

let updated =
datastore.instance_update_runtime(&instance_id, &runtime_state).await?;

if !updated {
warn!(
osagactx.log(),
"failed to update instance runtime state from creating to failed",
);
}

// Actually delete the record.
datastore.project_delete_instance(&opctx, &authz_instance).await?;
//
// Note that we override the list of states in which deleting the instance
// is allowed. Typically, instances may only be deleted when they are
// `NoVmm` or `Failed`. Here, however, the instance is in the `Creating`
// state, as we are in the process of creating it.
//
// We would like to avoid transiently changing it to `NoVmm` or `Failed`,
// as this would create a temporary period of time during which an instance
// whose instance-create saga failed could be restarted. In particular,
// marking the instance as `Failed` makes it eligible for auto-restart, so
// if the `instance_reincarnation` background task queries for reincarnable
// instances during that period, it could attempt to restart the instance,
// preventing this saga from unwinding successfully.
//
// Therefore, we override the set of states in which the instance it is
// deleted, which is okay to do because we will only attempt to delete a
// `Creating` instance when unwinding an instance-create saga.
datastore
.project_delete_instance_in_states(
&opctx,
&authz_instance,
&[
db::model::InstanceState::NoVmm,
db::model::InstanceState::Failed,
db::model::InstanceState::Creating,
],
)
.await?;

Ok(())
}
Expand Down

0 comments on commit d062c60

Please sign in to comment.