Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[8/5] Reduce VMM reservation contention #7533

Open
wants to merge 12 commits into
base: vmm-reserve-bench
Choose a base branch
from

Conversation

smklein
Copy link
Collaborator

@smklein smklein commented Feb 12, 2025

#7498 was introduced to benchmark the cost of concurrent instance provisioning, and it demonstrated that through contention, performance can be significantly on the VMM reservation pathway.

This PR optimizes that pathway, by removing the VMM reservation transaction, and instead replacing it with some non-transactional queries:

  1. First, we query to see if the VMM reservation has already succeeded (for idempotency)
  2. Next, we query for all viable sled targets and affinity information (sled_find_targets_query)
  3. After parsing that data and picking a sled, we call sled_insert_resource_query to INSERT a desired VMM record, and to re-validate our constraints.

This change significantly improves performance in the vmm-reservation benchmark, while upholding the necessary constraints implicit to VMM provisioning.

@hawkw hawkw self-requested a review February 12, 2025 21:18
COALESCE(SUM(CAST(sled_resource_vmm.reservoir_ram AS INT8)), 0) + "
).param().sql(" <= sled.reservoir_size
),
our_aa_groups AS (
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, here's a small detail that might be worth fixing...

This query is used alongside sled_find_targets_query, so we do:

  1. sled_find_targets_query
  2. Pick a candidate sled (in Rust)
  3. sled_insert_resource_query, to insert the sled reservation if it's still valid

So, for the "really bad cases" (e.g., no space, affinity group changes with policy = fail), we prevent the reservation if some concurrent action has changed the state of the world from underneath us.

HOWEVER, it is technically possible that we pick an unfavorable sled due to a concurrent operation.

For example:

  1. We get a set of sleds for an instance in an anti-affinity group with policy = allow
  2. We pick a sled, S, for our VMM (maybe no other instances are using S)
  3. Someone else concurrently provisions another VMM to S, and that VMM's instance belongs to our anti-affinity group.
  4. We call sled_insert_resource_query, thinking that this is a reasonable choice for a sled target. And with this query as-written, S is still technically an allowed choice (as long as our VMM still fits). However, because of the concurrent action in step (3), it's not a "good" choice - there's a member of our anti-affinity group co-located with us. Ideally, we would try a different sled.

Today, this means we can pick sled targets less-than-favorably -- but we do still prevent co-locating anti-affinity group members with policy = fail, and we prevent anti-locating affinity group members with policy = fail. It's this more permissive case that could use some cleanup.

@smklein
Copy link
Collaborator Author

smklein commented Feb 14, 2025

Before:

image

After:

Untitled drawing (2)

(Please note the difference in scale on the X-axis, it is significant)

@smklein smklein marked this pull request as ready for review February 14, 2025 22:54
@smklein smklein requested a review from gjcolombo February 14, 2025 22:56
@smklein
Copy link
Collaborator Author

smklein commented Feb 14, 2025

Similarly, with affinity + anti-affinity groups:

Before:

image

After:

image

(The results align with those of the group-less benchmarks)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant