-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nexus overprovisions disks with control plane zones on them #7225
Comments
Tagging this for R13 because I don't think it's a blocker (not a regression, right?) but it's important to keep on our radar. (Maybe it's better to not have a milestone and discuss at the product roundtable? CC @askfongjojo @morlandi7.) |
Some ideas discussed so far:
I haven't spent much time in this area of code and I apologize if I've got stuff wrong! Options 1-3 are all pretty related and they all assume:
So far my vote would be to go with (2), using a hardcoded limit on the disk space used by the control plane. Then we'd enforce that at allocation time with a database constraint and at runtime with quotas/reservations (#7227). |
Maybe related? #6110. |
For what it's worth, option 4 may not be that complicated. The following region allocation CTE change should do it: diff --git a/nexus/db-queries/src/db/queries/region_allocation.rs b/nexus/db-queries/src/db/queries/region_allocation.rs
index a9130d87f..239014303 100644
--- a/nexus/db-queries/src/db/queries/region_allocation.rs
+++ b/nexus/db-queries/src/db/queries/region_allocation.rs
@@ -218,7 +218,8 @@ pub fn allocation_query(
candidate_datasets AS (
SELECT DISTINCT ON (dataset.pool_id)
dataset.id,
- dataset.pool_id
+ dataset.pool_id,
+ dataset.size_used
FROM (dataset INNER JOIN candidate_zpools ON (dataset.pool_id = candidate_zpools.pool_id))
WHERE (
((dataset.time_deleted IS NULL) AND
@@ -235,7 +236,8 @@ pub fn allocation_query(
shuffled_candidate_datasets AS (
SELECT
candidate_datasets.id,
- candidate_datasets.pool_id
+ candidate_datasets.pool_id,
+ candidate_datasets.size_used
FROM candidate_datasets
ORDER BY md5((CAST(candidate_datasets.id as BYTEA) || ").param().sql(")) LIMIT ").param().sql("
),")
@@ -257,7 +259,8 @@ pub fn allocation_query(
NULL AS port,
").param().sql(" AS read_only,
FALSE as deleting
- FROM shuffled_candidate_datasets")
+ FROM shuffled_candidate_datasets
+ ORDER BY shuffled_candidate_datasets.size_used ASC")
// Only select the *additional* number of candidate regions for the required
// redundancy level
.sql(" Though there's some interaction here with the supplied |
This appears to be the root cause of #7221: Nexus provisioned so many Crucible regions on one disk that the CockroachDB zone whose root filesystem and database filesystem were both on the same pool ran out of space.
There are many ways to deal with this, with different tradeoffs. I'll post a few notes from the earlier control plane discussion.
The text was updated successfully, but these errors were encountered: