storcon: improve resource rebalancing across Pageservers #10488
Labels
c/storage/controller
Component: Storage Controller
t/feature
Issue type: feature, for new features or requests
During e.g. releases, tenant shards are moved around as Pageservers restart. This can lead to load imbalances, where some Pageservers have much higher load than others, and can see resource exhaustion. This can also happen gradually as tenant workloads change.
Currently, such imbalances must be resolved manually, by moving tenants around. This is slow, laborious, and often doesn't happen at all.
The storage controller should monitor Pageserver resource usage and attempt to balance resource usage and avoid overload across the following dimensions:
This must be combined with other constraints such as AZ affinity.
The text was updated successfully, but these errors were encountered: