Consider configuring kueue waitForPodsReady #191

avrittrohwer · 2024-09-23T17:08:02Z

kueue supports all-or-nothing scheduling: https://kueue.sigs.k8s.io/docs/tasks/manage/setup_wait_for_pods_ready/

Large multi-pod workloads that need every pod to be running to make progress (e.g. single-program-multi-data workloads) can deadlock capacity if the physical availability of resources does not match the configured kueue quotas. The kueue waitForPodsReady feature configures kueue to additionally monitor pod readiness condition for workloads. If not all pods become ready within a configured timeout, the workload is evicted and requeued.

The text was updated successfully, but these errors were encountered:

PBundyra · 2024-09-25T09:12:14Z

Hi @avrittrohwer! I like the idea. Do you suggest using default WaitForPodsReady or maybe make it configurable with some xpk flag? I'm leaning towards enabling it by default with default values

avrittrohwer · 2024-09-25T14:27:26Z

I'm not sure the waitForPodsReady configuration would be good in all scenarios, for example the default waitForPodsReady.timeout is 5m, if the cluster using using node auto-provisioning it is likely that timeout is too short

The kueue configuration is stored in a configmap (https://kueue.sigs.k8s.io/docs/installation/#install-a-custom-configured-released-version) so users could easily just update that configmap in their cluster. Another idea is to introduce a config directory concept in xpk, we could keep a yaml representation of the kueue configmap which users could edit on disk (and commit to source control) then xpk could take care of ensuring the cluster state matches the state on disk

PBundyra · 2024-09-30T06:19:45Z

WDYT @44past4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider configuring kueue waitForPodsReady #191

Consider configuring kueue waitForPodsReady #191

avrittrohwer commented Sep 23, 2024

PBundyra commented Sep 25, 2024

avrittrohwer commented Sep 25, 2024

PBundyra commented Sep 30, 2024

Consider configuring kueue waitForPodsReady #191

Consider configuring kueue waitForPodsReady #191

Comments

avrittrohwer commented Sep 23, 2024

PBundyra commented Sep 25, 2024

avrittrohwer commented Sep 25, 2024

PBundyra commented Sep 30, 2024