Skip to content

Commit

Permalink
address comments
Browse files Browse the repository at this point in the history
  • Loading branch information
Arvind Thirumurugan committed Jan 21, 2025
1 parent 63ff989 commit 1feaa25
Show file tree
Hide file tree
Showing 2 changed files with 40 additions and 210 deletions.
10 changes: 7 additions & 3 deletions docs/concepts/EvictionAndDisruptionBudget/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,10 +52,14 @@ Users are allowed to specify one of two fields in the `ClusterResourcePlacementD
- MaxUnavailable - specifies the maximum number of clusters in which a placement can be unavailable due to voluntary disruptions.
- MinAvailable - specifies the minimum number of clusters in which placements are available despite voluntary disruptions.

for both `MaxUnavailable` and `MinAvailable`, the user can specify the number of clusters as an integer or as a percentage of the total number of clusters in the fleet.

> **Note:** For both MaxUnavailable and MinAvailable, involuntary disruptions are not subject to the disruption budget but will still count against it.
When specifying a disruption budget for a particular `ClusterResourcePlacement`, the user needs to consider the following cases:

- For `PickFixed` CRP, whether a `ClusterResourcePlacementDisruptionBudget` is specified or not, if an eviction is carried out, the user will receive an invalid eviction error message in the eviction status.
- For `PickAll` CRP, if a `ClusterResourcePlacementDisruptionBudget` is specified and the `MaxUnavailable` field is set, the user will receive a misconfigured placement disruption budget error message in the eviction status because total number of clusters selected is non-deterministic.
- For `PickN` CRP, if a `ClusterResourcePlacementDisruptionBudget` is specified, the user can either set `MaxUnavailable` or `MinAvailable` field since the fields are mutually exclusive.
- For `PickFixed` CRP, whether a `ClusterResourcePlacementDisruptionBudget` is specified or not, if an eviction object is created, the user will receive an invalid eviction error message in the eviction status.
- For `PickAll` CRP, if the `ClusterResourcePlacementDisruptionBudget` is specified for the following cases, the user will receive a misconfigured placement disruption budget error message in the eviction status because total number of clusters selected is non-deterministic
- If the `MaxUnavailable` field is set either as integer or as a percentage
- If the `MinAvailable` field is set as a percentage
- For `PickN` CRP, if a `ClusterResourcePlacementDisruptionBudget` is specified, the user can either set `MaxUnavailable` or `MinAvailable` field as an integer or percentage since the fields are mutually exclusive.
240 changes: 33 additions & 207 deletions docs/howtos/eviction-placement-disruption-budget.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,94 +32,12 @@ spec:
placementType: PickAll
```
The CRP status after applying should look something like this:
The `CRP` status after applying should look something like this:

```yaml
status:
conditions:
- lastTransitionTime: "2025-01-19T11:43:31Z"
message: found all cluster needed as specified by the scheduling policy, found
1 cluster(s)
observedGeneration: 2
reason: SchedulingPolicyFulfilled
status: "True"
type: ClusterResourcePlacementScheduled
- lastTransitionTime: "2025-01-19T11:43:31Z"
message: All 1 cluster(s) start rolling out the latest resource
observedGeneration: 2
reason: RolloutStarted
status: "True"
type: ClusterResourcePlacementRolloutStarted
- lastTransitionTime: "2025-01-19T11:43:31Z"
message: No override rules are configured for the selected resources
observedGeneration: 2
reason: NoOverrideSpecified
status: "True"
type: ClusterResourcePlacementOverridden
- lastTransitionTime: "2025-01-19T11:43:31Z"
message: Works(s) are succcesfully created or updated in 1 target cluster(s)'
namespaces
observedGeneration: 2
reason: WorkSynchronized
status: "True"
type: ClusterResourcePlacementWorkSynchronized
- lastTransitionTime: "2025-01-19T11:43:31Z"
message: The selected resources are successfully applied to 1 cluster(s)
observedGeneration: 2
reason: ApplySucceeded
status: "True"
type: ClusterResourcePlacementApplied
- lastTransitionTime: "2025-01-19T11:43:31Z"
message: The selected resources in 1 cluster(s) are available now
observedGeneration: 2
reason: ResourceAvailable
status: "True"
type: ClusterResourcePlacementAvailable
observedResourceIndex: "0"
placementStatuses:
- clusterName: kind-cluster-1
conditions:
- lastTransitionTime: "2025-01-19T11:43:31Z"
message: 'Successfully scheduled resources for placement in "kind-cluster-1"
(affinity score: 0, topology spread score: 0): picked by scheduling policy'
observedGeneration: 2
reason: Scheduled
status: "True"
type: Scheduled
- lastTransitionTime: "2025-01-19T11:43:31Z"
message: Detected the new changes on the resources and started the rollout process
observedGeneration: 2
reason: RolloutStarted
status: "True"
type: RolloutStarted
- lastTransitionTime: "2025-01-19T11:43:31Z"
message: No override rules are configured for the selected resources
observedGeneration: 2
reason: NoOverrideSpecified
status: "True"
type: Overridden
- lastTransitionTime: "2025-01-19T11:43:31Z"
message: All of the works are synchronized to the latest
observedGeneration: 2
reason: AllWorkSynced
status: "True"
type: WorkSynchronized
- lastTransitionTime: "2025-01-19T11:43:31Z"
message: All corresponding work objects are applied
observedGeneration: 2
reason: AllWorkHaveBeenApplied
status: "True"
type: Applied
- lastTransitionTime: "2025-01-19T11:43:31Z"
message: All corresponding work objects are available
observedGeneration: 2
reason: AllWorkAreAvailable
status: "True"
type: Available
selectedResources:
- kind: Namespace
name: test-ns
version: v1
kubectl get crp test-crp
NAME GEN SCHEDULED SCHEDULED-GEN AVAILABLE AVAILABLE-GEN AGE
test-crp 2 True 2 True 2 5m49s
```

let's now add a taint to the member cluster to ensure this cluster is not picked again by the scheduler once we evict resources from it.
Expand Down Expand Up @@ -151,45 +69,23 @@ spec:
clusterName: kind-cluster-1
```

the eviction status lets us know if the eviction was successful:
the eviction object should look like this, if the eviction was successful:

```yaml
status:
conditions:
- lastTransitionTime: "2025-01-19T12:10:01Z"
message: Eviction is valid
observedGeneration: 1
reason: ClusterResourcePlacementEvictionValid
status: "True"
type: Valid
- lastTransitionTime: "2025-01-19T12:10:01Z"
message: Eviction is allowed, no ClusterResourcePlacementDisruptionBudget specified
observedGeneration: 1
reason: ClusterResourcePlacementEvictionExecuted
status: "True"
type: Executed
kubectl get crpe test-eviction
NAME VALID EXECUTED
test-eviction True True
```

since the eviction is successful, the resources should be removed from the cluster, let's take a look at the CRP object's status to verify:
since the eviction is successful, the resources should be removed from the cluster, let's take a look at the `CRP` object status to verify:

```yaml
status:
conditions:
- lastTransitionTime: "2025-01-19T11:43:31Z"
message: found all cluster needed as specified by the scheduling policy, found
0 cluster(s)
observedGeneration: 2
reason: SchedulingPolicyFulfilled
status: "True"
type: ClusterResourcePlacementScheduled
observedResourceIndex: "0"
selectedResources:
- kind: Namespace
name: test-ns
version: v1
kubectl get crp test-crp
NAME GEN SCHEDULED SCHEDULED-GEN AVAILABLE AVAILABLE-GEN AGE
test-crp 2 True 2 15m
```

The status shows that the resources have been removed from the cluster and the only reason the scheduler doesn't re-pick the cluster is because of the taint we added.
from the object we can clearly tell that the resources were evicted since the `AVAILABLE` column is empty. If the user needs more information `ClusterResourcePlacement` object's status can be checked.

## Protecting resources from voluntary disruptions using ClusterResourcePlacementDisruptionBudget

Expand Down Expand Up @@ -217,94 +113,12 @@ spec:
numberOfClusters: 1
```

The CRP status after applying should look something like this:
The `CRP` object after applying should look something like this:

```yaml
status:
conditions:
- lastTransitionTime: "2025-01-19T12:36:54Z"
message: found all cluster needed as specified by the scheduling policy, found
1 cluster(s)
observedGeneration: 2
reason: SchedulingPolicyFulfilled
status: "True"
type: ClusterResourcePlacementScheduled
- lastTransitionTime: "2025-01-19T12:36:54Z"
message: All 1 cluster(s) start rolling out the latest resource
observedGeneration: 2
reason: RolloutStarted
status: "True"
type: ClusterResourcePlacementRolloutStarted
- lastTransitionTime: "2025-01-19T12:36:54Z"
message: No override rules are configured for the selected resources
observedGeneration: 2
reason: NoOverrideSpecified
status: "True"
type: ClusterResourcePlacementOverridden
- lastTransitionTime: "2025-01-19T12:36:54Z"
message: Works(s) are succcesfully created or updated in 1 target cluster(s)'
namespaces
observedGeneration: 2
reason: WorkSynchronized
status: "True"
type: ClusterResourcePlacementWorkSynchronized
- lastTransitionTime: "2025-01-19T12:36:54Z"
message: The selected resources are successfully applied to 1 cluster(s)
observedGeneration: 2
reason: ApplySucceeded
status: "True"
type: ClusterResourcePlacementApplied
- lastTransitionTime: "2025-01-19T12:36:54Z"
message: The selected resources in 1 cluster(s) are available now
observedGeneration: 2
reason: ResourceAvailable
status: "True"
type: ClusterResourcePlacementAvailable
observedResourceIndex: "0"
placementStatuses:
- clusterName: kind-cluster-1
conditions:
- lastTransitionTime: "2025-01-19T12:36:54Z"
message: 'Successfully scheduled resources for placement in "kind-cluster-1"
(affinity score: 0, topology spread score: 0): picked by scheduling policy'
observedGeneration: 2
reason: Scheduled
status: "True"
type: Scheduled
- lastTransitionTime: "2025-01-19T12:36:54Z"
message: Detected the new changes on the resources and started the rollout process
observedGeneration: 2
reason: RolloutStarted
status: "True"
type: RolloutStarted
- lastTransitionTime: "2025-01-19T12:36:54Z"
message: No override rules are configured for the selected resources
observedGeneration: 2
reason: NoOverrideSpecified
status: "True"
type: Overridden
- lastTransitionTime: "2025-01-19T12:36:54Z"
message: All of the works are synchronized to the latest
observedGeneration: 2
reason: AllWorkSynced
status: "True"
type: WorkSynchronized
- lastTransitionTime: "2025-01-19T12:36:54Z"
message: All corresponding work objects are applied
observedGeneration: 2
reason: AllWorkHaveBeenApplied
status: "True"
type: Applied
- lastTransitionTime: "2025-01-19T12:36:54Z"
message: All corresponding work objects are available
observedGeneration: 2
reason: AllWorkAreAvailable
status: "True"
type: Available
selectedResources:
- kind: Namespace
name: test-ns
version: v1
kubectl get crp test-crp
NAME GEN SCHEDULED SCHEDULED-GEN AVAILABLE AVAILABLE-GEN AGE
test-crp 2 True 2 True 2 8s
```

Now we will create a `ClusterResourcePlacementDisruptionBudget` object to protect resources on the member cluster from voluntary disruption:
Expand Down Expand Up @@ -332,18 +146,30 @@ spec:
clusterName: kind-cluster-1
```

let's take a look at the status to see if the eviction was executed,
> **Note:** The eviction controller will try to get the corresponding `ClusterResourcePlacementDisruptionBudget` object when a `ClusterResourcePlacementEviction` object is reconciled to check if the specified MaxAvailable or MinAvailable allows the eviction to be executed.

let's take a look at the eviction object to see if the eviction was executed,

```yaml
kubectl get crpe test-eviction
NAME VALID EXECUTED
test-eviction True False
```

from the eviction object we can see the eviction was not executed.

let's take a look at the `ClusterResourcePlacementEviction` object status to verify why the eviction was not executed:

```yaml
status:
conditions:
- lastTransitionTime: "2025-01-19T12:48:42Z"
- lastTransitionTime: "2025-01-21T15:52:29Z"
message: Eviction is valid
observedGeneration: 1
reason: ClusterResourcePlacementEvictionValid
status: "True"
type: Valid
- lastTransitionTime: "2025-01-19T12:48:42Z"
- lastTransitionTime: "2025-01-21T15:52:29Z"
message: 'Eviction is blocked by specified ClusterResourcePlacementDisruptionBudget,
availablePlacements: 1, totalPlacements: 1'
observedGeneration: 1
Expand All @@ -352,4 +178,4 @@ status:
type: Executed
```

from the eviction status we can clearly see the eviction was blocked by the `ClusterResourcePlacementDisruptionBudget` object which protected resources from being evicted from the MemberCluster.
the eviction status clearly mentions that the eviction was blocked by the specified `ClusterResourcePlacementDisruptionBudget`.

0 comments on commit 1feaa25

Please sign in to comment.