Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document UpgradeSuspensionWindow #336

Merged
merged 2 commits into from
Jun 19, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -222,6 +222,21 @@ openshift_upgrade_controller_cluster_version_info{channel="stable-4.14",cluster_
openshift_upgrade_controller_cluster_version_overlay_timestamp_seconds{channel="stable-4.15",cluster_id="XXX",from="2022-12-04T14:00:00Z"} 1.6701624e+09
----

=== The controller should be able to block certain time windows for upgrades (for example public holidays) [[block-upgrade-time-windows]]

The controller should be able to block certain time windows for upgrades.
This allows us to prevent upgrades during public holidays or other special events.

A `UpgradeSuspensionWindow` object blocks upgrades for a specific time window.
The objects it's matched against can be defined through a selector.

Matching `UpgradeConfig` objects won't create `UpgradeJob` objects during the time window.

Matching `UpgradeJob` objects won't start the upgrade during the time window.
Skipped jobs will be marked as successful with reason skipped.
Success and finish hooks will be executed as normal.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is it handled currently with start hooks for a noop upgradejob? I assume start and success/finish hooks are all executed, right?

Do we need to ensure that we don't have success/finish hooks that expect certain actions to be executed in a corresponding start hook? I'm thinking about a certain customer's scale-down then scale-up requirements.

Copy link
Contributor Author

@bastjan bastjan Jun 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Start, Success, and Finish are all executed for noop jobs.
Skipped jobs would only execute Success and Finish.

The hook would see the reason for Success and Finish and we should extend the script there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, just something that we need to be aware of before we roll this out. It introduces a new corner case where only some of the hooks are executed, thereby potentially violating some expectations on the behavior that we had previously.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I would not expect too many of those weird jobs though. Since they should™️ be idempotent anyways.

If the job was owned by a upgradeconfig object, the object creates a new job with the current (possibly same) version in the next non-suspended time window.

=== When's an upgrade job considered successful? [[upgrade-success]]

The controller monitors the `ClusterVersion/version` for the `Available` condition.
Expand Down Expand Up @@ -455,6 +470,30 @@ Use `ttlSecondsAfterFinished` to delete the job after a certain time.
<8> There is no automatic timeout for jobs.
Use `activeDeadlineSeconds` to set a timeout.

=== UpgradeSuspensionWindow

The `UpgradeSuspensionWindow` CRD allows to block certain time windows for upgrades.

[source,yaml]
----
apiVersion: managedupgrade.appuio.io/v1beta1
kind: UpgradeSuspensionWindow
metadata:
name: end-of-year-holidays-2023
spec:
start: "2023-12-25T00:00:00Z"
end: "2024-01-08T00:00:00Z"
reason: "End of year holidays"
configSelector: <1>
matchLabels:
upgrade-config: cluster-upgrade
jobSelector: <2>
matchLabels:
upgrade-config: cluster-upgrade
----
<1> The selector to match the `UpgradeConfig` objects to block.
<2> The selector to match the `UpgradeJob` objects to block.

== Resources

- https://access.redhat.com/labs/ocpupgradegraph/update_channel[RedHat OCP Upgrade Graph]
Expand Down