From df69db9bb44f7ac2026c38b0cdf5a6a3425b0e67 Mon Sep 17 00:00:00 2001 From: Sebastian Widmer Date: Mon, 13 May 2024 13:43:26 +0200 Subject: [PATCH 1/5] Decision: OCP minor version tracking --- .../decisions/ocp-minor-version-tracking.adoc | 79 +++++++++++++++++++ docs/modules/ROOT/partials/nav.adoc | 1 + 2 files changed, 80 insertions(+) create mode 100644 docs/modules/ROOT/pages/explanations/decisions/ocp-minor-version-tracking.adoc diff --git a/docs/modules/ROOT/pages/explanations/decisions/ocp-minor-version-tracking.adoc b/docs/modules/ROOT/pages/explanations/decisions/ocp-minor-version-tracking.adoc new file mode 100644 index 00000000..edd61d5a --- /dev/null +++ b/docs/modules/ROOT/pages/explanations/decisions/ocp-minor-version-tracking.adoc @@ -0,0 +1,79 @@ += OpenShift Minor Version Upgrade Tracking + +== Problem + +Upgrading OpenShift minor versions requires a lot of coordination between our OpenShift team, customers, and other teams like AppCat and AppFlow teams. +It's difficult to track the status of each cluster. +We need to weekly check if and when we can switch which cluster to a new version and not forget the required pull requests. + +=== Goals + +* Allowing an overview of the status of each cluster +* Automating the process of changing the release channel + +=== Non-Goals + +* Automating customer or team approval for the upgrade + +== Proposals + +=== Automation in Jira to set cluster facts or update the tenant git repository + +Jira is the main tool for tracking customer communication and the status of the upgrade. + +We link the Jira ticket to the cluster and set the Lieutenant cluster fact or update the tenant git repository with the required information. + +We don't have much experience with Jira automation or workflow, so we don't know how much effort this would be. +Additionally Jira isn't in our operational control, so we would need to work with other equally busy teams to get this done. + +We're also not if Jira is here to stay or if we will switch to another tool in the nearish future. + +=== Automation in GitLab with metrics export to Grafana dashboard + +We create automation to schedule a pull request merge in the tenant git repository to change the release channel. + +The pull request automation would ideally also export metrics about the pull request status and expected merge time to a Grafana dashboard. + +=== Automation in upgrade controller with Grafana dashboard + +Allow the upgrade controller to schedule `ClusterVersion` changes. +This could be in the form of a base version with overlaid patches for the `ClusterVersion` resource. + +We would need an additional metric for current channel and future versions. + +[source,yaml] +---- +apiVersion: managedupgrade.appuio.io/v1beta1 +kind: ClusterVersion +metadata: + name: version + namespace: appuio-openshift-upgrade-controller +spec: + template: + spec: + channel: stable-4.14 + clusterID: XXX + patches: + - from: "2024-07-12T00:00:00Z" + patch: + spec: + channel: stable-4.15 +---- + +== Decision + +We add automation to the upgrade controller to schedule `ClusterVersion` changes and implement the required metrics for the Grafana dashboard. + +== Rationale + +The upgrade controller is already in place and we've got experience with the codebase. + +All relevant timestamps are in a single place and thus easier to track. + +We don't want to build Jira automation or GitLab automation for a single use case. + +== References + +- https://github.com/appuio/openshift-upgrade-controller/blob/master/api/v1beta1/clusterversion_types.go +- xref:oc4:ROOT:explanations/decisions/scheduled-mr-merges.adoc[] +- https://ticket.vshn.net/browse/SYN-1387[Build automation for merging MRs at specified times (internal)] diff --git a/docs/modules/ROOT/partials/nav.adoc b/docs/modules/ROOT/partials/nav.adoc index d0ee7f35..f9a15ec2 100644 --- a/docs/modules/ROOT/partials/nav.adoc +++ b/docs/modules/ROOT/partials/nav.adoc @@ -245,6 +245,7 @@ *** xref:oc4:ROOT:explanations/decisions/multi-team-alert-routing-base-alerts.adoc[] ** xref:oc4:ROOT:explanations/decisions/shipping-metrics-to-centralized-instance.adoc[] ** xref:oc4:ROOT:explanations/decisions/scheduled-mr-merges.adoc[] +** xref:oc4:ROOT:explanations/decisions/ocp-minor-version-tracking.adoc[] ** xref:oc4:ROOT:explanations/decisions/subscription-tracking.adoc[] ** xref:oc4:ROOT:explanations/decisions/admin-kubeconfig.adoc[] ** xref:oc4:ROOT:explanations/decisions/cloudscale-cilium-egressip.adoc[] From 64c0d09db93a6a4593d3f841f6d829f06ccc97d0 Mon Sep 17 00:00:00 2001 From: Sebastian Widmer Date: Mon, 13 May 2024 13:53:18 +0200 Subject: [PATCH 2/5] Update docs/modules/ROOT/pages/explanations/decisions/ocp-minor-version-tracking.adoc Co-authored-by: Adrian Haas <11636405+haasad@users.noreply.github.com> --- .../explanations/decisions/ocp-minor-version-tracking.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/modules/ROOT/pages/explanations/decisions/ocp-minor-version-tracking.adoc b/docs/modules/ROOT/pages/explanations/decisions/ocp-minor-version-tracking.adoc index edd61d5a..00c314c9 100644 --- a/docs/modules/ROOT/pages/explanations/decisions/ocp-minor-version-tracking.adoc +++ b/docs/modules/ROOT/pages/explanations/decisions/ocp-minor-version-tracking.adoc @@ -26,7 +26,7 @@ We link the Jira ticket to the cluster and set the Lieutenant cluster fact or up We don't have much experience with Jira automation or workflow, so we don't know how much effort this would be. Additionally Jira isn't in our operational control, so we would need to work with other equally busy teams to get this done. -We're also not if Jira is here to stay or if we will switch to another tool in the nearish future. +We're also not sure if Jira is here to stay or if we will switch to another tool in the nearish future. === Automation in GitLab with metrics export to Grafana dashboard From e0866e151b96311274bec1dfda1aeee71085b546 Mon Sep 17 00:00:00 2001 From: Sebastian Widmer Date: Mon, 13 May 2024 13:57:54 +0200 Subject: [PATCH 3/5] Apply suggestions from code review Co-authored-by: Simon Gerber --- .../decisions/ocp-minor-version-tracking.adoc | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/modules/ROOT/pages/explanations/decisions/ocp-minor-version-tracking.adoc b/docs/modules/ROOT/pages/explanations/decisions/ocp-minor-version-tracking.adoc index 00c314c9..08a8a3e0 100644 --- a/docs/modules/ROOT/pages/explanations/decisions/ocp-minor-version-tracking.adoc +++ b/docs/modules/ROOT/pages/explanations/decisions/ocp-minor-version-tracking.adoc @@ -4,7 +4,7 @@ Upgrading OpenShift minor versions requires a lot of coordination between our OpenShift team, customers, and other teams like AppCat and AppFlow teams. It's difficult to track the status of each cluster. -We need to weekly check if and when we can switch which cluster to a new version and not forget the required pull requests. +We need to check weekly if and when we can switch which cluster to a new version and not forget the required pull requests. === Goals @@ -17,11 +17,11 @@ We need to weekly check if and when we can switch which cluster to a new version == Proposals -=== Automation in Jira to set cluster facts or update the tenant git repository +=== Automation in Jira to set cluster facts or update the tenant Git repository Jira is the main tool for tracking customer communication and the status of the upgrade. -We link the Jira ticket to the cluster and set the Lieutenant cluster fact or update the tenant git repository with the required information. +We link the Jira ticket to the cluster and set the Lieutenant cluster fact or update the tenant Git repository with the required information. We don't have much experience with Jira automation or workflow, so we don't know how much effort this would be. Additionally Jira isn't in our operational control, so we would need to work with other equally busy teams to get this done. @@ -30,7 +30,7 @@ We're also not sure if Jira is here to stay or if we will switch to another tool === Automation in GitLab with metrics export to Grafana dashboard -We create automation to schedule a pull request merge in the tenant git repository to change the release channel. +We create automation to schedule a pull request merge in the tenant Git repository to change the release channel. The pull request automation would ideally also export metrics about the pull request status and expected merge time to a Grafana dashboard. From f57118e1ac24c78bedbf5968fa960b659ec6de27 Mon Sep 17 00:00:00 2001 From: Sebastian Widmer Date: Mon, 13 May 2024 13:57:09 +0200 Subject: [PATCH 4/5] Reference admin ack issue --- .../pages/explanations/decisions/ocp-minor-version-tracking.adoc | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/modules/ROOT/pages/explanations/decisions/ocp-minor-version-tracking.adoc b/docs/modules/ROOT/pages/explanations/decisions/ocp-minor-version-tracking.adoc index 08a8a3e0..471b35e2 100644 --- a/docs/modules/ROOT/pages/explanations/decisions/ocp-minor-version-tracking.adoc +++ b/docs/modules/ROOT/pages/explanations/decisions/ocp-minor-version-tracking.adoc @@ -14,6 +14,7 @@ We need to check weekly if and when we can switch which cluster to a new version === Non-Goals * Automating customer or team approval for the upgrade +* Automating the admin acknowledgment sometimes needed for the upgrade, this will be handled by an upgrade hook https://github.com/appuio/component-openshift-upgrade-controller/issues/51[appuio/component-openshift-upgrade-controller#51]. == Proposals From b403934ecb7ceb434daf079e9be644eaeee77666 Mon Sep 17 00:00:00 2001 From: Sebastian Widmer Date: Mon, 13 May 2024 14:00:47 +0200 Subject: [PATCH 5/5] Reference CV patch as implementation idea --- .../explanations/decisions/ocp-minor-version-tracking.adoc | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/modules/ROOT/pages/explanations/decisions/ocp-minor-version-tracking.adoc b/docs/modules/ROOT/pages/explanations/decisions/ocp-minor-version-tracking.adoc index 471b35e2..702c201e 100644 --- a/docs/modules/ROOT/pages/explanations/decisions/ocp-minor-version-tracking.adoc +++ b/docs/modules/ROOT/pages/explanations/decisions/ocp-minor-version-tracking.adoc @@ -42,6 +42,8 @@ This could be in the form of a base version with overlaid patches for the `Clust We would need an additional metric for current channel and future versions. +==== Implementation Idea + [source,yaml] ---- apiVersion: managedupgrade.appuio.io/v1beta1