Skip to content

Commit

Permalink
Update Kubernetes self-hosted upgrade docs
Browse files Browse the repository at this point in the history
  • Loading branch information
mdlinville committed Nov 5, 2024
1 parent 1f61531 commit 05fb164
Show file tree
Hide file tree
Showing 11 changed files with 510 additions and 615 deletions.
40 changes: 40 additions & 0 deletions src/current/_includes/common/upgrade/finalize-kubernetes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
{% assign major_version_numeric = page.version.version | remove_first: 'v' %}

To finalize a major-version upgrade:

1. Connect to the cluster using the SQL shell:

~~~ shell
$ kubectl exec -it cockroachdb-client-secure \
-- ./cockroach sql \
--certs-dir=/cockroach-certs \
--host=cockroachdb-public
~~~

1. Run the following command:

{% include_cached copy-clipboard.html %}
~~~ sql
> RESET CLUSTER SETTING cluster.preserve_downgrade_option;
~~~

A series of migration jobs runs to enable certain types of features and changes in the new major version that cannot be rolled back. These include changes to system schemas, indexes, and descriptors, and enabling certain types of improvements and new features. Until the upgrade is finalized, these features and functions will not be available and the command `SHOW CLUSTER SETTING version` will return the previous version`.
You can monitor the process of the migration in the DB Console [Jobs page]({% link {{ page.version.version }}/ui-jobs-page.md %}). Migration jobs have names in the format `{{ major_version_numeric }}-{migration-id}`. If a migration job fails or stalls, Cockroach Labs can use the migration ID to help diagnose and troubleshoot the problem. Each major version has different migration jobs with different IDs.
The amount of time required for finalization depends on the amount of data in the cluster, because finalization runs various internal maintenance and migration tasks. During this time, the cluster will experience a small amount of additional load.
{{site.data.alerts.callout_info}}
Finalization is not complete until all [schema change]({% link {{ page.version.version }}/online-schema-changes.md %}) jobs reach a terminal state. Finalization can take as long as the longest-running schema change.
{{site.data.alerts.end}}
When all migration jobs have completed, the upgrade is complete.
1. To confirm that finalization has completed, check the cluster version:
{% include_cached copy-clipboard.html %}
~~~ sql
> SHOW CLUSTER SETTING version;
~~~
If the cluster continues to report that it is on the previous version, finalization has not completed. If auto-finalization is enabled but finalization has not completed, check for the existence of [decommissioning nodes]({% link {{ page.version.version }}/node-shutdown.md %}?filters=decommission#status-change) where decommission has stalled. In most cases, issuing the `decommission` command again resolves the issue. If you have trouble upgrading, [contact Support](https://cockroachlabs.com/support/hc/).
10 changes: 6 additions & 4 deletions src/current/_includes/common/upgrade/finalize-self-hosted.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
To finalize a major-version upgrade, run the following command. Replace `{VERSION}` new major version, such as `{{ page.version.version }}:
{% assign major_version_numeric = page.version.version | remove_first: 'v' %}

To finalize a major-version upgrade:

1. Connect to the cluster using the SQL shell:

Expand All @@ -7,14 +9,14 @@ To finalize a major-version upgrade, run the following command. Replace `{VERSIO
cockroach sql
~~~

1. Run the following command. Replace `{VERSION}` with the new major version, such as `{{ page.version.version }}`.
1. Run the following command. Replace `{VERSION}` with the new major version, such as `{{ major_version_numeric }}`.

{% include_cached copy-clipboard.html %}
~~~ shell
~~~ sql
SET CLUSTER SETTING version '{VERSION}';
~~~

A series of migration jobs runs to enable certain types of features and changes in the new major version that cannot be rolled back. These include changes to system schemas, indexes, and descriptors, and enabling certain types of improvements and new features. Until the upgrade is finalized, these features and functions will not be available and the command `SHOW CLUSTER SETTING version` will return `{{ previous_version_numeric }}`.
A series of migration jobs runs to enable certain types of features and changes in the new major version that cannot be rolled back. These include changes to system schemas, indexes, and descriptors, and enabling certain types of improvements and new features. Until the upgrade is finalized, these features and functions will not be available and the command `SHOW CLUSTER SETTING version` will return the previous version.

You can monitor the process of the migration in the DB Console [Jobs page]({% link {{ page.version.version }}/ui-jobs-page.md %}). Migration jobs have names in the format `{{ major_version_numeric }}-{migration-id}`. If a migration job fails or stalls, Cockroach Labs can use the migration ID to help diagnose and troubleshoot the problem. Each major version has different migration jobs with different IDs.

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,264 @@
To perform a major upgrade:

<section class="filter-content" markdown="1" data-scope="operator">

1. Change the container image image in the custom resource:

~~~
image:
name: cockroachdb/cockroach:{{page.release_info.version}}
~~~
1. Apply the new settings to the cluster:
{% include_cached copy-clipboard.html %}
~~~ shell
kubectl apply -f example.yaml
~~~
The Operator will perform the staged update.
1. To check the status of the rolling upgrade, run `kubectl get pods`.
1. Verify that all pods have been upgraded:
{% include_cached copy-clipboard.html %}
~~~ shell
$ kubectl get pods \
-o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.containers[0].image}{"\n"}'
~~~
You can also check the CockroachDB version of each node in the [DB Console]({% link {{ page.version.version }}/ui-cluster-overview-page.md %}#node-details).
1. Before beginning a major-version upgrade, the Operator disables auto-finalization by setting the cluster setting `cluster.preserve_downgrade_option` to the cluster's current major version. Before finalizing an upgrade, follow your organization's testing procedures to decide whether to [finalize](#finalize-a-major-version-upgrade) or [roll back](#roll-back-a-major-version-upgrade) the upgrade. After finalization begins, you can no longer roll back to the cluster's previous major version.
</section>
<section class="filter-content" markdown="1" data-scope="manual">
1.
1. Add a [partition](https://kubernetes.io/docs/tutorials/stateful-application/basic-stateful-set/#staging-an-update) to the update strategy defined in the StatefulSet. Only the pods numbered greater than or equal to the partition value will be updated. For a cluster with 3 pods (e.g., `cockroachdb-0`, `cockroachdb-1`, `cockroachdb-2`) the partition value should be 2:
{% include_cached copy-clipboard.html %}
~~~ shell
$ kubectl patch statefulset cockroachdb \
-p='{"spec":{"updateStrategy":{"type":"RollingUpdate","rollingUpdate":{"partition":2}}}}'
~~~
~~~
statefulset.apps/cockroachdb patched
~~~
1. Change the container image in the StatefulSet:
{% include_cached copy-clipboard.html %}
~~~ shell
$ kubectl patch statefulset cockroachdb \
--type='json' \
-p='[{"op": "replace", "path": "/spec/template/spec/containers/0/image", "value":"cockroachdb/cockroach:{{page.release_info.version}}"}]'
~~~
~~~
statefulset.apps/cockroachdb patched
~~~
1. To check the status of the rolling upgrade, run `kubectl get pods`.
1. After the pod has been restarted with the new image, start the CockroachDB [built-in SQL client]({% link {{ page.version.version }}/cockroach-sql.md %}):
{% include_cached copy-clipboard.html %}
~~~ shell
$ kubectl exec -it cockroachdb-client-secure \-- ./cockroach sql \
--certs-dir=/cockroach-certs \
--host=cockroachdb-public
~~~
1. Run the following SQL query to verify that the number of underreplicated ranges is zero:
{% include_cached copy-clipboard.html %}
~~~ sql
SELECT sum((metrics->>'ranges.underreplicated')::DECIMAL)::INT AS ranges_underreplicated FROM crdb_internal.kv_store_status;
~~~
~~~
ranges_underreplicated
--------------------------
0
(1 row)
~~~
This indicates that it is safe to proceed to the next pod.
1. Exit the SQL shell:
{% include_cached copy-clipboard.html %}
~~~ sql
> \q
~~~
1. Decrement the partition value by 1 to allow the next pod in the cluster to update:
{% include_cached copy-clipboard.html %}
~~~ shell
$ kubectl patch statefulset cockroachdb \
-p='{"spec":{"updateStrategy":{"type":"RollingUpdate","rollingUpdate":{"partition":1}}}}'
~~~
~~~
statefulset.apps/cockroachdb patched
~~~
1. Repeat steps 4-8 until all pods have been restarted and are running the new image (the final partition value should be `0`).
1. Check the image of each pod to confirm that all have been upgraded:
{% include_cached copy-clipboard.html %}
~~~ shell
$ kubectl get pods \
-o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.containers[0].image}{"\n"}'
~~~
~~~
cockroachdb-0 cockroachdb/cockroach:{{page.release_info.version}}
cockroachdb-1 cockroachdb/cockroach:{{page.release_info.version}}
cockroachdb-2 cockroachdb/cockroach:{{page.release_info.version}}
...
~~~
You can also check the CockroachDB version of each node in the [DB Console]({% link {{ page.version.version }}/ui-cluster-overview-page.md %}#node-details).
1. If auto-finalization is disabled, the upgrade is not complete until you [finalize the upgrade](#finalize-a-major-version-upgrade).
</section>
<section class="filter-content" markdown="1" data-scope="helm">
1. Add a [partition](https://kubernetes.io/docs/tutorials/stateful-application/basic-stateful-set/#staging-an-update) to the update strategy defined in the StatefulSet. Only the pods numbered greater than or equal to the partition value will be updated. For a cluster with 3 pods (e.g., `cockroachdb-0`, `cockroachdb-1`, `cockroachdb-2`) the partition value should be 2:
{% include_cached copy-clipboard.html %}
~~~ shell
$ helm upgrade \
my-release \
cockroachdb/cockroachdb \
--set statefulset.updateStrategy.rollingUpdate.partition=2
~~~
1. Connect to the cluster using the SQL shell:
{% include_cached copy-clipboard.html %}
~~~ shell
$ kubectl exec -it cockroachdb-client-secure \
-- ./cockroach sql \
--certs-dir=/cockroach-certs \
--host=my-release-cockroachdb-public
~~~
1. Remove the cluster initialization job from when the cluster was created:
{% include_cached copy-clipboard.html %}
~~~ shell
$ kubectl delete job my-release-cockroachdb-init
~~~
1. Change the container image in the StatefulSet:
{% include_cached copy-clipboard.html %}
~~~ shell
$ helm upgrade \
my-release \
cockroachdb/cockroachdb \
--set image.tag={{page.release_info.version}} \
--reuse-values
~~~
~~~
NAME READY STATUS RESTARTS AGE
my-release-cockroachdb-0 1/1 Running 0 2m
my-release-cockroachdb-1 1/1 Running 0 3m
my-release-cockroachdb-2 0/1 ContainerCreating 0 25s
my-release-cockroachdb-init-nwjkh 0/1 ContainerCreating 0 6s
...
~~~
{{site.data.alerts.callout_info}}
Ignore the pod for cluster initialization. It is re-created as a byproduct of the StatefulSet configuration but does not impact your existing cluster.
{{site.data.alerts.end}}
1. After the pod has been restarted with the new image, start the CockroachDB [built-in SQL client]({% link {{ page.version.version }}/cockroach-sql.md %}):
{% if page.secure == true %}
{% include_cached copy-clipboard.html %}
~~~ shell
$ kubectl exec -it cockroachdb-client-secure \
-- ./cockroach sql \
--certs-dir=/cockroach-certs \
--host=my-release-cockroachdb-public
~~~
{% else %}
{% include_cached copy-clipboard.html %}
~~~ shell
$ kubectl run cockroachdb -it \
--image=cockroachdb/cockroach \
--rm \
--restart=Never \
-- sql \
--insecure \
--host=my-release-cockroachdb-public
~~~
{% endif %}
1. Run the following SQL query to verify that the number of underreplicated ranges is zero:
{% include_cached copy-clipboard.html %}
~~~ sql
SELECT sum((metrics->>'ranges.underreplicated')::DECIMAL)::INT AS ranges_underreplicated FROM crdb_internal.kv_store_status;
~~~
~~~
ranges_underreplicated
--------------------------
0
(1 row)
~~~
This indicates that it is safe to proceed to the next pod.
1. Exit the SQL shell:
{% include_cached copy-clipboard.html %}
~~~ sql
> \q
~~~
1. Decrement the partition value by 1 to allow the next pod in the cluster to update:
{% include_cached copy-clipboard.html %}
~~~ shell
$ helm upgrade \
my-release \
cockroachdb/cockroachdb \
--set statefulset.updateStrategy.rollingUpdate.partition=1 \
~~~
1. Repeat steps 4-8 until all pods have been restarted and are running the new image (the final partition value should be `0`).
1. Check the image of each pod to confirm that all have been upgraded:
{% include_cached copy-clipboard.html %}
~~~ shell
$ kubectl get pods \
-o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.containers[0].image}{"\n"}'
~~~
~~~
my-release-cockroachdb-0 cockroachdb/cockroach:{{page.release_info.version}}
my-release-cockroachdb-1 cockroachdb/cockroach:{{page.release_info.version}}
my-release-cockroachdb-2 cockroachdb/cockroach:{{page.release_info.version}}
...
~~~
You can also check the CockroachDB version of each node in the [DB Console]({% link {{ page.version.version }}/ui-cluster-overview-page.md %}#node-details).
1. If auto-finalization is disabled, the upgrade is not complete until you [finalize the upgrade](#finalize-a-major-version-upgrade).
</section>
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
To roll back a patch upgrade, repeat the steps in [Perform a patch upgrade](#perform-a-patch-upgrade), but configure the container image for the pods to the previous major version.
Loading

0 comments on commit 05fb164

Please sign in to comment.