Skip to content

Commit

Permalink
Reorganize Ceph assemblies
Browse files Browse the repository at this point in the history
Ceph assemblies are now better reorganized to follow a simple
rule/struct.
A main ceph-cluster migration assembly is included in main, and it
contains a quick intro and the (ordered) list of procedures (including
the cardinality section that is critical here and will be improved in a
follow up patch).
This way the ceph doc is very easy to access and maintain. There are
also fixes to wrong references (e.g. horizon != Ceph dashboard).

Signed-off-by: Francesco Pantano <[email protected]>
  • Loading branch information
fmount committed May 29, 2024
1 parent 09ef890 commit 81880a9
Show file tree
Hide file tree
Showing 9 changed files with 157 additions and 121 deletions.
44 changes: 44 additions & 0 deletions docs_user/assemblies/assembly_migrating-ceph-cluster.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
ifdef::context[:parent-context: {context}]

[id="ceph-migration_{context}"]

= Migrating the {CephCluster} Cluster

:context: migrating-ceph

:toc: left
:toclevels: 3

ifdef::parent-context[:context: {parent-context}]
ifndef::parent-context[:!context:]

In the context of data plane adoption, where the {rhos_prev_long}
({OpenStackShort}) services are redeployed in {OpenShift}, you migrate a
{OpenStackPreviousInstaller}-deployed {CephCluster} cluster by using a process
called “externalizing” the {CephCluster} cluster.

There are two deployment topologies that include an internal {CephCluster}
cluster:

* {OpenStackShort} includes dedicated {CephCluster} nodes to host object
storage daemons (OSDs)

* Hyperconverged Infrastructure (HCI), where Compute and Storage services are
colocated on hyperconverged nodes

In either scenario, there are some {Ceph} processes that are deployed on
{OpenStackShort} Controller nodes: {Ceph} monitors, Ceph Object Gateway (RGW),
Rados Block Device (RBD), Ceph Metadata Server (MDS), Ceph Dashboard, and NFS
Ganesha. To migrate your {CephCluster} cluster, you must decommission the
Controller nodes and move the {Ceph} daemons to a set of target nodes that are
already part of the {CephCluster} cluster.

include::../modules/con_ceph-daemon-cardinality.adoc[leveloffset=+1]

include::assembly_migrating-ceph-monitoring-stack.adoc[leveloffset=+1]

include::../modules/proc_migrating-ceph-mds.adoc[leveloffset=+1]

include::assembly_migrating-ceph-rgw.adoc[leveloffset=+1]

include::assembly_migrating-ceph-rbd.adoc[leveloffset=+1]
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,6 @@

= Migrating the monitoring stack component to new nodes within an existing {Ceph} cluster

In the context of data plane adoption, where the {rhos_prev_long} ({OpenStackShort}) services are
redeployed in {OpenShift}, a {OpenStackPreviousInstaller}-deployed {CephCluster} cluster will undergo a migration in a process we are calling “externalizing” the {CephCluster} cluster.
There are two deployment topologies, broadly, that include an “internal” {CephCluster} cluster today: one is where {OpenStackShort} includes dedicated {CephCluster} nodes to host object storage daemons (OSDs), and the other is Hyperconverged Infrastructure (HCI) where Compute nodes
double up as {CephCluster} nodes. In either scenario, there are some {Ceph} processes that are deployed on {OpenStackShort} Controller nodes: {Ceph} monitors, Ceph Object Gateway (RGW), Rados Block Device (RBD), Ceph Metadata Server (MDS), Ceph Dashboard, and NFS Ganesha.
The Ceph Dashboard module adds web-based monitoring and administration to the
Ceph Manager.
With {OpenStackPreviousInstaller}-deployed {Ceph} this component is enabled as part of the overcloud deploy and it’s composed by:
Expand Down
39 changes: 28 additions & 11 deletions docs_user/assemblies/assembly_migrating-ceph-rbd.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -4,22 +4,39 @@

= Migrating Red Hat Ceph Storage RBD to external RHEL nodes

For hyperconverged infrastructure (HCI) or dedicated Storage nodes that are running {Ceph} version 6 or later, you must migrate the daemons that are included in the {rhos_prev_long} control plane into the existing external Red Hat Enterprise Linux (RHEL) nodes. The external RHEL nodes typically include the Compute nodes for an HCI environment or dedicated storage nodes.
For Hyperconverged Infrastructure (HCI) or dedicated Storage nodes that are
running {Ceph} version 6 or later, you must migrate the daemons that are
included in the {rhos_prev_long} control plane into the existing external Red
Hat Enterprise Linux (RHEL) nodes. The external RHEL nodes typically include
the Compute nodes for an HCI environment or dedicated storage nodes.

To migrate Red Hat Ceph Storage Rados Block Device (RBD), your environment must meet the following requirements:
To migrate Red Hat Ceph Storage Rados Block Device (RBD), your environment must
meet the following requirements:

* {Ceph} is running version 6 or later and is managed by cephadm/orchestrator.
* NFS (ganesha) is migrated from a {OpenStackPreviousInstaller}-based deployment to cephadm. For more information, see xref:creating-a-ceph-nfs-cluster_migrating-databases[Creating a NFS Ganesha cluster].
* Both the {Ceph} public and cluster networks are propagated, with {OpenStackPreviousInstaller}, to the target nodes.
* Ceph MDS, Ceph Monitoring stack, Ceph MDS, Ceph RGW and other services have been migrated already to the target nodes;
* {Ceph} is running version 6 or later and is managed by Cephadm
* NFS Ganesha is migrated from a {OpenStackPreviousInstaller}-based
deployment to cephadm. For more information, see
xref:creating-a-ceph-nfs-cluster_migrating-databases[Creating a NFS Ganesha
cluster].
* Both the {Ceph} public and cluster networks are propagated, with
{OpenStackPreviousInstaller}, to the target nodes.
* Ceph MDS, Ceph Monitoring stack, Ceph MDS, Ceph RGW and other services are
migrated to the target nodes;
ifeval::["{build}" != "upstream"]
* The daemons distribution follows the cardinality constraints described in the doc link:https://access.redhat.com/articles/1548993[Red Hat Ceph Storage: Supported configurations]
* The daemons distribution follows the cardinality constraints that are
described in link:https://access.redhat.com/articles/1548993[Red Hat Ceph
Storage: Supported configurations]
endif::[]
* The Ceph cluster is healthy, and the `ceph -s` command returns `HEALTH_OK`
* The procedure keeps the mon IP addresses by moving them to the {Ceph} nodes
* Drain the existing Controller nodes
* Deploy additional monitors to the existing nodes, and promote them as
_admin nodes that administrators can use to manage the {CephCluster} cluster and perform day 2 operations against it.

During the procedure to migrate the Ceph Mon daemons, the following actions
occur:

* the mon IP addresses are moved to the target {Ceph} nodes
* the existing Controller nodes are drained and decommisioned
* additional monitors are deployed to the target nodes, and they are promoted
as `_admin` nodes that can be used to manage the {CephCluster} cluster and
perform day 2 operations.

include::../modules/proc_migrating-mgr-from-controller-nodes.adoc[leveloffset=+1]

Expand Down
2 changes: 0 additions & 2 deletions docs_user/assemblies/assembly_migrating-ceph-rgw.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,6 @@ To migrate Ceph Object Gateway (RGW), your environment must meet the following r
* {Ceph} is running version 6 or later and is managed by cephadm/orchestrator.
* An undercloud is still available, and the nodes and networks are managed by {OpenStackPreviousInstaller}.

include::../modules/con_ceph-daemon-cardinality.adoc[leveloffset=+1]

include::../modules/proc_completing-prerequisites-for-migrating-ceph-rgw.adoc[leveloffset=+1]

include::../modules/proc_migrating-the-rgw-backends.adoc[leveloffset=+1]
Expand Down
15 changes: 0 additions & 15 deletions docs_user/assemblies/ceph_migration.adoc

This file was deleted.

8 changes: 1 addition & 7 deletions docs_user/main.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,4 @@ include::assemblies/assembly_adopting-the-data-plane.adoc[leveloffset=+1]

include::assemblies/assembly_migrating-the-object-storage-service.adoc[leveloffset=+1]

include::assemblies/assembly_migrating-ceph-monitoring-stack.adoc[leveloffset=+1]

include::modules/proc_migrating-ceph-mds.adoc[leveloffset=+1]

include::assemblies/assembly_migrating-ceph-rgw.adoc[leveloffset=+1]

include::assemblies/assembly_migrating-ceph-rbd.adoc[leveloffset=+1]
include::assemblies/assembly_migrating-ceph-cluster.adoc[leveloffset=+1]
26 changes: 14 additions & 12 deletions docs_user/modules/con_ceph-daemon-cardinality.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,19 @@

= {Ceph} daemon cardinality

{Ceph} 6 and later applies strict constraints in the way daemons can be colocated within the same node.
{Ceph} 6 and later applies strict constraints in the way daemons can be
colocated within the same node.
ifeval::["{build}" != "upstream"]
For more information, see link:https://access.redhat.com/articles/1548993[Red Hat Ceph Storage: Supported configurations].
endif::[]
The resulting topology depends on the available hardware, as well as the amount of {Ceph} services present in the Controller nodes which are going to be retired.
ifeval::["{build}" != "upstream"]
For more information about the procedure that is required to migrate the RGW component and keep an HA model using the Ceph ingress daemon, see link:{defaultCephURL}/object_gateway_guide/index#high-availability-for-the-ceph-object-gateway[High availability for the Ceph Object Gateway] in _Object Gateway Guide_.
endif::[]
ifeval::["{build}" != "downstream"]
The following document describes the procedure required to migrate the RGW component (and keep an HA model using the https://docs.ceph.com/en/latest/cephadm/services/rgw/#high-availability-service-for-rgw[Ceph Ingress daemon] in a common {OpenStackPreviousInstaller} scenario where Controller nodes represent the
https://github.com/openstack/tripleo-ansible/blob/master/tripleo_ansible/roles/tripleo_cephadm/tasks/rgw.yaml#L26-L30[spec placement] where the service is deployed.
endif::[]
As a general rule, the number of services that can be migrated depends on the number of available nodes in the cluster. The following diagrams cover the distribution of the {Ceph} daemons on the {Ceph} nodes where at least three nodes are required in a scenario that sees only RGW and RBD, without the {dashboard_first_ref}:
The resulting topology depends on the available hardware, as well as the amount
of {Ceph} services present in the Controller nodes that are going to be
retired.
As a general rule, the number of services that can be migrated depends on the
number of available nodes in the cluster. The following diagrams cover the
distribution of the {Ceph} daemons on the {Ceph} nodes where at least three
nodes are required in a scenario that includes only RGW and RBD, without the
{Ceph} Dashboard:

----
| | | |
Expand All @@ -24,7 +24,8 @@ As a general rule, the number of services that can be migrated depends on the nu
| osd | mon/mgr/crash | rgw/ingress |
----

With the {dashboard}, and without {rhos_component_storage_file_first_ref} at least four nodes are required. The {dashboard} has no failover:
With the {dashboard}, and without {rhos_component_storage_file_first_ref}, at
least 4 nodes are required. The {Ceph} dashboard has no failover:

----
| | | |
Expand All @@ -35,7 +36,8 @@ With the {dashboard}, and without {rhos_component_storage_file_first_ref} at lea
| osd | rgw/ingress | (free) |
----

With the {dashboard} and the {rhos_component_storage_file}, 5 nodes minimum are required, and the {dashboard} has no failover:
With the {Ceph} dashboard and the {rhos_component_storage_file}, 5 nodes
minimum are required, and the {Ceph} dashboard has no failover:

----
| | | |
Expand Down
68 changes: 34 additions & 34 deletions docs_user/modules/proc_migrating-mgr-from-controller-nodes.adoc
Original file line number Diff line number Diff line change
@@ -1,11 +1,16 @@
[id="migrating-mgr-from-controller-nodes_{context}"]
= Migrating Ceph Manager daemons to {Ceph} nodes

= Migrating Ceph Mgr daemons to {Ceph} nodes
The following section describes how to move Ceph Manager daemons from the
{rhos_prev_long} Controller nodes to a set of target nodes. Target nodes might
be pre-existing {Ceph} nodes, or {OpenStackShort} Compute nodes if {Ceph} is
deployed by {OpenStackPreviousInstaller} with an HCI topology.
This procedure assumes that Cephadm and the {Ceph} Orchestrator are the tools
that drive the Ceph Manager migration. As done with the other Ceph daemons
(MDS, Monitoring and RGW), the procedure uses the Ceph spec to modify the
placement and reschedule the daemons. Ceph Manager is run in an active/passive
fashion, and it's also responsible to provide many modules, including the
orchestrator.

The following section describes how to move Ceph Mgr daemons from the
OpenStack controller nodes to a set of target nodes. Target nodes might be
pre-existing {Ceph} nodes, or OpenStack Compute nodes if Ceph is deployed by
{OpenStackPreviousInstaller} with an HCI topology.

.Prerequisites

Expand All @@ -17,57 +22,50 @@ you do not have to run a stack update.

.Procedure

This procedure assumes that cephadm and the orchestrator are the tools that
drive the Ceph Mgr migration. As done with the other Ceph daemons (MDS,
Monitoring and RGW), the procedure uses the Ceph spec to modify the placement
and reschedule the daemons. Ceph Mgr is run in an active/passive fashion, and
it's also responsible to provide many modules, including the orchestrator.
. Before you start the migration, ssh into the target node and enable the firewall
rules that are required to reach a Manager service.

. Before start the migration, ssh into the target node and enable the firewall
rules required to reach a Mgr service.
[source,bash]
+
----
dports="6800:7300"
ssh heat-admin@<target_node> sudo iptables -I INPUT \
-p tcp --match multiport --dports $dports -j ACCEPT;
----

[NOTE]
+
Repeat the previous action for each target_node.

. Check the rules are properly applied and persist them:
. Check that the rules are properly applied and persist them:
+
[source,bash]
----
sudo iptables-save
sudo systemctl restart iptables
----
+
. Prepare the target node to host the new Ceph Mgr daemon, and add the `mgr`
. Prepare the target node to host the new Ceph Manager daemon, and add the `mgr`
label to the target node:
+
[source,bash]
----
ceph orch host label add <target_node> mgr; done
----
+
- Replace <target_node> with the hostname of the hosts listed in the {Ceph}
through the `ceph orch host ls` command.

Repeat this action for each node that will be host a Ceph Mgr daemon.
* Replace <target_node> with the hostname of the hosts listed in the {Ceph}
through the `ceph orch host ls` command
Get the Ceph Mgr spec and update the `placement` section to use `label` as the
main scheduling strategy.
Repeat the actions described above for each `<target_node> that will host a
Ceph Manager daemon.

. Get the Ceph Mgr spec:
. Get the Ceph Manager spec:
+
[source,yaml]
----
sudo cephadm shell -- ceph orch ls --export mgr > mgr.yaml
----

.Edit the retrieved spec and add the `label: mgr` section:
. Edit the retrieved spec and add the `label: mgr` section to the `placement`
section:
+
[source,yaml]
----
Expand All @@ -78,23 +76,25 @@ placement:
----

. Save the spec in `/tmp/mgr.yaml`
. Apply the spec with cephadm using the orchestrator:
. Apply the spec with cephadm by using the orchestrator:
+
----
sudo cephadm shell -m /tmp/mgr.yaml -- ceph orch apply -i /mnt/mgr.yaml
----

According to the numner of nodes where the `mgr` label is added, you will see a
Ceph Mgr daemon count that matches the number of hosts.
As a result of this procedure, you see a Ceph Manager daemon count that matches
the number of hosts where the `mgr` label is added.

. Verify new Ceph Mgr have been created in the target_nodes:
. Verify that the new Ceph Manager are created in the target nodes:
+
----
ceph orch ps | grep -i mgr
ceph -s
----
+
[NOTE]
The procedure does not shrink the Ceph Mgr daemons: the count is grown by the
number of target nodes, and the xref:migrating-mon-from-controller-nodes[Ceph Mon migration procedure]
will decommission the stand-by Ceph Mgr instances.
The procedure does not shrink the Ceph Manager daemons. The count is grown by
the number of target nodes, and migrating Ceph Monitor daemons to {Ceph} nodes
decommissions the stand-by Ceph Manager instances. For more information, see
xref:migrating-mon-from-controller-nodes_migrating-ceph-rbd[Migrating Ceph Monitor
daemons to {Ceph} nodes].
Loading

0 comments on commit 81880a9

Please sign in to comment.