Skip to content

Commit

Permalink
Reorganize Ceph assemblies
Browse files Browse the repository at this point in the history
Ceph assemblies are now better reorganized to follow a simple
rule/struct.
A main ceph-cluster migration assembly is included in main, and it
contains a quick intro and the (ordered) list of procedures (including
the cardinality section that is critical here and will be improved in a
follow up patch).
This way the ceph doc is very easy to access and maintain. There are
also fixes to wrong references (e.g. horizon != Ceph dashboard).

Signed-off-by: Francesco Pantano <[email protected]>
  • Loading branch information
fmount committed May 28, 2024
1 parent 09ef890 commit 249528b
Show file tree
Hide file tree
Showing 7 changed files with 86 additions and 50 deletions.
44 changes: 44 additions & 0 deletions docs_user/assemblies/assembly_migrating-ceph-cluster.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
ifdef::context[:parent-context: {context}]

[id="ceph-migration_{context}"]

= Migrating the {CephCluster} Cluster

:context: migrating-ceph

:toc: left
:toclevels: 3

ifdef::parent-context[:context: {parent-context}]
ifndef::parent-context[:!context:]

In the context of data plane adoption, where the {rhos_prev_long}
({OpenStackShort}) services are redeployed in {OpenShift}, you migrate a
{OpenStackPreviousInstaller}-deployed {CephCluster} cluster by using a process
called “externalizing” the {CephCluster} cluster.

There are two deployment topologies that include an internal {CephCluster}
cluster:

* {OpenStackShort} includes dedicated {CephCluster} nodes to host object
storage daemons (OSDs)

* Hyperconverged Infrastructure (HCI), where Compute and Storage services are
colocated on hyperconverged nodes

In either scenario, there are some {Ceph} processes that are deployed on
{OpenStackShort} Controller nodes: {Ceph} monitors, Ceph Object Gateway (RGW),
Rados Block Device (RBD), Ceph Metadata Server (MDS), Ceph Dashboard, and NFS
Ganesha. To migrate your {CephCluster} cluster, you must decommission the
Controller nodes and move the {Ceph} daemons to a set of target nodes that are
already part of the {CephCluster} cluster.

include::../modules/con_ceph-daemon-cardinality.adoc[leveloffset=+1]

include::assembly_migrating-ceph-monitoring-stack.adoc[leveloffset=+1]

include::../modules/proc_migrating-ceph-mds.adoc[leveloffset=+1]

include::assembly_migrating-ceph-rgw.adoc[leveloffset=+1]

include::assembly_migrating-ceph-rbd.adoc[leveloffset=+1]
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,6 @@

= Migrating the monitoring stack component to new nodes within an existing {Ceph} cluster

In the context of data plane adoption, where the {rhos_prev_long} ({OpenStackShort}) services are
redeployed in {OpenShift}, a {OpenStackPreviousInstaller}-deployed {CephCluster} cluster will undergo a migration in a process we are calling “externalizing” the {CephCluster} cluster.
There are two deployment topologies, broadly, that include an “internal” {CephCluster} cluster today: one is where {OpenStackShort} includes dedicated {CephCluster} nodes to host object storage daemons (OSDs), and the other is Hyperconverged Infrastructure (HCI) where Compute nodes
double up as {CephCluster} nodes. In either scenario, there are some {Ceph} processes that are deployed on {OpenStackShort} Controller nodes: {Ceph} monitors, Ceph Object Gateway (RGW), Rados Block Device (RBD), Ceph Metadata Server (MDS), Ceph Dashboard, and NFS Ganesha.
The Ceph Dashboard module adds web-based monitoring and administration to the
Ceph Manager.
With {OpenStackPreviousInstaller}-deployed {Ceph} this component is enabled as part of the overcloud deploy and it’s composed by:
Expand Down
37 changes: 27 additions & 10 deletions docs_user/assemblies/assembly_migrating-ceph-rbd.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -4,22 +4,39 @@

= Migrating Red Hat Ceph Storage RBD to external RHEL nodes

For hyperconverged infrastructure (HCI) or dedicated Storage nodes that are running {Ceph} version 6 or later, you must migrate the daemons that are included in the {rhos_prev_long} control plane into the existing external Red Hat Enterprise Linux (RHEL) nodes. The external RHEL nodes typically include the Compute nodes for an HCI environment or dedicated storage nodes.
For hyperconverged infrastructure (HCI) or dedicated Storage nodes that are
running {Ceph} version 6 or later, you must migrate the daemons that are
included in the {rhos_prev_long} control plane into the existing external Red
Hat Enterprise Linux (RHEL) nodes. The external RHEL nodes typically include
the Compute nodes for an HCI environment or dedicated storage nodes.

To migrate Red Hat Ceph Storage Rados Block Device (RBD), your environment must meet the following requirements:
To migrate Red Hat Ceph Storage Rados Block Device (RBD), your environment must
meet the following requirements:

* {Ceph} is running version 6 or later and is managed by cephadm/orchestrator.
* NFS (ganesha) is migrated from a {OpenStackPreviousInstaller}-based deployment to cephadm. For more information, see xref:creating-a-ceph-nfs-cluster_migrating-databases[Creating a NFS Ganesha cluster].
* Both the {Ceph} public and cluster networks are propagated, with {OpenStackPreviousInstaller}, to the target nodes.
* Ceph MDS, Ceph Monitoring stack, Ceph MDS, Ceph RGW and other services have been migrated already to the target nodes;
* NFS (ganesha) is migrated from a {OpenStackPreviousInstaller}-based
deployment to cephadm. For more information, see
xref:creating-a-ceph-nfs-cluster_migrating-databases[Creating a NFS Ganesha
cluster].
* Both the {Ceph} public and cluster networks are propagated, with
{OpenStackPreviousInstaller}, to the target nodes.
* Ceph MDS, Ceph Monitoring stack, Ceph MDS, Ceph RGW and other services have
been migrated already to the target nodes;
ifeval::["{build}" != "upstream"]
* The daemons distribution follows the cardinality constraints described in the doc link:https://access.redhat.com/articles/1548993[Red Hat Ceph Storage: Supported configurations]
* The daemons distribution follows the cardinality constraints described in the
doc link:https://access.redhat.com/articles/1548993[Red Hat Ceph Storage:
Supported configurations]
endif::[]
* The Ceph cluster is healthy, and the `ceph -s` command returns `HEALTH_OK`
* The procedure keeps the mon IP addresses by moving them to the {Ceph} nodes
* Drain the existing Controller nodes
* Deploy additional monitors to the existing nodes, and promote them as
_admin nodes that administrators can use to manage the {CephCluster} cluster and perform day 2 operations against it.

The high level procedure that migrates the Ceph Mon daemons is based on th
following assumptions:

* It keeps the mon IP addresses by moving them to the target {Ceph} nodes
* It drains the existing Controller nodes that are supposed to be decommisioned
* It deploys additional monitors to the target nodes, and promote them as
`_admin` nodes that administrators can use to manage the {CephCluster} cluster
and perform day 2 operations.

include::../modules/proc_migrating-mgr-from-controller-nodes.adoc[leveloffset=+1]

Expand Down
2 changes: 0 additions & 2 deletions docs_user/assemblies/assembly_migrating-ceph-rgw.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,6 @@ To migrate Ceph Object Gateway (RGW), your environment must meet the following r
* {Ceph} is running version 6 or later and is managed by cephadm/orchestrator.
* An undercloud is still available, and the nodes and networks are managed by {OpenStackPreviousInstaller}.

include::../modules/con_ceph-daemon-cardinality.adoc[leveloffset=+1]

include::../modules/proc_completing-prerequisites-for-migrating-ceph-rgw.adoc[leveloffset=+1]

include::../modules/proc_migrating-the-rgw-backends.adoc[leveloffset=+1]
Expand Down
15 changes: 0 additions & 15 deletions docs_user/assemblies/ceph_migration.adoc

This file was deleted.

8 changes: 1 addition & 7 deletions docs_user/main.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,4 @@ include::assemblies/assembly_adopting-the-data-plane.adoc[leveloffset=+1]

include::assemblies/assembly_migrating-the-object-storage-service.adoc[leveloffset=+1]

include::assemblies/assembly_migrating-ceph-monitoring-stack.adoc[leveloffset=+1]

include::modules/proc_migrating-ceph-mds.adoc[leveloffset=+1]

include::assemblies/assembly_migrating-ceph-rgw.adoc[leveloffset=+1]

include::assemblies/assembly_migrating-ceph-rbd.adoc[leveloffset=+1]
include::assemblies/assembly_migrating-ceph-cluster.adoc[leveloffset=+1]
26 changes: 14 additions & 12 deletions docs_user/modules/con_ceph-daemon-cardinality.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,19 @@

= {Ceph} daemon cardinality

{Ceph} 6 and later applies strict constraints in the way daemons can be colocated within the same node.
{Ceph} 6 and later applies strict constraints in the way daemons can be
colocated within the same node.
ifeval::["{build}" != "upstream"]
For more information, see link:https://access.redhat.com/articles/1548993[Red Hat Ceph Storage: Supported configurations].
endif::[]
The resulting topology depends on the available hardware, as well as the amount of {Ceph} services present in the Controller nodes which are going to be retired.
ifeval::["{build}" != "upstream"]
For more information about the procedure that is required to migrate the RGW component and keep an HA model using the Ceph ingress daemon, see link:{defaultCephURL}/object_gateway_guide/index#high-availability-for-the-ceph-object-gateway[High availability for the Ceph Object Gateway] in _Object Gateway Guide_.
endif::[]
ifeval::["{build}" != "downstream"]
The following document describes the procedure required to migrate the RGW component (and keep an HA model using the https://docs.ceph.com/en/latest/cephadm/services/rgw/#high-availability-service-for-rgw[Ceph Ingress daemon] in a common {OpenStackPreviousInstaller} scenario where Controller nodes represent the
https://github.com/openstack/tripleo-ansible/blob/master/tripleo_ansible/roles/tripleo_cephadm/tasks/rgw.yaml#L26-L30[spec placement] where the service is deployed.
endif::[]
As a general rule, the number of services that can be migrated depends on the number of available nodes in the cluster. The following diagrams cover the distribution of the {Ceph} daemons on the {Ceph} nodes where at least three nodes are required in a scenario that sees only RGW and RBD, without the {dashboard_first_ref}:
The resulting topology depends on the available hardware, as well as the amount
of {Ceph} services present in the Controller nodes which are going to be
retired.
As a general rule, the number of services that can be migrated depends on the
number of available nodes in the cluster. The following diagrams cover the
distribution of the {Ceph} daemons on the {Ceph} nodes where at least three
nodes are required in a scenario that sees only RGW and RBD, without the
{Ceph} Dashboard:

----
| | | |
Expand All @@ -24,7 +24,8 @@ As a general rule, the number of services that can be migrated depends on the nu
| osd | mon/mgr/crash | rgw/ingress |
----

With the {dashboard}, and without {rhos_component_storage_file_first_ref} at least four nodes are required. The {dashboard} has no failover:
With the {dashboard}, and without {rhos_component_storage_file_first_ref} at
least four nodes are required. The {Ceph} dashboard has no failover:

----
| | | |
Expand All @@ -35,7 +36,8 @@ With the {dashboard}, and without {rhos_component_storage_file_first_ref} at lea
| osd | rgw/ingress | (free) |
----

With the {dashboard} and the {rhos_component_storage_file}, 5 nodes minimum are required, and the {dashboard} has no failover:
With the {Ceph} dashboard and the {rhos_component_storage_file}, 5 nodes
minimum are required, and the {Ceph} dashboard has no failover:

----
| | | |
Expand Down

0 comments on commit 249528b

Please sign in to comment.