From b23ac58ee8d951ac9df507907af6d70eb22a1c34 Mon Sep 17 00:00:00 2001 From: Goutham Pacha Ravi Date: Fri, 15 Mar 2024 16:27:26 -0700 Subject: [PATCH 1/2] Add CephFS-via-NFS migration guide This setup concerns an OpenStack-adjacent data plane service called "ceph-nfs". On RHOSP 17/Wallaby, this service encapsulates an NFS-Ganesha server and a VIP and is orchestrated by pacemaker and systemd. When adopting to RHOSP 18, we no longer support deploying this service on RHOSP. We document how deployers can replace this service with a new clustered NFS service natively supported on Ceph. Implements: OSPRH-2238 --- .../openstack-ceph_backend_configuration.adoc | 59 ++++- .../modules/openstack-manila_adoption.adoc | 226 +++++++++++++++++- docs_user/modules/openstack-rolling_back.adoc | 11 + .../openstack-stop_openstack_services.adoc | 41 +++- 4 files changed, 317 insertions(+), 20 deletions(-) diff --git a/docs_user/modules/openstack-ceph_backend_configuration.adoc b/docs_user/modules/openstack-ceph_backend_configuration.adoc index 3575acfb7..cc399c047 100644 --- a/docs_user/modules/openstack-ceph_backend_configuration.adoc +++ b/docs_user/modules/openstack-ceph_backend_configuration.adoc @@ -66,10 +66,8 @@ EOF The content of the file should look something like this: -____ [source,yaml] ---- ---- apiVersion: v1 kind: Secret metadata: @@ -80,17 +78,17 @@ stringData: [client.openstack] key = caps mgr = "allow *" - caps mon = "profile rbd" - caps osd = "profile rbd pool=images" + caps mon = "allow r, profile rbd" + caps osd = "pool=vms, profile rbd pool=volumes, profile rbd pool=images, allow rw pool manila_data' ceph.conf: | [global] fsid = 7a1719e8-9c59-49e2-ae2b-d7eb08c695d4 mon_host = 10.1.1.2,10.1.1.3,10.1.1.4 ---- -____ Configure `extraMounts` within the `OpenStackControlPlane` CR: +[source,yaml] ---- oc patch openstackcontrolplane openstack --type=merge --patch ' spec: @@ -122,6 +120,57 @@ spec: Configuring some OpenStack services to use Ceph backend may require the FSID value. You can fetch the value from the config like so: +[source,bash] ---- CEPH_FSID=$(oc get secret ceph-conf-files -o json | jq -r '.data."ceph.conf"' | base64 -d | grep fsid | sed -e 's/fsid = //') ---- + +[id="creating-a-ceph-nfs-cluster_{context}"] +== Creating a Ceph NFS cluster + +If you use the Ceph via NFS backend with OpenStack Manila, prior to adoption, +you must create a new clustered NFS service on the Ceph cluster. This service +will replace the standalone, pacemaker-controlled `ceph-nfs` service that was +used on Red Hat OpenStack Platform 17.1. + +* You may identify a subset of the ceph nodes to deploy the new clustered NFS +service. +* This cluster must be deployed on the `StorageNFS` isolated network so that +it is easier for clients to mount their existing shares through the new NFS +export locations. Replace the ``{{ VIP }}`` in the following example with an +IP address from the `StorageNFS` isolated network +* You can pick an appropriate size for the NFS cluster. The NFS service +provides active/active high availability when the cluster size is more than +one node. It is recommended that the ``{{ cluster_size }}`` is at least one +less than the number of hosts identified. This solution has been well tested +with a 3-node NFS cluster. +* The `ingress-mode` argument must be set to ``haproxy-protocol``. No other +ingress-mode will be supported. This ingress mode will allow enforcing client +restrictions through OpenStack Manila. +* For more information on deploying the clustered Ceph NFS service, see the +link:https://docs.ceph.com/en/latest/cephadm/services/nfs/[ceph orchestrator +documentation] + +[source,bash] +---- +$CEPH_SSH cephadm shell + +# wait for shell to come up, then execute: +ceph orch host ls + +# Identify the hosts that can host the NFS service. +# Repeat the following command to label each host identified: +ceph orch host label add nfs + +# Set the appropriate {{ cluster_size }} and {{ VIP }}: +ceph nfs cluster create cephfs \ + “{{ cluster_size }} label:nfs” \ + --ingress \ + --virtual-ip={{ VIP }} + --ingress-mode=haproxy-protocol +}} + +# Check the status of the nfs cluster with these commands +ceph nfs cluster ls +ceph nfs cluster info cephfs +---- diff --git a/docs_user/modules/openstack-manila_adoption.adoc b/docs_user/modules/openstack-manila_adoption.adoc index b5c35a3f1..9d9b513a2 100644 --- a/docs_user/modules/openstack-manila_adoption.adoc +++ b/docs_user/modules/openstack-manila_adoption.adoc @@ -36,29 +36,83 @@ For example, if a Ceph cluster was used in the deployment, the "storage" network refers to the Ceph cluster's public network, and Manila's share manager service needs to be able to reach it. +== Changes to CephFS via NFS + +If the Red Hat OpenStack Platform 17.1 deployment uses CephFS via NFS as a +backend for Manila, there's a `ceph-nfs` service on the RHOSP +controller nodes deployed and managed by Director. This service cannot be +directly imported into RHOSP 18. On RHOSP 18, Manila only supports using a +"clustered" NFS service that is directly managed on the Ceph cluster. So, +adoption with this service will involve a data path disruption to existing NFS +clients. The timing of this disruption can be controlled by the deployer +independent of this adoption procedure. + +On RHOSP 17.1, pacemaker controls the high availability of the `ceph-nfs` +service. This service is assigned a Virtual IP (VIP) address that is also +managed by pacemaker. The VIP is typically created on an isolated `StorageNFS` +network. There are ordering and collocation constraints established between +this VIP, `ceph-nfs` and Manila's share manager service on the +controller nodes. Prior to adopting Manila, pacemaker's ordering and +collocation constraints must be adjusted to separate the share manager service. +This establishes `ceph-nfs` with its VIP as an isolated, standalone NFS service +that can be decommissioned at will after completing the OpenStack adoption. + +Red Hat Ceph Storage 7.0 introduced a native `clustered Ceph NFS service`. This +service has to be deployed on the Ceph cluster using the Ceph orchestrator +prior to adopting Manila. This NFS service will eventually replace the +standalone NFS service from RHOSP 17.1 in your deployment. When manila is +adopted into the RHOSP 18 environment, it will establish all the existing +exports and client restrictions on the new clustered Ceph NFS service. Clients +can continue to read and write data on their existing NFS shares, and are not +affected until the old standalone NFS service is decommissioned. This +switchover window allows clients to re-mount the same share from the new +clustered Ceph NFS service during a scheduled downtime. + +In order to ensure that existing clients can easily switchover to the new NFS +service, it is necessary that the clustered Ceph NFS service is assigned an +IP address from the same isolated `StorageNFS` network. Doing this will ensure +that NFS users aren't expected to make any networking changes to their +existing workloads. These users only need to discover and re-mount their shares +using new export paths. When the adoption procedure is complete, OpenStack +users can query Manila's API to list the export locations on existing shares to +identify the `preferred` paths to mount these shares. These `preferred` paths +will correspond to the new clustered Ceph NFS service in contrast to other +non-preferred export paths that continue to be displayed until the old +isolated, standalone NFS service is decommissioned. + +See xref:creating-a-ceph-nfs-cluster_{context}[Creating a Ceph NFS cluster] +for instructions on setting up a clustered NFS service. + == Prerequisites -* Ensure that manila systemd services (api, cron, scheduler) are +* Ensure that manila systemd services (`api`, `cron`, `scheduler`) are stopped. For more information, see xref:stopping-openstack-services_{context}[Stopping OpenStack services]. -* Ensure that manila pacemaker services ("openstack-manila-share") are +* If the deployment uses CephFS via NFS as a storage backend, ensure that +pacemaker ordering and collocation constraints are adjusted. For more +information, see xref:stopping-openstack-services_{context}[Stopping OpenStack services]. +* Ensure that manila's pacemaker service (`openstack-manila-share`) is stopped. For more information, see xref:stopping-openstack-services_{context}[Stopping OpenStack services]. * Ensure that the database migration has completed. For more information, see xref:migrating-databases-to-mariadb-instances_{context}[Migrating databases to MariaDB instances]. * Ensure that OpenShift nodes where `manila-share` service will be deployed can reach the management network that the storage system is in. +* If the deployment uses CephFS via NFS as a storage backend, ensure that +a new clustered Ceph NFS service is deployed on the Ceph cluster with the help +of Ceph orchestrator. For more information, see +xref:creating-a-ceph-nfs-cluster_{context}[Creating a Ceph NFS cluster]. * Ensure that services such as keystone and memcached are available prior to adopting manila services. * If tenant-driven networking was enabled (`driver_handles_share_servers=True`), -ensure that neutron has been deployed prior to -adopting manila services. +ensure that neutron has been deployed prior to adopting manila services. == Procedure - Manila adoption === Copying configuration from the RHOSP 17.1 deployment Define the `CONTROLLER1_SSH` environment variable, if it link:stop_openstack_services.md#variables[hasn't been -defined] already. Then copy the -configuration file from RHOSP 17.1 for reference. +defined] already. Then copy the configuration file from RHOSP 17.1 for +reference. +[source,bash] ---- $CONTROLLER1_SSH cat /var/lib/config-data/puppet-generated/manila/etc/manila/manila.conf | awk '!/^ *#/ && NF' > ~/manila.conf ---- @@ -73,8 +127,8 @@ environment: (`[database]`), service authentication (`auth_strategy`, `[keystone_authtoken]`), message bus configuration (`transport_url`, `control_exchange`), the default paste config -(`api_paste_config`) and inter-service communication configuration (` -[neutron]`, `[nova]`, `[cinder]`, `[glance]` `[oslo_messaging_*]`). So +(`api_paste_config`) and inter-service communication configuration ( +`[neutron]`, `[nova]`, `[cinder]`, `[glance]` `[oslo_messaging_*]`). So all of these can be ignored. * Ignore the `osapi_share_listen` configuration. In RHOSP 18, you rely on OpenShift routes and ingress. @@ -116,6 +170,8 @@ file called `policy.yaml`. path: policy.yaml ---- +* You must preserve the value of the `host` option under the `[DEFAULT]` +section as `hostgroup`. * The Manila API service needs the `enabled_share_protocols` option to be added in the `customServiceConfig` section in `manila: template: manilaAPI`. * If you had scheduler overrides, add them to the `customServiceConfig` @@ -149,6 +205,7 @@ using custom container images. [DEFAULT] debug = true enabled_share_backends = netapp + host = hostgroup [netapp] driver_handles_share_servers = False share_backend_name = netapp @@ -161,6 +218,7 @@ using custom container images. [DEFAULT] debug = true enabled_share_backends=pure-1 + host = hostgroup [pure-1] driver_handles_share_servers = False share_backend_name = pure-1 @@ -175,6 +233,7 @@ using custom container images. usernames, it is recommended to use OpenShift secrets, and the `customServiceConfigSecrets` key. An example: +[source,yaml] ---- cat << __EOF__ > ~/netapp_secrets.conf @@ -186,6 +245,10 @@ netapp_password = secret_netapp_password netapp_vserver = mydatavserver __EOF__ +---- + +[source,bash] +--- oc create secret generic osp-secret-manila-netapp --from-file=~/netapp_secrets.conf -n openstack ---- @@ -205,6 +268,7 @@ config example using the secret you created above. [DEFAULT] debug = true enabled_share_backends = netapp + host = hostgroup [netapp] driver_handles_share_servers = False share_backend_name = netapp @@ -230,6 +294,9 @@ count of the `manilaShares` service/s to 1. * Ensure that the appropriate storage management network is specified in the `manilaShares` section. The example below connects the `manilaShares` instance with the CephFS backend driver to the `storage` network. +* Prior to adopting the `manilaShares` service for CephFS via NFS, ensure that +you have a clustered Ceph NFS service created. You will need to provide the +name of the service as ``cephfs_nfs_cluster_id``. === Deploying the manila control plane @@ -270,9 +337,10 @@ spec: customServiceConfig: | [DEFAULT] enabled_share_backends = tripleo_ceph - [tripleo_ceph] + host = hostgroup + [cephfs] driver_handles_share_servers=False - share_backend_name=tripleo_ceph + share_backend_name=cephfs share_driver=manila.share.drivers.cephfs.driver.CephFSDriver cephfs_conf_path=/etc/ceph/ceph.conf cephfs_auth_id=openstack @@ -284,6 +352,65 @@ spec: __EOF__ ---- +Below is an example that uses CephFS via NFS. In this example: + +* The `cephfs_ganesha_server_ip` option is preserved from the configuration on +the old RHOSP 17.1 environment. +* The `cephfs_nfs_cluster_id` option is set with the name of the NFS cluster +created on Ceph. + + +[source,yaml] +---- +cat << __EOF__ > ~/manila.patch +spec: + manila: + enabled: true + apiOverride: + route: {} + template: + databaseInstance: openstack + secret: osp-secret + manilaAPI: + replicas: 3 + customServiceConfig: | + [DEFAULT] + enabled_share_protocols = cephfs + override: + service: + internal: + metadata: + annotations: + metallb.universe.tf/address-pool: internalapi + metallb.universe.tf/allow-shared-ip: internalapi + metallb.universe.tf/loadBalancerIPs: 172.17.0.80 + spec: + type: LoadBalancer + manilaScheduler: + replicas: 3 + manilaShares: + cephfs: + replicas: 1 + customServiceConfig: | + [DEFAULT] + enabled_share_backends = cephfs + host = hostgroup + [cephfs] + driver_handles_share_servers=False + share_backend_name=tripleo_ceph + share_driver=manila.share.drivers.cephfs.driver.CephFSDriver + cephfs_conf_path=/etc/ceph/ceph.conf + cephfs_auth_id=openstack + cephfs_cluster_name=ceph + cephfs_protocol_helper_type=NFS + cephfs_nfs_cluster_id=cephfs + cephfs_ganesha_server_ip=172.17.5.47 + networkAttachments: + - storage +__EOF__ +---- + +[source,bash] ---- oc patch openstackcontrolplane openstack --type=merge --patch-file=~/manila.patch ---- @@ -292,16 +419,19 @@ oc patch openstackcontrolplane openstack --type=merge --patch-file=~/manila.patc === Inspect the resulting manila service pods +[source,bash] ---- oc get pods -l service=manila ---- === Check that Manila API service is registered in Keystone +[source,bash] ---- openstack service list | grep manila ---- +[source,bash] ---- openstack endpoint list | grep manila @@ -315,6 +445,7 @@ openstack endpoint list | grep manila Test the health of the service: +[source,bash] ---- openstack share service list openstack share pool list --detail @@ -322,6 +453,7 @@ openstack share pool list --detail Check on existing workloads: +[source,bash] ---- openstack share list openstack share snapshot list @@ -329,6 +461,80 @@ openstack share snapshot list You can create further resources: +[source,bash] ---- openstack share create cephfs 10 --snapshot mysharesnap --name myshareclone +openstack share create nfs 10 --name mynfsshare +openstack share export location list mynfsshare +---- + +== Decommissioning the old standalone Ceph NFS service + +If the deployment uses CephFS via NFS, you must inform your OpenStack users +that the old, standalone NFS service will be decommissioned. Users can discover +the new export locations for their pre-existing shares by querying Manila's API. +To stop using the old NFS server, they need to unmount and remount their +shared file systems on each client. If users are consuming Manila shares via +the Manila CSI plugin for OpenShift, this migration can be done by scaling down +the application pods and scaling them back up. Clients spawning new workloads +must be discouraged from using share exports via the old NFS service. Manila +will no longer communicate with the old NFS service, and so it cannot apply or +alter any export rules on the old NFS service. + +Since the old NFS service will no longer be supported by future software +upgrades, it is recommended that the decommissioning period is short. + +Once the old NFS service is no longer used, you can adjust the configuration +for the `manila-share` service to remove the `cephfs_ganesha_server_ip` option. +Doing this will restart the `manila-share` process and remove the export +locations that pertained to the old NFS service from all the shares. + +[source,yaml] +---- +cat << __EOF__ > ~/manila.patch +spec: + manila: + enabled: true + apiOverride: + route: {} + template: + manilaShares: + cephfs: + replicas: 1 + customServiceConfig: | + [DEFAULT] + enabled_share_backends = cephfs + host = hostgroup + [cephfs] + driver_handles_share_servers=False + share_backend_name=cephfs + share_driver=manila.share.drivers.cephfs.driver.CephFSDriver + cephfs_conf_path=/etc/ceph/ceph.conf + cephfs_auth_id=openstack + cephfs_cluster_name=ceph + cephfs_protocol_helper_type=NFS + cephfs_nfs_cluster_id=cephfs + networkAttachments: + - storage +__EOF__ + ---- + +[source,bash] +--- +oc patch openstackcontrolplane openstack --type=merge --patch-file=~/manila.patch +---- + +To cleanup the standalone ceph nfs service from the old OpenStack control plane +nodes, you can disable and delete the pacemaker resources associated with the +service. Replace `` in the following commands with the IP address assigned +to the ceph-nfs service in your environment. + +[source,bash] +--- +sudo pcs resource disable ceph-nfs +sudo pcs resource disable ip- +sudo pcs resource unmanage ceph-nfs +sudo pcs resource unmanage ip- +--- + diff --git a/docs_user/modules/openstack-rolling_back.adoc b/docs_user/modules/openstack-rolling_back.adoc index 0a47d4f54..7bd83a992 100644 --- a/docs_user/modules/openstack-rolling_back.adoc +++ b/docs_user/modules/openstack-rolling_back.adoc @@ -136,6 +136,17 @@ for i in {1..3}; do done ---- +If the Ceph NFS service is running on the deployment as a OpenStack Manila +backend, you must restore the pacemaker ordering and colocation constraints +involving the "openstack-manila-share" service: + +--- + +sudo pcs constraint order start ceph-nfs then openstack-manila-share kind=Optional id=order-ceph-nfs-openstack-manila-share-Optional +sudo pcs constraint colocation add openstack-manila-share with ceph-nfs score=INFINITY id=colocation-openstack-manila-share-ceph-nfs-INFINITY + +--- + Now you can verify that the source cloud is operational again, e.g. by running `openstack` CLI commands or using the Horizon Dashboard. diff --git a/docs_user/modules/openstack-stop_openstack_services.adoc b/docs_user/modules/openstack-stop_openstack_services.adoc index edbc9e5c8..4063f4a2e 100644 --- a/docs_user/modules/openstack-stop_openstack_services.adoc +++ b/docs_user/modules/openstack-stop_openstack_services.adoc @@ -15,13 +15,22 @@ Some services are easy to stop because they only perform short asynchronous oper Since gracefully stopping all services is non-trivial and beyond the scope of this guide, the following procedure uses the force method and presents recommendations on how to check some things in the services. -Note that you should not stop the infrastructure management services yet, such as database, RabbitMQ, and HAProxy Load Balancer, nor should you stop the -Nova compute service, containerized modular libvirt daemons and Swift storage backend services. +Note that you should not stop the infrastructure management services yet, such +as: + +- database +- RabbitMQ +- HAProxy Load Balancer +- ceph-nfs +- Nova compute service +- containerized modular libvirt daemons +- Swift storage backend services. == Variables Define the shell variables used in the following steps. The values are illustrative and refer to a single node standalone director deployment. Use values that are correct for your environment: +[source,bash] ---- CONTROLLER1_SSH="ssh -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa root@192.168.122.100" CONTROLLER2_SSH="" @@ -35,6 +44,7 @@ You can stop OpenStack services at any moment, but you might leave your environm Ensure that there are no ongoing instance live migrations, volume migrations (online or offline), volume creation, backup restore, attaching, detaching, etc. +[source,bash] ---- openstack server list --all-projects -c ID -c Status |grep -E '\| .+ing \|' openstack volume list --all-projects -c ID -c Status |grep -E '\| .+ing \|'| grep -vi error @@ -51,13 +61,34 @@ Also collect the services topology specific configuration before stopping servic You can stop OpenStack services at any moment, but you might leave your environment in an undesired state. You should confirm that there are no ongoing operations. 1. Connect to all the controller nodes. -2. Stop the control plane services. -3. Verify the control plane services are stopped. +2. Remove any constraints between infrastructure and OpenStack control plane +services. +3. Stop the control plane services. +4. Verify the control plane services are stopped. The cinder-backup service on OSP 17.1 could be running as Active-Passive under pacemaker or as Active-Active, so you must check how it is running and stop it. -These steps can be automated with a simple script that relies on the previously defined environmental variables and function: +If the deployment enables CephFS via NFS as a backend for the OpenStack Shared +File System service (Manila), there are pacemaker ordering and co-location +constraints that govern the Virtual IP address assigned to the `ceph-nfs` +service, the `ceph-nfs` service itself and `manila-share` service. +These constraints must be removed: + +[source,bash] +---- +# check the co-location and ordering constraints concerning "manila-share" +sudo pcs constraint list --full + +# remove these constraints +sudo pcs constraint remove colocation-openstack-manila-share-ceph-nfs-INFINITY +sudo pcs constraint remove order-ceph-nfs-openstack-manila-share-Optional +---- + +The following steps to disable OpenStack control plane services can be +automated with a simple script that relies on the previously defined +environmental variables and function: +[source,bash] ---- # Update the services list to be stopped From 4024705bfdcc5a740c26bb26c595ac2e000ae6be Mon Sep 17 00:00:00 2001 From: Goutham Pacha Ravi Date: Mon, 18 Mar 2024 16:33:54 -0700 Subject: [PATCH 2/2] Address review feedback and prettify o/p --- .../openstack-ceph_backend_configuration.adoc | 106 ++++++++++++++++-- .../modules/openstack-manila_adoption.adoc | 4 +- docs_user/modules/openstack-rolling_back.adoc | 4 +- 3 files changed, 99 insertions(+), 15 deletions(-) diff --git a/docs_user/modules/openstack-ceph_backend_configuration.adoc b/docs_user/modules/openstack-ceph_backend_configuration.adoc index cc399c047..af1d36e18 100644 --- a/docs_user/modules/openstack-ceph_backend_configuration.adoc +++ b/docs_user/modules/openstack-ceph_backend_configuration.adoc @@ -37,9 +37,13 @@ became far simpler and hence, more became more secure with RHOSP 18. * It is simpler to create a common ceph secret (keyring and ceph config file) and propagate the secret to all services that need it. +TIP: To run `ceph` commands, you must use SSH to connect to a Ceph +storage node and run `sudo cephadm shell`. This brings up a ceph orchestrator +container that allows you to run administrative commands against the ceph +cluster. If Director deployed the ceph cluster, you may launch the cephadm +shell from an OpenStack controller node. + ---- -$CEPH_SSH cephadm shell -# wait for shell to come up, then execute: ceph auth caps client.openstack \ mgr 'allow *' \ mon 'allow r, profile rbd' \ @@ -133,14 +137,93 @@ you must create a new clustered NFS service on the Ceph cluster. This service will replace the standalone, pacemaker-controlled `ceph-nfs` service that was used on Red Hat OpenStack Platform 17.1. -* You may identify a subset of the ceph nodes to deploy the new clustered NFS -service. -* This cluster must be deployed on the `StorageNFS` isolated network so that + +=== Ceph node preparation + +* You must identify the ceph nodes to deploy the new clustered NFS service. +* This service must be deployed on the `StorageNFS` isolated network so that it is easier for clients to mount their existing shares through the new NFS -export locations. Replace the ``{{ VIP }}`` in the following example with an -IP address from the `StorageNFS` isolated network -* You can pick an appropriate size for the NFS cluster. The NFS service -provides active/active high availability when the cluster size is more than +export locations. +* You must propagate the `StorageNFS` network to the target nodes +where the `ceph-nfs` service will be deployed. See link:https://docs.openstack.org/project-deploy-guide/tripleo-docs/wallaby/features/network_isolation.html#deploying-the-overcloud-with-network-isolation[Deploying +an Overcloud with Network Isolation with TripleO] and link:https://docs.openstack.org/project-deploy-guide/tripleo-docs/wallaby/post_deployment/updating_network_configuration_post_deployment.html[Applying +network configuration changes after deployment] for the background to these +tasks. The following steps will be relevant if the Ceph Storage nodes were +deployed via Director. +** Identify the node definition file used in the environment. This is +the input file associated with the `openstack overcloud node provision` +command. For example, this file may be called `overcloud-baremetal-deploy.yaml` +** Edit the networks associated with the `CephStorage` nodes to include the +`StorageNFS` network: ++ +[source,yaml] +---- +- name: CephStorage + count: 3 + hostname_format: cephstorage-%index% + instances: + - hostname: cephstorage-0 + name: ceph-0 + - hostname: cephstorage-1 + name: ceph-1 + - hostname: cephstorage-2 + name: ceph-2 + defaults: + profile: ceph-storage + network_config: + template: /home/stack/network/nic-configs/ceph-storage.j2 + network_config_update: true + networks: + - network: ctlplane + vif: true + - network: storage + - network: storage_mgmt + - network: storage_nfs +---- +** Edit the network configuration template file for the `CephStorage` nodes +to include an interface connecting to the `StorageNFS` network. In the +example above, the path to the network configuration template file is +`/home/stack/network/nic-configs/ceph-storage.j2`. This file is modified +to include the following NIC template: ++ +[source,yaml] +---- +- type: vlan + device: nic2 + vlan_id: {{ storage_nfs_vlan_id }} + addresses: + - ip_netmask: {{ storage_nfs_ip }}/{{ storage_nfs_cidr }} + routes: {{ storage_nfs_host_routes }} +---- +** Re-run the `openstack overcloud node provision` command to update the +`CephStorage` nodes. ++ +[source,bash] +---- +openstack overcloud node provision \ + --stack overcloud \ + --network-config -y \ + -o overcloud-baremetal-deployed-storage_nfs.yaml \ + --concurrency 2 \ + /home/stack/network/baremetal_deployment.yaml +---- +** When the update is complete, ensure that the `CephStorage` nodes have a +new interface created and tagged with the appropriate VLAN associated with +`StorageNFS`. + +=== Ceph NFS cluster creation + +* Identify an IP address from the `StorageNFS` network to use as the Virtual IP +address for the Ceph NFS service. This IP address must be provided in place of +the `{{ VIP }}` in the example below. You can query used IP addresses with: + +[source,bash] +---- +openstack port list -c "Fixed IP Addresses" --network storage_nfs +---- + +* Pick an appropriate size for the NFS cluster. The NFS service provides +active/active high availability when the cluster size is more than one node. It is recommended that the ``{{ cluster_size }}`` is at least one less than the number of hosts identified. This solution has been well tested with a 3-node NFS cluster. @@ -150,10 +233,11 @@ restrictions through OpenStack Manila. * For more information on deploying the clustered Ceph NFS service, see the link:https://docs.ceph.com/en/latest/cephadm/services/nfs/[ceph orchestrator documentation] +* The following commands are run inside a `cephadm shell` to create a clustered +Ceph NFS service. [source,bash] ---- -$CEPH_SSH cephadm shell # wait for shell to come up, then execute: ceph orch host ls @@ -164,7 +248,7 @@ ceph orch host label add nfs # Set the appropriate {{ cluster_size }} and {{ VIP }}: ceph nfs cluster create cephfs \ - “{{ cluster_size }} label:nfs” \ + "{{ cluster_size }} label:nfs" \ --ingress \ --virtual-ip={{ VIP }} --ingress-mode=haproxy-protocol diff --git a/docs_user/modules/openstack-manila_adoption.adoc b/docs_user/modules/openstack-manila_adoption.adoc index 9d9b513a2..c45d6a0f9 100644 --- a/docs_user/modules/openstack-manila_adoption.adoc +++ b/docs_user/modules/openstack-manila_adoption.adoc @@ -60,7 +60,7 @@ that can be decommissioned at will after completing the OpenStack adoption. Red Hat Ceph Storage 7.0 introduced a native `clustered Ceph NFS service`. This service has to be deployed on the Ceph cluster using the Ceph orchestrator prior to adopting Manila. This NFS service will eventually replace the -standalone NFS service from RHOSP 17.1 in your deployment. When manila is +standalone NFS service from RHOSP 17.1 in your deployment. When Manila is adopted into the RHOSP 18 environment, it will establish all the existing exports and client restrictions on the new clustered Ceph NFS service. Clients can continue to read and write data on their existing NFS shares, and are not @@ -85,7 +85,7 @@ for instructions on setting up a clustered NFS service. == Prerequisites -* Ensure that manila systemd services (`api`, `cron`, `scheduler`) are +* Ensure that Manila systemd services (`api`, `cron`, `scheduler`) are stopped. For more information, see xref:stopping-openstack-services_{context}[Stopping OpenStack services]. * If the deployment uses CephFS via NFS as a storage backend, ensure that pacemaker ordering and collocation constraints are adjusted. For more diff --git a/docs_user/modules/openstack-rolling_back.adoc b/docs_user/modules/openstack-rolling_back.adoc index 7bd83a992..6c31da2fa 100644 --- a/docs_user/modules/openstack-rolling_back.adoc +++ b/docs_user/modules/openstack-rolling_back.adoc @@ -140,12 +140,12 @@ If the Ceph NFS service is running on the deployment as a OpenStack Manila backend, you must restore the pacemaker ordering and colocation constraints involving the "openstack-manila-share" service: ---- +---- sudo pcs constraint order start ceph-nfs then openstack-manila-share kind=Optional id=order-ceph-nfs-openstack-manila-share-Optional sudo pcs constraint colocation add openstack-manila-share with ceph-nfs score=INFINITY id=colocation-openstack-manila-share-ceph-nfs-INFINITY ---- +---- Now you can verify that the source cloud is operational again, e.g. by running `openstack` CLI commands or using the Horizon Dashboard.