diff --git a/dev/.timestamp-images b/dev/.timestamp-images new file mode 100644 index 000000000..e69de29bb diff --git a/dev/images/.gitkeep b/dev/images/.gitkeep new file mode 100644 index 000000000..e69de29bb diff --git a/dev/index.html b/dev/index.html new file mode 100644 index 000000000..d28750e1a --- /dev/null +++ b/dev/index.html @@ -0,0 +1,1465 @@ + + + + + + + +Data Plane Adoption contributor documentation + + + + + + + +
+
+

Development environment

+
+
+

The Adoption development environment utilizes +install_yamls +for CRC VM creation and for creation of the VM that hosts the source +Wallaby (or OSP 17.1) OpenStack in Standalone configuration.

+
+
+

Preparing the host

+
+

Install pre-commit hooks before contributing:

+
+
+
+
pip install pre-commit
+pre-commit install
+
+
+
+

Get dataplane adoption repo:

+
+
+
+
git clone https://github.com/openstack-k8s-operators/data-plane-adoption.git ~/data-plane-adoption
+
+
+
+

Get install_yamls:

+
+
+
+
git clone https://github.com/openstack-k8s-operators/install_yamls.git ~/install_yamls
+
+
+
+

Install tools for operator development:

+
+
+
+
cd ~/install_yamls/devsetup
+make download_tools
+
+
+
+
+
+

Deploying CRC

+
+

CRC environment for virtual workloads

+
+
+
cd ~/install_yamls/devsetup
+PULL_SECRET=$HOME/pull-secret.txt CPUS=12 MEMORY=40000 DISK=100 make crc
+
+eval $(crc oc-env)
+oc login -u kubeadmin -p 12345678 https://api.crc.testing:6443
+
+make crc_attach_default_interface
+
+
+
+
+

CRC environment with Openstack Ironic

+
+ + + + + +
+ + +This section is specific to deploying Nova with Ironic backend. Skip +it if you want to deploy Nova normally. +
+
+
+

Create the BMaaS network (crc-bmaas) and virtual baremetal nodes controlled by +a RedFish BMC emulator.

+
+
+
+
cd ~/install_yamls
+make nmstate
+make namespace
+cd devsetup  # back to install_yamls/devsetup
+make bmaas BMAAS_NODE_COUNT=2
+
+
+
+

A node definition YAML file to use with the openstack baremetal +create <file>.yaml command can be generated for the virtual baremetal +nodes by running the bmaas_generate_nodes_yaml make target. Store it +in a temp file for later.

+
+
+
+
make bmaas_generate_nodes_yaml | tail -n +2 | tee /tmp/ironic_nodes.yaml
+
+
+
+

Set variables to deploy edpm Standalone with additional network +(baremetal) and compute driver ironic.

+
+
+
+
cat << EOF > /tmp/addtional_nets.json
+[
+  {
+    "type": "network",
+    "name": "crc-bmaas",
+    "standalone_config": {
+      "type": "ovs_bridge",
+      "name": "baremetal",
+      "mtu": 1500,
+      "vip": true,
+      "ip_subnet": "172.20.1.0/24",
+      "allocation_pools": [
+        {
+          "start": "172.20.1.100",
+          "end": "172.20.1.150"
+        }
+      ],
+      "host_routes": [
+        {
+          "destination": "192.168.130.0/24",
+          "nexthop": "172.20.1.1"
+        }
+      ]
+    }
+  }
+]
+EOF
+export EDPM_COMPUTE_ADDITIONAL_NETWORKS=$(jq -c . /tmp/addtional_nets.json)
+export STANDALONE_COMPUTE_DRIVER=ironic
+export EDPM_COMPUTE_CEPH_ENABLED=false  # Optional
+export EDPM_COMPUTE_CEPH_NOVA=false # Optional
+export EDPM_COMPUTE_SRIOV_ENABLED=false # Without this the standalone deploy fails when compute driver is ironic.
+export NTP_SERVER=<ntp server>  # Default pool.ntp.org do not work in RedHat network, set this var as per the environment
+
+
+
+

=== +If EDPM_COMPUTE_CEPH_ENABLED=false is set, TripleO configures Glance with +Swift as a backend. +If EDPM_COMPUTE_CEPH_NOVA=false is set, TripleO configures Nova/Libvirt with +a local storage backend. +===

+
+
+
+
+
+

Deploying TripleO Standalone

+
+

Use the install_yamls devsetup +to create a virtual machine (edpm-compute-0) connected to the isolated networks.

+
+
+ + + + + +
+ + +To use OSP 17.1 content to deploy TripleO Standalone, follow the +guide for setting up downstream content +for make standalone. +
+
+
+

To use Wallaby content instead, run the following:

+
+
+
+
cd ~/install_yamls/devsetup
+make standalone
+
+
+
+

To deploy using TLS everywhere enabled, instead run:

+
+
+
+
cd ~/install_yamls/devsetup
+EDPM_COMPUTE_CEPH_ENABLED=false TLS_ENABLED=true DNS_DOMAIN=ooo.test make standalone
+
+
+
+

This will disable the Ceph deployment, as the CI is not deploying with Ceph.

+
+
+

Snapshot/revert

+
+

When the deployment of the Standalone OpenStack is finished, it’s a +good time to snapshot the machine, so that multiple Adoption attempts +can be done without having to deploy from scratch.

+
+
+
+
cd ~/install_yamls/devsetup
+make standalone_snapshot
+
+
+
+

And when you wish to revert the Standalone deployment to the +snapshotted state:

+
+
+
+
cd ~/install_yamls/devsetup
+make standalone_revert
+
+
+
+

Similar snapshot could be done for the CRC virtual machine, but the +developer environment reset on CRC side can be done sufficiently via +the install_yamls *_cleanup targets. This is further detailed in +the section: +Reset the environment to pre-adoption state

+
+
+
+
+
+

Network routing

+
+

Route VLAN20 to have access to the MariaDB cluster:

+
+
+
+
EDPM_BRIDGE=$(sudo virsh dumpxml edpm-compute-0 | grep -oP "(?<=bridge=').*(?=')")
+sudo ip link add link $EDPM_BRIDGE name vlan20 type vlan id 20
+sudo ip addr add dev vlan20 172.17.0.222/24
+sudo ip link set up dev vlan20
+
+
+
+

To adopt the Swift service as well, route VLAN23 to have access to the storage +backend services:

+
+
+
+
EDPM_BRIDGE=$(sudo virsh dumpxml edpm-compute-0 | grep -oP "(?<=bridge=').*(?=')")
+sudo ip link add link $EDPM_BRIDGE name vlan23 type vlan id 23
+sudo ip addr add dev vlan23 172.20.0.222/24
+sudo ip link set up dev vlan23
+
+
+
+
+
+

Creating a workload to adopt

+
+

To run openstack commands from the host without +installing the package and copying the configuration file from the virtual machine, create an alias:

+
+
+
+
alias openstack="ssh -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa root@192.168.122.100 OS_CLOUD=standalone openstack"
+
+
+
+

Virtual machine steps

+
+

Create a test VM instance with a test volume attachement:

+
+
+
+
cd ~/data-plane-adoption
+OS_CLOUD_IP=192.168.122.100 OS_CLOUD_NAME=standalone \
+    bash tests/roles/development_environment/files/pre_launch.bash
+
+
+
+

This also creates a test Cinder volume, a backup from it, and a snapshot of it.

+
+
+

Create a Barbican secret:

+
+
+
+
openstack secret store --name testSecret --payload 'TestPayload'
+
+
+
+

If using Ceph backend, confirm the image UUID can be seen in Ceph’s +images pool:

+
+
+
+
ssh -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa root@192.168.122.100 sudo cephadm shell -- rbd -p images ls -l
+
+
+
+
+

Ironic steps

+
+ + + + + +
+ + +This section is specific to deploying Nova with Ironic backend. Skip +it if you deployed Nova normally. +
+
+
+
+
# Enroll baremetal nodes
+make bmaas_generate_nodes_yaml | tail -n +2 | tee /tmp/ironic_nodes.yaml
+scp -i $HOME/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa /tmp/ironic_nodes.yaml root@192.168.122.100:
+ssh -i $HOME/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa root@192.168.122.100
+
+export OS_CLOUD=standalone
+openstack baremetal create /root/ironic_nodes.yaml
+export IRONIC_PYTHON_AGENT_RAMDISK_ID=$(openstack image show deploy-ramdisk -c id -f value)
+export IRONIC_PYTHON_AGENT_KERNEL_ID=$(openstack image show deploy-kernel -c id -f value)
+for node in $(openstack baremetal node list -c UUID -f value); do
+  openstack baremetal node set $node \
+    --driver-info deploy_ramdisk=${IRONIC_PYTHON_AGENT_RAMDISK_ID} \
+    --driver-info deploy_kernel=${IRONIC_PYTHON_AGENT_KERNEL_ID} \
+    --resource-class baremetal \
+    --property capabilities='boot_mode:uefi'
+done
+
+# Create a baremetal flavor
+openstack flavor create baremetal --ram 1024 --vcpus 1 --disk 15 \
+  --property resources:VCPU=0 \
+  --property resources:MEMORY_MB=0 \
+  --property resources:DISK_GB=0 \
+  --property resources:CUSTOM_BAREMETAL=1 \
+  --property capabilities:boot_mode="uefi"
+
+# Create image
+IMG=Fedora-Cloud-Base-38-1.6.x86_64.qcow2
+URL=https://download.fedoraproject.org/pub/fedora/linux/releases/38/Cloud/x86_64/images/$IMG
+curl -o /tmp/${IMG} -L $URL
+DISK_FORMAT=$(qemu-img info /tmp/${IMG} | grep "file format:" | awk '{print $NF}')
+openstack image create --container-format bare --disk-format ${DISK_FORMAT} Fedora-Cloud-Base-38 < /tmp/${IMG}
+
+export BAREMETAL_NODES=$(openstack baremetal node list -c UUID -f value)
+# Manage nodes
+for node in $BAREMETAL_NODES; do
+  openstack baremetal node manage $node
+done
+
+# Wait for nodes to reach "manageable" state
+watch openstack baremetal node list
+
+# Inspect baremetal nodes
+for node in $BAREMETAL_NODES; do
+  openstack baremetal introspection start $node
+done
+
+# Wait for inspection to complete
+watch openstack baremetal introspection list
+
+# Provide nodes
+for node in $BAREMETAL_NODES; do
+  openstack baremetal node provide $node
+done
+
+# Wait for nodes to reach "available" state
+watch openstack baremetal node list
+
+# Create an instance on baremetal
+openstack server show baremetal-test || {
+    openstack server create baremetal-test --flavor baremetal --image Fedora-Cloud-Base-38 --nic net-id=provisioning --wait
+}
+
+# Check instance status and network connectivity
+openstack server show baremetal-test
+ping -c 4 $(openstack server show baremetal-test -f json -c addresses | jq -r .addresses.provisioning[0])
+
+
+
+
+
+
+

Installing the OpenStack operators

+
+
+
cd ..  # back to install_yamls
+make crc_storage
+make input
+make openstack
+
+
+
+
+
+

Performing the adoption procedure

+
+

To simplify the adoption procedure, copy the deployment passwords that +you use in copy the deployment passwords that you use in the +backend +services deployment phase of the data plane adoption.

+
+
+
+
scp -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa root@192.168.122.100:/root/tripleo-standalone-passwords.yaml ~/
+
+
+
+

The development environment is now set up, you can go to the Adoption +documentation +and perform adoption manually, or run the test +suite +against your environment.

+
+
+
+
+

Resetting the environment to pre-adoption state

+
+

The development environment must be rolled back in case we want to execute another Adoption run.

+
+
+

Delete the data-plane and control-plane resources from the CRC vm

+
+
+
+
oc delete --ignore-not-found=true --wait=false openstackdataplanedeployment/openstack
+oc delete --ignore-not-found=true --wait=false openstackdataplanedeployment/openstack-nova-compute-ffu
+oc delete --ignore-not-found=true --wait=false openstackcontrolplane/openstack
+oc patch openstackcontrolplane openstack --type=merge --patch '
+metadata:
+  finalizers: []
+' || true
+
+while oc get pod | grep rabbitmq-server-0; do
+    sleep 2
+done
+while oc get pod | grep openstack-galera-0; do
+    sleep 2
+done
+
+oc delete --wait=false pod ovn-copy-data || true
+oc delete --wait=false pod mariadb-copy-data || true
+oc delete secret osp-secret || true
+
+
+
+

Revert the standalone vm to the snapshotted state

+
+
+
+
cd ~/install_yamls/devsetup
+make standalone_revert
+
+
+
+

Clean up and initialize the storage PVs in CRC vm

+
+
+
+
cd ..
+for i in {1..3}; do make crc_storage_cleanup crc_storage && break || sleep 5; done
+
+
+
+
+
+

Experimenting with an additional compute node

+
+

The following is not on the critical path of preparing the development +environment for Adoption, but it shows how to make the environment +work with an additional compute node VM.

+
+
+

The remaining steps should be completed on the hypervisor hosting crc +and edpm-compute-0.

+
+
+

Deploy NG Control Plane with Ceph

+
+

Export the Ceph configuration from edpm-compute-0 into a secret.

+
+
+
+
SSH=$(ssh -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa root@192.168.122.100)
+KEY=$($SSH "cat /etc/ceph/ceph.client.openstack.keyring | base64 -w 0")
+CONF=$($SSH "cat /etc/ceph/ceph.conf | base64 -w 0")
+
+cat <<EOF > ceph_secret.yaml
+apiVersion: v1
+data:
+  ceph.client.openstack.keyring: $KEY
+  ceph.conf: $CONF
+kind: Secret
+metadata:
+  name: ceph-conf-files
+  namespace: openstack
+type: Opaque
+EOF
+
+oc create -f ceph_secret.yaml
+
+
+
+

Deploy the NG control plane with Ceph as backend for Glance and +Cinder. As described in +the install_yamls README, +use the sample config located at +https://github.com/openstack-k8s-operators/openstack-operator/blob/main/config/samples/core_v1beta1_openstackcontrolplane_network_isolation_ceph.yaml +but make sure to replace the _FSID_ in the sample with the one from +the secret created in the previous step.

+
+
+
+
curl -o /tmp/core_v1beta1_openstackcontrolplane_network_isolation_ceph.yaml https://raw.githubusercontent.com/openstack-k8s-operators/openstack-operator/main/config/samples/core_v1beta1_openstackcontrolplane_network_isolation_ceph.yaml
+FSID=$(oc get secret ceph-conf-files -o json | jq -r '.data."ceph.conf"' | base64 -d | grep fsid | sed -e 's/fsid = //') && echo $FSID
+sed -i "s/_FSID_/${FSID}/" /tmp/core_v1beta1_openstackcontrolplane_network_isolation_ceph.yaml
+oc apply -f /tmp/core_v1beta1_openstackcontrolplane_network_isolation_ceph.yaml
+
+
+
+

A NG control plane which uses the same Ceph backend should now be +functional. If you create a test image on the NG system to confirm +it works from the configuration above, be sure to read the warning +in the next section.

+
+
+

Before beginning adoption testing or development you may wish to +deploy an EDPM node as described in the following section.

+
+
+
+

Warning about two OpenStacks and one Ceph

+
+

Though workloads can be created in the NG deployment to test, be +careful not to confuse them with workloads from the Wallaby cluster +to be migrated. The following scenario is now possible.

+
+
+

A Glance image exists on the Wallaby OpenStack to be adopted.

+
+
+
+
[stack@standalone standalone]$ export OS_CLOUD=standalone
+[stack@standalone standalone]$ openstack image list
++--------------------------------------+--------+--------+
+| ID                                   | Name   | Status |
++--------------------------------------+--------+--------+
+| 33a43519-a960-4cd0-a593-eca56ee553aa | cirros | active |
++--------------------------------------+--------+--------+
+[stack@standalone standalone]$
+
+
+
+

If you now create an image with the NG cluster, then a Glance image +will exsit on the NG OpenStack which will adopt the workloads of the +wallaby.

+
+
+
+
[fultonj@hamfast ng]$ export OS_CLOUD=default
+[fultonj@hamfast ng]$ export OS_PASSWORD=12345678
+[fultonj@hamfast ng]$ openstack image list
++--------------------------------------+--------+--------+
+| ID                                   | Name   | Status |
++--------------------------------------+--------+--------+
+| 4ebccb29-193b-4d52-9ffd-034d440e073c | cirros | active |
++--------------------------------------+--------+--------+
+[fultonj@hamfast ng]$
+
+
+
+

Both Glance images are stored in the same Ceph pool.

+
+
+
+
ssh -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa root@192.168.122.100 sudo cephadm shell -- rbd -p images ls -l
+Inferring fsid 7133115f-7751-5c2f-88bd-fbff2f140791
+Using recent ceph image quay.rdoproject.org/tripleowallabycentos9/daemon@sha256:aa259dd2439dfaa60b27c9ebb4fb310cdf1e8e62aa7467df350baf22c5d992d8
+NAME                                       SIZE     PARENT  FMT  PROT  LOCK
+33a43519-a960-4cd0-a593-eca56ee553aa         273 B            2
+33a43519-a960-4cd0-a593-eca56ee553aa@snap    273 B            2  yes
+4ebccb29-193b-4d52-9ffd-034d440e073c       112 MiB            2
+4ebccb29-193b-4d52-9ffd-034d440e073c@snap  112 MiB            2  yes
+
+
+
+

However, as far as each Glance service is concerned each has one +image. Thus, in order to avoid confusion during adoption the test +Glance image on the NG OpenStack should be deleted.

+
+
+
+
openstack image delete 4ebccb29-193b-4d52-9ffd-034d440e073c
+
+
+
+

Connecting the NG OpenStack to the existing Ceph cluster is part of +the adoption procedure so that the data migration can be minimized +but understand the implications of the above example.

+
+
+
+

Deploy edpm-compute-1

+
+

edpm-compute-0 is not available as a standard EDPM system to be +managed by edpm-ansible +or +openstack-operator +because it hosts the wallaby deployment which will be adopted +and after adoption it will only host the Ceph server.

+
+
+

Use the install_yamls devsetup +to create additional virtual machines and be sure +that the EDPM_COMPUTE_SUFFIX is set to 1 or greater. +Do not set EDPM_COMPUTE_SUFFIX to 0 or you could delete +the Wallaby system created in the previous section.

+
+
+

When deploying EDPM nodes add an extraMounts like the following in +the OpenStackDataPlaneNodeSet CR nodeTemplate so that they will be +configured to use the same Ceph cluster.

+
+
+
+
    edpm-compute:
+      nodeTemplate:
+        extraMounts:
+        - extraVolType: Ceph
+          volumes:
+          - name: ceph
+            secret:
+              secretName: ceph-conf-files
+          mounts:
+          - name: ceph
+            mountPath: "/etc/ceph"
+            readOnly: true
+
+
+
+

A NG data plane which uses the same Ceph backend should now be +functional. Be careful about not confusing new workloads to test the +NG OpenStack with the Wallaby OpenStack as described in the previous +section.

+
+
+
+

Begin Adoption Testing or Development

+
+

We should now have:

+
+
+
    +
  • +

    An NG glance service based on Antelope running on CRC

    +
  • +
  • +

    An TripleO-deployed glance serviced running on edpm-compute-0

    +
  • +
  • +

    Both services have the same Ceph backend

    +
  • +
  • +

    Each service has their own independent database

    +
  • +
+
+
+

An environment above is assumed to be available in the +Glance Adoption documentation. You +may now follow other Data Plane Adoption procedures described in the +documentation. +The same pattern can be applied to other services.

+
+
+
+
+
+
+

Contributing to documentation

+
+
+

Rendering documentation locally

+
+

Install docs build requirements:

+
+
+
+
make docs-dependencies
+
+
+
+

To render the user-facing documentation site locally:

+
+
+
+
make docs-user
+
+
+
+

To render the contributor documentation site locally:

+
+
+
+
make docs-dev
+
+
+
+

The built HTML files are in docs_build/adoption-user and +docs_build/adoption-dev directories respectively.

+
+
+

There are some additional make targets for convenience. The following +targets, in addition to rendering the docs, will also open the +resulting HTML in your browser so that you don’t have to look for it:

+
+
+
+
make docs-user-open
+# or
+make docs-dev-open
+
+
+
+

The following targets set up an inotify watch on the documentation +sources, and when it detects modification, the HTML is re-rendered. +This is so that you can use "edit source - save source - refresh +browser page" loop when working on the docs, without having to run +make docs-* repeatedly.

+
+
+
+
make docs-user-watch
+# or
+make docs-dev-watch
+
+
+
+

Preview of downstream documentation

+
+

To render a preview of what should serve as the base for downstream +docs (e.g. with downstream container image URLs), prepend +BUILD=downstream to your make targets. For example:

+
+
+
+
BUILD=downstream make docs-user
+
+
+
+
+
+

Patterns and tips for contributing to documentation

+
+
    +
  • +

    Pages concerning individual components/services should make sense in +the context of the broader adoption procedure. While adopting a +service in isolation is an option for developers, let’s write the +documentation with the assumption the adoption procedure is being +done in full, going step by step (one doc after another).

    +
  • +
  • +

    The procedure should be written with production use in mind. This +repository could be used as a starting point for product +technical documentation. We should not tie the documentation to +something that wouldn’t translate well from dev envs to production.

    +
    +
      +
    • +

      This includes not assuming that the source environment is +Standalone, and the destination is CRC. We can provide examples for +Standalone/CRC, but it should be possible to use the procedure +with fuller environments in a way that is obvious from the docs.

      +
    • +
    +
    +
  • +
  • +

    If possible, try to make code snippets copy-pastable. Use shell +variables if the snippets should be parametrized. Use oc rather +than kubectl in snippets.

    +
  • +
  • +

    Focus on the "happy path" in the docs as much as possible, +troubleshooting info can go into the Troubleshooting page, or +alternatively a troubleshooting section at the end of the document, +visibly separated from the main procedure.

    +
  • +
  • +

    The full procedure will inevitably happen to be quite long, so let’s +try to be concise in writing to keep the docs consumable (but not to +a point of making things difficult to understand or omitting +important things).

    +
  • +
  • +

    A bash alias can be created for long command however when implementing +them in the test roles you should transform them to avoid command not +found errors. +From:

    +
    +
    +
    alias openstack="oc exec -t openstackclient -- openstack"
    +
    +openstack endpoint list | grep network
    +
    +
    +
    +

    To:

    +
    +
    +
    +
    alias openstack="oc exec -t openstackclient -- openstack"
    +
    +${BASH_ALIASES[openstack]} endpoint list | grep network
    +
    +
    +
  • +
+
+
+
+
+
+

Tests

+
+
+

Test suite information

+
+

The adoption docs repository also includes a test suite for Adoption. +There are targets in the Makefile which can be used to execute the +test suite:

+
+
+
    +
  • +

    test-minimal - a minimal test scenario, the eventual set of +services in this scenario should be the "core" services needed to +launch a VM. This scenario assumes local storage backend for +services like Glance and Cinder.

    +
  • +
  • +

    test-with-ceph - like minimal but with Ceph storage backend for +Glance and Cinder.

    +
  • +
+
+
+
+

Configuring the test suite

+
+
    +
  • +

    Create tests/vars.yaml and tests/secrets.yaml by copying the +included samples (tests/vars.sample.yaml, +tests/secrets.sample.yaml).

    +
  • +
  • +

    Walk through the tests/vars.yaml and tests/secrets.yaml files +and see if you need to edit any values. If you are using the +documented development environment, majority of the defaults should +work out of the box. The comments in the YAML files will guide you +regarding the expected values. You may want to double check that +these variables suit your environment:

    +
    +
      +
    • +

      install_yamls_path

      +
    • +
    • +

      tripleo_passwords

      +
    • +
    • +

      controller*_ssh

      +
    • +
    • +

      edpm_privatekey_path

      +
    • +
    • +

      timesync_ntp_servers

      +
    • +
    +
    +
  • +
+
+
+
+

Running the tests

+
+

The interface between the execution infrastructure and the test suite +is an Ansible inventory and variables files. Inventory and variable +samples are provided. To run the tests, follow this procedure:

+
+
+
    +
  • +

    Install dependencies and create a venv:

    +
    +
    +
    sudo dnf -y install python-devel
    +python3 -m venv venv
    +source venv/bin/activate
    +pip install openstackclient osc_placement jmespath
    +ansible-galaxy collection install community.general
    +
    +
    +
  • +
  • +

    Run make test-with-ceph (the documented development environment +does include Ceph).

    +
    +

    If you are using Ceph-less environment, you should run make +test-minimal.

    +
    +
  • +
+
+
+
+

Making patches to the test suite

+
+

Please be aware of the following when changing the test suite:

+
+
+
    +
  • +

    The test suite should follow the docs as much as possible.

    +
    +

    The purpose of the test suite is to verify what the user would run +if they were following the docs. We don’t want to loosely rewrite +the docs into Ansible code following Ansible best practices. We want +to test the exact same bash commands/snippets that are written in +the docs. This often means that we should be using the shell +module and do a verbatim copy/paste from docs, instead of using the +best Ansible module for the task at hand.

    +
    +
  • +
+
+
+
+
+
+ + + + + + + \ No newline at end of file diff --git a/index.html b/index.html new file mode 100644 index 000000000..cf61afc12 --- /dev/null +++ b/index.html @@ -0,0 +1,55 @@ + + + + + + + + + +
+ User + Contributor +
+ +
+ +
+ + + diff --git a/user/.timestamp-images b/user/.timestamp-images new file mode 100644 index 000000000..e69de29bb diff --git a/user/downstream.html b/user/downstream.html new file mode 100644 index 000000000..6f6224faf --- /dev/null +++ b/user/downstream.html @@ -0,0 +1,12174 @@ + + + + + + + +Adopting a Red Hat OpenStack Platform 17.1 deployment + + + + + + + +
+
+

Red Hat OpenStack Services on OpenShift 18.0 adoption overview

+
+
+

Adoption is the process of migrating a Red Hat OpenStack Platform (RHOSP) 17.1 overcloud to a Red Hat OpenStack Services on OpenShift 18.0 data plane. To ensure that you understand the entire adoption process and how to sufficiently prepare your RHOSP environment, review the prerequisites, adoption process, and post-adoption tasks.

+
+
+ + + + + +
+ + +It is important to read the whole adoption guide before you start +the adoption. You should form an understanding of the procedure, +prepare the necessary configuration snippets for each service ahead of +time, and test the procedure in a representative test environment +before you adopt your main environment. +
+
+
+

Adoption limitations

+
+

The adoption process does not support the following features:

+
+
+
    +
  • +

    Red Hat OpenStack Platform (RHOSP) 17.1 multi-cell deployments

    +
  • +
  • +

    Fast Data path

    +
  • +
  • +

    instanceHA

    +
  • +
  • +

    Auto-scaling

    +
  • +
  • +

    DCN

    +
  • +
  • +

    Designate

    +
  • +
  • +

    Octavia

    +
  • +
+
+
+

If you plan to adopt the Key Manager service (barbican) or a FIPs environment, review the following limitations:

+
+
+
    +
  • +

    The Key Manager service does not yet support all of the crypto plug-ins available in director.

    +
  • +
  • +

    When you adopt a RHOSP 17.1 FIPS environment to RHOSO 18.0, your adopted cluster remains a FIPS cluster. There is no option to change the FIPS status during adoption. If your cluster is FIPS-enabled, you must deploy a FIPS Red Hat OpenShift Container Platform (RHOCP) cluster to adopt your RHOSP 17.1 FIPS control plane. For more information about enabling FIPS in RHOCP, see Support for FIPS cryptography in the RHOCP Installing guide.

    +
  • +
+
+
+
+

Adoption prerequisites

+
+

Before you begin the adoption procedure, complete the following prerequisites:

+
+
+
+
Planning information
+
+
+ +
+
+
Back-up information
+
+
+ +
+
+
Compute
+
+
+ +
+
+
ML2/OVS
+
+
+
    +
  • +

    If you use the Modular Layer 2 plug-in with Open vSwitch mechanism driver (ML2/OVS), migrate it to the Modular Layer 2 plug-in with Open Virtual Networking (ML2/OVN) mechanism driver. For more information, see Migrating to the OVN mechanism driver.

    +
  • +
+
+
+
Tools
+
+
+
    +
  • +

    Install the oc command line tool on your workstation.

    +
  • +
  • +

    Install the podman command line tool on your workstation.

    +
  • +
+
+
+
RHOSP 17.1 hosts
+
+
+
    +
  • +

    All control plane and data plane hosts of the RHOSP 17.1 cloud are up and running, and continue to run throughout the adoption procedure.

    +
  • +
+
+
+
+
+
+
+

Guidelines for planning the adoption

+
+

When planning to adopt a Red Hat OpenStack Services on OpenShift (RHOSO) 18.0 environment, consider the scope of the change. An adoption is similar in scope to a data center upgrade. Different firmware levels, hardware vendors, hardware profiles, networking interfaces, storage interfaces, and so on affect the adoption process and can cause changes in behavior during the adoption.

+
+
+

Review the following guidelines to adequately plan for the adoption and increase the chance that you complete the adoption successfully:

+
+
+ + + + + +
+ + +All commands in the adoption documentation are examples. Do not copy and paste the commands verbatim. +
+
+
+
    +
  • +

    To minimize the risk of an adoption failure, reduce the number of environmental differences between the staging environment and the production sites.

    +
  • +
  • +

    If the staging environment is not representative of the production sites, or a staging environment is not available, then you must plan to include contingency time in case the adoption fails.

    +
  • +
  • +

    Review your custom Red Hat OpenStack Platform (RHOSP) service configuration at every major release.

    +
    +
      +
    • +

      Every major RHOSO release upgrades through multiple RHOSP versions.

      +
    • +
    • +

      Each new RHOSP version might deprecate configuration options or change the format of the configuration.

      +
    • +
    +
    +
  • +
  • +

    Prepare a Method of Procedure (MOP) that is specific to your environment to reduce the risk of variance or omitted steps when running the adoption process.

    +
  • +
  • +

    You can use representative hardware in a staging environment to prepare a MOP and validate any content changes.

    +
    +
      +
    • +

      Include a cross-section of firmware versions, additional interface or device hardware, and any additional software in the representative staging environment to ensure that it is broadly representative of the variety that is present in the production environments.

      +
    • +
    • +

      Ensure that you validate any Red Hat Enterprise Linux update or upgrade in the representative staging environment.

      +
    • +
    +
    +
  • +
  • +

    Use Satellite for localized and version-pinned RPM content where your data plane nodes are located.

    +
  • +
  • +

    In the production environment, use the content that you tested in the staging environment.

    +
  • +
+
+
+
+

Adoption process overview

+
+

Familiarize yourself with the steps of the adoption process and the optional post-adoption tasks.

+
+ +
+
Post-adoption tasks
+ +
+
+
+

Identity service authentication

+
+

If you have custom policies enabled, contact Red Hat Support before adopting a director OpenStack deployment. You must complete the following steps for adoption:

+
+
+
    +
  1. +

    Remove custom policies.

    +
  2. +
  3. +

    Run the adoption.

    +
  4. +
  5. +

    Re-add custom policies by using the new SRBAC syntax.

    +
  6. +
+
+
+

After you adopt a director-based OpenStack deployment to a Red Hat OpenStack Services on OpenShift deployment, the Identity service performs user authentication and authorization by using Secure RBAC (SRBAC). If SRBAC is already enabled, then there is no change to how you perform operations. If SRBAC is disabled, then adopting a director-based OpenStack deployment might change how you perform operations due to changes in API access policies.

+
+
+

For more information on SRBAC, see Secure role based access control in Red Hat OpenStack Services on OpenShift in Performing security operations.

+
+
+
+

Configuring the network for the RHOSO deployment

+
+

With Red Hat OpenShift Container Platform (RHOCP), the network is a very important aspect of the deployment, and +it is important to plan it carefully. The general network requirements for the +Red Hat OpenStack Platform (RHOSP) services are not much different from the ones in a director deployment, but the way you handle them is.

+
+
+ + + + + +
+ + +For more information about the network architecture and configuration, see +Deploying Red Hat OpenStack Platform 18.0 Development Preview 3 on Red Hat OpenShift Container Platform and About +networking in OpenShift Container Platform 4.15 Documentation. This document will address concerns specific to adoption. +
+
+
+

When adopting a new RHOSP deployment, it is important to align the network +configuration with the adopted cluster to maintain connectivity for existing +workloads.

+
+
+

The following logical configuration steps will incorporate the existing network +configuration:

+
+
+
    +
  • +

    configure RHOCP worker nodes to align VLAN tags and IPAM +configuration with the existing deployment.

    +
  • +
  • +

    configure Control Plane services to use compatible IP ranges for +service and load balancing IPs.

    +
  • +
  • +

    configure Data Plane nodes to use corresponding compatible configuration for +VLAN tags and IPAM.

    +
  • +
+
+
+

Specifically,

+
+
+
    +
  • +

    IPAM configuration will either be reused from the +existing deployment or, depending on IP address availability in the +existing allocation pools, new ranges will be defined to be used for the +new control plane services. If so, IP routing will be configured between +the old and new ranges. For more information, see Planning your IPAM configuration.

    +
  • +
  • +

    VLAN tags will be reused from the existing deployment.

    +
  • +
+
+
+

Retrieving the network configuration from your existing deployment

+
+

Let’s first determine which isolated networks are defined in the existing +deployment. You can find the network configuration in the network_data.yaml +file. For example,

+
+
+
+
- name: InternalApi
+  mtu: 1500
+  vip: true
+  vlan: 20
+  name_lower: internal_api
+  dns_domain: internal.mydomain.tld.
+  service_net_map_replace: internal
+  subnets:
+    internal_api_subnet:
+      ip_subnet: '172.17.0.0/24'
+      allocation_pools: [{'start': '172.17.0.4', 'end': '172.17.0.250'}]
+
+
+
+

You should make a note of the VLAN tag used (vlan key) and the IP range +(ip_subnet key) for each isolated network. The IP range will later be split +into separate pools for control plane services and load balancer IP addresses.

+
+
+

You should also determine the list of IP addresses already consumed in the +adopted environment. Consult tripleo-ansible-inventory.yaml file to find this +information. In the file, for each listed host, note IP and VIP addresses +consumed by the node.

+
+
+

For example,

+
+
+
+
Standalone:
+  hosts:
+    standalone:
+      ...
+      internal_api_ip: 172.17.0.100
+    ...
+  ...
+standalone:
+  children:
+    Standalone: {}
+  vars:
+    ...
+    internal_api_vip: 172.17.0.2
+    ...
+
+
+
+

In the example above, note that the 172.17.0.2 and 172.17.0.100 are +consumed and won’t be available for the new control plane services, at least +until the adoption is complete.

+
+
+

Repeat the process for each isolated network and each host in the +configuration.

+
+
+
+

At the end of this process, you should have the following information:

+
+
+
    +
  • +

    A list of isolated networks used in the existing deployment.

    +
  • +
  • +

    For each of the isolated networks, the VLAN tag and IP ranges used for +dynamic address allocation.

    +
  • +
  • +

    A list of existing IP address allocations used in the environment. You will +later exclude these addresses from allocation pools available for the new +control plane services.

    +
  • +
+
+
+
+

Planning your IPAM configuration

+
+

The new deployment model puts additional burden on the size of IP allocation +pools available for Red Hat OpenStack Platform (RHOSP) services. This is because each service deployed +on Red Hat OpenShift Container Platform (RHOCP) worker nodes will now require an IP address from the IPAM pool (in +the previous deployment model, all services hosted on a controller node shared +the same IP address.)

+
+
+

Since the new control plane deployment has different requirements as to the +number of IP addresses available for services, it may even be impossible to +reuse the existing IP ranges used in adopted environment, depending on its +size. Prudent planning is required to determine which options are available in +your particular case.

+
+
+

The total number of IP addresses required for the new control plane services, +in each isolated network, is calculated as a sum of the following:

+
+
+
    +
  • +

    The number of RHOCP worker nodes. (Each node will require 1 IP address in +NodeNetworkConfigurationPolicy custom resources (CRs).)

    +
  • +
  • +

    The number of IP addresses required for the data plane nodes. (Each node will require +an IP address from NetConfig CRs.)

    +
  • +
  • +

    The number of IP addresses required for control plane services. (Each service +will require an IP address from NetworkAttachmentDefinition CRs.) This +number depends on the number of replicas for each service.

    +
  • +
  • +

    The number of IP addresses required for load balancer IP addresses. (Each +service will require a VIP address from IPAddressPool CRs.)

    +
  • +
+
+
+

As of the time of writing, the simplest single worker node RHOCP deployment +(CRC) has the following IP ranges defined (for the internalapi network):

+
+
+
    +
  • +

    1 IP address for the single worker node;

    +
  • +
  • +

    1 IP address for the data plane node;

    +
  • +
  • +

    NetworkAttachmentDefinition CRs for control plane services: +X.X.X.30-X.X.X.70 (41 addresses);

    +
  • +
  • +

    IPAllocationPool CRs for load balancer IPs: X.X.X.80-X.X.X.90 (11 +addresses).

    +
  • +
+
+
+

Which comes to a total of 54 IP addresses allocated to the internalapi +allocation pools.

+
+
+

The exact requirements may differ depending on the list of RHOSP services +to be deployed, their replica numbers, as well as the number of RHOCP +worker nodes and data plane nodes.

+
+
+

Additional IP addresses may be required in future RHOSP releases, so it is +advised to plan for some extra capacity, for each of the allocation pools used +in the new environment.

+
+
+

Once you know the required IP pool size for the new deployment, you can choose +one of the following scenarios to handle IPAM allocation in the new +environment.

+
+
+

The first listed scenario is more general and implies using new IP ranges, +while the second scenario implies reusing the existing ranges. The end state of +the former scenario is using the new subnet ranges for control plane services, +but keeping the old ranges, with their node IP address allocations intact, for +data plane nodes.

+
+
+

Regardless of the IPAM scenario, the VLAN tags used in the existing deployment will be reused in the new deployment. Depending on the scenario, the IP address ranges to be used for control plane services will be either reused from the old deployment or defined anew. Adjust the configuration as described in Configuring isolated networks.

+
+
+
Scenario 1: Using new subnet ranges
+
+

This scenario is compatible with any existing subnet configuration, and can be +used even when the existing cluster subnet ranges don’t have enough free IP +addresses for the new control plane services.

+
+
+

The general idea here is to define new IP ranges for control plane services +that belong to a different subnet that was not used in the existing cluster. +Then, configure link local IP routing between the old and new subnets to allow +old and new service deployments to communicate. This involves using director +mechanism on pre-adopted cluster to configure additional link local routes +there. This will allow EDP deployment to reach out to adopted nodes using their +old subnet addresses.

+
+
+

The new subnet should be sized appropriately to accommodate the new control +plane services, but otherwise doesn’t have any specific requirements as to the +existing deployment allocation pools already consumed. Actually, the +requirements as to the size of the new subnet are lower than in the second +scenario, as the old subnet ranges are kept for the adopted nodes, which means +they don’t consume any IP addresses from the new range.

+
+
+

In this scenario, you will configure NetworkAttachmentDefinition custom resources (CRs) to use a +different subnet from what will be configured in NetConfig CR for the same +networks. The former range will be used for control plane services, +while the latter will be used to manage IPAM for data plane nodes.

+
+
+

During the process, you will need to make sure that adopted node IP addresses +don’t change during the adoption process. This is achieved by listing the +addresses in fixedIP fields in OpenstackDataplaneNodeSet per-node section.

+
+
+
+

Before proceeding, configure host routes on the adopted nodes for the +control plane subnets.

+
+
+

To achieve this, you will need to re-run openstack overcloud node provision with additional routes entries added to network_config. (This change should be applied for every adopted node configuration.) For example, you may add the following to net_config.yaml:

+
+
+
+
network_config:
+  - type: ovs_bridge
+    name: br-ctlplane
+    routes:
+    - ip_netmask: 0.0.0.0/0
+      next_hop: 192.168.1.1
+    - ip_netmask: 172.31.0.0/24 (1)
+      next_hop: 192.168.1.100 (2)
+
+
+
+ + + + + + + + + +
1The new control plane subnet.
2The control plane IP address of the adopted data plane node.
+
+
+

Do the same for other networks that will need to use different subnets for the +new and old parts of the deployment.

+
+
+

Once done, run openstack overcloud node provision to apply the new configuration.

+
+
+

Note that network configuration changes are not applied by default to avoid +risk of network disruption. You will have to enforce the changes by setting the +StandaloneNetworkConfigUpdate: true in the director configuration files.

+
+
+

Once openstack overcloud node provision is complete, you should see new link local routes to the +new subnet on each node. For example,

+
+
+
+
# ip route | grep 172
+172.31.0.0/24 via 192.168.122.100 dev br-ctlplane
+
+
+
+
+

The next step is to configure similar routes for the old subnet for control plane services attached to the networks. This is done by adding routes entries to +NodeNetworkConfigurationPolicy CRs for each network. For example,

+
+
+
+
      - destination: 192.168.122.0/24 (1)
+        next-hop-interface: ospbr (2)
+
+
+
+ + + + + + + + + +
1The isolated network’s original subnet on the data plane.
2The RHOCP worker network interface that corresponds to the isolated network on the data plane.
+
+
+

Once applied, you should eventually see the following route added to your Red Hat OpenShift Container Platform (RHOCP) nodes.

+
+
+
+
# ip route | grep 192
+192.168.122.0/24 dev ospbr proto static scope link
+
+
+
+
+

At this point, you should be able to ping the adopted nodes from RHOCP nodes +using their old subnet addresses; and vice versa.

+
+
+
+

Finally, during the data plane adoption, you will have to take care of several aspects:

+
+
+
    +
  • +

    in network_config, add link local routes to the new subnets, for example:

    +
  • +
+
+
+
+
  nodeTemplate:
+    ansible:
+      ansibleUser: root
+      ansibleVars:
+        additional_ctlplane_host_routes:
+        - ip_netmask: 172.31.0.0/24
+          next_hop: '{{ ctlplane_ip }}'
+        edpm_network_config_template: |
+          network_config:
+          - type: ovs_bridge
+            routes: {{ ctlplane_host_routes + additional_ctlplane_host_routes }}
+            ...
+
+
+
+
    +
  • +

    list the old IP addresses as ansibleHost and fixedIP, for example:

    +
  • +
+
+
+
+
  nodes:
+    standalone:
+      ansible:
+        ansibleHost: 192.168.122.100
+        ansibleUser: ""
+      hostName: standalone
+      networks:
+      - defaultRoute: true
+        fixedIP: 192.168.122.100
+        name: ctlplane
+        subnetName: subnet1
+
+
+
+
    +
  • +

    expand SSH range for the firewall configuration to include both subnets:

    +
  • +
+
+
+
+
        edpm_sshd_allowed_ranges:
+        - 192.168.122.0/24
+        - 172.31.0.0/24
+
+
+
+

This is to allow SSH access from the new subnet to the adopted nodes as well as +the old one.

+
+
+
+

Since you are applying new network configuration to the nodes, consider also +setting edpm_network_config_update: true to enforce the changes.

+
+
+
+

Note that the examples above are incomplete and should be incorporated into +your general configuration.

+
+
+
+
Scenario 2: Reusing existing subnet ranges
+
+

This scenario is only applicable when the existing subnet ranges have enough IP +addresses for the new control plane services. On the other hand, it allows to +avoid additional routing configuration between the old and new subnets, as in Scenario 1: Using new subnet ranges.

+
+
+

The general idea here is to instruct the new control plane services to use the +same subnet as in the adopted environment, but define allocation pools used by +the new services in a way that would exclude IP addresses that were already +allocated to existing cluster nodes.

+
+
+

This scenario implies that the remaining IP addresses in the existing subnet is +enough for the new control plane services. If not, +Scenario 1: Using new subnet ranges should be used +instead. For more information, see Planning your IPAM configuration.

+
+
+

No special routing configuration is required in this scenario; the only thing +to pay attention to is to make sure that already consumed IP addresses don’t +overlap with the new allocation pools configured for Red Hat OpenStack Platform control plane services.

+
+
+

If you are especially constrained by the size of the existing subnet, you may +have to apply elaborate exclusion rules when defining allocation pools for the +new control plane services. For more information, see

+
+
+
+
+

Configuring isolated networks

+
+

Before you begin replicating your existing VLAN and IPAM configuration in the Red Hat OpenStack Services on OpenShift (RHOSO) environment, you must have the following IP address allocations for the new control plane services:

+
+
+
    +
  • +

    1 IP address for each isolated network on each Red Hat OpenShift Container Platform (RHOCP) worker node. You configure these IP addresses in the NodeNetworkConfigurationPolicy custom resources (CRs) for the RHOCP worker nodes. For more information, see Configuring RHOCP worker nodes.

    +
  • +
  • +

    1 IP range for each isolated network for the data plane nodes. You configure these ranges in the NetConfig CRs for the data plane nodes. For more information, see Configuring data plane nodes.

    +
  • +
  • +

    1 IP range for each isolated network for control plane services. These ranges +enable pod connectivity for isolated networks in the NetworkAttachmentDefinition CRs. For more information, see Configuring the networking for control plane services.

    +
  • +
  • +

    1 IP range for each isolated network for load balancer IP addresses. These IP ranges define load balancer IP addresses for MetalLB in the IPAddressPool CRs. For more information, see Configuring the networking for control plane services.

    +
  • +
+
+
+ + + + + +
+ + +The exact list and configuration of isolated networks in the following procedures should reflect the actual Red Hat OpenStack Platform environment. The number of isolated networks might differ from the examples used in the procedures. The IPAM scheme might also differ. Only the parts of the configuration that are relevant to configuring networks are shown. The values that are used in the following procedures are examples. Use values that are specific to your configuration. +
+
+
+
Configuring isolated networks on RHOCP worker nodes
+
+

To connect service pods to isolated networks on Red Hat OpenShift Container Platform (RHOCP) worker nodes that run Red Hat OpenStack Platform services, physical network configuration on the hypervisor is required.

+
+
+

This configuration is managed by the NMState operator, which uses NodeNetworkConfigurationPolicy custom resources (CRs) to define the desired network configuration for the nodes.

+
+
+
Procedure
+
    +
  • +

    For each RHOCP worker node, define a NodeNetworkConfigurationPolicy CR that describes the desired network configuration. For example:

    +
    +
    +
    apiVersion: v1
    +items:
    +- apiVersion: nmstate.io/v1
    +  kind: NodeNetworkConfigurationPolicy
    +  spec:
    +    desiredState:
    +      interfaces:
    +      - description: internalapi vlan interface
    +        ipv4:
    +          address:
    +          - ip: 172.17.0.10
    +            prefix-length: 24
    +          dhcp: false
    +          enabled: true
    +        ipv6:
    +          enabled: false
    +        name: enp6s0.20
    +        state: up
    +        type: vlan
    +        vlan:
    +          base-iface: enp6s0
    +          id: 20
    +          reorder-headers: true
    +      - description: storage vlan interface
    +        ipv4:
    +          address:
    +          - ip: 172.18.0.10
    +            prefix-length: 24
    +          dhcp: false
    +          enabled: true
    +        ipv6:
    +          enabled: false
    +        name: enp6s0.21
    +        state: up
    +        type: vlan
    +        vlan:
    +          base-iface: enp6s0
    +          id: 21
    +          reorder-headers: true
    +      - description: tenant vlan interface
    +        ipv4:
    +          address:
    +          - ip: 172.19.0.10
    +            prefix-length: 24
    +          dhcp: false
    +          enabled: true
    +        ipv6:
    +          enabled: false
    +        name: enp6s0.22
    +        state: up
    +        type: vlan
    +        vlan:
    +          base-iface: enp6s0
    +          id: 22
    +          reorder-headers: true
    +    nodeSelector:
    +      kubernetes.io/hostname: ocp-worker-0
    +      node-role.kubernetes.io/worker: ""
    +
    +
    +
  • +
+
+
+
+
Configuring isolated networks on control plane services
+
+

After the NMState operator creates the desired hypervisor network configuration for isolated networks, you must configure the Red Hat OpenStack Platform (RHOSP) services to use the configured interfaces. You define a NetworkAttachmentDefinition custom resource (CR) for each isolated network. In some clusters, these CRs are managed by the Cluster Network Operator, in which case you use Network CRs instead. For more information, see +Cluster Network Operator in Networking.

+
+
+
Procedure
+
    +
  1. +

    Define a NetworkAttachmentDefinition CR for each isolated network. +For example:

    +
    +
    +
    apiVersion: k8s.cni.cncf.io/v1
    +kind: NetworkAttachmentDefinition
    +metadata:
    +  name: internalapi
    +  namespace: openstack
    +spec:
    +  config: |
    +    {
    +      "cniVersion": "0.3.1",
    +      "name": "internalapi",
    +      "type": "macvlan",
    +      "master": "enp6s0.20",
    +      "ipam": {
    +        "type": "whereabouts",
    +        "range": "172.17.0.0/24",
    +        "range_start": "172.17.0.20",
    +        "range_end": "172.17.0.50"
    +      }
    +    }
    +
    +
    +
    + + + + + +
    + + +Ensure that the interface name and IPAM range match the configuration that you used in the NodeNetworkConfigurationPolicy CRs. +
    +
    +
  2. +
  3. +

    Optional: When reusing existing IP ranges, you can exclude part of the range that is used in the existing deployment by using the exclude parameter in the NetworkAttachmentDefinition pool. For example:

    +
    +
    +
    apiVersion: k8s.cni.cncf.io/v1
    +kind: NetworkAttachmentDefinition
    +metadata:
    +  name: internalapi
    +  namespace: openstack
    +spec:
    +  config: |
    +    {
    +      "cniVersion": "0.3.1",
    +      "name": "internalapi",
    +      "type": "macvlan",
    +      "master": "enp6s0.20",
    +      "ipam": {
    +        "type": "whereabouts",
    +        "range": "172.17.0.0/24",
    +        "range_start": "172.17.0.20", (1)
    +        "range_end": "172.17.0.50", (2)
    +        "exclude": [ (3)
    +          "172.17.0.24/32",
    +          "172.17.0.44/31"
    +        ]
    +      }
    +    }
    +
    +
    +
    + + + + + + + + + + + + + +
    1Defines the start of the IP range.
    2Defines the end of the IP range.
    3Excludes part of the IP range. This example excludes IP addresses 172.17.0.24/32 and 172.17.0.44/31 from the allocation pool.
    +
    +
  4. +
  5. +

    If your RHOSP services require load balancer IP addresses, define the pools for these services in an IPAddressPool CR. For example:

    +
    + + + + + +
    + + +The load balancer IP addresses belong to the same IP range as the control plane services, and are managed by MetalLB. This pool should also be aligned with the RHOSP configuration. +
    +
    +
    +
    +
    - apiVersion: metallb.io/v1beta1
    +  kind: IPAddressPool
    +  spec:
    +    addresses:
    +    - 172.17.0.60-172.17.0.70
    +
    +
    +
    +

    Define IPAddressPool CRs for each isolated network that requires load +balancer IP addresses.

    +
    +
  6. +
  7. +

    Optional: When reusing existing IP ranges, you can exclude part of the range by listing multiple entries in the addresses section of the IPAddressPool. For example:

    +
    +
    +
    - apiVersion: metallb.io/v1beta1
    +  kind: IPAddressPool
    +  spec:
    +    addresses:
    +    - 172.17.0.60-172.17.0.64
    +    - 172.17.0.66-172.17.0.70
    +
    +
    +
    +

    The example above would exclude the 172.17.0.65 address from the allocation +pool.

    +
    +
  8. +
+
+
+
+
Configuring isolated networks on data plane nodes
+
+

Data plane nodes are configured by the OpenStack Operator and your OpenStackDataPlaneNodeSet custom resources (CRs). The OpenStackDataPlaneNodeSet CRs define your desired network configuration for the nodes.

+
+
+

Your Red Hat OpenStack Services on OpenShift (RHOSO) network configuration should reflect the existing Red Hat OpenStack Platform (RHOSP) network setup. You must pull the net_config.yaml files from each RHOSP node and reuse them when you define the OpenStackDataPlaneNodeSet CRs. The format of the configuration does not change, so you can put network templates under edpm_network_config_template variables, either for all nodes or for each node.

+
+
+

To ensure that the latest network configuration is used during the data plane adoption, you should also set edpm_network_config_update: true in the nodeTemplate field of the OpenStackDataPlaneNodeSet CR.

+
+
+
Procedure
+
    +
  1. +

    Configure a NetConfig CR with your desired VLAN tags and IPAM configuration. For example:

    +
    +
    +
    apiVersion: network.openstack.org/v1beta1
    +kind: NetConfig
    +metadata:
    +  name: netconfig
    +spec:
    +  networks:
    +  - name: internalapi
    +    dnsDomain: internalapi.example.com
    +    subnets:
    +    - name: subnet1
    +      allocationRanges:
    +      - end: 172.17.0.250
    +        start: 172.17.0.100
    +      cidr: 172.17.0.0/24
    +      vlan: 20
    +  - name: storage
    +    dnsDomain: storage.example.com
    +    subnets:
    +    - name: subnet1
    +      allocationRanges:
    +      - end: 172.18.0.250
    +        start: 172.18.0.100
    +      cidr: 172.18.0.0/24
    +      vlan: 21
    +  - name: tenant
    +    dnsDomain: tenant.example.com
    +    subnets:
    +    - name: subnet1
    +      allocationRanges:
    +      - end: 172.19.0.250
    +        start: 172.19.0.100
    +      cidr: 172.19.0.0/24
    +      vlan: 22
    +
    +
    +
  2. +
  3. +

    Optional: In the NetConfig CR, list multiple ranges for the allocationRanges field to exclude some of the IP addresses, for example, to accommodate IP addresses that are already consumed by the adopted environment:

    +
    +
    +
    apiVersion: network.openstack.org/v1beta1
    +kind: NetConfig
    +metadata:
    +  name: netconfig
    +spec:
    +  networks:
    +  - name: internalapi
    +    dnsDomain: internalapi.example.com
    +    subnets:
    +    - name: subnet1
    +      allocationRanges:
    +      - end: 172.17.0.199
    +        start: 172.17.0.100
    +      - end: 172.17.0.250
    +        start: 172.17.0.201
    +      cidr: 172.17.0.0/24
    +      vlan: 20
    +
    +
    +
    +

    This example excludes the 172.17.0.200 address from the pool.

    +
    +
  4. +
+
+
+
+
+
+

Storage requirements

+
+

Storage in a Red Hat OpenStack Platform (RHOSP) deployment refers to the following types:

+
+
+
    +
  • +

    The storage that is needed for the service to run

    +
  • +
  • +

    The storage that the service manages

    +
  • +
+
+
+

Before you can deploy the services in Red Hat OpenStack Services on OpenShift (RHOSO), you must review the storage requirements, plan your Red Hat OpenShift Container Platform (RHOCP) node selection, prepare your RHOCP nodes, and so on.

+
+
+

Storage driver certification

+
+

Before you adopt your Red Hat OpenStack Platform 17.1 deployment to a Red Hat OpenStack Services on OpenShift (RHOSO) 18.0 deployment, confirm that your deployed storage drivers are certified for use with RHOSO 18.0.

+
+
+

For information on software certified for use with RHOSO 18.0, see the Red Hat Ecosystem Catalog.

+
+
+
+

Block Storage service guidelines

+
+

Prepare to adopt your Block Storage service (cinder):

+
+
+
    +
  • +

    Take note of the Block Storage service back ends that you use.

    +
  • +
  • +

    Determine all the transport protocols that the Block Storage service back ends use, such as +RBD, iSCSI, FC, NFS, NVMe-TCP, and so on. You must consider them when you place the Block Storage services and when the right storage transport-related binaries are running on the Red Hat OpenShift Container Platform (RHOCP) nodes. For more information about each storage transport protocol, see RHOCP preparation for Block Storage service adoption.

    +
  • +
  • +

    Use a Block Storage service volume service to deploy each Block Storage service volume back end.

    +
    +

    For example, you have an LVM back end, a Ceph back end, and two entries in cinderVolumes, and you cannot set global defaults for all volume services. You must define a service for each of them:

    +
    +
    +
    +
    apiVersion: core.openstack.org/v1beta1
    +kind: OpenStackControlPlane
    +metadata:
    +  name: openstack
    +spec:
    +  cinder:
    +    enabled: true
    +    template:
    +      cinderVolumes:
    +        lvm:
    +          customServiceConfig: |
    +            [DEFAULT]
    +            debug = True
    +            [lvm]
    +< . . . >
    +        ceph:
    +          customServiceConfig: |
    +            [DEFAULT]
    +            debug = True
    +            [ceph]
    +< . . . >
    +
    +
    +
    + + + + + +
    + + +Check that all configuration options are still valid for RHOSO 18.0 version. Configuration options might be deprecated, removed, or added. This applies to both back-end driver-specific configuration options and other generic options. +
    +
    +
  • +
+
+
+
+

Limitations for adopting the Block Storage service

+
+

Before you begin the Block Storage service (cinder) adoption, review the following limitations:

+
+
+
    +
  • +

    There is no global nodeSelector option for all Block Storage service volumes. You must specify the nodeSelector for each back end.

    +
  • +
  • +

    There are no global customServiceConfig or customServiceConfigSecrets options for all Block Storage service volumes. You must specify these options for each back end.

    +
  • +
  • +

    Support for Block Storage service back ends that require kernel modules that are not included in Red Hat Enterprise Linux is not tested in Red Hat OpenStack Services on OpenShift (RHOSO).

    +
  • +
+
+
+
+

RHOCP preparation for Block Storage service adoption

+
+

Before you deploy Red Hat OpenStack Platform (RHOSP) in Red Hat OpenShift Container Platform (RHOCP) nodes, ensure that the networks are ready, that you decide which RHOCP nodes to restrict, and that you make any necessary changes to the RHOCP nodes.

+
+
+
+
Node selection
+
+

You might need to restrict the RHOCP nodes where the Block Storage service volume and backup services run.

+
+

An example of when you need to restrict nodes for a specific Block Storage service is when you deploy the Block Storage service with the LVM driver. In that scenario, the LVM data where the volumes are stored only exists in a specific host, so you need to pin the Block Storage-volume service to that specific RHOCP node. Running the service on any other RHOCP node does not work. You cannot use the RHOCP host node name to restrict the LVM back end. You need to identify the LVM back end by using a unique label, an existing label, or a new label:

+
+
+
+
$ oc label nodes worker0 lvm=cinder-volumes
+
+
+
+
+
apiVersion: core.openstack.org/v1beta1
+kind: OpenStackControlPlane
+metadata:
+  name: openstack
+spec:
+  secret: osp-secret
+  storageClass: local-storage
+  cinder:
+    enabled: true
+    template:
+      cinderVolumes:
+        lvm-iscsi:
+          nodeSelector:
+            lvm: cinder-volumes
+< . . . >
+
+
+
+

For more information about node selection, see About node selectors.

+
+
+ + + + + +
+ + +
+

If your nodes do not have enough local disk space for temporary images, you can use a remote NFS location by setting the extra volumes feature, extraMounts.

+
+
+
+
+
Transport protocols
+
+

Some changes to the storage transport protocols might be required for RHOCP:

+
+
    +
  • +

    If you use a MachineConfig to make changes to RHOCP nodes, the nodes reboot.

    +
  • +
  • +

    Check the back-end sections that are listed in the enabled_backends configuration option in your cinder.conf file to determine the enabled storage back-end sections.

    +
  • +
  • +

    Depending on the back end, you can find the transport protocol by viewing the volume_driver or target_protocol configuration options.

    +
  • +
  • +

    The icssid service, multipathd service, and NVMe-TCP kernel modules start automatically on data plane nodes.

    +
    +
    +
    NFS
    +
    +
    +
      +
    • +

      RHOCP connects to NFS back ends without additional changes.

      +
    • +
    +
    +
    +
    Rados Block Device and Red Hat Ceph Storage
    +
    +
    +
      +
    • +

      RHOCP connects to Red Hat Ceph Storage back ends without additional changes. You must provide credentials and configuration files to the services.

      +
    • +
    +
    +
    +
    iSCSI
    +
    +
    +
      +
    • +

      To connect to iSCSI volumes, the iSCSI initiator must run on the +RHOCP hosts where the volume and backup services run. The Linux Open iSCSI initiator does not support network namespaces, so you must only run one instance of the service for the normal RHOCP usage, as well as +the RHOCP CSI plugins and the RHOSP services.

      +
    • +
    • +

      If you are not already running iscsid on the RHOCP nodes, then you must apply a MachineConfig. For example:

      +
      +
      +
      apiVersion: machineconfiguration.openshift.io/v1
      +kind: MachineConfig
      +metadata:
      +  labels:
      +    machineconfiguration.openshift.io/role: worker
      +    service: cinder
      +  name: 99-master-cinder-enable-iscsid
      +spec:
      +  config:
      +    ignition:
      +      version: 3.2.0
      +    systemd:
      +      units:
      +      - enabled: true
      +        name: iscsid.service
      +
      +
      +
    • +
    • +

      If you use labels to restrict the nodes where the Block Storage services run, you must use a MachineConfigPool to limit the effects of the +MachineConfig to the nodes where your services might run. For more information, see About node selectors.

      +
    • +
    • +

      If you are using a single node deployment to test the process, replace worker with master in the MachineConfig.

      +
    • +
    • +

      For production deployments that use iSCSI volumes, configure multipathing for better I/O.

      +
    • +
    +
    +
    +
    FC
    +
    +
    +
      +
    • +

      The Block Storage service volume and Block Storage service backup services must run in an RHOCP host that has host bus adapters (HBAs). If some nodes do not have HBAs, then use labels to restrict where these services run. For more information, see About node selectors.

      +
    • +
    • +

      If you have virtualized RHOCP clusters that use FC you need to expose the host HBAs inside the virtual machine.

      +
    • +
    • +

      For production deployments that use FC volumes, configure multipathing for better I/O.

      +
    • +
    +
    +
    +
    NVMe-TCP
    +
    +
    +
      +
    • +

      To connect to NVMe-TCP volumes, load NVMe-TCP kernel modules on the RHOCP hosts.

      +
    • +
    • +

      If you do not already load the nvme-fabrics module on the RHOCP nodes where the volume and backup services are going to run, then you must apply a MachineConfig. For example:

      +
      +
      +
      apiVersion: machineconfiguration.openshift.io/v1
      +kind: MachineConfig
      +metadata:
      +  labels:
      +    machineconfiguration.openshift.io/role: worker
      +    service: cinder
      +  name: 99-master-cinder-load-nvme-fabrics
      +spec:
      +  config:
      +    ignition:
      +      version: 3.2.0
      +    storage:
      +      files:
      +        - path: /etc/modules-load.d/nvme_fabrics.conf
      +          overwrite: false
      +          # Mode must be decimal, this is 0644
      +          mode: 420
      +          user:
      +            name: root
      +          group:
      +            name: root
      +          contents:
      +            # Source can be a http, https, tftp, s3, gs, or data as defined in rfc2397.
      +            # This is the rfc2397 text/plain string format
      +            source: data:,nvme-fabrics
      +
      +
      +
    • +
    • +

      If you use labels to restrict the nodes where Block Storage +services run, use a MachineConfigPool to limit the effects of the MachineConfig to the nodes where your services run. For more information, see About node selectors.

      +
    • +
    • +

      If you use a single node deployment to test the process, replace worker with master in the MachineConfig.

      +
    • +
    • +

      Only load the nvme-fabrics module because it loads the transport-specific modules, such as TCP, RDMA, or FC, as needed.

      +
    • +
    • +

      For production deployments that use NVMe-TCP volumes, use multipathing for better I/O. For NVMe-TCP volumes, RHOCP uses native multipathing, called ANA.

      +
    • +
    • +

      After the RHOCP nodes reboot and load the nvme-fabrics module, you can confirm that the operating system is configured and that it supports ANA by checking the host:

      +
      +
      +
      $ cat /sys/module/nvme_core/parameters/multipath
      +
      +
      +
      + + + + + +
      + + +ANA does not use the Linux Multipathing Device Mapper, but RHOCP requires multipathd to run on Compute nodes for the Compute service (nova) to be able to use multipathing. Multipathing is automatically configured on data plane nodes when they are provisioned. +
      +
      +
    • +
    +
    +
    +
    Multipathing
    +
    +
    +
      +
    • +

      Multipathing is recommended for iSCSI and FC protocols. To configure multipathing on these protocols, you perform the following tasks:

      +
      +
        +
      • +

        Prepare the RHOCP hosts

        +
      • +
      • +

        Configure the Block Storage services

        +
      • +
      • +

        Prepare the Compute service nodes

        +
      • +
      • +

        Configure the Compute service

        +
      • +
      +
      +
    • +
    • +

      To prepare the RHOCP hosts, ensure that the Linux Multipath Device Mapper is configured and running on the RHOCP hosts by using MachineConfig. For example:

      +
      +
      +
      # Includes the /etc/multipathd.conf contents and the systemd unit changes
      +apiVersion: machineconfiguration.openshift.io/v1
      +kind: MachineConfig
      +metadata:
      +  labels:
      +    machineconfiguration.openshift.io/role: worker
      +    service: cinder
      +  name: 99-master-cinder-enable-multipathd
      +spec:
      +  config:
      +    ignition:
      +      version: 3.2.0
      +    storage:
      +      files:
      +        - path: /etc/multipath.conf
      +          overwrite: false
      +          # Mode must be decimal, this is 0600
      +          mode: 384
      +          user:
      +            name: root
      +          group:
      +            name: root
      +          contents:
      +            # Source can be a http, https, tftp, s3, gs, or data as defined in rfc2397.
      +            # This is the rfc2397 text/plain string format
      +            source: data:,defaults%20%7B%0A%20%20user_friendly_names%20no%0A%20%20recheck_wwid%20yes%0A%20%20skip_kpartx%20yes%0A%20%20find_multipaths%20yes%0A%7D%0A%0Ablacklist%20%7B%0A%7D
      +    systemd:
      +      units:
      +      - enabled: true
      +        name: multipathd.service
      +
      +
      +
    • +
    • +

      If you use labels to restrict the nodes where Block Storage services run, you need to use a MachineConfigPool to limit the effects of the MachineConfig to only the nodes where your services run. For more information, see About node selectors.

      +
    • +
    • +

      If you are using a single node deployment to test the process, replace worker with master in the MachineConfig.

      +
    • +
    • +

      Cinder volume and backup are configured by default to use multipathing.

      +
    • +
    +
    +
    +
    +
    +
  • +
+
+
+
+
+
+
+

Converting the Block Storage service configuration

+
+

In your previous deployment, you use the same cinder.conf file for all the services. To prepare your Block Storage service (cinder) configuration for adoption, split this single-file configuration into individual configurations for each Block Storage service service. Review the following information to guide you in coverting your previous configuration:

+
+
+
    +
  • +

    Determine what part of the configuration is generic for all the Block Storage services and remove anything that would change when deployed in Red Hat OpenShift Container Platform (RHOCP), such as the connection in the [database] section, the transport_url and log_dir in the [DEFAULT] sections, the whole [coordination] and [barbican] sections. The remaining generic configuration goes into the customServiceConfig option, or a Secret custom resource (CR) and is then used in the customServiceConfigSecrets section, at the cinder: template: level.

    +
  • +
  • +

    Determine if there is a scheduler-specific configuration and add it to the customServiceConfig option in cinder: template: cinderScheduler.

    +
  • +
  • +

    Determine if there is an API-specific configuration and add it to the customServiceConfig option in cinder: template: cinderAPI.

    +
  • +
  • +

    If the Block Storage service backup is deployed, add the Block Storage service backup configuration options to customServiceConfig option, or to a Secret CR that you can add to customServiceConfigSecrets section at the cinder: template: +cinderBackup: level. Remove the host configuration in the [DEFAULT] section to support multiple replicas later.

    +
  • +
  • +

    Determine the individual volume back-end configuration for each of the drivers. The configuration is in the specific driver section, and it includes the [backend_defaults] section and FC zoning sections if you use them. The Block Storage service operator does not support a global customServiceConfig option for all volume services. Each back end has its own section under cinder: template: cinderVolumes, and the configuration goes in the customServiceConfig option or in a Secret CR and is then used in the customServiceConfigSecrets section.

    +
  • +
  • +

    If any of the Block Storage service volume drivers require a custom vendor image, find the location of the image in the Red Hat Ecosystem Catalog, and create or modify an OpenStackVersion CR to specify the custom image by using the key from the cinderVolumes section.

    +
    +

    For example, if you have the following configuration:

    +
    +
    +
    +
    spec:
    +  cinder:
    +    enabled: true
    +    template:
    +      cinderVolume:
    +        pure:
    +          customServiceConfigSecrets:
    +            - openstack-cinder-pure-cfg
    +< . . . >
    +
    +
    +
    +

    Then the OpenStackVersion CR that describes the container image for that back end looks like the following example:

    +
    +
    +
    +
    apiVersion: core.openstack.org/v1beta1
    +kind: OpenStackVersion
    +metadata:
    +  name: openstack
    +spec:
    +  customContainerImages:
    +    cinderVolumeImages:
    +      pure: registry.connect.redhat.com/purestorage/openstack-cinder-volume-pure-rhosp-18-0'
    +
    +
    +
    + + + + + +
    + + +The name of the OpenStackVersion must match the name of your OpenStackControlPlane CR. +
    +
    +
  • +
  • +

    If your Block Storage services use external files, for example, for a custom policy, or to store credentials or SSL certificate authority bundles to connect to a storage array, make those files available to the right containers. Use Secrets or ConfigMap to store the information in RHOCP and then in the extraMounts key. For example, for Red Hat Ceph Storage credentials that are stored in a Secret called ceph-conf-files, you patch the top-level extraMounts key in the OpenstackControlPlane CR:

    +
    +
    +
    spec:
    +  extraMounts:
    +  - extraVol:
    +    - extraVolType: Ceph
    +      mounts:
    +      - mountPath: /etc/ceph
    +        name: ceph
    +        readOnly: true
    +      propagation:
    +      - CinderVolume
    +      - CinderBackup
    +      - Glance
    +      volumes:
    +      - name: ceph
    +        projected:
    +          sources:
    +          - secret:
    +              name: ceph-conf-files
    +
    +
    +
  • +
  • +

    For a service-specific file, such as the API policy, you add the configuration +on the service itself. In the following example, you include the CinderAPI +configuration that references the policy you are adding from a ConfigMap +called my-cinder-conf that has a policy key with the contents of the policy:

    +
    +
    +
    spec:
    +  cinder:
    +    enabled: true
    +    template:
    +      cinderAPI:
    +        customServiceConfig: |
    +           [oslo_policy]
    +           policy_file=/etc/cinder/api/policy.yaml
    +      extraMounts:
    +      - extraVol:
    +        - extraVolType: Ceph
    +          mounts:
    +          - mountPath: /etc/cinder/api
    +            name: policy
    +            readOnly: true
    +          propagation:
    +          - CinderAPI
    +          volumes:
    +          - name: policy
    +            projected:
    +              sources:
    +              - configMap:
    +                  name: my-cinder-conf
    +                  items:
    +                    - key: policy
    +                      path: policy.yaml
    +
    +
    +
  • +
+
+
+
+

Changes to CephFS through NFS

+
+

Before you begin the adoption, review the following information to understand the changes to CephFS through NFS between Red Hat OpenStack Platform (RHOSP) 17.1 and Red Hat OpenStack Services on OpenShift (RHOSO) 18.0:

+
+
+
    +
  • +

    If the RHOSP 17.1 deployment uses CephFS through NFS as a back end for Shared File Systems service (manila), you cannot directly import the ceph-nfs service on the RHOSP Controller nodes into RHOSO 18.0. In RHOSO 18.0, the Shared File Systems service only supports using a clustered NFS service that is directly managed on the Red Hat Ceph Storage cluster. Adoption with the ceph-nfs service involves a data path disruption to existing NFS clients.

    +
  • +
  • +

    On RHOSP 17.1, Pacemaker controls the high availability of the ceph-nfs service. This service is assigned a Virtual IP (VIP) address that is also managed by Pacemaker. The VIP is typically created on an isolated StorageNFS network. The Controller nodes have ordering and collocation constraints established between this VIP, ceph-nfs, and the Shared File Systems service (manila) share manager service. Prior to adopting Shared File Systems service, you must adjust the Pacemaker ordering and collocation constraints to separate the share manager service. This establishes ceph-nfs with its VIP as an isolated, standalone NFS service that you can decommission after completing the RHOSO adoption.

    +
  • +
  • +

    In Red Hat Ceph Storage 7, a native clustered Ceph NFS service has to be deployed on the Red Hat Ceph Storage cluster by using the Ceph Orchestrator prior to adopting the Shared File Systems service. This NFS service eventually replaces the standalone NFS service from RHOSP 17.1 in your deployment. When the Shared File Systems service is adopted into the RHOSO 18.0 environment, it establishes all the existing exports and client restrictions on the new clustered Ceph NFS service. Clients can continue to read and write data on existing NFS shares, and are not affected until the old standalone NFS service is decommissioned. After the service is decommissioned, you can re-mount the same share from the new clustered Ceph NFS service during a scheduled downtime.

    +
  • +
  • +

    To ensure that NFS users are not required to make any networking changes to their existing workloads, assign an IP address from the same isolated StorageNFS network to the clustered Ceph NFS service. NFS users only need to discover and re-mount their shares by using new export paths. When the adoption is complete, RHOSO users can query the Shared File Systems service API to list the export locations on existing shares to identify the preferred paths to mount these shares. These preferred paths correspond to the new clustered Ceph NFS service in contrast to other non-preferred export paths that continue to be displayed until the old isolated, standalone NFS service is decommissioned.

    +
  • +
+
+
+

For more information on setting up a clustered NFS service, see Creating a NFS Ganesha cluster.

+
+
+
+
+

Comparing configuration files between deployments

+
+

To help you manage the configuration for your director and Red Hat OpenStack Platform (RHOSP) services, you can compare the configuration files between your director deployment and the Red Hat OpenStack Services on OpenShift (RHOSO) cloud by using the os-diff tool.

+
+
+
Prerequisites
+
    +
  • +

    Golang is installed and configured on your environment:

    +
    +
    +
    dnf install -y golang-github-openstack-k8s-operators-os-diff
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Configure the /etc/os-diff/os-diff.cfg file and the /etc/os-diff/ssh.config file according to your environment. To allow os-diff to connect to your clouds and pull files from the services that you describe in the config.yaml file, you must set the following options in the os-diff.cfg file:

    +
    +
    +
    [Default]
    +
    +local_config_dir=/tmp/
    +service_config_file=config.yaml
    +
    +[Tripleo]
    +
    +ssh_cmd=ssh -F ssh.config (1)
    +director_host=standalone (2)
    +container_engine=podman
    +connection=ssh
    +remote_config_path=/tmp/tripleo
    +local_config_path=/tmp/
    +
    +[Openshift]
    +
    +ocp_local_config_path=/tmp/ocp
    +connection=local
    +ssh_cmd=""
    +
    +
    +
    + + + + + + + + + +
    1Instructs os-diff to access your director host through SSH. The default value is ssh -F ssh.config. However, you can set the value without an ssh.config file, for example, ssh -i /home/user/.ssh/id_rsa stack@my.undercloud.local.
    2The host to use to access your cloud, and the podman/docker binary is installed and allowed to interact with the running containers. You can leave this key blank.
    +
    +
  2. +
  3. +

    If you use a host file to connect to your cloud, configure the ssh.config file to allow os-diff to access your RHOSP environment, for example:

    +
    +
    +
    Host *
    +    IdentitiesOnly yes
    +
    +Host virthost
    +    Hostname virthost
    +    IdentityFile ~/.ssh/id_rsa
    +    User root
    +    StrictHostKeyChecking no
    +    UserKnownHostsFile=/dev/null
    +
    +
    +Host standalone
    +    Hostname standalone
    +    IdentityFile <path to SSH key>
    +    User root
    +    StrictHostKeyChecking no
    +    UserKnownHostsFile=/dev/null
    +
    +Host crc
    +    Hostname crc
    +    IdentityFile ~/.ssh/id_rsa
    +    User stack
    +    StrictHostKeyChecking no
    +    UserKnownHostsFile=/dev/null
    +
    +
    +
    +
      +
    • +

      Replace <path to SSH key> with the path to your SSH key. You must provide a value for IdentityFile to get full working access to your RHOSP environment.

      +
    • +
    +
    +
  4. +
  5. +

    If you use an inventory file to connect to your cloud, generate the ssh.config file from your Ansible inventory, for example, tripleo-ansible-inventory.yaml file:

    +
    +
    +
    $ os-diff configure -i tripleo-ansible-inventory.yaml -o ssh.config --yaml
    +
    +
    +
  6. +
+
+
+
Verification
+
    +
  • +

    Test your connection:

    +
    +
    +
    $ ssh -F ssh.config standalone
    +
    +
    +
  • +
+
+
+
+
+
+

Migrating TLS-e to the RHOSO deployment

+
+
+

If you enabled TLS everywhere (TLS-e) in your Red Hat OpenStack Platform (RHOSP) 17.1 deployment, you must migrate TLS-e to the Red Hat OpenStack Services on OpenShift (RHOSO) deployment.

+
+
+

The RHOSO deployment uses the cert-manager operator to issue, track, and renew the certificates. In the following procedure, you extract the CA signing certificate from the FreeIPA instance that you use to provide the certificates in the RHOSP environment, and then import them into cert-manager in the RHOSO environment. As a result, you minimize the disruption on the Compute nodes because you do not need to install a new chain of trust.

+
+
+

You then decommission the previous FreeIPA node and no longer use it to issue certificates. This might not be possible if you use the IPA server to issue certificates for non-RHOSP systems.

+
+
+ + + + + +
+ + +
+
    +
  • +

    The following procedure was reproduced on a FreeIPA 4.10.1 server. The location of the files and directories might change depending on the version.

    +
  • +
  • +

    If the signing keys are stored in an hardware security module (HSM) instead of an NSS shared database (NSSDB), and the keys are retrievable, special HSM utilities might be required.

    +
  • +
+
+
+
+
+
Prerequisites
+
    +
  • +

    Your RHOSP deployment is using TLS-e.

    +
  • +
  • +

    Ensure that the back-end services on the new deployment are not started yet.

    +
  • +
  • +

    Define the following shell variables. The values are examples and refer to a single-node standalone director deployment. Replace these example values with values that are correct for your environment:

    +
    +
    +
    IPA_SSH="ssh -i <path_to_ssh_key> root@<freeipa-server-ip-address>"
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    To locate the CA certificate and key, list all the certificates inside your NSSDB:

    +
    +
    +
    $IPA_SSH certutil -L -d /etc/pki/pki-tomcat/alias
    +
    +
    +
    +
      +
    • +

      The -L option lists all certificates.

      +
    • +
    • +

      The -d option specifies where the certificates are stored.

      +
      +

      The command produces an output similar to the following example:

      +
      +
      +
      +
      Certificate Nickname                                         Trust Attributes
      +                                                             SSL,S/MIME,JAR/XPI
      +
      +caSigningCert cert-pki-ca                                    CTu,Cu,Cu
      +ocspSigningCert cert-pki-ca                                  u,u,u
      +Server-Cert cert-pki-ca                                      u,u,u
      +subsystemCert cert-pki-ca                                    u,u,u
      +auditSigningCert cert-pki-ca                                 u,u,Pu
      +
      +
      +
    • +
    +
    +
  2. +
  3. +

    Export the certificate and key from the /etc/pki/pki-tomcat/alias directory. The following example uses the caSigningCert cert-pki-ca certificate:

    +
    +
    +
    $IPA_SSH pk12util -o /tmp/freeipa.p12 -n 'caSigningCert\ cert-pki-ca' -d /etc/pki/pki-tomcat/alias -k /etc/pki/pki-tomcat/alias/pwdfile.txt -w /etc/pki/pki-tomcat/alias/pwdfile.txt
    +
    +
    +
    + + + + + +
    + + +
    +

    The command generates a P12 file with both the certificate and the key. The /etc/pki/pki-tomcat/alias/pwdfile.txt file contains the password that protects the key. You can use the password to both extract the key and generate the new file, /tmp/freeipa.p12. You can also choose another password. If you choose a different password for the new file, replace the parameter of the -w option, or use the -W option followed by the password, in clear text.

    +
    +
    +

    With that file, you can also get the certificate and the key by using the openssl pkcs12 command.

    +
    +
    +
    +
  4. +
  5. +

    Create the secret that contains the root CA:

    +
    +
    +
    $ oc create secret generic rootca-internal -n openstack
    +
    +
    +
  6. +
  7. +

    Import the certificate and the key from FreeIPA:

    +
    +
    +
    $ oc patch secret rootca-internal -n openstack -p="{\"data\":{\"ca.crt\": \"`$IPA_SSH openssl pkcs12 -in /tmp/freeipa.p12 -passin file:/etc/pki/pki-tomcat/alias/pwdfile.txt -nokeys | openssl x509 | base64 -w 0`\"}}"
    +
    +$ oc patch secret rootca-internal -n openstack -p="{\"data\":{\"tls.crt\": \"`$IPA_SSH openssl pkcs12 -in /tmp/freeipa.p12 -passin file:/etc/pki/pki-tomcat/alias/pwdfile.txt -nokeys | openssl x509 | base64 -w 0`\"}}"
    +
    +$ oc patch secret rootca-internal -n openstack -p="{\"data\":{\"tls.key\": \"`$IPA_SSH openssl pkcs12 -in /tmp/freeipa.p12 -passin file:/etc/pki/pki-tomcat/alias/pwdfile.txt -nocerts -noenc | openssl rsa | base64 -w 0`\"}}"
    +
    +
    +
  8. +
  9. +

    Create the cert-manager issuer and reference the secret:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: cert-manager.io/v1
    +kind: Issuer
    +metadata:
    +  name: rootca-internal
    +  namespace: openstack
    +  labels:
    +    osp-rootca-issuer-public: ""
    +    osp-rootca-issuer-internal: ""
    +    osp-rootca-issuer-libvirt: ""
    +    osp-rootca-issuer-ovn: ""
    +spec:
    +  ca:
    +    secretName: rootca-internal
    +EOF
    +
    +
    +
  10. +
  11. +

    Delete the previously created p12 files:

    +
    +
    +
    $IPA_SSH rm /tmp/freeipa.p12
    +
    +
    +
  12. +
+
+
+
Verification
+
    +
  • +

    Verify that the necessary resources are created:

    +
    +
    +
    $ oc get issuers -n openstack
    +
    +
    +
    +
    +
    $ oc get secret rootca-internal -n openstack -o yaml
    +
    +
    +
  • +
+
+
+ + + + + +
+ + +After the adoption is complete, the cert-manager operator issues new certificates and updates the secrets with the new certificates. As a result, the pods on the control plane automatically restart in order to obtain the new certificates. On the data plane, you must manually initiate a new deployment and restart certain processes to use the new certificates. The old certificates remain active until both the control plane and data plane obtain the new certificates. +
+
+
+
+
+

Migrating databases to the control plane

+
+
+

To begin creating the control plane, enable back-end services and import the databases from your original Red Hat OpenStack Platform 17.1 deployment.

+
+
+

Retrieving topology-specific service configuration

+
+

Before you migrate your databases to the Red Hat OpenStack Services on OpenShift (RHOSO) control plane, retrieve the topology-specific service configuration from your Red Hat OpenStack Platform (RHOSP) environment. You need this configuration for the following reasons:

+
+
+
    +
  • +

    To check your current database for inaccuracies

    +
  • +
  • +

    To ensure that you have the data you need before the migration

    +
  • +
  • +

    To compare your RHOSP database with the adopted RHOSO database

    +
  • +
+
+
+
Prerequisites
+
    +
  • +

    Define the following shell variables. Replace the example values with values that are correct for your environment:

    +
    +
    +
    CONTROLLER1_SSH="ssh -i *<path to SSH key>* root@*<node IP>*"
    +MARIADB_IMAGE=registry.redhat.io/rhosp-dev-preview/openstack-mariadb-rhel9:18.0
    +SOURCE_MARIADB_IP=172.17.0.2
    +SOURCE_DB_ROOT_PASSWORD=$(cat ~/overcloud-deploy/overcloud/overcloud-passwords.yaml | grep ' MysqlRootPassword:' | awk -F ': ' '{ print $2; }')
    +)
    +MARIADB_CLIENT_ANNOTATIONS='--annotations=k8s.v1.cni.cncf.io/networks=internalapi'
    +
    +
    +
    +

    To get the value to set SOURCE_MARIADB_IP, query the puppet-generated configurations in a Controller node:

    +
    +
    +
    +
    $ grep -rI 'listen mysql' -A10 /var/lib/config-data/puppet-generated/ | grep bind
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Export the shell variables for the following outputs and test the connection to the RHOSP database:

    +
    +
    +
    export PULL_OPENSTACK_CONFIGURATION_DATABASES=$(oc run mariadb-client ${MARIADB_CLIENT_ANNOTATIONS} -q --image ${MARIADB_IMAGE} -i --rm --restart=Never -- \
    +    mysql -rsh "$SOURCE_MARIADB_IP" -uroot -p"$SOURCE_DB_ROOT_PASSWORD" -e 'SHOW databases;')
    +echo "$PULL_OPENSTACK_CONFIGURATION_DATABASES"
    +
    +
    +
    + + + + + +
    + + +The nova, nova_api, and nova_cell0 databases are included in the same database host. +
    +
    +
  2. +
  3. +

    Run mysqlcheck on the RHOSP database to check for inaccuracies:

    +
    +
    +
    export PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK=$(oc run mariadb-client ${MARIADB_CLIENT_ANNOTATIONS} -q --image ${MARIADB_IMAGE} -i --rm --restart=Never -- \
    +    mysqlcheck --all-databases -h $SOURCE_MARIADB_IP -u root -p"$SOURCE_DB_ROOT_PASSWORD" | grep -v OK)
    +echo "$PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK"
    +
    +
    +
  4. +
  5. +

    Get the Compute service (nova) cell mappings:

    +
    +
    +
    export PULL_OPENSTACK_CONFIGURATION_NOVADB_MAPPED_CELLS=$(oc run mariadb-client ${MARIADB_CLIENT_ANNOTATIONS} -q --image ${MARIADB_IMAGE} -i --rm --restart=Never -- \
    +    mysql -rsh "${SOURCE_MARIADB_IP}" -uroot -p"${SOURCE_DB_ROOT_PASSWORD}" nova_api -e \
    +    'select uuid,name,transport_url,database_connection,disabled from cell_mappings;')
    +echo "$PULL_OPENSTACK_CONFIGURATION_NOVADB_MAPPED_CELLS"
    +
    +
    +
  6. +
  7. +

    Get the hostnames of the registered Compute services:

    +
    +
    +
    export PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES=$(oc run mariadb-client ${MARIADB_CLIENT_ANNOTATIONS} -q --image ${MARIADB_IMAGE} -i --rm --restart=Never -- \
    +    mysql -rsh "$SOURCE_MARIADB_IP" -uroot -p"$SOURCE_DB_ROOT_PASSWORD" nova_api -e \
    +    "select host from nova.services where services.binary='nova-compute';")
    +echo "$PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES"
    +
    +
    +
  8. +
  9. +

    Get the list of the mapped Compute service cells:

    +
    +
    +
    export PULL_OPENSTACK_CONFIGURATION_NOVAMANAGE_CELL_MAPPINGS=$($CONTROLLER1_SSH sudo podman exec -it nova_api nova-manage cell_v2 list_cells)
    +echo "$PULL_OPENSTACK_CONFIGURATION_NOVAMANAGE_CELL_MAPPINGS"
    +
    +
    +
    + + + + + +
    + + +After the RHOSP control plane services are shut down, if any of the exported values are lost, re-running the command fails because the control plane services are no longer running on the source cloud, and the data cannot be retrieved. To avoid data loss, preserve the exported values in an environment file before shutting down the control plane services. +
    +
    +
  10. +
  11. +

    If neutron-sriov-nic-agent agents are running in your RHOSP deployment, get the configuration to use for the data plane adoption:

    +
    +
    +
    SRIOV_AGENTS=$ oc run mariadb-client mysql -rsh "$SOURCE_MARIADB_IP" \
    +-uroot -p"$SOURCE_DB_ROOT_PASSWORD" ovs_neutron -e \
    +"select host, configurations from agents where agents.binary='neutron-sriov-nic-agent';"
    +
    +
    +
  12. +
  13. +

    Store the exported variables for future use:

    +
    +
    +
    $ cat >~/.source_cloud_exported_variables <<EOF
    +PULL_OPENSTACK_CONFIGURATION_DATABASES="$PULL_OPENSTACK_CONFIGURATION_DATABASES"
    +PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK="$PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK"
    +PULL_OPENSTACK_CONFIGURATION_NOVADB_MAPPED_CELLS="$PULL_OPENSTACK_CONFIGURATION_NOVADB_MAPPED_CELLS"
    +PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES="$PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES"
    +PULL_OPENSTACK_CONFIGURATION_NOVAMANAGE_CELL_MAPPINGS="$PULL_OPENSTACK_CONFIGURATION_NOVAMANAGE_CELL_MAPPINGS"
    +SRIOV_AGENTS="$SRIOV_AGENTS"
    +EOF
    +
    +
    +
  14. +
+
+
+
+

Deploying back-end services

+
+

Create the OpenStackControlPlane custom resource (CR) with the basic back-end services deployed, and disable all the Red Hat OpenStack Platform (RHOSP) services. This CR is the foundation of the control plane.

+
+
+
Prerequisites
+
    +
  • +

    The cloud that you want to adopt is running, and it is on the RHOSP 17.1 release.

    +
  • +
  • +

    All control plane and data plane hosts of the source cloud are running, and continue to run throughout the adoption procedure.

    +
  • +
  • +

    The openstack-operator is deployed, but OpenStackControlPlane is +not deployed.

    +
  • +
  • +

    Install the OpenStack Operators. For more information, see Installing and preparing the Operators in Deploying Red Hat OpenStack Services on OpenShift.

    +
  • +
  • +

    If you enabled TLS everywhere (TLS-e) on the RHOSP environment, you must copy the tls root CA from the RHOSP environment to the rootca-internal issuer.

    +
  • +
  • +

    There are free PVs available for MariaDB and RabbitMQ.

    +
  • +
  • +

    Set the desired admin password for the control plane deployment. This can +be the admin password from your original deployment or a different password:

    +
    +
    +
    ADMIN_PASSWORD=SomePassword
    +
    +
    +
    +

    To use the existing RHOSP deployment password:

    +
    +
    +
    +
    ADMIN_PASSWORD=$(cat ~/overcloud-deploy/overcloud/overcloud-passwords.yaml | grep ' AdminPassword:' | awk -F ': ' '{ print $2; }')
    +
    +
    +
  • +
  • +

    Set the service password variables to match the original deployment. +Database passwords can differ in the control plane environment, but +you must synchronize the service account passwords.

    +
    +

    For example, in developer environments with director Standalone, the passwords can be extracted:

    +
    +
    +
    +
    AODH_PASSWORD=$(cat ~/overcloud-deploy/overcloud/overcloud-passwords.yaml | grep ' AodhPassword:' | awk -F ': ' '{ print $2; }')
    +BARBICAN_PASSWORD=$(cat ~/overcloud-deploy/overcloud/overcloud-passwords.yaml | grep ' BarbicanPassword:' | awk -F ': ' '{ print $2; }')
    +CEILOMETER_PASSWORD=$(cat ~/overcloud-deploy/overcloud/overcloud-passwords.yaml | grep ' CeilometerPassword:' | awk -F ': ' '{ print $2; }')
    +CINDER_PASSWORD=$(cat ~/overcloud-deploy/overcloud/overcloud-passwords.yaml | grep ' CinderPassword:' | awk -F ': ' '{ print $2; }')
    +GLANCE_PASSWORD=$(cat ~/overcloud-deploy/overcloud/overcloud-passwords.yaml | grep ' GlancePassword:' | awk -F ': ' '{ print $2; }')
    +HEAT_AUTH_ENCRYPTION_KEY=$(cat ~/overcloud-deploy/overcloud/overcloud-passwords.yaml | grep ' HeatAuthEncryptionKey:' | awk -F ': ' '{ print $2; }')
    +HEAT_PASSWORD=$(cat ~/overcloud-deploy/overcloud/overcloud-passwords.yaml | grep ' HeatPassword:' | awk -F ': ' '{ print $2; }')
    +IRONIC_PASSWORD=$(cat ~/overcloud-deploy/overcloud/overcloud-passwords.yaml | grep ' IronicPassword:' | awk -F ': ' '{ print $2; }')
    +MANILA_PASSWORD=$(cat ~/overcloud-deploy/overcloud/overcloud-passwords.yaml | grep ' ManilaPassword:' | awk -F ': ' '{ print $2; }')
    +NEUTRON_PASSWORD=$(cat ~/overcloud-deploy/overcloud/overcloud-passwords.yaml | grep ' NeutronPassword:' | awk -F ': ' '{ print $2; }')
    +NOVA_PASSWORD=$(cat ~/overcloud-deploy/overcloud/overcloud-passwords.yaml | grep ' NovaPassword:' | awk -F ': ' '{ print $2; }')
    +OCTAVIA_PASSWORD=$(cat ~/overcloud-deploy/overcloud/overcloud-passwords.yaml | grep ' OctaviaPassword:' | awk -F ': ' '{ print $2; }')
    +PLACEMENT_PASSWORD=$(cat ~/overcloud-deploy/overcloud/overcloud-passwords.yaml | grep ' PlacementPassword:' | awk -F ': ' '{ print $2; }')
    +SWIFT_PASSWORD=$(cat ~/overcloud-deploy/overcloud/overcloud-passwords.yaml | grep ' SwiftPassword:' | awk -F ': ' '{ print $2; }')
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Ensure that you are using the Red Hat OpenShift Container Platform (RHOCP) namespace where you want the +control plane to be deployed:

    +
    +
    +
    $ oc project openstack
    +
    +
    +
  2. +
  3. +

    Create the RHOSP secret. For more information, see Providing secure access to the Red Hat OpenStack Services on OpenShift services in Deploying Red Hat OpenStack Services on OpenShift.

    +
  4. +
  5. +

    If the $ADMIN_PASSWORD is different than the password you set +in osp-secret, amend the AdminPassword key in the osp-secret:

    +
    +
    +
    $ oc set data secret/osp-secret "AdminPassword=$ADMIN_PASSWORD"
    +
    +
    +
  6. +
  7. +

    Set service account passwords in osp-secret to match the service +account passwords from the original deployment:

    +
    +
    +
    $ oc set data secret/osp-secret "AodhPassword=$AODH_PASSWORD"
    +$ oc set data secret/osp-secret "BarbicanPassword=$BARBICAN_PASSWORD"
    +$ oc set data secret/osp-secret "CeilometerPassword=$CEILOMETER_PASSWORD"
    +$ oc set data secret/osp-secret "CinderPassword=$CINDER_PASSWORD"
    +$ oc set data secret/osp-secret "GlancePassword=$GLANCE_PASSWORD"
    +$ oc set data secret/osp-secret "HeatAuthEncryptionKey=$HEAT_AUTH_ENCRYPTION_KEY"
    +$ oc set data secret/osp-secret "HeatPassword=$HEAT_PASSWORD"
    +$ oc set data secret/osp-secret "IronicPassword=$IRONIC_PASSWORD"
    +$ oc set data secret/osp-secret "IronicInspectorPassword=$IRONIC_PASSWORD"
    +$ oc set data secret/osp-secret "ManilaPassword=$MANILA_PASSWORD"
    +$ oc set data secret/osp-secret "MetadataSecret=$METADATA_SECRET"
    +$ oc set data secret/osp-secret "NeutronPassword=$NEUTRON_PASSWORD"
    +$ oc set data secret/osp-secret "NovaPassword=$NOVA_PASSWORD"
    +$ oc set data secret/osp-secret "OctaviaPassword=$OCTAVIA_PASSWORD"
    +$ oc set data secret/osp-secret "PlacementPassword=$PLACEMENT_PASSWORD"
    +$ oc set data secret/osp-secret "SwiftPassword=$SWIFT_PASSWORD"
    +
    +
    +
  8. +
  9. +

    If you enabled TLS-e in your RHOSP environment, in the spec:tls section, set the enabled parameter to true:

    +
    +
    +
    apiVersion: core.openstack.org/v1beta1
    +kind: OpenStackControlPlane
    +metadata:
    +  name: openstack
    +spec:
    +  tls:
    +    podLevel:
    +      enabled: true
    +      internal:
    +        ca:
    +          customIssuer: rootca-internal
    +      libvirt:
    +        ca:
    +          customIssuer: rootca-internal
    +      ovn:
    +        ca:
    +          customIssuer: rootca-internal
    +    ingress:
    +      ca:
    +        customIssuer: rootca-internal
    +      enabled: true
    +
    +
    +
  10. +
  11. +

    If you did not enable TLS-e, in the spec:tls` section, set the enabled parameter to false:

    +
    +
    +
    apiVersion: core.openstack.org/v1beta1
    +kind: OpenStackControlPlane
    +metadata:
    +  name: openstack
    +spec:
    +  tls:
    +    podLevel:
    +      enabled: false
    +    ingress:
    +      enabled: false
    +
    +
    +
  12. +
  13. +

    Deploy the OpenStackControlPlane CR. Ensure that you only enable the DNS, MariaDB, Memcached, and RabbitMQ services. All other services must +be disabled:

    +
    +
    +
    oc apply -f - <<EOF
    +apiVersion: core.openstack.org/v1beta1
    +kind: OpenStackControlPlane
    +metadata:
    +  name: openstack
    +spec:
    +  secret: osp-secret
    +  storageClass: local-storage (1)
    +
    +  barbican:
    +    enabled: false
    +    template:
    +      barbicanAPI: {}
    +      barbicanWorker: {}
    +      barbicanKeystoneListener: {}
    +
    +  cinder:
    +    enabled: false
    +    template:
    +      cinderAPI: {}
    +      cinderScheduler: {}
    +      cinderBackup: {}
    +      cinderVolumes: {}
    +
    +  dns:
    +    template:
    +      override:
    +        service:
    +          metadata:
    +            annotations:
    +              metallb.universe.tf/address-pool: ctlplane
    +              metallb.universe.tf/allow-shared-ip: ctlplane
    +              metallb.universe.tf/loadBalancerIPs: 192.168.122.80
    +          spec:
    +            type: LoadBalancer
    +      options:
    +      - key: server
    +        values:
    +        - 192.168.122.1
    +      replicas: 1
    +
    +  glance:
    +    enabled: false
    +    template:
    +      glanceAPIs: {}
    +
    +  heat:
    +    enabled: false
    +    template: {}
    +
    +  horizon:
    +    enabled: false
    +    template: {}
    +
    +  ironic:
    +    enabled: false
    +    template:
    +      ironicConductors: []
    +
    +  keystone:
    +    enabled: false
    +    template: {}
    +
    +  manila:
    +    enabled: false
    +    template:
    +      manilaAPI: {}
    +      manilaScheduler: {}
    +      manilaShares: {}
    +
    +  mariadb:
    +    enabled: false
    +    templates: {}
    +
    +  galera:
    +    enabled: true
    +    templates:
    +      openstack:
    +        secret: osp-secret
    +        replicas: 3
    +        storageRequest: 500M
    +      openstack-cell1:
    +        secret: osp-secret
    +        replicas: 3
    +        storageRequest: 500M
    +
    +  memcached:
    +    enabled: true
    +    templates:
    +      memcached:
    +        replicas: 3
    +
    +  neutron:
    +    enabled: false
    +    template: {}
    +
    +  nova:
    +    enabled: false
    +    template: {}
    +
    +  ovn:
    +    enabled: false
    +    template:
    +      ovnController:
    +        networkAttachment: tenant
    +        nodeSelector:
    +          node: non-existing-node-name
    +      ovnNorthd:
    +        replicas: 0
    +      ovnDBCluster:
    +        ovndbcluster-nb:
    +          dbType: NB
    +          networkAttachment: internalapi
    +        ovndbcluster-sb:
    +          dbType: SB
    +          networkAttachment: internalapi
    +
    +  placement:
    +    enabled: false
    +    template: {}
    +
    +  rabbitmq:
    +    templates:
    +      rabbitmq:
    +        override:
    +          service:
    +            metadata:
    +              annotations:
    +                metallb.universe.tf/address-pool: internalapi
    +                metallb.universe.tf/loadBalancerIPs: 172.17.0.85
    +            spec:
    +              type: LoadBalancer
    +      rabbitmq-cell1:
    +        override:
    +          service:
    +            metadata:
    +              annotations:
    +                metallb.universe.tf/address-pool: internalapi
    +                metallb.universe.tf/loadBalancerIPs: 172.17.0.86
    +            spec:
    +              type: LoadBalancer
    +
    +  telemetry:
    +    enabled: false
    +
    +  swift:
    +    enabled: false
    +    template:
    +      swiftRing:
    +        ringReplicas: 1
    +      swiftStorage:
    +        replicas: 0
    +      swiftProxy:
    +        replicas: 1
    +EOF
    +
    +
    +
    + + + + + +
    1Select an existing storage class in your RHOCP cluster.
    +
    +
  14. +
+
+
+
Verification
+
    +
  • +

    Verify that MariaDB is running:

    +
    +
    +
    $ oc get pod openstack-galera-0 -o jsonpath='{.status.phase}{"\n"}'
    +$ oc get pod openstack-cell1-galera-0 -o jsonpath='{.status.phase}{"\n"}'
    +
    +
    +
  • +
+
+
+
+

Configuring a Red Hat Ceph Storage back end

+
+

If your Red Hat OpenStack Platform (RHOSP) 17.1 deployment uses a Red Hat Ceph Storage back end for any service, such as Image Service (glance), Block Storage service (cinder), Compute service (nova), or Shared File Systems service (manila), you must configure the custom resources (CRs) to use the same back end in the Red Hat OpenStack Services on OpenShift (RHOSO) 18.0 deployment.

+
+
+ + + + + +
+ + +To run ceph commands, you must use SSH to connect to a Red Hat Ceph Storage node and run sudo cephadm shell. This generates a Ceph orchestrator container that enables you to run administrative commands against the Red Hat Ceph Storage cluster. If you deployed the Red Hat Ceph Storage cluster by using director, you can launch the cephadm shell from an RHOSP Controller node. +
+
+
+
Prerequisites
+
    +
  • +

    The OpenStackControlPlane CR is created.

    +
  • +
  • +

    If your RHOSP 17.1 deployment uses the Shared File Systems service, the openstack keyring is updated. Modify the openstack user so that you can use it across all RHOSP services:

    +
    +
    +
    ceph auth caps client.openstack \
    +  mgr 'allow *' \
    +  mon 'allow r, profile rbd' \
    +  osd 'profile rbd pool=vms, profile rbd pool=volumes, profile rbd pool=images, allow rw pool manila_data'
    +
    +
    +
    +

    Using the same user across all services makes it simpler to create a common Red Hat Ceph Storage secret that includes the keyring and ceph.conf file and propagate the secret to all the services that need it.

    +
    +
  • +
  • +

    The following shell variables are defined. Replace the following example values with values that are correct for your environment:

    +
    +
    +
    CEPH_SSH="ssh -i <path to SSH key> root@<node IP>"
    +CEPH_KEY=$($CEPH_SSH "cat /etc/ceph/ceph.client.openstack.keyring | base64 -w 0")
    +CEPH_CONF=$($CEPH_SSH "cat /etc/ceph/ceph.conf | base64 -w 0")
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Create the ceph-conf-files secret that includes the Red Hat Ceph Storage configuration:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: v1
    +data:
    +  ceph.client.openstack.keyring: $CEPH_KEY
    +  ceph.conf: $CEPH_CONF
    +kind: Secret
    +metadata:
    +  name: ceph-conf-files
    +  namespace: openstack
    +type: Opaque
    +EOF
    +
    +
    +
    +

    The content of the file should be similar to the following example:

    +
    +
    +
    +
    apiVersion: v1
    +kind: Secret
    +metadata:
    +  name: ceph-conf-files
    +  namespace: openstack
    +stringData:
    +  ceph.client.openstack.keyring: |
    +    [client.openstack]
    +        key = <secret key>
    +        caps mgr = "allow *"
    +        caps mon = "allow r, profile rbd"
    +        caps osd = "pool=vms, profile rbd pool=volumes, profile rbd pool=images, allow rw pool manila_data'
    +  ceph.conf: |
    +    [global]
    +    fsid = 7a1719e8-9c59-49e2-ae2b-d7eb08c695d4
    +    mon_host = 10.1.1.2,10.1.1.3,10.1.1.4
    +
    +
    +
  2. +
  3. +

    In your OpenStackControlPlane CR, inject ceph.conf and ceph.client.openstack.keyring to the RHOSP services that are defined in the propagation list. For example:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch '
    +spec:
    +  extraMounts:
    +    - name: v1
    +      region: r1
    +      extraVol:
    +        - propagation:
    +          - CinderVolume
    +          - CinderBackup
    +          - GlanceAPI
    +          - ManilaShare
    +          extraVolType: Ceph
    +          volumes:
    +          - name: ceph
    +            projected:
    +              sources:
    +              - secret:
    +                  name: ceph-conf-files
    +          mounts:
    +          - name: ceph
    +            mountPath: "/etc/ceph"
    +            readOnly: true
    +'
    +
    +
    +
  4. +
+
+
+
+

Creating an NFS Ganesha cluster

+
+

If you use CephFS through NFS with the Shared File Systems service (manila), you must create a new clustered NFS service on the Red Hat Ceph Storage cluster. This service replaces the standalone, Pacemaker-controlled ceph-nfs service that you use in Red Hat OpenStack Platform (RHOSP) 17.1.

+
+
+
Procedure
+
    +
  1. +

    Identify the Red Hat Ceph Storage nodes to deploy the new clustered NFS service, for example, cephstorage-0, cephstorage-1, cephstorage-2.

    +
    + + + + + +
    + + +You must deploy this service on the StorageNFS isolated network so that you can mount your existing shares through the new NFS export locations. +You can deploy the new clustered NFS service on your existing CephStorage nodes or HCI nodes, or on new hardware that you enrolled in the Red Hat Ceph Storage cluster. +
    +
    +
  2. +
  3. +

    If you deployed your Red Hat Ceph Storage nodes with director, propagate the StorageNFS network to the target nodes where the ceph-nfs service is deployed.

    +
    +
      +
    1. +

      Identify the node definition file, overcloud-baremetal-deploy.yaml, that is used in the RHOSP environment. +For more information about identifying the overcloud-baremetal-deploy.yaml file, see Customizing overcloud networks in Customizing the Red Hat OpenStack Services on OpenShift deployment.

      +
    2. +
    3. +

      Edit the networks that are associated with the Red Hat Ceph Storage nodes to include the StorageNFS network:

      +
      +
      +
      - name: CephStorage
      +  count: 3
      +  hostname_format: cephstorage-%index%
      +  instances:
      +  - hostname: cephstorage-0
      +    name: ceph-0
      +  - hostname: cephstorage-1
      +    name: ceph-1
      +  - hostname: cephstorage-2
      +    name: ceph-2
      +  defaults:
      +    profile: ceph-storage
      +    network_config:
      +      template: /home/stack/network/nic-configs/ceph-storage.j2
      +      network_config_update: true
      +    networks:
      +    - network: ctlplane
      +      vif: true
      +    - network: storage
      +    - network: storage_mgmt
      +    - network: storage_nfs
      +
      +
      +
    4. +
    5. +

      Edit the network configuration template file, for example, /home/stack/network/nic-configs/ceph-storage.j2, for the Red Hat Ceph Storage nodes +to include an interface that connects to the StorageNFS network:

      +
      +
      +
      - type: vlan
      +  device: nic2
      +  vlan_id: {{ storage_nfs_vlan_id }}
      +  addresses:
      +  - ip_netmask: {{ storage_nfs_ip }}/{{ storage_nfs_cidr }}
      +  routes: {{ storage_nfs_host_routes }}
      +
      +
      +
    6. +
    7. +

      Update the Red Hat Ceph Storage nodes:

      +
      +
      +
      $ openstack overcloud node provision \
      +    --stack overcloud   \
      +    --network-config -y  \
      +    -o overcloud-baremetal-deployed-storage_nfs.yaml \
      +    --concurrency 2 \
      +    /home/stack/network/baremetal_deployment.yaml
      +
      +
      +
      +

      When the update is complete, ensure that a new interface is created in theRed Hat Ceph Storage nodes and that they are tagged with the VLAN that is associated with StorageNFS.

      +
      +
    8. +
    +
    +
  4. +
  5. +

    Identify the IP address from the StorageNFS network to use as the Virtual IP +address (VIP) for the Ceph NFS service:

    +
    +
    +
    $ openstack port list -c "Fixed IP Addresses" --network storage_nfs
    +
    +
    +
  6. +
  7. +

    In a running cephadm shell, identify the hosts for the NFS service:

    +
    +
    +
    $ ceph orch host ls
    +
    +
    +
  8. +
  9. +

    Label each host that you identified. Repeat this command for each host that you want to label:

    +
    +
    +
    $ ceph orch host label add <hostname> nfs
    +
    +
    +
    +
      +
    • +

      Replace <hostname> with the name of the host that you identified.

      +
    • +
    +
    +
  10. +
  11. +

    Create the NFS cluster:

    +
    +
    +
    $ ceph nfs cluster create cephfs \
    +    "label:nfs" \
    +    --ingress \
    +    --virtual-ip=<VIP>
    +    --ingress-mode=haproxy-protocol
    +}}
    +
    +
    +
    +
      +
    • +

      Replace <VIP> with the VIP for the Ceph NFS service.

      +
      + + + + + +
      + + +You must set the ingress-mode argument to haproxy-protocol. No other +ingress-mode is supported. This ingress mode allows you to enforce client +restrictions through the Shared File Systems service. +For more information on deploying the clustered Ceph NFS service, see the +Management of NFS-Ganesha gateway using the Ceph Orchestrator (Limited Availability) in Red Hat Ceph Storage 7 Operations Guide. +
      +
      +
    • +
    +
    +
  12. +
  13. +

    Check the status of the NFS cluster:

    +
    +
    +
    $ ceph nfs cluster ls
    +$ ceph nfs cluster info cephfs
    +
    +
    +
  14. +
+
+
+
+

Stopping Red Hat OpenStack Platform services

+
+

Before you start the Red Hat OpenStack Services on OpenShift (RHOSO) adoption, you must stop the Red Hat OpenStack Platform (RHOSP) services to avoid inconsistencies in the data that you migrate for the data plane adoption. Inconsistencies are caused by resource changes after the database is copied to the new deployment.

+
+
+

You should not stop the infrastructure management services yet, such as:

+
+
+
    +
  • +

    Database

    +
  • +
  • +

    RabbitMQ

    +
  • +
  • +

    HAProxy Load Balancer

    +
  • +
  • +

    Ceph-nfs

    +
  • +
  • +

    Compute service

    +
  • +
  • +

    Containerized modular libvirt daemons

    +
  • +
  • +

    Object Storage service (swift) back-end services

    +
  • +
+
+
+
Prerequisites
+
    +
  • +

    Ensure that there no long-running tasks that require the services that you plan to stop, such as instance live migrations, volume migrations, volume creation, backup and restore, attaching, detaching, and other similar operations:

    +
    +
    +
    openstack server list --all-projects -c ID -c Status |grep -E '\| .+ing \|'
    +openstack volume list --all-projects -c ID -c Status |grep -E '\| .+ing \|'| grep -vi error
    +openstack volume backup list --all-projects -c ID -c Status |grep -E '\| .+ing \|' | grep -vi error
    +openstack share list --all-projects -c ID -c Status |grep -E '\| .+ing \|'| grep -vi error
    +openstack image list -c ID -c Status |grep -E '\| .+ing \|'
    +
    +
    +
  • +
  • +

    Collect the services topology-specific configuration. For more information, see Retrieving topology-specific service configuration.

    +
  • +
  • +

    Define the following shell variables. The values are examples and refer to a single node standalone director deployment. Replace these example values with values that are correct for your environment:

    +
    +
    +
    CONTROLLER1_SSH="ssh -i <path to SSH key> root@<controller-1 IP>"
    +CONTROLLER2_SSH="ssh -i <path to SSH key> root@<controller-2 IP>"
    +CONTROLLER3_SSH="ssh -i <path to SSH key> root@<controller-3 IP>"
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    If your deployment enables CephFS through NFS as a back end for Shared File Systems service (manila), remove the following Pacemaker ordering and co-location constraints that govern the Virtual IP address of the ceph-nfs service and the manila-share service:

    +
    +
    +
    # check the co-location and ordering constraints concerning "manila-share"
    +sudo pcs constraint list --full
    +
    +# remove these constraints
    +sudo pcs constraint remove colocation-openstack-manila-share-ceph-nfs-INFINITY
    +sudo pcs constraint remove order-ceph-nfs-openstack-manila-share-Optional
    +
    +
    +
  2. +
  3. +

    Disable RHOSP control plane services:

    +
    +
    +
    # Update the services list to be stopped
    +ServicesToStop=("tripleo_aodh_api.service"
    +                "tripleo_aodh_api_cron.service"
    +                "tripleo_aodh_evaluator.service"
    +                "tripleo_aodh_listener.service"
    +                "tripleo_aodh_notifier.service"
    +                "tripleo_ceilometer_agent_central.service"
    +                "tripleo_ceilometer_agent_notification.service"
    +                "tripleo_horizon.service"
    +                "tripleo_keystone.service"
    +                "tripleo_barbican_api.service"
    +                "tripleo_barbican_worker.service"
    +                "tripleo_barbican_keystone_listener.service"
    +                "tripleo_cinder_api.service"
    +                "tripleo_cinder_api_cron.service"
    +                "tripleo_cinder_scheduler.service"
    +                "tripleo_cinder_volume.service"
    +                "tripleo_cinder_backup.service"
    +                "tripleo_collectd.service"
    +                "tripleo_glance_api.service"
    +                "tripleo_gnocchi_api.service"
    +                "tripleo_gnocchi_metricd.service"
    +                "tripleo_gnocchi_statsd.service"
    +                "tripleo_manila_api.service"
    +                "tripleo_manila_api_cron.service"
    +                "tripleo_manila_scheduler.service"
    +                "tripleo_neutron_api.service"
    +                "tripleo_placement_api.service"
    +                "tripleo_nova_api_cron.service"
    +                "tripleo_nova_api.service"
    +                "tripleo_nova_conductor.service"
    +                "tripleo_nova_metadata.service"
    +                "tripleo_nova_scheduler.service"
    +                "tripleo_nova_vnc_proxy.service"
    +                "tripleo_aodh_api.service"
    +                "tripleo_aodh_api_cron.service"
    +                "tripleo_aodh_evaluator.service"
    +                "tripleo_aodh_listener.service"
    +                "tripleo_aodh_notifier.service"
    +                "tripleo_ceilometer_agent_central.service"
    +                "tripleo_ceilometer_agent_compute.service"
    +                "tripleo_ceilometer_agent_ipmi.service"
    +                "tripleo_ceilometer_agent_notification.service"
    +                "tripleo_ovn_cluster_northd.service"
    +                "tripleo_ironic_neutron_agent.service"
    +                "tripleo_ironic_api.service"
    +                "tripleo_ironic_inspector.service"
    +                "tripleo_ironic_conductor.service")
    +
    +PacemakerResourcesToStop=("openstack-cinder-volume"
    +                          "openstack-cinder-backup"
    +                          "openstack-manila-share")
    +
    +echo "Stopping systemd OpenStack services"
    +for service in ${ServicesToStop[*]}; do
    +    for i in {1..3}; do
    +        SSH_CMD=CONTROLLER${i}_SSH
    +        if [ ! -z "${!SSH_CMD}" ]; then
    +            echo "Stopping the $service in controller $i"
    +            if ${!SSH_CMD} sudo systemctl is-active $service; then
    +                ${!SSH_CMD} sudo systemctl stop $service
    +            fi
    +        fi
    +    done
    +done
    +
    +echo "Checking systemd OpenStack services"
    +for service in ${ServicesToStop[*]}; do
    +    for i in {1..3}; do
    +        SSH_CMD=CONTROLLER${i}_SSH
    +        if [ ! -z "${!SSH_CMD}" ]; then
    +            if ! ${!SSH_CMD} systemctl show $service | grep ActiveState=inactive >/dev/null; then
    +                echo "ERROR: Service $service still running on controller $i"
    +            else
    +                echo "OK: Service $service is not running on controller $i"
    +            fi
    +        fi
    +    done
    +done
    +
    +echo "Stopping pacemaker OpenStack services"
    +for i in {1..3}; do
    +    SSH_CMD=CONTROLLER${i}_SSH
    +    if [ ! -z "${!SSH_CMD}" ]; then
    +        echo "Using controller $i to run pacemaker commands"
    +        for resource in ${PacemakerResourcesToStop[*]}; do
    +            if ${!SSH_CMD} sudo pcs resource config $resource &>/dev/null; then
    +                echo "Stopping $resource"
    +                ${!SSH_CMD} sudo pcs resource disable $resource
    +            else
    +                echo "Service $resource not present"
    +            fi
    +        done
    +        break
    +    fi
    +done
    +
    +echo "Checking pacemaker OpenStack services"
    +for i in {1..3}; do
    +    SSH_CMD=CONTROLLER${i}_SSH
    +    if [ ! -z "${!SSH_CMD}" ]; then
    +        echo "Using controller $i to run pacemaker commands"
    +        for resource in ${PacemakerResourcesToStop[*]}; do
    +            if ${!SSH_CMD} sudo pcs resource config $resource &>/dev/null; then
    +                if ! ${!SSH_CMD} sudo pcs resource status $resource | grep Started; then
    +                    echo "OK: Service $resource is stopped"
    +                else
    +                    echo "ERROR: Service $resource is started"
    +                fi
    +            fi
    +        done
    +        break
    +    fi
    +done
    +
    +
    +
    +

    If the status of each service is OK, then the services stopped successfully.

    +
    +
  4. +
+
+
+
+

Migrating databases to MariaDB instances

+
+

Migrate your databases from the original Red Hat OpenStack Platform (RHOSP) deployment to the MariaDB instances in the Red Hat OpenShift Container Platform (RHOCP) cluster.

+
+
+
Prerequisites
+
    +
  • +

    Ensure that the control plane MariaDB and RabbitMQ are running, and that no other control plane services are running.

    +
  • +
  • +

    Retrieve the topology-specific service configuration. For more information, see Retrieving topology-specific service configuration.

    +
  • +
  • +

    Stop the RHOSP services. For more information, see Stopping Red Hat OpenStack Platform services.

    +
  • +
  • +

    Ensure that there is network routability between the original MariaDB and the MariaDB for the control plane.

    +
  • +
  • +

    Define the following shell variables. Replace the following example values with values that are correct for your environment:

    +
    +
    +
    PODIFIED_MARIADB_IP=$(oc get svc --selector "mariadb/name=openstack" -ojsonpath='{.items[0].spec.clusterIP}')
    +PODIFIED_CELL1_MARIADB_IP=$(oc get svc --selector "mariadb/name=openstack-cell1" -ojsonpath='{.items[0].spec.clusterIP}')
    +PODIFIED_DB_ROOT_PASSWORD=$(oc get -o json secret/osp-secret | jq -r .data.DbRootPassword | base64 -d)
    +
    +# The CHARACTER_SET and collation should match the source DB
    +# if the do not then it will break foreign key relationships
    +# for any tables that are created in the future as part of db sync
    +CHARACTER_SET=utf8
    +COLLATION=utf8_general_ci
    +
    +STORAGE_CLASS=local-storage
    +MARIADB_IMAGE=registry.redhat.io/rhosp-dev-preview/openstack-mariadb-rhel9:18.0
    +# Replace with your environment's MariaDB Galera cluster VIP and backend IPs:
    +SOURCE_MARIADB_IP=172.17.0.2
    +declare -A SOURCE_GALERA_MEMBERS
    +SOURCE_GALERA_MEMBERS=(
    +  ["standalone.localdomain"]=172.17.0.100
    +  # ...
    +)
    +SOURCE_DB_ROOT_PASSWORD=$(cat ~/overcloud-deploy/overcloud/overcloud-passwords.yaml | grep ' MysqlRootPassword:' | awk -F ': ' '{ print $2; }')
    +)
    +
    +
    +
    +

    To get the value to set SOURCE_MARIADB_IP, query the puppet-generated configurations in a Controller node:

    +
    +
    +
    +
    $ grep -rI 'listen mysql' -A10 /var/lib/config-data/puppet-generated/ | grep bind
    +
    +
    +
  • +
  • +

    Prepare the MariaDB adoption helper pod:

    +
    +
      +
    1. +

      Create a temporary volume claim and a pod for the database data copy. Edit the volume claim storage request if necessary, to give it enough space for the overcloud databases:

      +
      +
      +
      oc apply -f - <<EOF
      +---
      +apiVersion: v1
      +kind: PersistentVolumeClaim
      +metadata:
      +  name: mariadb-data
      +spec:
      +  storageClassName: $STORAGE_CLASS
      +  accessModes:
      +    - ReadWriteOnce
      +  resources:
      +    requests:
      +      storage: 10Gi
      +---
      +apiVersion: v1
      +kind: Pod
      +metadata:
      +  name: mariadb-copy-data
      +  annotations:
      +    openshift.io/scc: anyuid
      +    k8s.v1.cni.cncf.io/networks: internalapi
      +  labels:
      +    app: adoption
      +spec:
      +  containers:
      +  - image: $MARIADB_IMAGE
      +    command: [ "sh", "-c", "sleep infinity"]
      +    name: adoption
      +    volumeMounts:
      +    - mountPath: /backup
      +      name: mariadb-data
      +  securityContext:
      +    allowPrivilegeEscalation: false
      +    capabilities:
      +      drop: ALL
      +    runAsNonRoot: true
      +    seccompProfile:
      +      type: RuntimeDefault
      +  volumes:
      +  - name: mariadb-data
      +    persistentVolumeClaim:
      +      claimName: mariadb-data
      +EOF
      +
      +
      +
    2. +
    3. +

      Wait for the pod to be ready:

      +
      +
      +
      $ oc wait --for condition=Ready pod/mariadb-copy-data --timeout=30s
      +
      +
      +
    4. +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Check that the source Galera database cluster members are online and synced:

    +
    +
    +
    for i in "${!SOURCE_GALERA_MEMBERS[@]}"; do
    +  echo "Checking for the database node $i WSREP status Synced"
    +  oc rsh mariadb-copy-data mysql \
    +    -h "${SOURCE_GALERA_MEMBERS[$i]}" -uroot -p"$SOURCE_DB_ROOT_PASSWORD" \
    +    -e "show global status like 'wsrep_local_state_comment'" | \
    +    grep -qE "\bSynced\b"
    +done
    +
    +
    +
  2. +
  3. +

    Get the count of source databases with the NOK (not-OK) status:

    +
    +
    +
    $ oc rsh mariadb-copy-data mysql -h "${SOURCE_MARIADB_IP}" -uroot -p"${SOURCE_DB_ROOT_PASSWORD}" -e "SHOW databases;"
    +
    +
    +
  4. +
  5. +

    Check that mysqlcheck had no errors:

    +
    +
    +
    . ~/.source_cloud_exported_variables
    +test -z "$PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK"  || [ "$PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK" = " " ] && echo "OK" || echo "CHECK FAILED"
    +
    +
    +
  6. +
  7. +

    Test the connection to the control plane databases:

    +
    +
    +
    $ oc run mariadb-client --image $MARIADB_IMAGE -i --rm --restart=Never -- \
    +    mysql -rsh "$PODIFIED_MARIADB_IP" -uroot -p"$PODIFIED_DB_ROOT_PASSWORD" -e 'SHOW databases;'
    +$ oc run mariadb-client --image $MARIADB_IMAGE -i --rm --restart=Never -- \
    +    mysql -rsh "$PODIFIED_CELL1_MARIADB_IP" -uroot -p"$PODIFIED_DB_ROOT_PASSWORD" -e 'SHOW databases;'
    +
    +
    +
    + + + + + +
    + + +You must transition Compute service (nova) services that are imported later into a superconductor architecture by deleting the old service records in the cell databases, starting with cell1. New records are registered with different hostnames provided by the Compute service operator. All Compute services, except the Compute agent, have no internal state, and their service records can be safely deleted. You also need to rename the former default cell to cell1. +
    +
    +
  8. +
  9. +

    Create a dump of the original databases:

    +
    +
    +
    $ oc rsh mariadb-copy-data << EOF
    +  mysql -h"${SOURCE_MARIADB_IP}" -uroot -p"${SOURCE_DB_ROOT_PASSWORD}" \
    +  -N -e "show databases" | grep -E -v "schema|mysql|gnocchi|aodh" | \
    +  while read dbname; do
    +    echo "Dumping \${dbname}";
    +    mysqldump -h"${SOURCE_MARIADB_IP}" -uroot -p"${SOURCE_DB_ROOT_PASSWORD}" \
    +      --single-transaction --complete-insert --skip-lock-tables --lock-tables=0 \
    +      "\${dbname}" > /backup/"\${dbname}".sql;
    +   done
    +EOF
    +
    +
    +
  10. +
  11. +

    Restore the databases from .sql files into the control plane MariaDB:

    +
    +
    +
    $ oc rsh mariadb-copy-data << EOF
    +  # db schemas to rename on import
    +  declare -A db_name_map
    +  db_name_map['nova']='nova_cell1'
    +  db_name_map['ovs_neutron']='neutron'
    +  db_name_map['ironic-inspector']='ironic_inspector'
    +
    +  # db servers to import into
    +  declare -A db_server_map
    +  db_server_map['default']=${PODIFIED_MARIADB_IP}
    +  db_server_map['nova_cell1']=${PODIFIED_CELL1_MARIADB_IP}
    +
    +  # db server root password map
    +  declare -A db_server_password_map
    +  db_server_password_map['default']=${PODIFIED_DB_ROOT_PASSWORD}
    +  db_server_password_map['nova_cell1']=${PODIFIED_DB_ROOT_PASSWORD}
    +
    +  cd /backup
    +  for db_file in \$(ls *.sql); do
    +    db_name=\$(echo \${db_file} | awk -F'.' '{ print \$1; }')
    +    if [[ -v "db_name_map[\${db_name}]" ]]; then
    +      echo "renaming \${db_name} to \${db_name_map[\${db_name}]}"
    +      db_name=\${db_name_map[\${db_name}]}
    +    fi
    +    db_server=\${db_server_map["default"]}
    +    if [[ -v "db_server_map[\${db_name}]" ]]; then
    +      db_server=\${db_server_map[\${db_name}]}
    +    fi
    +    db_password=\${db_server_password_map['default']}
    +    if [[ -v "db_server_password_map[\${db_name}]" ]]; then
    +      db_password=\${db_server_password_map[\${db_name}]}
    +    fi
    +    echo "creating \${db_name} in \${db_server}"
    +    mysql -h"\${db_server}" -uroot "-p\${db_password}" -e \
    +      "CREATE DATABASE IF NOT EXISTS \${db_name} DEFAULT \
    +      CHARACTER SET ${CHARACTER_SET} DEFAULT COLLATE ${COLLATION};"
    +    echo "importing \${db_name} into \${db_server}"
    +    mysql -h "\${db_server}" -uroot "-p\${db_password}" "\${db_name}" < "\${db_file}"
    +  done
    +
    +  mysql -h "\${db_server_map['default']}" -uroot -p"\${db_server_password_map['default']}" -e \
    +    "update nova_api.cell_mappings set name='cell1' where name='default';"
    +  mysql -h "\${db_server_map['nova_cell1']}" -uroot -p"\${db_server_password_map['nova_cell1']}" -e \
    +    "delete from nova_cell1.services where host not like '%nova-cell1-%' and services.binary != 'nova-compute';"
    +EOF
    +
    +
    +
  12. +
+
+
+
Verification
+

Compare the following outputs with the topology-specific service configuration. +For more information, see Retrieving topology-specific service configuration.

+
+
+
    +
  1. +

    Check that the databases are imported correctly:

    +
    +
    +
    . ~/.source_cloud_exported_variables
    +
    +# use 'oc exec' and 'mysql -rs' to maintain formatting
    +dbs=$(oc exec openstack-galera-0 -c galera -- mysql -rs -uroot "-p$PODIFIED_DB_ROOT_PASSWORD" -e 'SHOW databases;')
    +echo $dbs | grep -Eq '\bkeystone\b' && echo "OK" || echo "CHECK FAILED"
    +
    +# ensure neutron db is renamed from ovs_neutron
    +echo $dbs | grep -Eq '\bneutron\b'
    +echo $PULL_OPENSTACK_CONFIGURATION_DATABASES | grep -Eq '\bovs_neutron\b' && echo "OK" || echo "CHECK FAILED"
    +
    +# ensure nova cell1 db is extracted to a separate db server and renamed from nova to nova_cell1
    +c1dbs=$(oc exec openstack-cell1-galera-0 -c galera -- mysql -rs -uroot "-p$PODIFIED_DB_ROOT_PASSWORD" -e 'SHOW databases;')
    +echo $c1dbs | grep -Eq '\bnova_cell1\b' && echo "OK" || echo "CHECK FAILED"
    +
    +# ensure default cell renamed to cell1, and the cell UUIDs retained intact
    +novadb_mapped_cells=$(oc exec openstack-galera-0 -c galera -- mysql -rs -uroot "-p$PODIFIED_DB_ROOT_PASSWORD" \
    +  nova_api -e 'select uuid,name,transport_url,database_connection,disabled from cell_mappings;')
    +uuidf='\S{8,}-\S{4,}-\S{4,}-\S{4,}-\S{12,}'
    +left_behind=$(comm -23 \
    +  <(echo $PULL_OPENSTACK_CONFIGURATION_NOVADB_MAPPED_CELLS | grep -oE " $uuidf \S+") \
    +  <(echo $novadb_mapped_cells | tr -s "| " " " | grep -oE " $uuidf \S+"))
    +changed=$(comm -13 \
    +  <(echo $PULL_OPENSTACK_CONFIGURATION_NOVADB_MAPPED_CELLS | grep -oE " $uuidf \S+") \
    +  <(echo $novadb_mapped_cells | tr -s "| " " " | grep -oE " $uuidf \S+"))
    +test $(grep -Ec ' \S+$' <<<$left_behind) -eq 1 && echo "OK" || echo "CHECK FAILED"
    +default=$(grep -E ' default$' <<<$left_behind)
    +test $(grep -Ec ' \S+$' <<<$changed) -eq 1 && echo "OK" || echo "CHECK FAILED"
    +grep -qE " $(awk '{print $1}' <<<$default) cell1$" <<<$changed && echo "OK" || echo "CHECK FAILED"
    +
    +# ensure the registered Compute service name has not changed
    +novadb_svc_records=$(oc exec openstack-cell1-galera-0 -c galera -- mysql -rs -uroot "-p$PODIFIED_DB_ROOT_PASSWORD" \
    +  nova_cell1 -e "select host from services where services.binary='nova-compute' order by host asc;")
    +diff -Z <(echo $novadb_svc_records) <(echo $PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES) && echo "OK" || echo "CHECK FAILED"
    +
    +
    +
  2. +
  3. +

    Delete the mariadb-data pod and the mariadb-copy-data persistent volume claim that contains the database backup:

    +
    + + + + + +
    + + +Consider taking a snapshot of them before deleting. +
    +
    +
    +
    +
    $ oc delete pod mariadb-copy-data
    +$ oc delete pvc mariadb-data
    +
    +
    +
  4. +
+
+
+ + + + + +
+ + +During the pre-checks and post-checks, the mariadb-client pod might return a pod security warning related to the restricted:latest security context constraint. This warning is due to default security context constraints and does not prevent the admission controller from creating a pod. You see a warning for the short-lived pod, but it does not interfere with functionality. +For more information, see About pod security standards and warnings. +
+
+
+
+

Migrating OVN data

+
+

Migrate the data in the OVN databases from the original Red Hat OpenStack Platform deployment to ovsdb-server instances that are running in the Red Hat OpenShift Container Platform (RHOCP) cluster.

+
+
+
Prerequisites
+
    +
  • +

    The OpenStackControlPlane resource is created.

    +
  • +
  • +

    NetworkAttachmentDefinition custom resources (CRs) for the original cluster are defined. Specifically, the internalapi network is defined.

    +
  • +
  • +

    The original Networking service (neutron) and OVN northd are not running.

    +
  • +
  • +

    There is network routability between the control plane services and the adopted cluster.

    +
  • +
  • +

    The cloud is migrated to the Modular Layer 2 plug-in with Open Virtual Networking (ML2/OVN) mechanism driver.

    +
  • +
  • +

    Define the following shell variables. Replace the example values with values that are correct for your environment:

    +
    +
    +
    STORAGE_CLASS=local-storage
    +OVSDB_IMAGE=registry.redhat.io/rhosp-dev-preview/openstack-ovn-base-rhel9:18.0
    +SOURCE_OVSDB_IP=172.17.0.100
    +
    +
    +
    +

    To get the value to set SOURCE_OVSDB_IP, query the puppet-generated configurations in a Controller node:

    +
    +
    +
    +
    $ grep -rI 'ovn_[ns]b_conn' /var/lib/config-data/puppet-generated/
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Prepare a temporary PersistentVolume claim and the helper pod for the OVN backup. Adjust the storage requests for a large database, if needed:

    +
    +
    +
    $ oc apply -f - <<EOF
    +---
    +apiVersion: cert-manager.io/v1
    +kind: Certificate
    +metadata:
    +  name: ovn-data-cert
    +  namespace: openstack
    +spec:
    +  commonName: ovn-data-cert
    +  secretName: ovn-data-cert
    +  issuerRef:
    +    name: rootca-internal
    +---
    +apiVersion: v1
    +kind: PersistentVolumeClaim
    +metadata:
    +  name: ovn-data
    +spec:
    +  storageClassName: $STORAGE_CLASS_NAME
    +  accessModes:
    +    - ReadWriteOnce
    +  resources:
    +    requests:
    +      storage: 10Gi
    +---
    +apiVersion: v1
    +kind: Pod
    +metadata:
    +  name: ovn-copy-data
    +  annotations:
    +    openshift.io/scc: anyuid
    +    k8s.v1.cni.cncf.io/networks: internalapi
    +  labels:
    +    app: adoption
    +spec:
    +  containers:
    +  - image: $OVSDB_IMAGE
    +    command: [ "sh", "-c", "sleep infinity"]
    +    name: adoption
    +    volumeMounts:
    +    - mountPath: /backup
    +      name: ovn-data
    +    - mountPath: /etc/pki/tls/misc
    +      name: ovn-data-cert
    +      readOnly: true
    +  securityContext:
    +    allowPrivilegeEscalation: false
    +    capabilities:
    +      drop: ALL
    +    runAsNonRoot: true
    +    seccompProfile:
    +      type: RuntimeDefault
    +  volumes:
    +  - name: ovn-data
    +    persistentVolumeClaim:
    +      claimName: ovn-data
    +  - name: ovn-data-cert
    +    secret:
    +      secretName: ovn-data-cert
    +EOF
    +
    +
    +
  2. +
  3. +

    Wait for the pod to be ready:

    +
    +
    +
    $ oc wait --for=condition=Ready pod/ovn-copy-data --timeout=30s
    +
    +
    +
  4. +
  5. +

    Back up your OVN databases:

    +
    +
      +
    • +

      If you did not enable TLS everywhere, run the following command:

      +
      +
      +
      $ oc exec ovn-copy-data -- bash -c "ovsdb-client backup tcp:$SOURCE_OVSDB_IP:6641 > /backup/ovs-nb.db"
      +$ oc exec ovn-copy-data -- bash -c "ovsdb-client backup tcp:$SOURCE_OVSDB_IP:6642 > /backup/ovs-sb.db"
      +
      +
      +
    • +
    • +

      If you enabled TLS everywhere, run the following command:

      +
      +
      +
      $ oc exec ovn-copy-data -- bash -c "ovsdb-client backup --ca-cert=/etc/pki/tls/misc/ca.crt --private-key=/etc/pki/tls/misc/tls.key --certificate=/etc/pki/tls/misc/tls.crt ssl:$SOURCE_OVSDB_IP:6641 > /backup/ovs-nb.db"
      +$ oc exec ovn-copy-data -- bash -c "ovsdb-client backup --ca-cert=/etc/pki/tls/misc/ca.crt --private-key=/etc/pki/tls/misc/tls.key --certificate=/etc/pki/tls/misc/tls.crt ssl:$SOURCE_OVSDB_IP:6642 > /backup/ovs-sb.db"
      +
      +
      +
    • +
    +
    +
  6. +
  7. +

    Start the control plane OVN database services prior to import, with northd and ovn-controller disabled:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack-galera-network-isolation --type=merge --patch '
    +spec:
    +  ovn:
    +    enabled: true
    +    template:
    +      ovnDBCluster:
    +        ovndbcluster-nb:
    +          dbType: NB
    +          storageRequest: 10G
    +          networkAttachment: internalapi
    +        ovndbcluster-sb:
    +          dbType: SB
    +          storageRequest: 10G
    +          networkAttachment: internalapi
    +      ovnNorthd:
    +        replicas: 0
    +      ovnController:
    +        networkAttachment: tenant
    +        nodeSelector:
    +          node: non-existing-node-name
    +'
    +
    +
    +
  8. +
  9. +

    Wait for the OVN database services to reach the Running phase:

    +
    +
    +
    $ oc wait --for=jsonpath='{.status.phase}'=Running pod --selector=service=ovsdbserver-nb
    +$ oc wait --for=jsonpath='{.status.phase}'=Running pod --selector=service=ovsdbserver-sb
    +
    +
    +
  10. +
  11. +

    Fetch the OVN database IP addresses on the clusterIP service network:

    +
    +
    +
    PODIFIED_OVSDB_NB_IP=$(oc get svc --selector "statefulset.kubernetes.io/pod-name=ovsdbserver-nb-0" -ojsonpath='{.items[0].spec.clusterIP}')
    +PODIFIED_OVSDB_SB_IP=$(oc get svc --selector "statefulset.kubernetes.io/pod-name=ovsdbserver-sb-0" -ojsonpath='{.items[0].spec.clusterIP}')
    +
    +
    +
  12. +
  13. +

    Upgrade the database schema for the backup files:

    +
    +
      +
    1. +

      If you did not enable TLS everywhere, use the following command:

      +
      +
      +
      $ oc exec ovn-copy-data -- bash -c "ovsdb-client get-schema tcp:$PODIFIED_OVSDB_NB_IP:6641 > /backup/ovs-nb.ovsschema && ovsdb-tool convert /backup/ovs-nb.db /backup/ovs-nb.ovsschema"
      +$ oc exec ovn-copy-data -- bash -c "ovsdb-client get-schema tcp:$PODIFIED_OVSDB_SB_IP:6642 > /backup/ovs-sb.ovsschema && ovsdb-tool convert /backup/ovs-sb.db /backup/ovs-sb.ovsschema"
      +
      +
      +
    2. +
    3. +

      If you enabled TLS everywhere, use the following command:

      +
      +
      +
      $ oc exec ovn-copy-data -- bash -c "ovsdb-client get-schema --ca-cert=/etc/pki/tls/misc/ca.crt --private-key=/etc/pki/tls/misc/tls.key --certificate=/etc/pki/tls/misc/tls.crt ssl:$PODIFIED_OVSDB_NB_IP:6641 > /backup/ovs-nb.ovsschema && ovsdb-tool convert /backup/ovs-nb.db /backup/ovs-nb.ovsschema"
      +$ oc exec ovn-copy-data -- bash -c "ovsdb-client get-schema --ca-cert=/etc/pki/tls/misc/ca.crt --private-key=/etc/pki/tls/misc/tls.key --certificate=/etc/pki/tls/misc/tls.crt ssl:$PODIFIED_OVSDB_SB_IP:6642 > /backup/ovs-sb.ovsschema && ovsdb-tool convert /backup/ovs-sb.db /backup/ovs-sb.ovsschema"
      +
      +
      +
    4. +
    +
    +
  14. +
  15. +

    Restore the database backup to the new OVN database servers:

    +
    +
      +
    1. +

      If you did not enable TLS everywhere, use the following command:

      +
      +
      +
      $ oc exec ovn-copy-data -- bash -c "ovsdb-client restore tcp:$PODIFIED_OVSDB_NB_IP:6641 < /backup/ovs-nb.db"
      +$ oc exec ovn-copy-data -- bash -c "ovsdb-client restore tcp:$PODIFIED_OVSDB_SB_IP:6642 < /backup/ovs-sb.db"
      +
      +
      +
    2. +
    3. +

      If you enabled TLS everywhere, use the following command:

      +
      +
      +
      $ oc exec ovn-copy-data -- bash -c "ovsdb-client restore --ca-cert=/etc/pki/tls/misc/ca.crt --private-key=/etc/pki/tls/misc/tls.key --certificate=/etc/pki/tls/misc/tls.crt ssl:$PODIFIED_OVSDB_NB_IP:6641 < /backup/ovs-nb.db"
      +$ oc exec ovn-copy-data -- bash -c "ovsdb-client restore --ca-cert=/etc/pki/tls/misc/ca.crt --private-key=/etc/pki/tls/misc/tls.key --certificate=/etc/pki/tls/misc/tls.crt ssl:$PODIFIED_OVSDB_SB_IP:6642 < /backup/ovs-sb.db"
      +
      +
      +
    4. +
    +
    +
  16. +
  17. +

    Check that the data was successfully migrated by running the following commands against the new database servers, for example:

    +
    +
    +
    $ oc exec -it ovsdbserver-nb-0 -- ovn-nbctl show
    +$ oc exec -it ovsdbserver-sb-0 -- ovn-sbctl list Chassis
    +
    +
    +
  18. +
  19. +

    Start the control plane ovn-northd service to keep both OVN databases in sync:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack-galera-network-isolation --type=merge --patch '
    +spec:
    +  ovn:
    +    enabled: true
    +    template:
    +      ovnNorthd:
    +        replicas: 1
    +'
    +
    +
    +
  20. +
  21. +

    If you are running OVN gateway services on RHOCP nodes, enable the control plane ovn-controller service:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack-galera-network-isolation --type=json -p="[{'op': 'remove', 'path': '/spec/ovn/template/ovnController/nodeSelector'}]"
    +
    +
    +
    + + + + + +
    + + +Running OVN gateways on RHOCP nodes might be prone to data plane downtime during Open vSwitch upgrades. Consider running OVN gateways on dedicated Networker data plane nodes for production deployments instead. +
    +
    +
  22. +
  23. +

    Delete the ovn-data helper pod and the temporary PersistentVolumeClaim that is used to store OVN database backup files:

    +
    +
    +
    $ oc delete pod ovn-copy-data
    +$ oc delete pvc ovn-data
    +
    +
    +
    + + + + + +
    + + +Consider taking a snapshot of the ovn-data helper pod and the temporary PersistentVolumeClaim before deleting them. For more information, see About volume snapshots in OpenShift Container Platform storage overview. +
    +
    +
  24. +
  25. +

    Stop the adopted OVN database servers:

    +
    +
    +
    ServicesToStop=("tripleo_ovn_cluster_north_db_server.service"
    +                "tripleo_ovn_cluster_south_db_server.service")
    +
    +echo "Stopping systemd OpenStack services"
    +for service in ${ServicesToStop[*]}; do
    +    for i in {1..3}; do
    +        SSH_CMD=CONTROLLER${i}_SSH
    +        if [ ! -z "${!SSH_CMD}" ]; then
    +            echo "Stopping the $service in controller $i"
    +            if ${!SSH_CMD} sudo systemctl is-active $service; then
    +                ${!SSH_CMD} sudo systemctl stop $service
    +            fi
    +        fi
    +    done
    +done
    +
    +echo "Checking systemd OpenStack services"
    +for service in ${ServicesToStop[*]}; do
    +    for i in {1..3}; do
    +        SSH_CMD=CONTROLLER${i}_SSH
    +        if [ ! -z "${!SSH_CMD}" ]; then
    +            if ! ${!SSH_CMD} systemctl show $service | grep ActiveState=inactive >/dev/null; then
    +                echo "ERROR: Service $service still running on controller $i"
    +            else
    +                echo "OK: Service $service is not running on controller $i"
    +            fi
    +        fi
    +    done
    +done
    +
    +
    +
  26. +
+
+
+
+
+
+

Adopting Red Hat OpenStack Platform control plane services

+
+
+

Adopt your Red Hat OpenStack Platform 17.1 control plane services to deploy them in the Red Hat OpenStack Services on OpenShift (RHOSO) 18.0 control plane.

+
+
+

Adopting the Identity service

+
+

To adopt the Identity service (keystone), you patch an existing OpenStackControlPlane custom resource (CR) where the Identity service is disabled. The patch starts the service with the configuration parameters that are provided by the Red Hat OpenStack Platform (RHOSP) environment.

+
+
+
Prerequisites
+
    +
  • +

    Create the keystone secret that includes the Fernet keys that were copied from the RHOSP environment:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: v1
    +data:
    +  CredentialKeys0: $($CONTROLLER1_SSH sudo cat /var/lib/config-data/puppet-generated/keystone/etc/keystone/credential-keys/0 | base64 -w 0)
    +  CredentialKeys1: $($CONTROLLER1_SSH sudo cat /var/lib/config-data/puppet-generated/keystone/etc/keystone/credential-keys/1 | base64 -w 0)
    +  FernetKeys0: $($CONTROLLER1_SSH sudo cat /var/lib/config-data/puppet-generated/keystone/etc/keystone/fernet-keys/0 | base64 -w 0)
    +  FernetKeys1: $($CONTROLLER1_SSH sudo cat /var/lib/config-data/puppet-generated/keystone/etc/keystone/fernet-keys/1 | base64 -w 0)
    +kind: Secret
    +metadata:
    +  name: keystone
    +  namespace: openstack
    +type: Opaque
    +EOF
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Patch the OpenStackControlPlane CR to deploy the Identity service:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch '
    +spec:
    +  keystone:
    +    enabled: true
    +    apiOverride:
    +      route: {}
    +    template:
    +      override:
    +        service:
    +          internal:
    +            metadata:
    +              annotations:
    +                metallb.universe.tf/address-pool: internalapi
    +                metallb.universe.tf/allow-shared-ip: internalapi
    +                metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +            spec:
    +              type: LoadBalancer
    +      databaseInstance: openstack
    +      secret: osp-secret
    +'
    +
    +
    +
  2. +
  3. +

    Create an alias to use the openstack command in the Red Hat OpenStack Services on OpenShift (RHOSO) deployment:

    +
    +
    +
    $ alias openstack="oc exec -t openstackclient -- openstack"
    +
    +
    +
  4. +
  5. +

    Remove services and endpoints that still point to the RHOSP +control plane, excluding the Identity service and its endpoints:

    +
    +
    +
    $ openstack endpoint list | grep keystone | awk '/admin/{ print $2; }' | xargs ${BASH_ALIASES[openstack]} endpoint delete || true
    +
    +for service in aodh heat heat-cfn barbican cinderv3 glance gnocchi manila manilav2 neutron nova placement swift ironic-inspector ironic; do
    +  openstack service list | awk "/ $service /{ print \$2; }" | xargs -r ${BASH_ALIASES[openstack]} service delete || true
    +done
    +
    +
    +
  6. +
+
+
+
Verification
+
    +
  • +

    Confirm that the Identity service endpoints are defined and are pointing to the control plane FQDNs:

    +
    +
    +
    $ openstack endpoint list | grep keystone
    +
    +
    +
  • +
+
+
+
+

Adopting the Key Manager service

+
+

To adopt the Key Manager service (barbican), you patch an existing OpenStackControlPlane custom resource (CR) where Key Manager service is disabled. The patch starts the service with the configuration parameters that are provided by the Red Hat OpenStack Platform (RHOSP) environment.

+
+
+

The Key Manager service adoption is complete if you see the following results:

+
+
+
    +
  • +

    The BarbicanAPI, BarbicanWorker, and BarbicanKeystoneListener services are up and running.

    +
  • +
  • +

    Keystone endpoints are updated, and the same crypto plugin of the source cloud is available.

    +
  • +
+
+
+ + + + + +
+ + +This procedure configures the Key Manager service to use the simple_crypto back end. Additional back ends, such as PKCS11 and DogTag, are currently not supported in Red Hat OpenStack Services on OpenShift (RHOSO). +
+
+
+
Procedure
+
    +
  1. +

    Add the kek secret:

    +
    +
    +
    $ oc set data secret/osp-secret "BarbicanSimpleCryptoKEK=$($CONTROLLER1_SSH "python3 -c \"import configparser; c = configparser.ConfigParser(); c.read('/var/lib/config-data/puppet-generated/barbican/etc/barbican/barbican.conf'); print(c['simple_crypto_plugin']['kek'])\"")"
    +
    +
    +
  2. +
  3. +

    Patch the OpenStackControlPlane CR to deploy the Key Manager service:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch '
    +spec:
    +  barbican:
    +    enabled: true
    +    apiOverride:
    +      route: {}
    +    template:
    +      databaseInstance: openstack
    +      databaseAccount: barbican
    +      rabbitMqClusterName: rabbitmq
    +      secret: osp-secret
    +      simpleCryptoBackendSecret: osp-secret
    +      serviceAccount: barbican
    +      serviceUser: barbican
    +      passwordSelectors:
    +        service: BarbicanPassword
    +        simplecryptokek: BarbicanSimpleCryptoKEK
    +      barbicanAPI:
    +        replicas: 1
    +        override:
    +          service:
    +            internal:
    +              metadata:
    +                annotations:
    +                  metallb.universe.tf/address-pool: internalapi
    +                  metallb.universe.tf/allow-shared-ip: internalapi
    +                  metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +              spec:
    +                type: LoadBalancer
    +      barbicanWorker:
    +        replicas: 1
    +      barbicanKeystoneListener:
    +        replicas: 1
    +'
    +
    +
    +
  4. +
+
+
+
Verification
+
    +
  • +

    Ensure that the Identity service (keystone) endpoints are defined and are pointing to the control plane FQDNs:

    +
    +
    +
    $ openstack endpoint list | grep key-manager
    +
    +
    +
  • +
  • +

    Ensure that Barbican API service is registered in the Identity service:

    +
    +
    +
    $ openstack service list | grep key-manager
    +
    +
    +
    +
    +
    $ openstack endpoint list | grep key-manager
    +
    +
    +
  • +
  • +

    List the secrets:

    +
    +
    +
    $ openstack secret list
    +
    +
    +
  • +
+
+
+
+

Adopting the Networking service

+
+

To adopt the Networking service (neutron), you patch an existing OpenStackControlPlane custom resource (CR) that has the Networking service disabled. The patch starts the service with the +configuration parameters that are provided by the Red Hat OpenStack Platform (RHOSP) environment.

+
+
+

The Networking service adoption is complete if you see the following results:

+
+
+
    +
  • +

    The NeutronAPI service is running.

    +
  • +
  • +

    The Identity service (keystone) endpoints are updated, and the same back end of the source cloud is available.

    +
  • +
+
+
+
Prerequisites
+
    +
  • +

    Ensure that Single Node OpenShift or OpenShift Local is running in the Red Hat OpenShift Container Platform (RHOCP) cluster.

    +
  • +
  • +

    Adopt the Identity service. For more information, see Adopting the Identity service.

    +
  • +
  • +

    Migrate your OVN databases to ovsdb-server instances that run in the Red Hat OpenShift Container Platform (RHOCP) cluster. For more information, see Migrating OVN data.

    +
  • +
+
+
+
Procedure
+
    +
  • +

    Patch the OpenStackControlPlane CR to deploy the Networking service:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch '
    +spec:
    +  neutron:
    +    enabled: true
    +    apiOverride:
    +      route: {}
    +    template:
    +      override:
    +        service:
    +          internal:
    +            metadata:
    +              annotations:
    +                metallb.universe.tf/address-pool: internalapi
    +                metallb.universe.tf/allow-shared-ip: internalapi
    +                metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +            spec:
    +              type: LoadBalancer
    +      databaseInstance: openstack
    +      databaseAccount: neutron
    +      secret: osp-secret
    +      networkAttachments:
    +      - internalapi
    +'
    +
    +
    +
  • +
+
+
+
Verification
+
    +
  • +

    Inspect the resulting Networking service pods:

    +
    +
    +
    NEUTRON_API_POD=`oc get pods -l service=neutron | tail -n 1 | cut -f 1 -d' '`
    +oc exec -t $NEUTRON_API_POD -c neutron-api -- cat /etc/neutron/neutron.conf
    +
    +
    +
  • +
  • +

    Ensure that the Neutron API service is registered in the Identity service:

    +
    +
    +
    $ openstack service list | grep network
    +
    +
    +
    +
    +
    $ openstack endpoint list | grep network
    +
    +| 6a805bd6c9f54658ad2f24e5a0ae0ab6 | regionOne | neutron      | network      | True    | public    | http://neutron-public-openstack.apps-crc.testing  |
    +| b943243e596847a9a317c8ce1800fa98 | regionOne | neutron      | network      | True    | internal  | http://neutron-internal.openstack.svc:9696        |
    +
    +
    +
  • +
  • +

    Create sample resources so that you can test whether the user can create networks, subnets, ports, or routers:

    +
    +
    +
    $ openstack network create net
    +$ openstack subnet create --network net --subnet-range 10.0.0.0/24 subnet
    +$ openstack router create router
    +
    +
    +
  • +
+
+
+
+

Adopting the Object Storage service

+
+

If you are using Object Storage as a service, adopt the Object Storage service (swift) to the Red Hat OpenStack Services on OpenShift (RHOSO) environment. If you are using the Object Storage API of the Ceph Object Gateway (RGW), skip the following procedure.

+
+
+
Prerequisites
+
    +
  • +

    The Object Storage service storage back-end services are running in the Red Hat OpenStack Platform (RHOSP) deployment.

    +
  • +
  • +

    The storage network is properly configured on the Red Hat OpenShift Container Platform (RHOCP) cluster. For more information, see Configuring the data plane network in Deploying Red Hat OpenStack Services on OpenShift.

    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Create the swift-conf secret that includes the Object Storage service hash path suffix and prefix:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: v1
    +kind: Secret
    +metadata:
    +  name: swift-conf
    +  namespace: openstack
    +type: Opaque
    +data:
    +  swift.conf: $($CONTROLLER1_SSH sudo cat /var/lib/config-data/puppet-generated/swift/etc/swift/swift.conf | base64 -w0)
    +EOF
    +
    +
    +
  2. +
  3. +

    Create the swift-ring-files ConfigMap that includes the Object Storage service ring files:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: v1
    +kind: ConfigMap
    +metadata:
    +  name: swift-ring-files
    +binaryData:
    +  swiftrings.tar.gz: $($CONTROLLER1_SSH "cd /var/lib/config-data/puppet-generated/swift/etc/swift && tar cz *.builder *.ring.gz backups/ | base64 -w0")
    +  account.ring.gz: $($CONTROLLER1_SSH "base64 -w0 /var/lib/config-data/puppet-generated/swift/etc/swift/account.ring.gz")
    +  container.ring.gz: $($CONTROLLER1_SSH "base64 -w0 /var/lib/config-data/puppet-generated/swift/etc/swift/container.ring.gz")
    +  object.ring.gz: $($CONTROLLER1_SSH "base64 -w0 /var/lib/config-data/puppet-generated/swift/etc/swift/object.ring.gz")
    +EOF
    +
    +
    +
  4. +
  5. +

    Patch the OpenStackControlPlane custom resource to deploy the Object Storage service:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch '
    +spec:
    +  swift:
    +    enabled: true
    +    template:
    +      memcachedInstance: memcached
    +      swiftRing:
    +        ringReplicas: 1
    +      swiftStorage:
    +        replicas: 0
    +        networkAttachments:
    +        - storage
    +        storageClass: local-storage (1)
    +        storageRequest: 10Gi
    +      swiftProxy:
    +        secret: osp-secret
    +        replicas: 1
    +        passwordSelectors:
    +          service: SwiftPassword
    +        serviceUser: swift
    +        override:
    +          service:
    +            internal:
    +              metadata:
    +                annotations:
    +                  metallb.universe.tf/address-pool: internalapi
    +                  metallb.universe.tf/allow-shared-ip: internalapi
    +                  metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +              spec:
    +                type: LoadBalancer
    +        networkAttachments: (2)
    +        - storage
    +'
    +
    +
    +
    + + + + + + + + + +
    1Must match the RHOSO deployment storage class.
    2Must match the network attachment for the previous Object Storage service configuration from the RHOSP deployment.
    +
    +
  6. +
+
+
+
Verification
+
    +
  • +

    Inspect the resulting Object Storage service pods:

    +
    +
    +
    $ oc get pods -l component=swift-proxy
    +
    +
    +
  • +
  • +

    Verify that the Object Storage proxy service is registered in the Identity service (keystone):

    +
    +
    +
    $ openstack service list | grep swift
    +| b5b9b1d3c79241aa867fa2d05f2bbd52 | swift    | object-store |
    +
    +
    +
    +
    +
    $ openstack endpoint list | grep swift
    +| 32ee4bd555414ab48f2dc90a19e1bcd5 | regionOne | swift        | object-store | True    | public    | https://swift-public-openstack.apps-crc.testing/v1/AUTH_%(tenant_id)s |
    +| db4b8547d3ae4e7999154b203c6a5bed | regionOne | swift        | object-store | True    | internal  | http://swift-internal.openstack.svc:8080/v1/AUTH_%(tenant_id)s        |
    +
    +
    +
  • +
  • +

    Verify that you are able to upload and download objects:

    +
    +
    +
    openstack container create test
    ++---------------------------------------+-----------+------------------------------------+
    +| account                               | container | x-trans-id                         |
    ++---------------------------------------+-----------+------------------------------------+
    +| AUTH_4d9be0a9193e4577820d187acdd2714a | test      | txe5f9a10ce21e4cddad473-0065ce41b9 |
    ++---------------------------------------+-----------+------------------------------------+
    +
    +openstack object create test --name obj <(echo "Hello World!")
    ++--------+-----------+----------------------------------+
    +| object | container | etag                             |
    ++--------+-----------+----------------------------------+
    +| obj    | test      | d41d8cd98f00b204e9800998ecf8427e |
    ++--------+-----------+----------------------------------+
    +
    +openstack object save test obj --file -
    +Hello World!
    +
    +
    +
  • +
+
+
+ + + + + +
+ + +The Object Storage data is still stored on the existing RHOSP nodes. For more information about migrating the actual data from the RHOSP deployment to the RHOSO deployment, see Migrating the Object Storage service (swift) data from RHOSP to Red Hat OpenStack Services on OpenShift (RHOSO) nodes. +
+
+
+
+

Adopting the Image service

+
+

To adopt the Image Service (glance) you patch an existing OpenStackControlPlane custom resource (CR) that has the Image service disabled. The patch starts the service with the configuration parameters that are provided by the Red Hat OpenStack Platform (RHOSP) environment.

+
+
+

The Image service adoption is complete if you see the following results:

+
+
+
    +
  • +

    The GlanceAPI service up and running.

    +
  • +
  • +

    The Identity service endpoints are updated, and the same back end of the source cloud is available.

    +
  • +
+
+
+

To complete the Image service adoption, ensure that your environment meets the following criteria:

+
+
+
    +
  • +

    You have a running director environment (the source cloud).

    +
  • +
  • +

    You have a Single Node OpenShift or OpenShift Local that is running in the Red Hat OpenShift Container Platform (RHOCP) cluster.

    +
  • +
  • +

    Optional: You can reach an internal/external Ceph cluster by both crc and director.

    +
  • +
+
+
+

Adopting the Image service that is deployed with a Object Storage service back end

+
+

Adopt the Image Service (glance) that you deployed with an Object Storage service (swift) back end in the Red Hat OpenStack Platform (RHOSP) environment. The control plane glanceAPI instance is deployed with the following configuration. You use this configuration in the patch manifest that deploys the Image service with the object storage back end:

+
+
+
+
..
+spec
+  glance:
+   ...
+      customServiceConfig: |
+          [DEFAULT]
+          enabled_backends = default_backend:swift
+          [glance_store]
+          default_backend = default_backend
+          [default_backend]
+          swift_store_create_container_on_put = True
+          swift_store_auth_version = 3
+          swift_store_auth_address = {{ .KeystoneInternalURL }}
+          swift_store_endpoint_type = internalURL
+          swift_store_user = service:glance
+          swift_store_key = {{ .ServicePassword }}
+
+
+
+
Prerequisites
+
    +
  • +

    You have completed the previous adoption steps.

    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Create a new file, for example, glance_swift.patch, and include the following content:

    +
    +
    +
    spec:
    +  glance:
    +    enabled: true
    +    apiOverride:
    +      route: {}
    +    template:
    +      secret: osp-secret
    +      databaseInstance: openstack
    +      storage:
    +        storageRequest: 10G
    +      customServiceConfig: |
    +        [DEFAULT]
    +        enabled_backends = default_backend:swift
    +        [glance_store]
    +        default_backend = default_backend
    +        [default_backend]
    +        swift_store_create_container_on_put = True
    +        swift_store_auth_version = 3
    +        swift_store_auth_address = {{ .KeystoneInternalURL }}
    +        swift_store_endpoint_type = internalURL
    +        swift_store_user = service:glance
    +        swift_store_key = {{ .ServicePassword }}
    +      glanceAPIs:
    +        default:
    +          replicas: 1
    +          override:
    +            service:
    +              internal:
    +                metadata:
    +                  annotations:
    +                    metallb.universe.tf/address-pool: internalapi
    +                    metallb.universe.tf/allow-shared-ip: internalapi
    +                    metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +                spec:
    +                  type: LoadBalancer
    +          networkAttachments:
    +            - storage
    +
    +
    +
    + + + + + +
    + + +The Object Storage service as a back end establishes a dependency with the Image service. Any deployed GlanceAPI instances do not work if the Image service is configured with the Object Storage service that is not available in the OpenStackControlPlane custom resource. +After the Object Storage service, and in particular SwiftProxy, is adopted, you can proceed with the GlanceAPI adoption. For more information, see Adopting the Object Storage service. +
    +
    +
  2. +
  3. +

    Verify that SwiftProxy is available:

    +
    +
    +
    $ oc get pod -l component=swift-proxy | grep Running
    +swift-proxy-75cb47f65-92rxq   3/3     Running   0
    +
    +
    +
  4. +
  5. +

    Patch the GlanceAPI service that is deployed in the control plane:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch-file=glance_swift.patch
    +
    +
    +
  6. +
+
+
+
+

Adopting the Image service that is deployed with a Block Storage service back end

+
+

Adopt the Image Service (glance) that you deployed with a Block Storage service (cinder) back end in the Red Hat OpenStack Platform (RHOSP) environment. The control plane glanceAPI instance is deployed with the following configuration. You use this configuration in the patch manifest that deploys the Image service with the block storage back end:

+
+
+
+
..
+spec
+  glance:
+   ...
+      customServiceConfig: |
+          [DEFAULT]
+          enabled_backends = default_backend:cinder
+          [glance_store]
+          default_backend = default_backend
+          [default_backend]
+          rootwrap_config = /etc/glance/rootwrap.conf
+          description = Default cinder backend
+          cinder_store_auth_address = {{ .KeystoneInternalURL }}
+          cinder_store_user_name = {{ .ServiceUser }}
+          cinder_store_password = {{ .ServicePassword }}
+          cinder_store_project_name = service
+          cinder_catalog_info = volumev3::internalURL
+          cinder_use_multipath = true
+
+
+
+
Prerequisites
+
    +
  • +

    You have completed the previous adoption steps.

    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Create a new file, for example glance_cinder.patch, and include the following content:

    +
    +
    +
    spec:
    +  glance:
    +    enabled: true
    +    apiOverride:
    +      route: {}
    +    template:
    +      secret: osp-secret
    +      databaseInstance: openstack
    +      storage:
    +        storageRequest: 10G
    +      customServiceConfig: |
    +        [DEFAULT]
    +        enabled_backends = default_backend:cinder
    +        [glance_store]
    +        default_backend = default_backend
    +        [default_backend]
    +        rootwrap_config = /etc/glance/rootwrap.conf
    +        description = Default cinder backend
    +        cinder_store_auth_address = {{ .KeystoneInternalURL }}
    +        cinder_store_user_name = {{ .ServiceUser }}
    +        cinder_store_password = {{ .ServicePassword }}
    +        cinder_store_project_name = service
    +        cinder_catalog_info = volumev3::internalURL
    +        cinder_use_multipath = true
    +      glanceAPIs:
    +        default:
    +          replicas: 1
    +          override:
    +            service:
    +              internal:
    +                metadata:
    +                  annotations:
    +                    metallb.universe.tf/address-pool: internalapi
    +                    metallb.universe.tf/allow-shared-ip: internalapi
    +                    metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +                spec:
    +                  type: LoadBalancer
    +          networkAttachments:
    +            - storage
    +
    +
    +
    + + + + + +
    + + +The Block Storage service as a back end establishes a dependency with the Image service. Any deployed GlanceAPI instances do not work if the Image service is configured with the Block Storage service that is not available in the OpenStackControlPlane custom resource. +After the Block Storage service, and in particular CinderVolume, is adopted, you can proceed with the GlanceAPI adoption. For more information, see Adopting the Block Storage service. +
    +
    +
  2. +
  3. +

    Verify that CinderVolume is available:

    +
    +
    +
    $ oc get pod -l component=cinder-volume | grep Running
    +cinder-volume-75cb47f65-92rxq   3/3     Running   0
    +
    +
    +
  4. +
  5. +

    Patch the GlanceAPI service that is deployed in the control plane:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch-file=glance_cinder.patch
    +
    +
    +
  6. +
+
+
+
+

Adopting the Image service that is deployed with an NFS back end

+
+

Adopt the Image Service (glance) that you deployed with an NFS back end. To complete the following procedure, ensure that your environment meets the following criteria:

+
+
+
    +
  • +

    The Storage network is propagated to the Red Hat OpenStack Platform (RHOSP) control plane.

    +
  • +
  • +

    The Image service can reach the Storage network and connect to the nfs-server through the port 2049.

    +
  • +
+
+
+
Prerequisites
+
    +
  • +

    You have completed the previous adoption steps.

    +
  • +
  • +

    In the source cloud, verify the NFS parameters that the overcloud uses to configure the Image service back end. Specifically, in yourdirector heat templates, find the following variables that override the default content that is provided by the glance-nfs.yaml file in the +/usr/share/openstack-tripleo-heat-templates/environments/storage directory:

    +
    +
    +
    **GlanceBackend**: file
    +
    +**GlanceNfsEnabled**: true
    +
    +**GlanceNfsShare**: 192.168.24.1:/var/nfs
    +
    +
    +
    + + + + + +
    + + +
    +

    In this example, the GlanceBackend variable shows that the Image service has no notion of an NFS back end. The variable is using the File driver and, in the background, the filesystem_store_datadir. The filesystem_store_datadir is mapped to the export value provided by the GlanceNfsShare variable instead of /var/lib/glance/images/. +If you do not export the GlanceNfsShare through a network that is propagated to the adopted Red Hat OpenStack Services on OpenShift (RHOSO) control plane, you must stop the nfs-server and remap the export to the storage network. Before doing so, ensure that the Image service is stopped in the source Controller nodes.

    +
    +
    +

    In the control plane, the Image service is attached to the Storage network, then propagated through the associated NetworkAttachmentsDefinition custom resource (CR), and the resulting pods already have the right permissions to handle the Image service traffic through this network. +In a deployed RHOSP control plane, you can verify that the network mapping matches with what has been deployed in the director-based environment by checking both the NodeNetworkConfigPolicy (nncp) and the NetworkAttachmentDefinition (net-attach-def). The following is an example of the output that you should check in the Red Hat OpenShift Container Platform (RHOCP) environment to make sure that there are no issues with the propagated networks:

    +
    +
    +
    +
    $ oc get nncp
    +NAME                        STATUS      REASON
    +enp6s0-crc-8cf2w-master-0   Available   SuccessfullyConfigured
    +
    +$ oc get net-attach-def
    +NAME
    +ctlplane
    +internalapi
    +storage
    +tenant
    +
    +$ oc get ipaddresspool -n metallb-system
    +NAME          AUTO ASSIGN   AVOID BUGGY IPS   ADDRESSES
    +ctlplane      true          false             ["192.168.122.80-192.168.122.90"]
    +internalapi   true          false             ["172.17.0.80-172.17.0.90"]
    +storage       true          false             ["172.18.0.80-172.18.0.90"]
    +tenant        true          false             ["172.19.0.80-172.19.0.90"]
    +
    +
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Adopt the Image service and create a new default GlanceAPI instance that is connected with the existing NFS share:

    +
    +
    +
    $ cat << EOF > glance_nfs_patch.yaml
    +
    +spec:
    +  extraMounts:
    +  - extraVol:
    +    - extraVolType: Nfs
    +      mounts:
    +      - mountPath: /var/lib/glance/images
    +        name: nfs
    +      propagation:
    +      - Glance
    +      volumes:
    +      - name: nfs
    +        nfs:
    +          path: <exported_path>
    +          server: <ip_address>
    +    name: r1
    +    region: r1
    +  glance:
    +    enabled: true
    +    template:
    +      databaseInstance: openstack
    +      customServiceConfig: |
    +        [DEFAULT]
    +        enabled_backends = default_backend:file
    +        [glance_store]
    +        default_backend = default_backend
    +        [default_backend]
    +        filesystem_store_datadir = /var/lib/glance/images/
    +      storage:
    +        storageRequest: 10G
    +      glanceAPIs:
    +        default:
    +          replicas: 0
    +          type: single
    +          override:
    +            service:
    +              internal:
    +                metadata:
    +                  annotations:
    +                    metallb.universe.tf/address-pool: internalapi
    +                    metallb.universe.tf/allow-shared-ip: internalapi
    +                    metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +                spec:
    +                  type: LoadBalancer
    +          networkAttachments:
    +          - storage
    +EOF
    +
    +
    +
    +
      +
    • +

      Replace <ip_address> with the IP address that you use to reach the nfs-server.

      +
    • +
    • +

      Replace <exported_path> with the exported path in the nfs-server.

      +
    • +
    +
    +
  2. +
  3. +

    Patch the OpenStackControlPlane CR to deploy the Image service with an NFS back end:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch-file glance_nfs_patch.yaml
    +
    +
    +
  4. +
+
+
+
Verification
+
    +
  • +

    When GlanceAPI is active, confirm that you can see a single API instance:

    +
    +
    +
    $ oc get pods -l service=glance
    +NAME                      READY   STATUS    RESTARTS
    +glance-default-single-0   3/3     Running   0
    +```
    +
    +
    +
  • +
  • +

    Ensure that the description of the pod reports the following output:

    +
    +
    +
    Mounts:
    +...
    +  nfs:
    +    Type:      NFS (an NFS mount that lasts the lifetime of a pod)
    +    Server:    {{ server ip address }}
    +    Path:      {{ nfs export path }}
    +    ReadOnly:  false
    +...
    +
    +
    +
  • +
  • +

    Check that the mountpoint that points to /var/lib/glance/images is mapped to the expected nfs server ip and nfs path that you defined in the new default GlanceAPI instance:

    +
    +
    +
    $ oc rsh -c glance-api glance-default-single-0
    +
    +sh-5.1# mount
    +...
    +...
    +{{ ip address }}:/var/nfs on /var/lib/glance/images type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=172.18.0.5,local_lock=none,addr=172.18.0.5)
    +...
    +...
    +
    +
    +
  • +
  • +

    Confirm that the UUID is created in the exported directory on the NFS node. For example:

    +
    +
    +
    $ oc rsh openstackclient
    +$ openstack image list
    +
    +sh-5.1$  curl -L -o /tmp/cirros-0.5.2-x86_64-disk.img http://download.cirros-cloud.net/0.5.2/cirros-0.5.2-x86_64-disk.img
    +...
    +...
    +
    +sh-5.1$ openstack image create --container-format bare --disk-format raw --file /tmp/cirros-0.5.2-x86_64-disk.img cirros
    +...
    +...
    +
    +sh-5.1$ openstack image list
    ++--------------------------------------+--------+--------+
    +| ID                                   | Name   | Status |
    ++--------------------------------------+--------+--------+
    +| 634482ca-4002-4a6d-b1d5-64502ad02630 | cirros | active |
    ++--------------------------------------+--------+--------+
    +
    +
    +
  • +
  • +

    On the nfs-server node, the same uuid is in the exported /var/nfs:

    +
    +
    +
    $ ls /var/nfs/
    +634482ca-4002-4a6d-b1d5-64502ad02630
    +
    +
    +
  • +
+
+
+
+

Adopting the Image service that is deployed with a Red Hat Ceph Storage back end

+
+

Adopt the Image Service (glance) that you deployed with a Red Hat Ceph Storage back end. Use the customServiceConfig parameter to inject the right configuration to the GlanceAPI instance.

+
+
+
Prerequisites
+
    +
  • +

    You have completed the previous adoption steps.

    +
  • +
  • +

    Ensure that the Ceph-related secret (ceph-conf-files) is created in +the openstack namespace and that the extraMounts property of the +OpenStackControlPlane custom resource (CR) is configured properly. For more information, see Configuring a Ceph back end.

    +
    +
    +
    $ cat << EOF > glance_patch.yaml
    +spec:
    +  glance:
    +    enabled: true
    +    template:
    +      databaseInstance: openstack
    +      customServiceConfig: |
    +        [DEFAULT]
    +        enabled_backends=default_backend:rbd
    +        [glance_store]
    +        default_backend=default_backend
    +        [default_backend]
    +        rbd_store_ceph_conf=/etc/ceph/ceph.conf
    +        rbd_store_user=openstack
    +        rbd_store_pool=images
    +        store_description=Ceph glance store backend.
    +      storage:
    +        storageRequest: 10G
    +      glanceAPIs:
    +        default:
    +          replicas: 0
    +          override:
    +            service:
    +              internal:
    +                metadata:
    +                  annotations:
    +                    metallb.universe.tf/address-pool: internalapi
    +                    metallb.universe.tf/allow-shared-ip: internalapi
    +                    metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +                spec:
    +                  type: LoadBalancer
    +          networkAttachments:
    +          - storage
    +EOF
    +
    +
    +
  • +
+
+
+ + + + + +
+ + +
+

If you backed up your Red Hat OpenStack Platform (RHOSP) services configuration file from the original environment, you can compare it with the confgiuration file that you adopted and ensure that the configuration is correct. +For more information, see Pulling the configuration from a director deployment.

+
+
+
+
os-diff diff /tmp/collect_tripleo_configs/glance/etc/glance/glance-api.conf glance_patch.yaml --crd
+
+
+
+

This command produces the difference between both ini configuration files.

+
+
+
+
+
Procedure
+
    +
  • +

    Patch the OpenStackControlPlane CR to deploy the Image service with a Red Hat Ceph Storage back end:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch-file glance_patch.yaml
    +
    +
    +
  • +
+
+
+
+

Verifying the Image service adoption

+
+

Verify that you adopted the Image Service (glance) to the Red Hat OpenStack Services on OpenShift (RHOSO) 18.0 deployment.

+
+
+
Procedure
+
    +
  1. +

    Test the Image service from the Red Hat OpenStack Platform CLI. You can compare and ensure that the configuration is applied to the Image service pods:

    +
    +
    +
    $ os-diff diff /etc/glance/glance.conf.d/02-config.conf glance_patch.yaml --frompod -p glance-api
    +
    +
    +
    +

    If no line appears, then the configuration is correct.

    +
    +
  2. +
  3. +

    Inspect the resulting Image service pods:

    +
    +
    +
    GLANCE_POD=`oc get pod |grep glance-default | cut -f 1 -d' ' | head -n 1`
    +oc exec -t $GLANCE_POD -c glance-api -- cat /etc/glance/glance.conf.d/02-config.conf
    +
    +[DEFAULT]
    +enabled_backends=default_backend:rbd
    +[glance_store]
    +default_backend=default_backend
    +[default_backend]
    +rbd_store_ceph_conf=/etc/ceph/ceph.conf
    +rbd_store_user=openstack
    +rbd_store_pool=images
    +store_description=Ceph glance store backend.
    +
    +
    +
  4. +
  5. +

    If you use a Red Hat Ceph Storage back end, ensure that the Red Hat Ceph Storage secrets are mounted:

    +
    +
    +
    $ oc exec -t $GLANCE_POD -c glance-api -- ls /etc/ceph
    +ceph.client.openstack.keyring
    +ceph.conf
    +
    +
    +
  6. +
  7. +

    Check that the service is active, and that the endpoints are updated in the RHOSP CLI:

    +
    +
    +
    $ oc rsh openstackclient -n openstackclient
    +$ openstack service list | grep image
    +
    +| fc52dbffef36434d906eeb99adfc6186 | glance    | image        |
    +
    +$ openstack endpoint list | grep image
    +
    +| 569ed81064f84d4a91e0d2d807e4c1f1 | regionOne | glance       | image        | True    | internal  | http://glance-internal-openstack.apps-crc.testing   |
    +| 5843fae70cba4e73b29d4aff3e8b616c | regionOne | glance       | image        | True    | public    | http://glance-public-openstack.apps-crc.testing     |
    +
    +
    +
  8. +
  9. +

    Check that the images that you previously listed in the source cloud are available in the adopted service:

    +
    +
    +
    $ openstack image list
    ++--------------------------------------+--------+--------+
    +| ID                                   | Name   | Status |
    ++--------------------------------------+--------+--------+
    +| c3158cad-d50b-452f-bec1-f250562f5c1f | cirros | active |
    ++--------------------------------------+--------+--------+
    +
    +
    +
  10. +
+
+
+
+
+

Adopting the Placement service

+
+

To adopt the Placement service, you patch an existing OpenStackControlPlane custom resource (CR) that has the Placement service disabled. The patch starts the service with the configuration parameters that are provided by the Red Hat OpenStack Platform (RHOSP) environment.

+
+
+
Prerequisites
+ +
+
+
Procedure
+
    +
  • +

    Patch the OpenStackControlPlane CR to deploy the Placement service:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch '
    +spec:
    +  placement:
    +    enabled: true
    +    apiOverride:
    +      route: {}
    +    template:
    +      databaseInstance: openstack
    +      databaseAccount: placement
    +      secret: osp-secret
    +      override:
    +        service:
    +          internal:
    +            metadata:
    +              annotations:
    +                metallb.universe.tf/address-pool: internalapi
    +                metallb.universe.tf/allow-shared-ip: internalapi
    +                metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +            spec:
    +              type: LoadBalancer
    +'
    +
    +
    +
  • +
+
+
+
Verification
+
    +
  • +

    Check that the Placement service endpoints are defined and pointing to the +control plane FQDNs, and that the Placement API responds:

    +
    +
    +
    $ alias openstack="oc exec -t openstackclient -- openstack"
    +
    +$ openstack endpoint list | grep placement
    +
    +
    +# Without OpenStack CLI placement plugin installed:
    +PLACEMENT_PUBLIC_URL=$(openstack endpoint list -c 'Service Name' -c 'Service Type' -c URL | grep placement | grep public | awk '{ print $6; }')
    +oc exec -t openstackclient -- curl "$PLACEMENT_PUBLIC_URL"
    +
    +# With OpenStack CLI placement plugin installed:
    +openstack resource class list
    +
    +
    +
  • +
+
+
+
+

Adopting the Compute service

+
+

To adopt the Compute service (nova), you patch an existing OpenStackControlPlane custom resource (CR) where the Compute service is disabled. The patch starts the service with the configuration parameters that are provided by the Red Hat OpenStack Platform (RHOSP) environment. The following procedure describes a single-cell setup.

+
+
+
Prerequisites
+
    +
  • +

    You have completed the previous adoption steps.

    +
  • +
  • +

    You have defined the following shell variables. Replace the following example values with the values that are correct for your environment:

    +
  • +
+
+
+
+
$ alias openstack="oc exec -t openstackclient -- openstack"
+
+
+
+
Procedure
+
    +
  1. +

    Patch the OpenStackControlPlane CR to deploy the Compute service:

    +
    + + + + + +
    + + +This procedure assumes that Compute service metadata is deployed on the top level and not on each cell level. If the RHOSP deployment has a per-cell metadata deployment, adjust the following patch as needed. You cannot run the metadata service in cell0. +
    +
    +
    +
    +
    $ oc patch openstackcontrolplane openstack -n openstack --type=merge --patch '
    +spec:
    +  nova:
    +    enabled: true
    +    apiOverride:
    +      route: {}
    +    template:
    +      secret: osp-secret
    +      apiServiceTemplate:
    +        override:
    +          service:
    +            internal:
    +              metadata:
    +                annotations:
    +                  metallb.universe.tf/address-pool: internalapi
    +                  metallb.universe.tf/allow-shared-ip: internalapi
    +                  metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +              spec:
    +                type: LoadBalancer
    +        customServiceConfig: |
    +          [workarounds]
    +          disable_compute_service_check_for_ffu=true
    +      metadataServiceTemplate:
    +        enabled: true # deploy single nova metadata on the top level
    +        override:
    +          service:
    +            metadata:
    +              annotations:
    +                metallb.universe.tf/address-pool: internalapi
    +                metallb.universe.tf/allow-shared-ip: internalapi
    +                metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +            spec:
    +              type: LoadBalancer
    +        customServiceConfig: |
    +          [workarounds]
    +          disable_compute_service_check_for_ffu=true
    +      schedulerServiceTemplate:
    +        customServiceConfig: |
    +          [workarounds]
    +          disable_compute_service_check_for_ffu=true
    +      cellTemplates:
    +        cell0:
    +          conductorServiceTemplate:
    +            customServiceConfig: |
    +              [workarounds]
    +              disable_compute_service_check_for_ffu=true
    +        cell1:
    +          metadataServiceTemplate:
    +            enabled: false # enable here to run it in a cell instead
    +            override:
    +                service:
    +                  metadata:
    +                    annotations:
    +                      metallb.universe.tf/address-pool: internalapi
    +                      metallb.universe.tf/allow-shared-ip: internalapi
    +                      metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +                  spec:
    +                    type: LoadBalancer
    +            customServiceConfig: |
    +              [workarounds]
    +              disable_compute_service_check_for_ffu=true
    +          conductorServiceTemplate:
    +            customServiceConfig: |
    +              [workarounds]
    +              disable_compute_service_check_for_ffu=true
    +'
    +
    +
    +
  2. +
  3. +

    If you are adopting the Compute service with the Bare Metal Provisioning service (ironic), append the following novaComputeTemplates in the cell1 section of the Compute service CR patch:

    +
    +
    +
            cell1:
    +          novaComputeTemplates:
    +            standalone:
    +              customServiceConfig: |
    +                [DEFAULT]
    +                host = <hostname>
    +                [workarounds]
    +                disable_compute_service_check_for_ffu=true
    +
    +
    +
    +
      +
    • +

      Replace <hostname> with the hostname of the node that is running the ironic Compute driver in the source cloud.

      +
    • +
    +
    +
  4. +
  5. +

    Wait for the CRs for the Compute control plane services to be ready:

    +
    +
    +
    $ oc wait --for condition=Ready --timeout=300s Nova/nova
    +
    +
    +
    + + + + + +
    + + +The local Conductor services are started for each cell, while the superconductor runs in cell0. +Note that disable_compute_service_check_for_ffu is mandatory for all imported Compute services until the external data plane is imported, and until Compute services are fast-forward upgraded. For more information, see Adopting Compute services to the RHOSO data plane and Performing a fast-forward upgrade on Compute services. +
    +
    +
  6. +
+
+
+
Verification
+
    +
  • +

    Check that Compute service endpoints are defined and pointing to the +control plane FQDNs, and that the Nova API responds:

    +
    +
    +
    $ openstack endpoint list | grep nova
    +$ openstack server list
    +
    +
    +
    + +
    +
  • +
  • +

    Query the superconductor to check that cell1 exists, and compare it to pre-adoption values:

    +
    +
    +
    . ~/.source_cloud_exported_variables
    +echo $PULL_OPENSTACK_CONFIGURATION_NOVAMANAGE_CELL_MAPPINGS
    +oc rsh nova-cell0-conductor-0 nova-manage cell_v2 list_cells | grep -F '| cell1 |'
    +
    +
    +
    +

    The following changes are expected:

    +
    +
    +
      +
    • +

      The cell1 nova database and username become nova_cell1.

      +
    • +
    • +

      The default cell is renamed to cell1.

      +
    • +
    • +

      RabbitMQ transport URL no longer uses guest.

      +
    • +
    +
    +
  • +
+
+
+ + + + + +
+ + +At this point, the Compute service control plane services do not control the existing Compute service workloads. The control plane manages the data plane only after the data adoption process is completed. For more information, see Adopting Compute services to the RHOSO data plane. +
+
+
+
+

Adopting the Block Storage service

+
+

To adopt a director-deployed Block Storage service (cinder), create the manifest based on the existing cinder.conf file, deploy the Block Storage service, and validate the new deployment.

+
+
+
Prerequisites
+
    +
  • +

    You have reviewed the Block Storage service limitations. For more information, see Limitations for adopting the Block Storage service.

    +
  • +
  • +

    You have planned the placement of the Block Storage services.

    +
  • +
  • +

    You have prepared the Red Hat OpenShift Container Platform (RHOCP) nodes where the volume and backup services run. For more information, see RHOCP preparation for Block Storage service adoption.

    +
  • +
  • +

    The Block Storage service (cinder) is stopped.

    +
  • +
  • +

    The service databases are imported into the control plane MariaDB.

    +
  • +
  • +

    The Identity service (keystone) and Key Manager service (barbican) are adopted.

    +
  • +
  • +

    The Storage network is correctly configured on the RHOCP cluster.

    +
  • +
  • +

    The contents of cinder.conf file. Download the file so that you can access it locally:

    +
    +
    +
    $CONTROLLER1_SSH cat /var/lib/config-data/puppet-generated/cinder/etc/cinder/cinder.conf > cinder.conf
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Create a new file, for example, cinder.patch, and apply the configuration:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch-file=<patch_name>
    +
    +
    +
    +
      +
    • +

      Replace <patch_name> with the name of your patch file.

      +
      +

      The following example shows a cinder.patch file for an RBD deployment:

      +
      +
      +
      +
      spec:
      +  extraMounts:
      +  - extraVol:
      +    - extraVolType: Ceph
      +      mounts:
      +      - mountPath: /etc/ceph
      +        name: ceph
      +        readOnly: true
      +      propagation:
      +      - CinderVolume
      +      - CinderBackup
      +      - Glance
      +      volumes:
      +      - name: ceph
      +        projected:
      +          sources:
      +          - secret:
      +              name: ceph-conf-files
      +  cinder:
      +    enabled: true
      +    apiOverride:
      +      route: {}
      +    template:
      +      databaseInstance: openstack
      +      databaseAccount: cinder
      +      secret: osp-secret
      +      cinderAPI:
      +        override:
      +          service:
      +            internal:
      +              metadata:
      +                annotations:
      +                  metallb.universe.tf/address-pool: internalapi
      +                  metallb.universe.tf/allow-shared-ip: internalapi
      +                  metallb.universe.tf/loadBalancerIPs: 172.17.0.80
      +              spec:
      +                type: LoadBalancer
      +        replicas: 1
      +        customServiceConfig: |
      +          [DEFAULT]
      +          default_volume_type=tripleo
      +      cinderScheduler:
      +        replicas: 1
      +      cinderBackup:
      +        networkAttachments:
      +        - storage
      +        replicas: 1
      +        customServiceConfig: |
      +          [DEFAULT]
      +          backup_driver=cinder.backup.drivers.ceph.CephBackupDriver
      +          backup_ceph_conf=/etc/ceph/ceph.conf
      +          backup_ceph_user=openstack
      +          backup_ceph_pool=backups
      +      cinderVolumes:
      +        ceph:
      +          networkAttachments:
      +          - storage
      +          replicas: 1
      +          customServiceConfig: |
      +            [tripleo_ceph]
      +            backend_host=hostgroup
      +            volume_backend_name=tripleo_ceph
      +            volume_driver=cinder.volume.drivers.rbd.RBDDriver
      +            rbd_ceph_conf=/etc/ceph/ceph.conf
      +            rbd_user=openstack
      +            rbd_pool=volumes
      +            rbd_flatten_volume_from_snapshot=False
      +            report_discard_supported=True
      +
      +
      +
    • +
    +
    +
  2. +
  3. +

    Retrieve the list of the previous scheduler and backup services:

    +
    +
    +
    $ openstack volume service list
    +
    ++------------------+------------------------+------+---------+-------+----------------------------+
    +| Binary           | Host                   | Zone | Status  | State | Updated At                 |
    ++------------------+------------------------+------+---------+-------+----------------------------+
    +| cinder-backup    | standalone.localdomain | nova | enabled | down  | 2023-06-28T11:00:59.000000 |
    +| cinder-scheduler | standalone.localdomain | nova | enabled | down  | 2023-06-28T11:00:29.000000 |
    +| cinder-volume    | hostgroup@tripleo_ceph | nova | enabled | up    | 2023-06-28T17:00:03.000000 |
    +| cinder-scheduler | cinder-scheduler-0     | nova | enabled | up    | 2023-06-28T17:00:02.000000 |
    +| cinder-backup    | cinder-backup-0        | nova | enabled | up    | 2023-06-28T17:00:01.000000 |
    ++------------------+------------------------+------+---------+-------+----------------------------+
    +
    +
    +
  4. +
  5. +

    Remove services for hosts that are in the down state:

    +
    +
    +
    $ oc exec -it cinder-scheduler-0 -- cinder-manage service remove <service_binary> <service_host>
    +
    +
    +
    +
      +
    • +

      Replace <service_binary> with the name of the binary, for example, cinder-backup.

      +
    • +
    • +

      Replace <service_host> with the host name, for example, cinder-backup-0.

      +
    • +
    +
    +
  6. +
  7. +

    Apply the DB data migrations:

    +
    + + + + + +
    + + +
    +

    You are not required to run the data migrations at this step, but you must run them before the next upgrade. However, for adoption, it is recommended to run the migrations now to ensure that there are no issues before you run production workloads on the deployment.

    +
    +
    +
    +
    +
    +
    $ oc exec -it cinder-scheduler-0 -- cinder-manage db online_data_migrations
    +
    +
    +
  8. +
+
+
+
Verification
+
    +
  1. +

    Ensure that the openstack alias is defined:

    +
    +
    +
    $ alias openstack="oc exec -t openstackclient -- openstack"
    +
    +
    +
  2. +
  3. +

    Confirm that Block Storage service endpoints are defined and pointing to the control plane FQDNs:

    +
    +
    +
    $ openstack endpoint list --service <endpoint>
    +
    +
    +
    +
      +
    • +

      Replace <endpoint> with the name of the endpoint that you want to confirm.

      +
    • +
    +
    +
  4. +
  5. +

    Confirm that the Block Storage services are running:

    +
    +
    +
    $ openstack volume service list
    +
    +
    +
    + + + + + +
    + + +Cinder API services do not appear in the list. However, if you get a response from the openstack volume service list command, that means at least one of the cinder API services is running. +
    +
    +
  6. +
  7. +

    Confirm that you have your previous volume types, volumes, snapshots, and backups:

    +
    +
    +
    $ openstack volume type list
    +$ openstack volume list
    +$ openstack volume snapshot list
    +$ openstack volume backup list
    +
    +
    +
  8. +
  9. +

    To confirm that the configuration is working, perform the following steps:

    +
    +
      +
    1. +

      Create a volume from an image to check that the connection to Image Service (glance) is working:

      +
      +
      +
      $ openstack volume create --image cirros --bootable --size 1 disk_new
      +
      +
      +
    2. +
    3. +

      Back up the previous attached volume:

      +
      +
      +
      $ openstack --os-volume-api-version 3.47 volume create --backup <backup_name>
      +
      +
      +
      +
        +
      • +

        Replace <backup_name> with the name of your new backup location.

        +
        + + + + + +
        + + +You do not boot a Compute service (nova) instance by using the new volume from image or try to detach the previous volume because the Compute service and the Block Storage service are still not connected. +
        +
        +
      • +
      +
      +
    4. +
    +
    +
  10. +
+
+
+
+

Adopting the Dashboard service

+
+

To adopt the Dashboard service (horizon), you patch an existing OpenStackControlPlane custom resource (CR) that has the Dashboard service disabled. The patch starts the service with the configuration parameters that are provided by the Red Hat OpenStack Platform environment.

+
+
+
Prerequisites
+ +
+
+
Procedure
+
    +
  • +

    Patch the OpenStackControlPlane CR to deploy the Dashboard service:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch '
    +spec:
    +  horizon:
    +    enabled: true
    +    apiOverride:
    +      route: {}
    +    template:
    +      memcachedInstance: memcached
    +      secret: osp-secret
    +'
    +
    +
    +
  • +
+
+
+
Verification
+
    +
  1. +

    Verify that the Dashboard service instance is successfully deployed and ready:

    +
    +
    +
    $ oc get horizon
    +
    +
    +
  2. +
  3. +

    Confirm that the Dashboard service is reachable and returns a 200 status code:

    +
    +
    +
    PUBLIC_URL=$(oc get horizon horizon -o jsonpath='{.status.endpoint}')
    +curl --silent --output /dev/stderr --head --write-out "%{http_code}" "$PUBLIC_URL/dashboard/auth/login/?next=/dashboard/" -k | grep 200
    +
    +
    +
  4. +
+
+
+
+

Adopting the Shared File Systems service

+
+

The Shared File Systems service (manila) in Red Hat OpenStack Services on OpenShift (RHOSO) provides a self-service API to create and manage file shares. File shares (or "shares"), are built for concurrent read/write access from multiple clients. This makes the Shared File Systems service essential in cloud environments that require a ReadWriteMany persistent storage.

+
+
+

File shares in RHOSO require network access. Ensure that the networking in the Red Hat OpenStack Platform (RHOSP) 17.1 environment matches the network plans for your new cloud after adoption. This ensures that tenant workloads remain connected to storage during the adoption process. The Shared File Systems service control plane services are not in the data path. Shutting down the API, scheduler, and share manager services do not impact access to existing shared file systems.

+
+
+

Typically, storage and storage device management are separate networks. Shared File Systems services only need access to the storage device management network. +For example, if you used a Red Hat Ceph Storage cluster in the deployment, the "storage" +network refers to the Red Hat Ceph Storage cluster’s public network, and the Shared File Systems service’s share manager service needs to be able to reach it.

+
+
+

The Shared File Systems service supports the following storage networking scenarios:

+
+
+
    +
  • +

    You can directly control the networking for your respective file shares.

    +
  • +
  • +

    The RHOSO administrator configures the storage networking.

    +
  • +
+
+
+

Guidelines for preparing the Shared File Systems service configuration

+
+

To deploy Shared File Systems service (manila) on the control plane, you must copy the original configuration file from the Red Hat OpenStack Platform 17.1 deployment. You must review the content in the file to make sure you are adopting the correct configuration for Red Hat OpenStack Services on OpenShift (RHOSO) 18.0. Not all of the content needs to be brought into the new cloud environment.

+
+
+

Review the following guidelines for preparing your Shared File Systems service configuration file for adoption:

+
+
+
    +
  • +

    The Shared File Systems service operator sets up the following configurations and can be ignored:

    +
    +
      +
    • +

      Database-related configuration ([database])

      +
    • +
    • +

      Service authentication (auth_strategy, [keystone_authtoken])

      +
    • +
    • +

      Message bus configuration (transport_url, control_exchange)

      +
    • +
    • +

      The default paste config (api_paste_config)

      +
    • +
    • +

      Inter-service communication configuration ([neutron], [nova], [cinder], [glance] [oslo_messaging_*])

      +
    • +
    +
    +
  • +
  • +

    Ignore the osapi_share_listen configuration. In Red Hat OpenStack Services on OpenShift (RHOSO) 18.0, you rely on Red Hat OpenShift Container Platform (RHOCP) routes and ingress.

    +
  • +
  • +

    Check for policy overrides. In RHOSO 18.0, the Shared File Systems service ships with a secure default Role-based access control (RBAC), and overrides might not be necessary.

    +
  • +
  • +

    If a custom policy is necessary, you must provide it as a ConfigMap. The following example spec illustrates how you can set up a ConfigMap called manila-policy with the contents of a file called policy.yaml:

    +
    +
    +
      spec:
    +    manila:
    +      enabled: true
    +      template:
    +        manilaAPI:
    +          customServiceConfig: |
    +             [oslo_policy]
    +             policy_file=/etc/manila/policy.yaml
    +        extraMounts:
    +        - extraVol:
    +          - extraVolType: Undefined
    +            mounts:
    +            - mountPath: /etc/manila/
    +              name: policy
    +              readOnly: true
    +            propagation:
    +            - ManilaAPI
    +            volumes:
    +            - name: policy
    +              projected:
    +                sources:
    +                - configMap:
    +                    name: manila-policy
    +                    items:
    +                      - key: policy
    +                        path: policy.yaml
    +
    +
    +
  • +
  • +

    The value of the host option under the [DEFAULT] section must be hostgroup.

    +
  • +
  • +

    To run the Shared File Systems service API service, you must add the enabled_share_protocols option to the customServiceConfig section in manila: template: manilaAPI.

    +
  • +
  • +

    If you have scheduler overrides, add them to the customServiceConfig +section in manila: template: manilaScheduler.

    +
  • +
  • +

    If you have multiple storage back-end drivers configured with RHOSP 17.1, you need to split them up when deploying RHOSO 18.0. Each storage back-end driver needs to use its own instance of the manila-share service.

    +
  • +
  • +

    If a storage back-end driver needs a custom container image, find it in the +Red Hat Ecosystem Catalog, and create or modify an OpenStackVersion custom resource (CR) to specify the custom image using the same custom name.

    +
    +

    The following example shows a manila spec from the OpenStackControlPlane CR that includes multiple storage back-end drivers, where only one is using a custom container image:

    +
    +
    +
    +
      spec:
    +    manila:
    +      enabled: true
    +      template:
    +        manilaAPI:
    +          customServiceConfig: |
    +            [DEFAULT]
    +            enabled_share_protocols = nfs
    +          replicas: 3
    +        manilaScheduler:
    +          replicas: 3
    +        manilaShares:
    +         netapp:
    +           customServiceConfig: |
    +             [DEFAULT]
    +             debug = true
    +             enabled_share_backends = netapp
    +             host = hostgroup
    +             [netapp]
    +             driver_handles_share_servers = False
    +             share_backend_name = netapp
    +             share_driver = manila.share.drivers.netapp.common.NetAppDriver
    +             netapp_storage_family = ontap_cluster
    +             netapp_transport_type = http
    +           replicas: 1
    +         pure:
    +            customServiceConfig: |
    +             [DEFAULT]
    +             debug = true
    +             enabled_share_backends=pure-1
    +             host = hostgroup
    +             [pure-1]
    +             driver_handles_share_servers = False
    +             share_backend_name = pure-1
    +             share_driver = manila.share.drivers.purestorage.flashblade.FlashBladeShareDriver
    +             flashblade_mgmt_vip = 203.0.113.15
    +             flashblade_data_vip = 203.0.10.14
    +            replicas: 1
    +
    +
    +
    +

    The following example shows the OpenStackVersion CR that defines the custom container image:

    +
    +
    +
    +
    apiVersion: core.openstack.org/v1beta1
    +kind: OpenStackVersion
    +metadata:
    +  name: openstack
    +spec:
    +  customContainerImages:
    +    cinderVolumeImages:
    +      pure: registry.connect.redhat.com/purestorage/openstack-manila-share-pure-rhosp-18-0
    +
    +
    +
    +

    The name of the OpenStackVersion CR must match the name of your OpenStackControlPlane CR.

    +
    +
  • +
  • +

    If you are providing sensitive information, such as passwords, hostnames, and usernames, it is recommended to use RHOCP secrets, and the customServiceConfigSecrets key. You can use customConfigSecrets in any service. If you use third party storage that requires credentials, create a secret that is referenced in the manila CR/patch file by using the customServiceConfigSecrets key. For example:

    +
    +
      +
    1. +

      Create a file that includes the secrets, for example, netapp_secrets.conf:

      +
      +
      +
      $ cat << __EOF__ > ~/netapp_secrets.conf
      +
      +[netapp]
      +netapp_server_hostname = 203.0.113.10
      +netapp_login = fancy_netapp_user
      +netapp_password = secret_netapp_password
      +netapp_vserver = mydatavserver
      +__EOF__
      +
      +
      +
      +
      +
      $ oc create secret generic osp-secret-manila-netapp --from-file=~/<secret> -n openstack
      +
      +
      +
      +
        +
      • +

        Replace <secret> with the name of the file that includes your secrets, for example, netapp_secrets.conf.

        +
      • +
      +
      +
    2. +
    3. +

      Add the secret to any Shared File Systems service file in the customServiceConfigSecrets section. The following example adds the osp-secret-manila-netapp secret to the manilaShares service:

      +
      +
      +
        spec:
      +    manila:
      +      enabled: true
      +      template:
      +        < . . . >
      +        manilaShares:
      +         netapp:
      +           customServiceConfig: |
      +             [DEFAULT]
      +             debug = true
      +             enabled_share_backends = netapp
      +             host = hostgroup
      +             [netapp]
      +             driver_handles_share_servers = False
      +             share_backend_name = netapp
      +             share_driver = manila.share.drivers.netapp.common.NetAppDriver
      +             netapp_storage_family = ontap_cluster
      +             netapp_transport_type = http
      +           customServiceConfigSecrets:
      +             - osp-secret-manila-netapp
      +           replicas: 1
      +    < . . . >
      +
      +
      +
    4. +
    +
    +
  • +
+
+
+
+

Deploying the Shared File Systems service on the control plane

+
+

Copy the Shared File Systems service (manila) configuration from the Red Hat OpenStack Platform (RHOSP) 17.1 deployment, and then deploy the Shared File Systems service on the control plane.

+
+
+
Prerequisites
+
    +
  • +

    The Shared File Systems service systemd services such as api, cron, and scheduler are stopped. For more information, see Stopping Red Hat OpenStack Platform services.

    +
  • +
  • +

    If the deployment uses CephFS through NFS as a storage back end, the Pacemaker ordering and collocation constraints are adjusted. For more information, see Stopping Red Hat OpenStack Platform services.

    +
  • +
  • +

    The Shared File Systems service Pacemaker service (openstack-manila-share) is stopped. For more information, see Stopping Red Hat OpenStack Platform services.

    +
  • +
  • +

    The database migration is complete. For more information, see Migrating databases to MariaDB instances.

    +
  • +
  • +

    The Red Hat OpenShift Container Platform (RHOCP) nodes where the manila-share service is to be deployed can reach the management network that the storage system is in.

    +
  • +
  • +

    If the deployment uses CephFS through NFS as a storage back end, a new clustered Ceph NFS service is deployed on the Red Hat Ceph Storage cluster with the help +of Ceph orchestrator. For more information, see Creating a Ceph NFS cluster.

    +
  • +
  • +

    Services such as the Identity service (keystone) and memcached are available prior to adopting the Shared File Systems services.

    +
  • +
  • +

    If you enabled tenant-driven networking by setting driver_handles_share_servers=True, the Networking service (neutron) is deployed.

    +
  • +
  • +

    The CONTROLLER1_SSH environment variable is defined and points to the RHOSP Controller node. Replace the following example values with values that are correct for your environment:

    +
    +
    +
    CONTROLLER1_SSH="ssh -i <path to SSH key> root@<node IP>"
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Copy the configuration file from RHOSP 17.1 for reference:

    +
    +
    +
    $ CONTROLLER1_SSH cat /var/lib/config-data/puppet-generated/manila/etc/manila/manila.conf | awk '!/^ *#/ && NF' > ~/manila.conf
    +
    +
    +
  2. +
  3. +

    Review the configuration file for configuration changes that were made since RHOSP 17.1. For more information on preparing this file for Red Hat OpenStack Services on OpenShift (RHOSO), see Guidelines for preparing the Shared File Systems service configuration.

    +
  4. +
  5. +

    Create a patch file for the OpenStackControlPlane CR to deploy the Shared File Systems service. The following example manila.patch file uses native CephFS:

    +
    +
    +
    $ cat << __EOF__ > ~/manila.patch
    +spec:
    +  manila:
    +    enabled: true
    +    apiOverride:
    +      route: {}
    +    template:
    +      databaseInstance: openstack
    +      databaseAccount: manila
    +      secret: osp-secret
    +      manilaAPI:
    +        replicas: 3 (1)
    +        customServiceConfig: |
    +          [DEFAULT]
    +          enabled_share_protocols = cephfs
    +        override:
    +          service:
    +            internal:
    +              metadata:
    +                annotations:
    +                  metallb.universe.tf/address-pool: internalapi
    +                  metallb.universe.tf/allow-shared-ip: internalapi
    +                  metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +              spec:
    +                type: LoadBalancer
    +      manilaScheduler:
    +        replicas: 3 (2)
    +      manilaShares:
    +        cephfs:
    +          replicas: 1 (3)
    +          customServiceConfig: |
    +            [DEFAULT]
    +            enabled_share_backends = tripleo_ceph
    +            host = hostgroup
    +            [cephfs]
    +            driver_handles_share_servers=False
    +            share_backend_name=cephfs (4)
    +            share_driver=manila.share.drivers.cephfs.driver.CephFSDriver
    +            cephfs_conf_path=/etc/ceph/ceph.conf
    +            cephfs_auth_id=openstack
    +            cephfs_cluster_name=ceph
    +            cephfs_volume_mode=0755
    +            cephfs_protocol_helper_type=CEPHFS
    +          networkAttachments: (5)
    +              - storage
    +      extraMounts: (6)
    +      - name: v1
    +        region: r1
    +        extraVol:
    +          - propagation:
    +            - ManilaShare
    +          extraVolType: Ceph
    +          volumes:
    +          - name: ceph
    +            secret:
    +              secretName: ceph-conf-files
    +          mounts:
    +          - name: ceph
    +            mountPath: "/etc/ceph"
    +            readOnly: true
    +__EOF__
    +
    +
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + +
    1Set the replica count of the manilaAPI service to 3.
    2Set the replica count of the manilaScheduler service to 3.
    3Set the replica count of the manilaShares service to 1.
    4Ensure that the names of the back ends (share_backend_name) are the same as they were in RHOSP 17.1.
    5Ensure that the appropriate storage management network is specified in the networkAttachments section. For example, the manilaShares instance with the CephFS back-end driver is connected to the storage network.
    6If you need to add extra files to any of the services, you can use extraMounts. For example, when using Red Hat Ceph Storage, you can add the Shared File Systems service Ceph user’s keyring file as well as the ceph.conf configuration file. +
    +

    The following example patch file uses CephFS through NFS:

    +
    +
    +
    +
    $ cat << __EOF__ > ~/manila.patch
    +spec:
    +  manila:
    +    enabled: true
    +    apiOverride:
    +      route: {}
    +    template:
    +      databaseInstance: openstack
    +      secret: osp-secret
    +      manilaAPI:
    +        replicas: 3
    +        customServiceConfig: |
    +          [DEFAULT]
    +          enabled_share_protocols = cephfs
    +        override:
    +          service:
    +            internal:
    +              metadata:
    +                annotations:
    +                  metallb.universe.tf/address-pool: internalapi
    +                  metallb.universe.tf/allow-shared-ip: internalapi
    +                  metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +              spec:
    +                type: LoadBalancer
    +      manilaScheduler:
    +        replicas: 3
    +      manilaShares:
    +        cephfs:
    +          replicas: 1
    +          customServiceConfig: |
    +            [DEFAULT]
    +            enabled_share_backends = cephfs
    +            host = hostgroup
    +            [cephfs]
    +            driver_handles_share_servers=False
    +            share_backend_name=tripleo_ceph
    +            share_driver=manila.share.drivers.cephfs.driver.CephFSDriver
    +            cephfs_conf_path=/etc/ceph/ceph.conf
    +            cephfs_auth_id=openstack
    +            cephfs_cluster_name=ceph
    +            cephfs_protocol_helper_type=NFS
    +            cephfs_nfs_cluster_id=cephfs
    +            cephfs_ganesha_server_ip=172.17.5.47
    +          networkAttachments:
    +              - storage
    +__EOF__
    +
    +
    +
    +
      +
    • +

      Prior to adopting the manilaShares service for CephFS through NFS, ensure that you create a clustered Ceph NFS service. The name of the service must be cephfs_nfs_cluster_id. The cephfs_nfs_cluster_id option is set with the name of the NFS cluster created on Red Hat Ceph Storage.

      +
    • +
    • +

      The cephfs_ganesha_server_ip option is preserved from the configuration on the RHOSP 17.1 environment.

      +
    • +
    +
    +
    +
  6. +
  7. +

    Patch the OpenStackControlPlane CR:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch-file=~/<manila.patch>
    +
    +
    +
    +
      +
    • +

      Replace <manila.patch> with the name of your patch file.

      +
    • +
    +
    +
  8. +
+
+
+
Verification
+
    +
  1. +

    Inspect the resulting Shared File Systems service pods:

    +
    +
    +
    $ oc get pods -l service=manila
    +
    +
    +
  2. +
  3. +

    Check that the Shared File Systems API service is registered in the Identity service (keystone):

    +
    +
    +
    $ openstack service list | grep manila
    +
    +
    +
    +
    +
    $ openstack endpoint list | grep manila
    +
    +| 1164c70045d34b959e889846f9959c0e | regionOne | manila       | share        | True    | internal  | http://manila-internal.openstack.svc:8786/v1/%(project_id)s        |
    +| 63e89296522d4b28a9af56586641590c | regionOne | manilav2     | sharev2      | True    | public    | https://manila-public-openstack.apps-crc.testing/v2                |
    +| af36c57adcdf4d50b10f484b616764cc | regionOne | manila       | share        | True    | public    | https://manila-public-openstack.apps-crc.testing/v1/%(project_id)s |
    +| d655b4390d7544a29ce4ea356cc2b547 | regionOne | manilav2     | sharev2      | True    | internal  | http://manila-internal.openstack.svc:8786/v2                       |
    +
    +
    +
  4. +
  5. +

    Test the health of the service:

    +
    +
    +
    $ openstack share service list
    +$ openstack share pool list --detail
    +
    +
    +
  6. +
  7. +

    Check existing workloads:

    +
    +
    +
    $ openstack share list
    +$ openstack share snapshot list
    +
    +
    +
  8. +
+
+
+
+

Decommissioning the Red Hat OpenStack Platform standalone Ceph NFS service

+
+

If your deployment uses CephFS through NFS, you must decommission the Red Hat OpenStack Platform(RHOSP) standalone NFS service. Since future software upgrades do not support the previous NFS service, it is recommended that the decommissioning period is short.

+
+
+
Prerequisites
+
    +
  • +

    You identified the new export locations for your existing shares by querying the Shared File Systems API.

    +
  • +
  • +

    You unmounted and remounted the shared file systems on each client to stop using the previous NFS server.

    +
  • +
  • +

    If you are consuming the Shared File Systems service shares with the Shared File Systems service CSI plugin for Red Hat OpenShift Container Platform (RHOCP), you migrated the shares by scaling down the application pods and scaling them back up.

    +
  • +
+
+
+ + + + + +
+ + +Clients that are creating new workloads cannot use share exports through the previous NFS service. The Shared File Systems service no longer communicates with the previous NFS service, and cannot apply or alter export rules on the previous NFS service. +
+
+
+
Procedure
+
    +
  1. +

    Remove the cephfs_ganesha_server_ip option from the manila-share service configuration:

    +
    + + + + + +
    + + +This restarts the manila-share process and removes the export locations that applied to the previous NFS service from all the shares. +
    +
    +
    +
    +
    $ cat << __EOF__ > ~/manila.patch
    +spec:
    +  manila:
    +    enabled: true
    +    apiOverride:
    +      route: {}
    +    template:
    +      manilaShares:
    +        cephfs:
    +          replicas: 1
    +          customServiceConfig: |
    +            [DEFAULT]
    +            enabled_share_backends = cephfs
    +            host = hostgroup
    +            [cephfs]
    +            driver_handles_share_servers=False
    +            share_backend_name=cephfs
    +            share_driver=manila.share.drivers.cephfs.driver.CephFSDriver
    +            cephfs_conf_path=/etc/ceph/ceph.conf
    +            cephfs_auth_id=openstack
    +            cephfs_cluster_name=ceph
    +            cephfs_protocol_helper_type=NFS
    +            cephfs_nfs_cluster_id=cephfs
    +          networkAttachments:
    +              - storage
    +__EOF__
    +
    +
    +
  2. +
  3. +

    Patch the OpenStackControlPlane custom resource:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch-file=~/<manila.patch>
    +
    +
    +
    +
      +
    • +

      Replace <manila.patch> with the name of your patch file.

      +
    • +
    +
    +
  4. +
  5. +

    Clean up the standalone ceph-nfs service from the RHOSP control plane nodes by disabling and deleting the Pacemaker resources associated with the service:

    +
    + + + + + +
    + + +You can defer this step until after RHOSO 18.0 is operational. During this time, you cannot decommission the Controller nodes. +
    +
    +
    +
    +
    $ sudo pcs resource disable ceph-nfs
    +$ sudo pcs resource disable ip-<VIP>
    +$ sudo pcs resource unmanage ceph-nfs
    +$ sudo pcs resource unmanage ip-<VIP>
    +
    +
    +
    +
      +
    • +

      Replace <VIP> with the IP address assigned to the ceph-nfs service in your environment.

      +
    • +
    +
    +
  6. +
+
+
+
+
+

Adopting the Bare Metal Provisioning service

+
+

Review information about your Bare Metal Provisioning service (ironic) configuration and then adopt the Bare Metal Provisioning service to the Red Hat OpenStack Services on OpenShift control plane.

+
+
+

Bare Metal Provisioning service configurations

+
+

You configure the Bare Metal Provisioning service (ironic) by using configuration snippets. For more information about configuring the control plane with the Bare Metal Provisioning service, see Customizing the Red Hat OpenStack Services on OpenShift deployment.

+
+
+

Some Bare Metal Provisioning service configuration is overridden in director, for example, PXE Loader file names are often overridden at intermediate layers. You must pay attention to the settings you apply in your Red Hat OpenStack Services on OpenShift (RHOSO) deployment. The ironic-operator applies a reasonable working default configuration, but if you override them with your prior configuration, your experience might not be ideal or your new Bare Metal Provisioning service fails to operate. Similarly, additional configuration might be necessary, for example, if you enable and use additional hardware types in your ironic.conf file.

+
+
+

The model of reasonable defaults includes commonly used hardware-types and driver interfaces. For example, the redfish-virtual-media boot interface and the ramdisk deploy interface are enabled by default. If you add new bare metal nodes after the adoption is complete, the driver interface selection occurs based on the order of precedence in the configuration if you do not explicitly set it on the node creation request or as an established default in the ironic.conf file.

+
+
+

Some configuration parameters do not need to be set on an individual node level, for example, network UUID values, or they are centrally configured in the ironic.conf file, as the setting controls security behavior.

+
+
+

It is critical that you maintain the following parameters that you configured and formatted as [section] and parameter name from the prior deployment to the new deployment. These parameters that govern the underlying behavior and values in the previous configuration would have used specific values if set.

+
+
+
    +
  • +

    [neutron]cleaning_network

    +
  • +
  • +

    [neutron]provisioning_network

    +
  • +
  • +

    [neutron]rescuing_network

    +
  • +
  • +

    [neutron]inspection_network

    +
  • +
  • +

    [conductor]automated_clean

    +
  • +
  • +

    [deploy]erase_devices_priority

    +
  • +
  • +

    [deploy]erase_devices_metadata_priority

    +
  • +
  • +

    [conductor]force_power_state_during_sync

    +
  • +
+
+
+

You can set the following parameters individually on a node. However, you might choose to use embedded configuration options to avoid the need to set the parameters individually when creating or managing bare metal nodes. Check your prior ironic.conf file for these parameters, and if set, apply a specific override configuration.

+
+
+
    +
  • +

    [conductor]bootloader

    +
  • +
  • +

    [conductor]rescue_ramdisk

    +
  • +
  • +

    [conductor]rescue_kernel

    +
  • +
  • +

    [conductor]deploy_kernel

    +
  • +
  • +

    [conductor]deploy_ramdisk

    +
  • +
+
+
+

The instances of kernel_append_params, formerly pxe_append_params in the [pxe] and [redfish] configuration sections, are used to apply boot time options like "console" for the deployment ramdisk and as such often must be changed.

+
+
+ + + + + +
+ + +You cannot migrate hardware types that are set with the ironic.conf file enabled_hardware_types parameter, and hardware type driver interfaces starting with staging- into the adopted configuration. +
+
+
+
+

Deploying the Bare Metal Provisioning service

+
+

To deploy the Bare Metal Provisioning service (ironic), you patch an existing OpenStackControlPlane custom resource (CR) that has the Bare Metal Provisioning service disabled. The ironic-operator applies the configuration and starts the Bare Metal Provisioning services. After the services are running, the Bare Metal Provisioning service automatically begins polling the power state of the bare metal nodes that it manages.

+
+
+ + + + + +
+ + +By default, newer versions of the Bare Metal Provisioning service contain a more restrictive access control model while also becoming multi-tenant aware. As a result, bare metal nodes might be missing from a openstack baremetal node list command after you adopt the Bare Metal Provisioning service. Your nodes are not deleted. You must set the owner field on each bare metal node due to the increased access restrictions in the role-based access control (RBAC) model. Because this involves access controls and the model of use can be site specific, you should identify which project owns the bare metal nodes. +
+
+
+
Prerequisites
+
    +
  • +

    You have imported the service databases into the control plane MariaDB.

    +
  • +
  • +

    The Identity service (keystone), Networking service (neutron), Image Service (glance), and Block Storage service (cinder) are operational.

    +
    + + + + + +
    + + +If you use the Bare Metal Provisioning service in a Bare Metal as a Service configuration, you have not yet adopted the Compute service (nova). +
    +
    +
  • +
  • +

    For the Bare Metal Provisioning service conductor services, the services must be able to reach Baseboard Management Controllers of hardware that is configured to be managed by the Bare Metal Provisioning service. If this hardware is unreachable, the nodes might enter "maintenance" state and be unavailable until connectivity is restored later.

    +
  • +
  • +

    You have downloaded the ironic.conf file locally:

    +
    +
    +
    $CONTROLLER1_SSH cat /var/lib/config-data/puppet-generated/ironic/etc/ironic/ironic.conf > ironic.conf
    +
    +
    +
    + + + + + +
    + + +This configuration file must come from one of the Controller nodes and not a director undercloud node. The director undercloud node operates with different configuration that does not apply when you adopt the Overcloud Ironic deployment. +
    +
    +
  • +
  • +

    If you are adopting the Ironic Inspector service, you need the value of the IronicInspectorSubnets director parameter. Use the same values to populate the dhcpRanges parameter in the RHOSO environment.

    +
  • +
  • +

    You have defined the following shell variables. Replace the following example values with values that apply to your environment:

    +
    +
    +
    $ alias openstack="oc exec -t openstackclient -- openstack"
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Patch the OpenStackControlPlane custom resource (CR) to deploy the Bare Metal Provisioning service:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack -n openstack --type=merge --patch '
    +spec:
    +  ironic:
    +    enabled: true
    +    template:
    +      rpcTransport: oslo
    +      databaseInstance: openstack
    +      ironicAPI:
    +        replicas: 1
    +        override:
    +          service:
    +            internal:
    +              metadata:
    +                annotations:
    +                  metallb.universe.tf/address-pool: internalapi
    +                  metallb.universe.tf/allow-shared-ip: internalapi
    +                  metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +              spec:
    +                type: LoadBalancer
    +      ironicConductors:
    +      - replicas: 1
    +        networkAttachments:
    +          - baremetal
    +        provisionNetwork: baremetal
    +        storageRequest: 10G
    +        customServiceConfig: |
    +          [neutron]
    +          cleaning_network=<cleaning network uuid>
    +          provisioning_network=<provisioning network uuid>
    +          rescuing_network=<rescuing network uuid>
    +          inspection_network=<introspection network uuid>
    +          [conductor]
    +          automated_clean=true
    +      ironicInspector:
    +        replicas: 1
    +        inspectionNetwork: baremetal
    +        networkAttachments:
    +          - baremetal
    +        dhcpRanges:
    +          - name: inspector-0
    +            cidr: 172.20.1.0/24
    +            start: 172.20.1.190
    +            end: 172.20.1.199
    +            gateway: 172.20.1.1
    +        serviceUser: ironic-inspector
    +        databaseAccount: ironic-inspector
    +        passwordSelectors:
    +          database: IronicInspectorDatabasePassword
    +          service: IronicInspectorPassword
    +      ironicNeutronAgent:
    +        replicas: 1
    +        rabbitMqClusterName: rabbitmq
    +      secret: osp-secret
    +'
    +
    +
    +
  2. +
  3. +

    Wait for the Bare Metal Provisioning service control plane services CRs to become ready:

    +
    +
    +
    $ oc wait --for condition=Ready --timeout=300s ironics.ironic.openstack.org ironic
    +
    +
    +
  4. +
  5. +

    Verify that the individual services are ready:

    +
    +
    +
    $ oc wait --for condition=Ready --timeout=300s ironicapis.ironic.openstack.org ironic-api
    +$ oc wait --for condition=Ready --timeout=300s ironicconductors.ironic.openstack.org ironic-conductor
    +$ oc wait --for condition=Ready --timeout=300s ironicinspectors.ironic.openstack.org ironic-inspector
    +$ oc wait --for condition=Ready --timeout=300s ironicneutronagents.ironic.openstack.org ironic-ironic-neutron-agent
    +
    +
    +
  6. +
  7. +

    Update the DNS Nameservers on the provisoning, cleaning, and rescue networks:

    +
    + + + + + +
    + + +For name resolution to work for Bare Metal Provisioning service operations, you must set the DNS nameserver to use the internal DNS servers in the RHOSO control plane: +
    +
    +
    +
    +
    $ openstack subnet set --dns-nameserver 192.168.122.80 provisioning-subnet
    +
    +
    +
  8. +
  9. +

    Verify that no Bare Metal Provisioning service nodes are missing from the node list:

    +
    +
    +
    $ openstack baremetal node list
    +
    +
    +
    + + + + + +
    + + +If the openstack baremetal node list command output reports an incorrect power status, wait a few minutes and re-run the command to see if the output syncs with the actual state of the hardware being managed. The time required for the Bare Metal Provisioning service to review and reconcile the power state of bare metal nodes depends on the number of operating conductors through the replicas parameter and which are present in the Bare Metal Provisioning service deployment being adopted. +
    +
    +
  10. +
  11. +

    If any Bare Metal Provisioning service nodes are missing from the openstack baremetal node list command, temporarily disable the new RBAC policy to see the nodes again:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack -n openstack --type=merge --patch '
    +spec:
    +  ironic:
    +    enabled: true
    +    template:
    +      databaseInstance: openstack
    +      ironicAPI:
    +        replicas: 1
    +        customServiceConfig: |
    +          [oslo_policy]
    +          enforce_scope=false
    +          enforce_new_defaults=false
    +'
    +
    +
    +
  12. +
  13. +

    After you set the owner field on the bare metal nodes, you can re-enable RBAC by removing the customServiceConfig section or by setting the following values to true:

    +
    +
    +
    customServiceConfig: |
    +  [oslo_policy]
    +  enforce_scope=true
    +  enforce_new_defaults=true
    +
    +
    +
  14. +
  15. +

    After this configuration is applied, the operator restarts the Ironic API service and disables the new RBAC policy that is enabled by default. After the RBAC policy is disabled, you can view bare metal nodes without an owner field:

    +
    +
    +
    $ openstack baremetal node list -f uuid,provision_state,owner
    +
    +
    +
  16. +
  17. +

    Assign all bare metal nodes with no owner to a new project, for example, the admin project:

    +
    +
    +
    ADMIN_PROJECT_ID=$(openstack project show -c id -f value --domain default admin)
    +for node in $(openstack baremetal node list -f json -c UUID -c Owner | jq -r '.[] | select(.Owner == null) | .UUID'); do openstack baremetal node set --owner $ADMIN_PROJECT_ID $node; done
    +
    +
    +
  18. +
  19. +

    Re-apply the default RBAC:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack -n openstack --type=merge --patch '
    +spec:
    +  ironic:
    +    enabled: true
    +    template:
    +      databaseInstance: openstack
    +      ironicAPI:
    +        replicas: 1
    +        customServiceConfig: |
    +          [oslo_policy]
    +          enforce_scope=true
    +          enforce_new_defaults=true
    +'
    +
    +
    +
  20. +
+
+
+
Verification
+
    +
  1. +

    Verify the list of endpoints:

    +
    +
    +
    $ openstack endpoint list |grep ironic
    +
    +
    +
  2. +
  3. +

    Verify the list of bare metal nodes:

    +
    +
    +
    $ openstack baremetal node list
    +
    +
    +
  4. +
+
+
+
+
+

Adopting the Orchestration service

+
+

To adopt the Orchestration service (heat), you patch an existing OpenStackControlPlane custom resource (CR), where the Orchestration service +is disabled. The patch starts the service with the configuration parameters that are provided by the Red Hat OpenStack Platform (RHOSP) environment.

+
+
+

After you complete the adoption process, you have CRs for Heat, HeatAPI, HeatEngine, and HeatCFNAPI, and endpoints within the Identity service (keystone) to facilitate these services.

+
+
+
Prerequisites
+
    +
  • +

    The source director environment is running.

    +
  • +
  • +

    The target Red Hat OpenShift Container Platform (RHOCP) environment is running.

    +
  • +
  • +

    You adopted MariaDB and the Identity service.

    +
  • +
  • +

    If your existing Orchestration service stacks contain resources from other services such as Networking service (neutron), Compute service (nova), Object Storage service (swift), and so on, adopt those sevices before adopting the Orchestration service.

    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Retrieve the existing auth_encryption_key and service passwords. You use these passwords to patch the osp-secret. In the following example, the auth_encryption_key is used as HeatAuthEncryptionKey and the service password is used as HeatPassword:

    +
    +
    +
    [stack@rhosp17 ~]$ grep -E 'HeatPassword|HeatAuth' ~/overcloud-deploy/overcloud/overcloud-passwords.yaml
    +  HeatAuthEncryptionKey: Q60Hj8PqbrDNu2dDCbyIQE2dibpQUPg2
    +  HeatPassword: dU2N0Vr2bdelYH7eQonAwPfI3
    +
    +
    +
  2. +
  3. +

    Log in to a Controller node and verify the auth_encryption_key value in use:

    +
    +
    +
    [stack@rhosp17 ~]$ ansible -i overcloud-deploy/overcloud/config-download/overcloud/tripleo-ansible-inventory.yaml overcloud-controller-0 -m shell -a "grep auth_encryption_key /var/lib/config-data/puppet-generated/heat/etc/heat/heat.conf | grep -Ev '^#|^$'" -b
    +overcloud-controller-0 | CHANGED | rc=0 >>
    +auth_encryption_key=Q60Hj8PqbrDNu2dDCbyIQE2dibpQUPg2
    +
    +
    +
  4. +
  5. +

    Encode the password to Base64 format:

    +
    +
    +
    $ echo Q60Hj8PqbrDNu2dDCbyIQE2dibpQUPg2 | base64
    +UTYwSGo4UHFickROdTJkRENieUlRRTJkaWJwUVVQZzIK
    +
    +
    +
  6. +
  7. +

    Patch the osp-secret to update the HeatAuthEncryptionKey and HeatPassword parameters. These values must match the values in the director Orchestration service configuration:

    +
    +
    +
    $ oc patch secret osp-secret --type='json' -p='[{"op" : "replace" ,"path" : "/data/HeatAuthEncryptionKey" ,"value" : "UTYwSGo4UHFickROdTJkRENieUlRRTJkaWJwUVVQZzIK"}]'
    +secret/osp-secret patched
    +
    +
    +
  8. +
  9. +

    Patch the OpenStackControlPlane CR to deploy the Orchestration service:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch '
    +spec:
    +  heat:
    +    enabled: true
    +    apiOverride:
    +      route: {}
    +    template:
    +      databaseInstance: openstack
    +      databaseAccount: heat
    +      secret: osp-secret
    +      memcachedInstance: memcached
    +      passwordSelectors:
    +        authEncryptionKey: HeatAuthEncryptionKey
    +        service: HeatPassword
    +'
    +
    +
    +
  10. +
+
+
+
Verification
+
    +
  1. +

    Ensure that the statuses of all the CRs are Setup complete:

    +
    +
    +
    $ oc get Heat,HeatAPI,HeatEngine,HeatCFNAPI
    +NAME                           STATUS   MESSAGE
    +heat.heat.openstack.org/heat   True     Setup complete
    +
    +NAME                                  STATUS   MESSAGE
    +heatapi.heat.openstack.org/heat-api   True     Setup complete
    +
    +NAME                                        STATUS   MESSAGE
    +heatengine.heat.openstack.org/heat-engine   True     Setup complete
    +
    +NAME                                        STATUS   MESSAGE
    +heatcfnapi.heat.openstack.org/heat-cfnapi   True     Setup complete
    +
    +
    +
  2. +
  3. +

    Check that the Orchestration service is registered in the Identity service:

    +
    +
    +
    $ oc exec -it openstackclient -- openstack service list -c Name -c Type
    ++------------+----------------+
    +| Name       | Type           |
    ++------------+----------------+
    +| heat       | orchestration  |
    +| glance     | image          |
    +| heat-cfn   | cloudformation |
    +| ceilometer | Ceilometer     |
    +| keystone   | identity       |
    +| placement  | placement      |
    +| cinderv3   | volumev3       |
    +| nova       | compute        |
    +| neutron    | network        |
    ++------------+----------------+
    +
    +
    +
    +
    +
    $ oc exec -it openstackclient -- openstack endpoint list --service=heat -f yaml
    +- Enabled: true
    +  ID: 1da7df5b25b94d1cae85e3ad736b25a5
    +  Interface: public
    +  Region: regionOne
    +  Service Name: heat
    +  Service Type: orchestration
    +  URL: http://heat-api-public-openstack-operators.apps.okd.bne-shift.net/v1/%(tenant_id)s
    +- Enabled: true
    +  ID: 414dd03d8e9d462988113ea0e3a330b0
    +  Interface: internal
    +  Region: regionOne
    +  Service Name: heat
    +  Service Type: orchestration
    +  URL: http://heat-api-internal.openstack-operators.svc:8004/v1/%(tenant_id)s
    +
    +
    +
  4. +
  5. +

    Check that the Orchestration service engine services are running:

    +
    +
    +
    $ oc exec -it openstackclient -- openstack orchestration service list -f yaml
    +- Binary: heat-engine
    +  Engine ID: b16ad899-815a-4b0c-9f2e-e6d9c74aa200
    +  Host: heat-engine-6d47856868-p7pzz
    +  Hostname: heat-engine-6d47856868-p7pzz
    +  Status: up
    +  Topic: engine
    +  Updated At: '2023-10-11T21:48:01.000000'
    +- Binary: heat-engine
    +  Engine ID: 887ed392-0799-4310-b95c-ac2d3e6f965f
    +  Host: heat-engine-6d47856868-p7pzz
    +  Hostname: heat-engine-6d47856868-p7pzz
    +  Status: up
    +  Topic: engine
    +  Updated At: '2023-10-11T21:48:00.000000'
    +- Binary: heat-engine
    +  Engine ID: 26ed9668-b3f2-48aa-92e8-2862252485ea
    +  Host: heat-engine-6d47856868-p7pzz
    +  Hostname: heat-engine-6d47856868-p7pzz
    +  Status: up
    +  Topic: engine
    +  Updated At: '2023-10-11T21:48:00.000000'
    +- Binary: heat-engine
    +  Engine ID: 1011943b-9fea-4f53-b543-d841297245fd
    +  Host: heat-engine-6d47856868-p7pzz
    +  Hostname: heat-engine-6d47856868-p7pzz
    +  Status: up
    +  Topic: engine
    +  Updated At: '2023-10-11T21:48:01.000000'
    +
    +
    +
  6. +
  7. +

    Verify that you can see your Orchestration service stacks:

    +
    +
    +
    $ openstack stack list -f yaml
    +- Creation Time: '2023-10-11T22:03:20Z'
    +  ID: 20f95925-7443-49cb-9561-a1ab736749ba
    +  Project: 4eacd0d1cab04427bc315805c28e66c9
    +  Stack Name: test-networks
    +  Stack Status: CREATE_COMPLETE
    +  Updated Time: null
    +
    +
    +
  8. +
+
+
+
+

Adopting the Loadbalancer service

+
+

During the adoption process the Loadbalancer service (octavia) service +must stay disabled in the new control plane.

+
+
+

Certificates

+
+

Before running the script below the shell variables CONTROLLER1_SSH and +CONTROLLER1_SCP must be set to contain the command to log into one of the +controllers using ssh and scp respectively as root user as shown below.

+
+
+
+
$ CONTROLLER1_SSH="ssh -i <path to the ssh key> root@192.168.122.100"
+$ CONTROLLER1_SCP="scp -i <path to the ssh key> root@192.168.122.100"
+
+
+
+

Make sure to replace <path to the ssh key> with the correct path to the ssh +key for connecting to the controller.

+
+
+
+
SERVER_CA_PASSPHRASE=$($CONTROLLER1_SSH grep ^ca_private_key_passphrase /var/lib/config-data/puppet-generated/octavia/etc/octavia/octavia.conf)
+export SERVER_CA_PASSPHRASE=$(echo "${SERVER_CA_PASSPHRASE}"  | cut -d '=' -f 2 | xargs)
+export CLIENT_PASSPHRASE="ThisIsOnlyAppliedTemporarily"
+CERT_SUBJECT="/C=US/ST=Denial/L=Springfield/O=Dis/CN=www.example.com"
+CERT_MIGRATE_PATH="$HOME/octavia_cert_migration"
+
+mkdir -p ${CERT_MIGRATE_PATH}
+cd ${CERT_MIGRATE_PATH}
+# Set up the server CA
+mkdir -p server_ca
+cd server_ca
+mkdir -p certs crl newcerts private csr
+chmod 700 private
+${CONTROLLER1_SCP}:/var/lib/config-data/puppet-generated/octavia/etc/octavia/certs/private/cakey.pem private/server_ca.key.pem
+chmod 400 private/server_ca.key.pem
+${CONTROLLER1_SCP}:/tmp/octavia-ssl/client-.pem certs/old_client_cert.pem
+${CONTROLLER1_SCP}:/tmp/octavia-ssl/index.txt* ./
+${CONTROLLER1_SCP}:/tmp/octavia-ssl/serial* ./
+${CONTROLLER1_SCP}:/tmp/octavia-ssl/openssl.cnf ../
+openssl req -config ../openssl.cnf -key private/server_ca.key.pem -new -passin env:SERVER_CA_PASSPHRASE -x509 -days 18250 -sha256 -extensions v3_ca -out certs/server_ca.cert.pem -subj "/C=US/ST=Denial/L=Springfield/O=Dis/CN=www.example.com"
+
+# Set up the new client CA
+sed -i "s|^dir\s\+=\s\+\"/tmp/octavia-ssl\"|dir = \"$CERT_MIGRATE_PATH/client_ca\"|" ../openssl.cnf
+cd ${CERT_MIGRATE_PATH}
+mkdir -p client_ca
+cd client_ca
+mkdir -p certs crl csr newcerts private
+chmod 700 private
+touch index.txt
+echo 1000 > serial
+openssl genrsa -aes256 -out private/ca.key.pem -passout env:SERVER_CA_PASSPHRASE 4096
+chmod 400 private/ca.key.pem
+openssl req -config ../openssl.cnf -key private/ca.key.pem -new -passin env:SERVER_CA_PASSPHRASE -x509 -days 18250 -sha256 -extensions v3_ca -out certs/client_ca.cert.pem -subj "${CERT_SUBJECT}"
+
+# Create client certificates
+cd ${CERT_MIGRATE_PATH}/client_ca
+openssl genrsa -aes256 -out private/client.key.pem -passout env:CLIENT_PASSPHRASE 4096
+openssl req -config ../openssl.cnf -new -passin env:CLIENT_PASSPHRASE -sha256 -key private/client.key.pem -out csr/client.csr.pem -subj "${CERT_SUBJECT}"
+mkdir -p ${CERT_MIGRATE_PATH}/client_ca/private ${CERT_MIGRATE_PATH}/client_ca/newcerts ${CERT_MIGRATE_PATH}/private
+chmod 700 ${CERT_MIGRATE_PATH}/client_ca/private ${CERT_MIGRATE_PATH}/private
+
+cp ${CERT_MIGRATE_PATH}/client_ca/private/ca.key.pem ${CERT_MIGRATE_PATH}/client_ca/private/cakey.pem
+cp ${CERT_MIGRATE_PATH}/client_ca/certs/client_ca.cert.pem $CERT_MIGRATE_PATH/client_ca/ca_01.pem
+openssl ca -config ../openssl.cnf -extensions usr_cert -passin env:SERVER_CA_PASSPHRASE -days 1825 -notext -batch -md sha256 -in csr/client.csr.pem -out certs/client.cert.pem
+openssl rsa -passin env:CLIENT_PASSPHRASE -in private/client.key.pem -out private/client.cert-and-key.pem
+cat certs/client.cert.pem >> private/client.cert-and-key.pem
+
+# Install new data in k8s
+oc apply -f - <<EOF
+apiVersion: v1
+kind: Secret
+metadata:
+  name: octavia-certs-secret
+  namespace: openstack
+type: Opaque
+data:
+  server_ca.key.pem:  $(cat ${CERT_MIGRATE_PATH}/server_ca/private/server_ca.key.pem | base64 -w0)
+  server_ca.cert.pem: $(cat ${CERT_MIGRATE_PATH}/server_ca/certs/server_ca.cert.pem | base64 -w0)
+  client_ca.cert.pem: $(cat ${CERT_MIGRATE_PATH}/client_ca/certs/client_ca.cert.pem | base64 -w0)
+  client.cert-and-key.pem: $(cat ${CERT_MIGRATE_PATH}/client_ca/private/client.cert-and-key.pem | base64 -w0)
+EOF
+
+oc apply -f - <<EOF
+apiVersion: v1
+kind: Secret
+metadata:
+  name: octavia-ca-passphrase
+  namespace: openstack
+type: Opaque
+data:
+  server-ca-passphrase: $(echo $SERVER_CA_PASSPHRASE | base64 -w0)
+EOF
+
+rm -rf ${CERT_MIGRATE_PATH}
+
+
+
+

These commands convert the existing single CA configuration into a dual CA configuration.

+
+
+
+

Enabling the Loadbalancer service in OpenShift

+
+

Run the following command in order to enable the Loadbalancer service CR.

+
+
+
+
$ oc patch openstackcontrolplane openstack --type=merge --patch '
+spec:
+  octavia:
+    enabled: true
+    template: {}
+'
+
+
+
+
+
+

Adopting Telemetry services

+
+

To adopt Telemetry services, you patch an existing OpenStackControlPlane custom resource (CR) that has Telemetry services disabled to start the service with the configuration parameters that are provided by the Red Hat OpenStack Platform (RHOSP) 17.1 environment.

+
+
+

If you adopt Telemetry services, the observability solution that is used in the RHOSP 17.1 environment, Service Telemetry Framework, is removed from the cluster. The new solution is deployed in the Red Hat OpenStack Services on OpenShift (RHOSO) environment, allowing for metrics, and optionally logs, to be retrieved and stored in the new back ends.

+
+
+

You cannot automatically migrate old data because different back ends are used. Metrics and logs are considered short-lived data and are not intended to be migrated to the RHOSO environment. For information about adopting legacy autoscaling stack templates to the RHOSO environment, see Adopting Autoscaling services.

+
+
+
Prerequisites
+
    +
  • +

    The director environment is running (the source cloud).

    +
  • +
  • +

    The Single Node OpenShift or OpenShift Local is running in the Red Hat OpenShift Container Platform (RHOCP) cluster.

    +
  • +
  • +

    Previous adoption steps are completed.

    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Patch the OpenStackControlPlane CR to deploy cluster-observability-operator:

    +
    +
    +
    $ oc create -f - <<EOF
    +apiVersion: operators.coreos.com/v1alpha1
    +kind: Subscription
    +metadata:
    +  name: cluster-observability-operator
    +  namespace: openshift-operators
    +spec:
    +  channel: development
    +  installPlanApproval: Automatic
    +  name: cluster-observability-operator
    +  source: redhat-operators
    +  sourceNamespace: openshift-marketplace
    +EOF
    +
    +
    +
  2. +
  3. +

    Wait for the installation to succeed:

    +
    +
    +
    $ oc wait --for jsonpath="{.status.phase}"=Succeeded csv --namespace=openshift-operators -l operators.coreos.com/cluster-observability-operator.openshift-operators
    +
    +
    +
  4. +
  5. +

    Patch the OpenStackControlPlane CR to deploy Ceilometer services:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch '
    +spec:
    +  telemetry:
    +    enabled: true
    +    template:
    +      ceilometer:
    +        passwordSelector:
    +          ceilometerService: CeilometerPassword
    +        enabled: true
    +        secret: osp-secret
    +        serviceUser: ceilometer
    +'
    +
    +
    +
  6. +
  7. +

    Enable the metrics storage back end:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch '
    +spec:
    +  telemetry:
    +    template:
    +      metricStorage:
    +        enabled: true
    +        monitoringStack:
    +          alertingEnabled: true
    +          scrapeInterval: 30s
    +          storage:
    +            strategy: persistent
    +            retention: 24h
    +            persistent:
    +              pvcStorageRequest: 20G
    +'
    +
    +
    +
  8. +
+
+
+
Verification
+
    +
  1. +

    Verify that the alertmanager and prometheus pods are available:

    +
    +
    +
    $ oc get pods -l alertmanager=metric-storage -n openstack
    +NAME                            READY   STATUS    RESTARTS   AGE
    +alertmanager-metric-storage-0   2/2     Running   0          46s
    +alertmanager-metric-storage-1   2/2     Running   0          46s
    +
    +$ oc get pods -l prometheus=metric-storage -n openstack
    +NAME                          READY   STATUS    RESTARTS   AGE
    +prometheus-metric-storage-0   3/3     Running   0          46s
    +
    +
    +
  2. +
  3. +

    Inspect the resulting Ceilometer pods:

    +
    +
    +
    CEILOMETETR_POD=`oc get pods -l service=ceilometer -n openstack | tail -n 1 | cut -f 1 -d' '`
    +oc exec -t $CEILOMETETR_POD -c ceilometer-central-agent -- cat /etc/ceilometer/ceilometer.conf
    +
    +
    +
  4. +
  5. +

    Inspect enabled pollsters:

    +
    +
    +
    $ oc get secret ceilometer-config-data -o jsonpath="{.data['polling\.yaml\.j2']}"  | base64 -d
    +
    +
    +
  6. +
  7. +

    Optional: Override default pollsters according to the requirements of your environment:

    +
    +
    +
    $ oc patch openstackcontrolplane controlplane --type=merge --patch '
    +spec:
    +  telemetry:
    +    template:
    +      ceilometer:
    +          defaultConfigOverwrite:
    +            polling.yaml.j2: |
    +              ---
    +              sources:
    +                - name: pollsters
    +                  interval: 100
    +                  meters:
    +                    - volume.*
    +                    - image.size
    +          enabled: true
    +          secret: osp-secret
    +'
    +
    +
    +
  8. +
+
+
+
Next steps
+
    +
  1. +

    Optional: Patch the OpenStackControlPlane CR to include logging:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch '
    +spec:
    +  telemetry:
    +    template:
    +      logging:
    +      enabled: false
    +      ipaddr: 172.17.0.80
    +      port: 10514
    +      cloNamespace: openshift-logging
    +'
    +
    +
    +
  2. +
+
+
+
+

Adopting autoscaling services

+
+

To adopt services that enable autoscaling, you patch an existing OpenStackControlPlane custom resource (CR) where the Alarming services (aodh) are disabled. The patch starts the service with the configuration parameters that are provided by the Red Hat OpenStack Platform environment.

+
+
+
Prerequisites
+
    +
  • +

    The source director environment is running.

    +
  • +
  • +

    A Single Node OpenShift or OpenShift Local is running in the Red Hat OpenShift Container Platform (RHOCP) cluster.

    +
  • +
  • +

    You have adopted the following services:

    +
    +
      +
    • +

      MariaDB

      +
    • +
    • +

      Identity service (keystone)

      +
    • +
    • +

      Orchestration service (heat)

      +
    • +
    • +

      Telemetry service

      +
    • +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Patch the OpenStackControlPlane CR to deploy the autoscaling services:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch '
    +spec:
    +  telemetry:
    +    enabled: true
    +    template:
    +      autoscaling:
    +        enabled: true
    +        aodh:
    +          passwordSelector:
    +            aodhService: AodhPassword
    +          databaseAccount: aodh
    +          databaseInstance: openstack
    +          secret: osp-secret
    +          serviceUser: aodh
    +        heatInstance: heat
    +'
    +
    +
    +
  2. +
  3. +

    Inspect the aodh pods:

    +
    +
    +
    $ AODH_POD=`oc get pods -l service=aodh -n openstack | tail -n 1 | cut -f 1 -d' '`
    +$ oc exec -t $AODH_POD -c aodh-api -- cat /etc/aodh/aodh.conf
    +
    +
    +
  4. +
  5. +

    Check whether the aodh API service is registered in the Identity service:

    +
    +
    +
    $ openstack endpoint list | grep aodh
    +| d05d120153cd4f9b8310ac396b572926 | regionOne | aodh  | alarming  | True    | internal  | http://aodh-internal.openstack.svc:8042  |
    +| d6daee0183494d7a9a5faee681c79046 | regionOne | aodh  | alarming  | True    | public    | http://aodh-public.openstack.svc:8042    |
    +
    +
    +
  6. +
  7. +

    Optional: Create aodh alarms with the PrometheusAlarm alarm type:

    +
    + + + + + +
    + + +You must use the PrometheusAlarm alarm type instead of GnocchiAggregationByResourcesAlarm. +
    +
    +
    +
    +
    $ openstack alarm create --name high_cpu_alarm \
    +--type prometheus \
    +--query "(rate(ceilometer_cpu{resource_name=~'cirros'})) * 100" \
    +--alarm-action 'log://' \
    +--granularity 15 \
    +--evaluation-periods 3 \
    +--comparison-operator gt \
    +--threshold 7000000000
    +
    +
    +
    +
      +
    1. +

      Verify that the alarm is enabled:

      +
      +
      +
      $ openstack alarm list
      ++--------------------------------------+------------+------------------+-------------------+----------+
      +| alarm_id                             | type       | name             | state  | severity | enabled  |
      ++--------------------------------------+------------+------------------+-------------------+----------+
      +| 209dc2e9-f9d6-40e5-aecc-e767ce50e9c0 | prometheus | prometheus_alarm |   ok   |    low   |   True   |
      ++--------------------------------------+------------+------------------+-------------------+----------+
      +
      +
      +
    2. +
    +
    +
  8. +
+
+
+
+

Pulling the configuration from a director deployment

+
+

Before you start the data plane adoption workflow, back up the configuration from the Red Hat OpenStack Platform (RHOSP) services and director. You can then use the files during the configuration of the adopted services to ensure that nothing is missed or misconfigured.

+
+
+
Prerequisites
+ +
+
+
Procedure
+
    +
  1. +

    Update your ssh parameters according to your environment in the os-diff.cfg. Os-diff uses the ssh parameters to connect to your director node, and then query and download the configuration files:

    +
    +
    +
    ssh_cmd=ssh -F ssh.config standalone
    +container_engine=podman
    +connection=ssh
    +remote_config_path=/tmp/tripleo
    +
    +
    +
    +

    Ensure that the ssh command you provide in ssh_cmd parameter is correct and includes key authentication.

    +
    +
  2. +
  3. +

    Enable the services that you want to include in the /etc/os-diff/config.yaml file, and disable the services that you want to exclude from the file. Ensure that you have the correct permissions to edit the file:

    +
    +
    +
    $ chown ospng:ospng /etc/os-diff/config.yaml
    +
    +
    +
    +

    The following example enables the default Identity service (keystone) to be included in the /etc/os-diff/config.yaml file:

    +
    +
    +
    +
    # service name and file location
    +services:
    +  # Service name
    +  keystone:
    +    # Bool to enable/disable a service (not implemented yet)
    +    enable: true
    +    # Pod name, in both OCP and podman context.
    +    # It could be strict match or will only just grep the podman_name
    +    # and work with all the pods which matched with pod_name.
    +    # To enable/disable use strict_pod_name_match: true/false
    +    podman_name: keystone
    +    pod_name: keystone
    +    container_name: keystone-api
    +    # pod options
    +    # strict match for getting pod id in TripleO and podman context
    +    strict_pod_name_match: false
    +    # Path of the config files you want to analyze.
    +    # It could be whatever path you want:
    +    # /etc/<service_name> or /etc or /usr/share/<something> or even /
    +    # @TODO: need to implement loop over path to support multiple paths such as:
    +    # - /etc
    +    # - /usr/share
    +    path:
    +      - /etc/
    +      - /etc/keystone
    +      - /etc/keystone/keystone.conf
    +      - /etc/keystone/logging.conf
    +
    +
    +
    +

    Repeat this step for each RHOSP service that you want to disable or enable.

    +
    +
  4. +
  5. +

    If you use non-containerized services, such as the ovs-external-ids, pull the configuration or the command output. For example:

    +
    +
    +
    services:
    +  ovs_external_ids:
    +    hosts: (1)
    +      - standalone
    +    service_command: "ovs-vsctl list Open_vSwitch . | grep external_ids | awk -F ': ' '{ print $2; }'" (2)
    +    cat_output: true (3)
    +    path:
    +      - ovs_external_ids.json
    +    config_mapping: (4)
    +      ovn-bridge-mappings: edpm_ovn_bridge_mappings (5)
    +      ovn-bridge: edpm_ovn_bridge
    +      ovn-encap-type: edpm_ovn_encap_type
    +      ovn-monitor-all: ovn_monitor_all
    +      ovn-remote-probe-interval: edpm_ovn_remote_probe_interval
    +      ovn-ofctrl-wait-before-clear: edpm_ovn_ofctrl_wait_before_clear
    +
    +
    +
    + + + + + +
    + + +You must correctly configure an SSH configuration file or equivalent for non-standard services, such as OVS. The ovs_external_ids service does not run in a container, and the OVS data is stored on each host of your cloud, for example, controller_1/controller_2/, and so on. +
    +
    +
    + + + + + + + + + + + + + + + + + + + + + +
    1The list of hosts, for example, compute-1, compute-2.
    2The command that runs against the hosts.
    3Os-diff gets the output of the command and stores the output in a file that is specified by the key path.
    4Provides a mapping between, in this example, the data plane custom resource definition and the ovs-vsctl output.
    5The edpm_ovn_bridge_mappings variable must be a list of strings, for example, ["datacentre:br-ex"]. +
    +
      +
    1. +

      Compare the values:

      +
      +
      +
      $ os-diff diff ovs_external_ids.json edpm.crd --crd --service ovs_external_ids
      +
      +
      +
      +

      For example, to check the /etc/yum.conf on every host, you must put the following statement in the config.yaml file. The following example uses a file called yum_config:

      +
      +
      +
      +
      services:
      +  yum_config:
      +    hosts:
      +      - undercloud
      +      - controller_1
      +      - compute_1
      +      - compute_2
      +    service_command: "cat /etc/yum.conf"
      +    cat_output: true
      +    path:
      +      - yum.conf
      +
      +
      +
    2. +
    +
    +
    +
  6. +
  7. +

    Pull the configuration:

    +
    + + + + + +
    + + +
    +

    The following command pulls all the configuration files that are included in the /etc/os-diff/config.yaml file. You can configure os-diff to update this file automatically according to your running environment by using the --update or --update-only option. These options set the podman information into the config.yaml for all running containers. The podman information can be useful later, when all the Red Hat OpenStack Platform services are turned off.

    +
    +
    +

    Note that when the config.yaml file is populated automatically you must provide the configuration paths manually for each service.

    +
    +
    +
    +
    +
    +
    # will only update the /etc/os-diff/config.yaml
    +os-diff pull --update-only
    +
    +
    +
    +
    +
    # will update the /etc/os-diff/config.yaml and pull configuration
    +os-diff pull --update
    +
    +
    +
    +
    +
    # will update the /etc/os-diff/config.yaml and pull configuration
    +os-diff pull
    +
    +
    +
    +

    The configuration is pulled and stored by default in the following directory:

    +
    +
    +
    +
    /tmp/tripleo/
    +
    +
    +
  8. +
+
+
+
Verification
+
    +
  • +

    Verify that you have a directory for each service configuration in your local path:

    +
    +
    +
      ▾ tmp/
    +    ▾ tripleo/
    +      ▾ glance/
    +      ▾ keystone/
    +
    +
    +
  • +
+
+
+
+

Rolling back the control plane adoption

+
+

If you encountered a problem and are unable to complete the adoption of the Red Hat OpenStack Platform (RHOSP) control plane services, you can roll back the control plane adoption.

+
+
+ + + + + +
+ + +Do not attempt the rollback if you altered the data plane nodes in any way. +You can only roll back the control plane adoption if you altered the control plane. +
+
+
+

During the control plane adoption, services on the RHOSP control plane are stopped but not removed. The databases on the RHOSP control plane are not edited during the adoption procedure. The Red Hat OpenStack Services on OpenShift (RHOSO) control plane receives a copy of the original control plane databases. The rollback procedure assumes that the data plane has not yet been modified by the adoption procedure, and it is still connected to the RHOSP control plane.

+
+
+

The rollback procedure consists of the following steps:

+
+
+
    +
  • +

    Restoring the functionality of the RHOSP control plane.

    +
  • +
  • +

    Removing the partially or fully deployed RHOSO control plane.

    +
  • +
+
+
+
Procedure
+
    +
  1. +

    To restore the source cloud to a working state, start the RHOSP +control plane services that you previously stopped during the adoption +procedure:

    +
    +
    +
    ServicesToStart=("tripleo_horizon.service"
    +                 "tripleo_keystone.service"
    +                 "tripleo_barbican_api.service"
    +                 "tripleo_barbican_worker.service"
    +                 "tripleo_barbican_keystone_listener.service"
    +                 "tripleo_cinder_api.service"
    +                 "tripleo_cinder_api_cron.service"
    +                 "tripleo_cinder_scheduler.service"
    +                 "tripleo_cinder_volume.service"
    +                 "tripleo_cinder_backup.service"
    +                 "tripleo_glance_api.service"
    +                 "tripleo_manila_api.service"
    +                 "tripleo_manila_api_cron.service"
    +                 "tripleo_manila_scheduler.service"
    +                 "tripleo_neutron_api.service"
    +                 "tripleo_placement_api.service"
    +                 "tripleo_nova_api_cron.service"
    +                 "tripleo_nova_api.service"
    +                 "tripleo_nova_conductor.service"
    +                 "tripleo_nova_metadata.service"
    +                 "tripleo_nova_scheduler.service"
    +                 "tripleo_nova_vnc_proxy.service"
    +                 "tripleo_aodh_api.service"
    +                 "tripleo_aodh_api_cron.service"
    +                 "tripleo_aodh_evaluator.service"
    +                 "tripleo_aodh_listener.service"
    +                 "tripleo_aodh_notifier.service"
    +                 "tripleo_ceilometer_agent_central.service"
    +                 "tripleo_ceilometer_agent_compute.service"
    +                 "tripleo_ceilometer_agent_ipmi.service"
    +                 "tripleo_ceilometer_agent_notification.service"
    +                 "tripleo_ovn_cluster_north_db_server.service"
    +                 "tripleo_ovn_cluster_south_db_server.service"
    +                 "tripleo_ovn_cluster_northd.service")
    +
    +PacemakerResourcesToStart=("galera-bundle"
    +                           "haproxy-bundle"
    +                           "rabbitmq-bundle"
    +                           "openstack-cinder-volume"
    +                           "openstack-cinder-backup"
    +                           "openstack-manila-share")
    +
    +echo "Starting systemd OpenStack services"
    +for service in ${ServicesToStart[*]}; do
    +    for i in {1..3}; do
    +        SSH_CMD=CONTROLLER${i}_SSH
    +        if [ ! -z "${!SSH_CMD}" ]; then
    +            if ${!SSH_CMD} sudo systemctl is-enabled $service &> /dev/null; then
    +                echo "Starting the $service in controller $i"
    +                ${!SSH_CMD} sudo systemctl start $service
    +            fi
    +        fi
    +    done
    +done
    +
    +echo "Checking systemd OpenStack services"
    +for service in ${ServicesToStart[*]}; do
    +    for i in {1..3}; do
    +        SSH_CMD=CONTROLLER${i}_SSH
    +        if [ ! -z "${!SSH_CMD}" ]; then
    +            if ${!SSH_CMD} sudo systemctl is-enabled $service &> /dev/null; then
    +                if ! ${!SSH_CMD} systemctl show $service | grep ActiveState=active >/dev/null; then
    +                    echo "ERROR: Service $service is not running on controller $i"
    +                else
    +                    echo "OK: Service $service is running in controller $i"
    +                fi
    +            fi
    +        fi
    +    done
    +done
    +
    +echo "Starting pacemaker OpenStack services"
    +for i in {1..3}; do
    +    SSH_CMD=CONTROLLER${i}_SSH
    +    if [ ! -z "${!SSH_CMD}" ]; then
    +        echo "Using controller $i to run pacemaker commands"
    +        for resource in ${PacemakerResourcesToStart[*]}; do
    +            if ${!SSH_CMD} sudo pcs resource config $resource &>/dev/null; then
    +                echo "Starting $resource"
    +                ${!SSH_CMD} sudo pcs resource enable $resource
    +            else
    +                echo "Service $resource not present"
    +            fi
    +        done
    +        break
    +    fi
    +done
    +
    +echo "Checking pacemaker OpenStack services"
    +for i in {1..3}; do
    +    SSH_CMD=CONTROLLER${i}_SSH
    +    if [ ! -z "${!SSH_CMD}" ]; then
    +        echo "Using controller $i to run pacemaker commands"
    +        for resource in ${PacemakerResourcesToStop[*]}; do
    +            if ${!SSH_CMD} sudo pcs resource config $resource &>/dev/null; then
    +                if ${!SSH_CMD} sudo pcs resource status $resource | grep Started >/dev/null; then
    +                    echo "OK: Service $resource is started"
    +                else
    +                    echo "ERROR: Service $resource is stopped"
    +                fi
    +            fi
    +        done
    +        break
    +    fi
    +done
    +
    +
    +
  2. +
  3. +

    If the Ceph NFS service is running on the deployment as a Shared File Systems service (manila) back end, you must restore the Pacemaker order and colocation constraints for the openstack-manila-share service:

    +
    +
    +
    $ sudo pcs constraint order start ceph-nfs then openstack-manila-share kind=Optional id=order-ceph-nfs-openstack-manila-share-Optional
    +$ sudo pcs constraint colocation add openstack-manila-share with ceph-nfs score=INFINITY id=colocation-openstack-manila-share-ceph-nfs-INFINITY
    +
    +
    +
  4. +
  5. +

    Verify that the source cloud is operational again, for example, you +can run openstack CLI commands such as openstack server list, or check that you can access the Dashboard service (horizon).

    +
  6. +
  7. +

    Remove the partially or fully deployed control plane so that you can attempt the adoption again later:

    +
    +
    +
    $ oc delete --ignore-not-found=true --wait=false openstackcontrolplane/openstack
    +$ oc patch openstackcontrolplane openstack --type=merge --patch '
    +metadata:
    +  finalizers: []
    +' || true
    +
    +while oc get pod | grep rabbitmq-server-0; do
    +    sleep 2
    +done
    +while oc get pod | grep openstack-galera-0; do
    +    sleep 2
    +done
    +
    +$ oc delete --ignore-not-found=true --wait=false pod mariadb-copy-data
    +$ oc delete --ignore-not-found=true --wait=false pvc mariadb-data
    +$ oc delete --ignore-not-found=true --wait=false pod ovn-copy-data
    +$ oc delete --ignore-not-found=true secret osp-secret
    +
    +
    +
  8. +
+
+
+ + + + + +
+ + +After you restore the RHOSP control plane services, their internal +state might have changed. Before you retry the adoption procedure, verify that all the control plane resources are removed and that there are no leftovers which could affect the following adoption procedure attempt. You must not use previously created copies of the database contents in another adoption attempt. You must make a new copy of the latest state of the original source database contents. For more information about making new copies of the database, see Migrating databases to the control plane. +
+
+
+
+
+
+

Adopting the data plane

+
+
+

Adopting the Red Hat OpenStack Services on OpenShift (RHOSO) data plane involves the following steps:

+
+
+
    +
  1. +

    Stop any remaining services on the Red Hat OpenStack Platform (RHOSP) 17.1 control plane.

    +
  2. +
  3. +

    Deploy the required custom resources.

    +
  4. +
  5. +

    Perform a fast-forward upgrade on Compute services from RHOSP 17.1 to RHOSO 18.0.

    +
  6. +
  7. +

    If applicable, adopt Networker nodes to the RHOSO data plane.

    +
  8. +
+
+
+ + + + + +
+ + +After the RHOSO control plane manages the newly deployed data plane, you must not re-enable services on the RHOSP 17.1 control plane and data plane. If you re-enable services, workloads are managed by two control planes or two data planes, resulting in data corruption, loss of control of existing workloads, inability to start new workloads, or other issues. +
+
+
+

Stopping infrastructure management and Compute services

+
+

You must stop cloud Controller nodes, database nodes, and messaging nodes on the Red Hat OpenStack Platform 17.1 control plane. Do not stop nodes that are running the Compute, Storage, or Networker roles on the control plane.

+
+
+

The following procedure applies to a single node standalone director deployment. You must remove conflicting repositories and packages from your Compute hosts, so that you can install libvirt packages when these hosts are adopted as data plane nodes, where modular libvirt daemons are no longer running in podman containers.

+
+
+
Prerequisites
+
    +
  • +

    Define the shell variables. Replace the following example values with values that apply to your environment:

    +
    +
    +
    EDPM_PRIVATEKEY_PATH="<path_to_SSH_key>"
    +declare -A computes
    +computes=(
    +  ["standalone.localdomain"]="192.168.122.100"
    +  # ...
    +)
    +
    +
    +
    +
      +
    • +

      Replace ["standalone.localdomain"]="192.168.122.100" with the name and IP address of the Compute node.

      +
    • +
    • +

      Replace <path_to_SSH_key> with the path to your SSH key.

      +
    • +
    +
    +
  • +
+
+
+
Procedure
+
    +
  • +

    Remove the conflicting repositories and packages from all Compute hosts:

    +
    +
    +
    PacemakerResourcesToStop=(
    +                "galera-bundle"
    +                "haproxy-bundle"
    +                "rabbitmq-bundle")
    +
    +echo "Stopping pacemaker services"
    +for i in {1..3}; do
    +    SSH_CMD=CONTROLLER${i}_SSH
    +    if [ ! -z "${!SSH_CMD}" ]; then
    +        echo "Using controller $i to run pacemaker commands"
    +        for resource in ${PacemakerResourcesToStop[*]}; do
    +            if ${!SSH_CMD} sudo pcs resource config $resource; then
    +                ${!SSH_CMD} sudo pcs resource disable $resource
    +            fi
    +        done
    +        break
    +    fi
    +done
    +
    +
    +
  • +
+
+
+
+

Adopting Compute services to the RHOSO data plane

+
+

Adopt your Compute (nova) services to the Red Hat OpenStack Services on OpenShift (RHOSO) data plane.

+
+
+
Prerequisites
+
    +
  • +

    You have stopped the remaining control plane nodes, repositories, and packages on the Compute service (nova) hosts. For more information, see Stopping infrastructure management and Compute services.

    +
  • +
  • +

    You have configured the Ceph back end for the NovaLibvirt service. For more information, see Configuring a Ceph back end.

    +
  • +
  • +

    You have configured IP Address Management (IPAM):

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: network.openstack.org/v1beta1
    +kind: NetConfig
    +metadata:
    +  name: netconfig
    +spec:
    +  networks:
    +  - name: ctlplane
    +    dnsDomain: ctlplane.example.com
    +    subnets:
    +    - name: subnet1
    +      allocationRanges:
    +      - end: 192.168.122.120
    +        start: 192.168.122.100
    +      - end: 192.168.122.200
    +        start: 192.168.122.150
    +      cidr: 192.168.122.0/24
    +      gateway: 192.168.122.1
    +  - name: internalapi
    +    dnsDomain: internalapi.example.com
    +    subnets:
    +    - name: subnet1
    +      allocationRanges:
    +      - end: 172.17.0.250
    +        start: 172.17.0.100
    +      cidr: 172.17.0.0/24
    +      vlan: 20
    +  - name: External
    +    dnsDomain: external.example.com
    +    subnets:
    +    - name: subnet1
    +      allocationRanges:
    +      - end: 10.0.0.250
    +        start: 10.0.0.100
    +      cidr: 10.0.0.0/24
    +      gateway: 10.0.0.1
    +  - name: storage
    +    dnsDomain: storage.example.com
    +    subnets:
    +    - name: subnet1
    +      allocationRanges:
    +      - end: 172.18.0.250
    +        start: 172.18.0.100
    +      cidr: 172.18.0.0/24
    +      vlan: 21
    +  - name: storagemgmt
    +    dnsDomain: storagemgmt.example.com
    +    subnets:
    +    - name: subnet1
    +      allocationRanges:
    +      - end: 172.20.0.250
    +        start: 172.20.0.100
    +      cidr: 172.20.0.0/24
    +      vlan: 23
    +  - name: tenant
    +    dnsDomain: tenant.example.com
    +    subnets:
    +    - name: subnet1
    +      allocationRanges:
    +      - end: 172.19.0.250
    +        start: 172.19.0.100
    +      cidr: 172.19.0.0/24
    +      vlan: 22
    +EOF
    +
    +
    +
  • +
  • +

    If neutron-sriov-nic-agent is running on your Compute service nodes, ensure that the physical device mappings match the values that are defined in the OpenStackDataPlaneNodeSet custom resource (CR). For more information, see Pulling the configuration from a director deployment.

    +
  • +
  • +

    You have defined the shell variables to run the script that runs the fast-forward upgrade:

    +
    +
    +
    PODIFIED_DB_ROOT_PASSWORD=$(oc get -o json secret/osp-secret | jq -r .data.DbRootPassword | base64 -d)
    +CEPH_FSID=$(oc get secret ceph-conf-files -o json | jq -r '.data."ceph.conf"' | base64 -d | grep fsid | sed -e 's/fsid = //'
    +
    +alias openstack="oc exec -t openstackclient -- openstack"
    +declare -A computes
    +export computes=(
    +  ["standalone.localdomain"]="192.168.122.100"
    +  # ...
    +)
    +
    +
    +
    +
      +
    • +

      Replace ["standalone.localdomain"]="192.168.122.100" with the name and IP address of the Compute service node.

      +
      + + + + + +
      + + +Do not set a value for the CEPH_FSID parameter if the local storage back end is configured by the Compute service for libvirt. The storage back end must match the source cloud storage back end. You cannot change the storage back end during adoption. +
      +
      +
    • +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Create an SSH authentication secret for the data plane nodes:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: v1
    +kind: Secret
    +metadata:
    +    name: dataplane-adoption-secret
    +    namespace: openstack
    +data:
    +    ssh-privatekey: |
    +$(cat <path_to_SSH_key> | base64 | sed 's/^/        /')
    +EOF
    +
    +
    +
    +
      +
    • +

      Replace <path_to_SSH_key> with the path to your SSH key.

      +
    • +
    +
    +
  2. +
  3. +

    Generate an ssh key-pair nova-migration-ssh-key secret:

    +
    +
    +
    $ cd "$(mktemp -d)"
    +ssh-keygen -f ./id -t ecdsa-sha2-nistp521 -N ''
    +oc get secret nova-migration-ssh-key || oc create secret generic nova-migration-ssh-key \
    +  -n openstack \
    +  --from-file=ssh-privatekey=id \
    +  --from-file=ssh-publickey=id.pub \
    +  --type kubernetes.io/ssh-auth
    +rm -f id*
    +cd -
    +
    +
    +
  4. +
  5. +

    If you use a local storage back end for libvirt, create a nova-compute-extra-config service to remove pre-fast-forward workarounds and configure Compute services to use a local storage back end:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: v1
    +kind: ConfigMap
    +metadata:
    +  name: nova-extra-config
    +  namespace: openstack
    +data:
    +  19-nova-compute-cell1-workarounds.conf: |
    +    [workarounds]
    +    disable_compute_service_check_for_ffu=true
    +EOF
    +
    +
    +
    + + + + + +
    + + +The secret nova-cell<X>-compute-config auto-generates for each +cell<X>. You must specify values for the nova-cell<X>-compute-config and nova-migration-ssh-key parameters for each custom OpenStackDataPlaneService CR that is related to the Compute service. +
    +
    +
  6. +
  7. +

    If TLS Everywhere is enabled, append the following content to the OpenStackDataPlaneService CR:

    +
    +
    +
      tlsCerts:
    +    contents:
    +      - dnsnames
    +      - ips
    +    networks:
    +      - ctlplane
    +    issuer: osp-rootca-issuer-internal
    +  caCerts: combined-ca-bundle
    +  edpmServiceType: nova
    +
    +
    +
  8. +
  9. +

    If you use a Ceph back end for libvirt, create a nova-compute-extra-config service to remove pre-fast-forward upgrade workarounds and configure Compute services to use a Ceph back end:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: v1
    +kind: ConfigMap
    +metadata:
    +  name: nova-extra-config
    +  namespace: openstack
    +data:
    +  19-nova-compute-cell1-workarounds.conf: |
    +    [workarounds]
    +    disable_compute_service_check_for_ffu=true
    +  03-ceph-nova.conf: |
    +    [libvirt]
    +    images_type=rbd
    +    images_rbd_pool=vms
    +    images_rbd_ceph_conf=/etc/ceph/ceph.conf
    +    images_rbd_glance_store_name=default_backend
    +    images_rbd_glance_copy_poll_interval=15
    +    images_rbd_glance_copy_timeout=600
    +    rbd_user=openstack
    +    rbd_secret_uuid=$CEPH_FSID
    +EOF
    +
    +
    +
    +

    The resources in the ConfigMap contain cell-specific configurations.

    +
    +
  10. +
  11. +

    Create a secret for the subscription manager and a secret for the Red Hat registry:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: v1
    +kind: Secret
    +metadata:
    +  name: subscription-manager
    +data:
    +  username: <base64_encoded_username>
    +  password: <base64_encoded_password>
    +---
    +apiVersion: v1
    +kind: Secret
    +metadata:
    +  name: redhat-registry
    +data:
    +  username: <registry_username>
    +  password: <registry_password>
    +EOF
    +
    +
    +
  12. +
  13. +

    Deploy the OpenStackDataPlaneNodeSet CR:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: dataplane.openstack.org/v1beta1
    +kind: OpenStackDataPlaneNodeSet
    +metadata:
    +  name: openstack
    +spec:
    +  tlsEnabled: false (1)
    +  networkAttachments:
    +      - ctlplane
    +  preProvisioned: true
    +  services:
    +    - bootstrap
    +    - download-cache
    +    - configure-network
    +    - validate-network
    +    - install-os
    +    - configure-os
    +    - ssh-known-hosts
    +    - run-os
    +    - reboot-os
    +    - install-certs
    +    - libvirt
    +    - nova
    +    - ovn
    +    - neutron-metadata
    +    - telemetry
    +  env:
    +    - name: ANSIBLE_CALLBACKS_ENABLED
    +      value: "profile_tasks"
    +    - name: ANSIBLE_FORCE_COLOR
    +      value: "True"
    +  nodes:
    +    standalone:
    +      hostName: standalone (2)
    +      ansible:
    +        ansibleHost: ${computes[standalone.localdomain]}
    +      networks:
    +      - defaultRoute: true
    +        fixedIP: ${computes[standalone.localdomain]}
    +        name: ctlplane
    +        subnetName: subnet1
    +      - name: internalapi
    +        subnetName: subnet1
    +      - name: storage
    +        subnetName: subnet1
    +      - name: tenant
    +        subnetName: subnet1
    +  nodeTemplate:
    +    ansibleSSHPrivateKeySecret: dataplane-adoption-secret
    +    ansible:
    +      ansibleUser: root
    +      ansibleVarsFrom:
    +      - prefix: subscription_manager_
    +        secretRef:
    +          name: subscription-manager
    +      - prefix: registry_
    +        secretRef:
    +          name: redhat-registry
    +      ansibleVars:
    +        edpm_bootstrap_release_version_package: []
    +        # edpm_network_config
    +        # Default nic config template for a EDPM node
    +        # These vars are edpm_network_config role vars
    +        edpm_network_config_template: |
    +           ---
    +           {% set mtu_list = [ctlplane_mtu] %}
    +           {% for network in nodeset_networks %}
    +           {{ mtu_list.append(lookup('vars', networks_lower[network] ~ '_mtu')) }}
    +           {%- endfor %}
    +           {% set min_viable_mtu = mtu_list | max %}
    +           network_config:
    +           - type: ovs_bridge
    +             name: {{ neutron_physical_bridge_name }}
    +             mtu: {{ min_viable_mtu }}
    +             use_dhcp: false
    +             dns_servers: {{ ctlplane_dns_nameservers }}
    +             domain: {{ dns_search_domains }}
    +             addresses:
    +             - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }}
    +             routes: {{ ctlplane_host_routes }}
    +             members:
    +             - type: interface
    +               name: nic1
    +               mtu: {{ min_viable_mtu }}
    +               # force the MAC address of the bridge to this interface
    +               primary: true
    +           {% for network in nodeset_networks %}
    +             - type: vlan
    +               mtu: {{ lookup('vars', networks_lower[network] ~ '_mtu') }}
    +               vlan_id: {{ lookup('vars', networks_lower[network] ~ '_vlan_id') }}
    +               addresses:
    +               - ip_netmask:
    +                   {{ lookup('vars', networks_lower[network] ~ '_ip') }}/{{ lookup('vars', networks_lower[network] ~ '_cidr') }}
    +               routes: {{ lookup('vars', networks_lower[network] ~ '_host_routes') }}
    +           {% endfor %}
    +
    +        edpm_network_config_hide_sensitive_logs: false
    +        #
    +        # These vars are for the network config templates themselves and are
    +        # considered EDPM network defaults.
    +        neutron_physical_bridge_name: br-ctlplane
    +        neutron_public_interface_name: eth0
    +
    +        # edpm_nodes_validation
    +        edpm_nodes_validation_validate_controllers_icmp: false
    +        edpm_nodes_validation_validate_gateway_icmp: false
    +
    +        # edpm ovn-controller configuration
    +        edpm_ovn_bridge_mappings: <bridge_mappings> (3)
    +        edpm_ovn_bridge: br-int
    +        edpm_ovn_encap_type: geneve
    +        ovn_monitor_all: true
    +        edpm_ovn_remote_probe_interval: 60000
    +        edpm_ovn_ofctrl_wait_before_clear: 8000
    +
    +        timesync_ntp_servers:
    +        - hostname: clock.redhat.com
    +        - hostname: clock2.redhat.com
    +
    +        edpm_bootstrap_command: |
    +          subscription-manager register --username {{ subscription_manager_username }} --password {{ subscription_manager_password }}
    +          subscription-manager release --set=9.2
    +          subscription-manager repos --disable=*
    +          subscription-manager repos --enable=rhel-9-for-x86_64-baseos-eus-rpms --enable=rhel-9-for-x86_64-appstream-eus-rpms --enable=rhel-9-for-x86_64-highavailability-eus-rpms --enable=openstack-17.1-for-rhel-9-x86_64-rpms --enable=fast-datapath-for-rhel-9-x86_64-rpms --enable=openstack-dev-preview-for-rhel-9-x86_64-rpms
    +          # FIXME: perform dnf upgrade for other packages in EDPM ansible
    +          # here we only ensuring that decontainerized libvirt can start
    +          dnf -y upgrade openstack-selinux
    +          rm -f /run/virtlogd.pid
    +          podman login -u {{ registry_username }} -p {{ registry_password }} registry.redhat.io
    +
    +        gather_facts: false
    +        # edpm firewall, change the allowed CIDR if needed
    +        edpm_sshd_configure_firewall: true
    +        edpm_sshd_allowed_ranges: ['192.168.122.0/24']
    +
    +        # Do not attempt OVS major upgrades here
    +        edpm_ovs_packages:
    +        - openvswitch3.1
    +EOF
    +
    +
    +
    + + + + + + + + + + + + + +
    1If TLS Everywhere is enabled, change spec:tlsEnabled to true.
    2If your deployment has a custom DNS Domain, modify the spec:nodes:[NODE NAME]:hostName to use fqdn for the node.
    3Replace <bridge_mappings> with the value of the bridge mappings in your configuration, for example, "datacentre:br-ctlplane".
    +
    +
  14. +
  15. +

    Ensure that you use the same ovn-controller settings in the OpenStackDataPlaneNodeSet CR that you used in the Compute service nodes before adoption. This configuration is stored in the external_ids column in the Open_vSwitch table in the Open vSwitch database:

    +
    +
    +
    ovs-vsctl list Open .
    +...
    +external_ids        : {hostname=standalone.localdomain, ovn-bridge=br-int, ovn-bridge-mappings=<bridge_mappings>, ovn-chassis-mac-mappings="datacentre:1e:0a:bb:e6:7c:ad", ovn-encap-ip="172.19.0.100", ovn-encap-tos="0", ovn-encap-type=geneve, ovn-match-northd-version=False, ovn-monitor-all=True, ovn-ofctrl-wait-before-clear="8000", ovn-openflow-probe-interval="60", ovn-remote="tcp:ovsdbserver-sb.openstack.svc:6642", ovn-remote-probe-interval="60000", rundir="/var/run/openvswitch", system-id="2eec68e6-aa21-4c95-a868-31aeafc11736"}
    +...
    +
    +
    +
    +
      +
    • +

      Replace <bridge_mappings> with the value of the bridge mappings in your configuration, for example, "datacentre:br-ctlplane".

      +
    • +
    +
    +
  16. +
  17. +

    If you use a Ceph back end for Block Storage service (cinder), prepare the adopted data plane workloads:

    +
    +
    +
    $ oc patch osdpns/openstack --type=merge --patch "
    +spec:
    +  services:
    +    - bootstrap
    +    - download-cache
    +    - configure-network
    +    - validate-network
    +    - install-os
    +    - configure-os
    +    - ssh-known-hosts
    +    - run-os
    +    - reboot-os
    +    - ceph-client
    +    - install-certs
    +    - ovn
    +    - neutron-metadata
    +    - libvirt
    +    - nova
    +    - telemetry
    +  nodeTemplate:
    +    extraMounts:
    +    - extraVolType: Ceph
    +      volumes:
    +      - name: ceph
    +        secret:
    +          secretName: ceph-conf-files
    +      mounts:
    +      - name: ceph
    +        mountPath: "/etc/ceph"
    +        readOnly: true
    +"
    +
    +
    +
    + + + + + +
    + + +Ensure that you use the same list of services from the original OpenStackDataPlaneNodeSet CR, except for the inserted ceph-client service. +
    +
    +
  18. +
  19. +

    Optional: Enable neutron-sriov-nic-agent in the OpenStackDataPlaneNodeSet CR:

    +
    +
    +
    $ oc patch openstackdataplanenodeset openstack --type='json' --patch='[
    +  {
    +    "op": "add",
    +    "path": "/spec/services/-",
    +    "value": "neutron-sriov"
    +  }, {
    +    "op": "add",
    +    "path": "/spec/nodeTemplate/ansible/ansibleVars/edpm_neutron_sriov_agent_SRIOV_NIC_physical_device_mappings",
    +    "value": "dummy_sriov_net:dummy-dev"
    +  }, {
    +    "op": "add",
    +    "path": "/spec/nodeTemplate/ansible/ansibleVars/edpm_neutron_sriov_agent_SRIOV_NIC_resource_provider_bandwidths",
    +    "value": "dummy-dev:40000000:40000000"
    +  }, {
    +    "op": "add",
    +    "path": "/spec/nodeTemplate/ansible/ansibleVars/edpm_neutron_sriov_agent_SRIOV_NIC_resource_provider_hypervisors",
    +    "value": "dummy-dev:standalone.localdomain"
    +  }
    +]'
    +
    +
    +
  20. +
  21. +

    Optional: Enable neutron-dhcp in the OpenStackDataPlaneNodeSet CR:

    +
    +
    +
    $ oc patch openstackdataplanenodeset openstack --type='json' --patch='[
    +  {
    +    "op": "add",
    +    "path": "/spec/services/-",
    +    "value": "neutron-dhcp"
    +  }]'
    +
    +
    +
    + + + + + +
    + + +
    +

    To use neutron-dhcp with OVN for the Bare Metal Provisioning service (ironic), you must set the disable_ovn_dhcp_for_baremetal_ports configuration option for the Networking service (neutron) to true. You can set this configuration in the NeutronAPI spec:

    +
    +
    +
    +
    ..
    +spec:
    +  serviceUser: neutron
    +   ...
    +      customServiceConfig: |
    +          [ovn]
    +          disable_ovn_dhcp_for_baremetal_ports = true
    +
    +
    +
    +
    +
  22. +
  23. +

    Run the pre-adoption validation:

    +
    +
      +
    1. +

      Create the validation service:

      +
      +
      +
      $ oc apply -f - <<EOF
      +apiVersion: dataplane.openstack.org/v1beta1
      +kind: OpenStackDataPlaneService
      +metadata:
      +  name: pre-adoption-validation
      +spec:
      +  playbook: osp.edpm.pre_adoption_validation
      +EOF
      +
      +
      +
    2. +
    3. +

      Create a OpenStackDataPlaneDeployment CR that runs only the validation:

      +
      +
      +
      $ oc apply -f - <<EOF
      +apiVersion: dataplane.openstack.org/v1beta1
      +kind: OpenStackDataPlaneDeployment
      +metadata:
      +  name: openstack-pre-adoption
      +spec:
      +  nodeSets:
      +  - openstack
      +  servicesOverride:
      +  - pre-adoption-validation
      +EOF
      +
      +
      +
    4. +
    5. +

      When the validation is finished, confirm that the status of the Ansible EE pods is Completed:

      +
      +
      +
      $ watch oc get pod -l app=openstackansibleee
      +
      +
      +
      +
      +
      $ oc logs -l app=openstackansibleee -f --max-log-requests 20
      +
      +
      +
    6. +
    7. +

      Wait for the deployment to reach the Ready status:

      +
      +
      +
      $ oc wait --for condition=Ready openstackdataplanedeployment/openstack-pre-adoption --timeout=10m
      +
      +
      +
      + + + + + +
      + + +
      +

      If any openstack-pre-adoption validations fail, you must reference the Ansible logs to determine which ones were unsuccessful, and then try the following troubleshooting options:

      +
      +
      +
        +
      • +

        If the hostname validation failed, check that the hostname of the data plane +node is correctly listed in the OpenStackDataPlaneNodeSet CR.

        +
      • +
      • +

        If the kernel argument check failed, ensure that the kernel argument configuration in the edpm_kernel_args and edpm_kernel_hugepages variables in the OpenStackDataPlaneNodeSet CR is the same as the kernel argument configuration that you used in the Red Hat OpenStack Platform (RHOSP) 17.1 node.

        +
      • +
      • +

        If the tuned profile check failed, ensure that the +edpm_tuned_profile variable in the OpenStackDataPlaneNodeSet CR is configured +to use the same profile as the one set on the RHOSP 17.1 node.

        +
      • +
      +
      +
      +
      +
    8. +
    +
    +
  24. +
  25. +

    Remove the remaining director services:

    +
    +
      +
    1. +

      Create an OpenStackDataPlaneService CR to clean up the data plane services you are adopting:

      +
      +
      +
      $ oc apply -f - <<EOF
      +apiVersion: dataplane.openstack.org/v1beta1
      +kind: OpenStackDataPlaneService
      +metadata:
      +  name: tripleo-cleanup
      +spec:
      +  playbook: osp.edpm.tripleo_cleanup
      +EOF
      +
      +
      +
    2. +
    3. +

      Create the OpenStackDataPlaneDeployment CR to run the clean-up:

      +
      +
      +
      $ oc apply -f - <<EOF
      +apiVersion: dataplane.openstack.org/v1beta1
      +kind: OpenStackDataPlaneDeployment
      +metadata:
      +  name: tripleo-cleanup
      +spec:
      +  nodeSets:
      +  - openstack
      +  servicesOverride:
      +  - tripleo-cleanup
      +EOF
      +
      +
      +
    4. +
    +
    +
  26. +
  27. +

    When the clean-up is finished, deploy the OpenStackDataPlaneDeployment CR:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: dataplane.openstack.org/v1beta1
    +kind: OpenStackDataPlaneDeployment
    +metadata:
    +  name: openstack
    +spec:
    +  nodeSets:
    +  - openstack
    +EOF
    +
    +
    +
    + + + + + +
    + + +If you have other node sets to deploy, such as Networker nodes, you can +add them in the nodeSets list in this step, or create separate OpenStackDataPlaneDeployment CRs later. You cannot add new node sets to an OpenStackDataPlaneDeployment CR after deployment. +
    +
    +
  28. +
+
+
+
Verification
+
    +
  1. +

    Confirm that all the Ansible EE pods reach a Completed status:

    +
    +
    +
    $ watch oc get pod -l app=openstackansibleee
    +
    +
    +
    +
    +
    $ oc logs -l app=openstackansibleee -f --max-log-requests 20
    +
    +
    +
  2. +
  3. +

    Wait for the data plane node set to reach the Ready status:

    +
    +
    +
    $ oc wait --for condition=Ready osdpns/openstack --timeout=30m
    +
    +
    +
  4. +
  5. +

    Verify that the Networking service (neutron) agents are running:

    +
    +
    +
    $ oc exec openstackclient -- openstack network agent list
    ++--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+----------------------------+
    +| ID                                   | Agent Type                   | Host                   | Availability Zone | Alive | State | Binary                     |
    ++--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+----------------------------+
    +| 174fc099-5cc9-4348-b8fc-59ed44fcfb0e | DHCP agent                   | standalone.localdomain | nova              | :-)   | UP    | neutron-dhcp-agent         |
    +| 10482583-2130-5b0d-958f-3430da21b929 | OVN Metadata agent           | standalone.localdomain |                   | :-)   | UP    | neutron-ovn-metadata-agent |
    +| a4f1b584-16f1-4937-b2b0-28102a3f6eaa | OVN Controller agent         | standalone.localdomain |                   | :-)   | UP    | ovn-controller             |
    ++--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+----------------------------+
    +
    +
    +
  6. +
+
+
+
Next steps
+ +
+
+
+

Performing a fast-forward upgrade on Compute services

+
+

You must upgrade the Compute services from Red Hat OpenStack Platform 17.1 to Red Hat OpenStack Services on OpenShift (RHOSO) 18.0 on the control plane and data plane by completing the following tasks:

+
+
+
    +
  • +

    Update the cell1 Compute data plane services version.

    +
  • +
  • +

    Remove pre-fast-forward upgrade workarounds from the Compute control plane services and Compute data plane services.

    +
  • +
  • +

    Run Compute database online migrations to update live data.

    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Wait for cell1 Compute data plane services version to update:

    +
    +
    +
    $ oc exec openstack-cell1-galera-0 -c galera -- mysql -rs -uroot -p$PODIFIED_DB_ROOT_PASSWORD \
    +    -e "select a.version from nova_cell1.services a join nova_cell1.services b where a.version!=b.version and a.binary='nova-compute';"
    +
    +
    +
    + + + + + +
    + + +
    +

    The query returns an empty result when the update is completed. No downtime is expected for virtual machine workloads.

    +
    +
    +

    Review any errors in the nova Compute agent logs on the data plane, and the nova-conductor journal records on the control plane.

    +
    +
    +
    +
  2. +
  3. +

    Patch the OpenStackControlPlane CR to remove the pre-fast-forward upgrade workarounds from the Compute control plane services:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack -n openstack --type=merge --patch '
    +spec:
    +  nova:
    +    template:
    +      cellTemplates:
    +        cell0:
    +          conductorServiceTemplate:
    +            customServiceConfig: |
    +              [workarounds]
    +              disable_compute_service_check_for_ffu=false
    +        cell1:
    +          metadataServiceTemplate:
    +            customServiceConfig: |
    +              [workarounds]
    +              disable_compute_service_check_for_ffu=false
    +          conductorServiceTemplate:
    +            customServiceConfig: |
    +              [workarounds]
    +              disable_compute_service_check_for_ffu=false
    +      apiServiceTemplate:
    +        customServiceConfig: |
    +          [workarounds]
    +          disable_compute_service_check_for_ffu=false
    +      metadataServiceTemplate:
    +        customServiceConfig: |
    +          [workarounds]
    +          disable_compute_service_check_for_ffu=false
    +      schedulerServiceTemplate:
    +        customServiceConfig: |
    +          [workarounds]
    +          disable_compute_service_check_for_ffu=false
    +'
    +
    +
    +
  4. +
  5. +

    Wait until the Compute control plane services CRs are ready:

    +
    +
    +
    $ oc wait --for condition=Ready --timeout=300s Nova/nova
    +
    +
    +
  6. +
  7. +

    Remove the pre-fast-forward upgrade workarounds from the Compute data plane services:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: v1
    +kind: ConfigMap
    +metadata:
    +  name: nova-extra-config
    +  namespace: openstack
    +data:
    +  20-nova-compute-cell1-workarounds.conf: |
    +    [workarounds]
    +    disable_compute_service_check_for_ffu=false
    +---
    +apiVersion: dataplane.openstack.org/v1beta1
    +kind: OpenStackDataPlaneDeployment
    +metadata:
    +  name: openstack-nova-compute-ffu
    +  namespace: openstack
    +spec:
    +  nodeSets:
    +    - openstack
    +  servicesOverride:
    +    - nova
    +EOF
    +
    +
    +
    + + + + + +
    + + +The service included in the servicesOverride key must match the name of the service that you included in the OpenStackDataPlaneNodeSet CR. For example, if you use a custom service called nova-custom, ensure that you add it to the servicesOverride key. +
    +
    +
  8. +
  9. +

    Wait for the Compute data plane services to be ready:

    +
    +
    +
    $ oc wait --for condition=Ready openstackdataplanedeployment/openstack-nova-compute-ffu --timeout=5m
    +
    +
    +
  10. +
  11. +

    Run Compute database online migrations to complete the fast-forward upgrade:

    +
    +
    +
    $ oc exec -it nova-cell0-conductor-0 -- nova-manage db online_data_migrations
    +$ oc exec -it nova-cell1-conductor-0 -- nova-manage db online_data_migrations
    +
    +
    +
  12. +
+
+
+
Verification
+
    +
  1. +

    Discover the Compute hosts in the cell:

    +
    +
    +
    $ oc rsh nova-cell0-conductor-0 nova-manage cell_v2 discover_hosts --verbose
    +
    +
    +
  2. +
  3. +

    Verify if the Compute services can stop the existing test VM instance:

    +
    +
    +
    ${BASH_ALIASES[openstack]} server list -c Name -c Status -f value | grep -qF "test ACTIVE" && ${BASH_ALIASES[openstack]} server stop test || echo PASS
    +${BASH_ALIASES[openstack]} server list -c Name -c Status -f value | grep -qF "test SHUTOFF" || echo FAIL
    +${BASH_ALIASES[openstack]} server --os-compute-api-version 2.48 show --diagnostics test 2>&1 || echo PASS
    +
    +
    +
  4. +
  5. +

    Verify if the Compute services can start the existing test VM instance:

    +
    +
    +
    ${BASH_ALIASES[openstack]} server list -c Name -c Status -f value | grep -qF "test SHUTOFF" && ${BASH_ALIASES[openstack]} server start test || echo PASS
    +${BASH_ALIASES[openstack]} server list -c Name -c Status -f value | grep -qF "test ACTIVE" && \
    +  ${BASH_ALIASES[openstack]} server --os-compute-api-version 2.48 show --diagnostics test --fit-width -f json | jq -r '.state' | grep running || echo FAIL
    +
    +
    +
  6. +
+
+
+ + + + + +
+ + +After the data plane adoption, the Compute hosts continue to run Red Hat Enterprise Linux (RHEL) 9.2. To take advantage of RHEL 9.4, perform a minor update procedure after finishing the adoption procedure. +
+
+
+
+

Adopting Networker services to the RHOSO data plane

+
+

Adopt the Networker nodes in your existing Red Hat OpenStack Platform deployment to the Red Hat OpenStack Services on OpenShift (RHOSO) data plane. You decide which services you want to run on the Networker nodes, and create a separate OpenStackDataPlaneNodeSet custom resource (CR) for the Networker nodes. You might also decide to implement the following options if they apply to your environment:

+
+
+
    +
  • +

    Depending on your topology, you might need to run the neutron-metadata service on the nodes, specifically when you want to serve metadata to SR-IOV ports that are hosted on Compute nodes.

    +
  • +
  • +

    If you want to continue running OVN gateway services on Networker nodes, keep ovn service in the list to deploy.

    +
  • +
  • +

    Optional: You can run the neutron-dhcp service on your Networker nodes instead of your Compute nodes. You might not need to use neutron-dhcp with OVN, unless your deployment uses DHCP relays, or advanced DHCP options that are supported by dnsmasq but not by the OVN DHCP implementation.

    +
  • +
+
+
+
Prerequisites
+
    +
  • +

    Define the shell variable. The following value is an example +from a single node standalone director deployment:

    +
    +
    +
    declare -A networkers
    +networkers=(
    +  ["standalone.localdomain"]="192.168.122.100"
    +  # ...
    +)
    +
    +
    +
    +
      +
    • +

      Replace ["standalone.localdomain"]="192.168.122.100" with the name and IP address of the Networker node.

      +
    • +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Deploy the OpenStackDataPlaneNodeSet CR for your Networker nodes:

    +
    + + + + + +
    + + +You can reuse most of the nodeTemplate section from the OpenStackDataPlaneNodeSet CR that is designated for your Compute nodes. You can omit some of the variables because of the limited set of services that are running on Networker nodes. +
    +
    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: dataplane.openstack.org/v1beta1
    +kind: OpenStackDataPlaneNodeSet
    +metadata:
    +  name: openstack-networker
    +spec:
    +  tlsEnabled: false (1)
    +  networkAttachments:
    +      - ctlplane
    +  preProvisioned: true
    +  services:
    +    - bootstrap
    +    - download-cache
    +    - configure-network
    +    - validate-network
    +    - install-os
    +    - configure-os
    +    - ssh-known-hosts
    +    - run-os
    +    - install-certs
    +    - ovn
    +  env:
    +    - name: ANSIBLE_CALLBACKS_ENABLED
    +      value: "profile_tasks"
    +    - name: ANSIBLE_FORCE_COLOR
    +      value: "True"
    +  nodes:
    +    standalone:
    +      hostName: standalone
    +      ansible:
    +        ansibleHost: ${networkers[standalone.localdomain]}
    +      networks:
    +      - defaultRoute: true
    +        fixedIP: ${networkers[standalone.localdomain]}
    +        name: ctlplane
    +        subnetName: subnet1
    +      - name: internalapi
    +        subnetName: subnet1
    +      - name: storage
    +        subnetName: subnet1
    +      - name: tenant
    +        subnetName: subnet1
    +  nodeTemplate:
    +    ansibleSSHPrivateKeySecret: dataplane-adoption-secret
    +    ansible:
    +      ansibleUser: root
    +      ansibleVarsFrom:
    +      - prefix: subscription_manager_
    +        secretRef:
    +          name: subscription-manager
    +      - prefix: registry_
    +        secretRef:
    +          name: redhat-registry
    +      ansibleVars:
    +        edpm_bootstrap_release_version_package: []
    +        # edpm_network_config
    +        # Default nic config template for a EDPM node
    +        # These vars are edpm_network_config role vars
    +        edpm_network_config_template: |
    +           ---
    +           {% set mtu_list = [ctlplane_mtu] %}
    +           {% for network in nodeset_networks %}
    +           {{ mtu_list.append(lookup('vars', networks_lower[network] ~ '_mtu')) }}
    +           {%- endfor %}
    +           {% set min_viable_mtu = mtu_list | max %}
    +           network_config:
    +           - type: ovs_bridge
    +             name: {{ neutron_physical_bridge_name }}
    +             mtu: {{ min_viable_mtu }}
    +             use_dhcp: false
    +             dns_servers: {{ ctlplane_dns_nameservers }}
    +             domain: {{ dns_search_domains }}
    +             addresses:
    +             - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }}
    +             routes: {{ ctlplane_host_routes }}
    +             members:
    +             - type: interface
    +               name: nic1
    +               mtu: {{ min_viable_mtu }}
    +               # force the MAC address of the bridge to this interface
    +               primary: true
    +           {% for network in nodeset_networks %}
    +             - type: vlan
    +               mtu: {{ lookup('vars', networks_lower[network] ~ '_mtu') }}
    +               vlan_id: {{ lookup('vars', networks_lower[network] ~ '_vlan_id') }}
    +               addresses:
    +               - ip_netmask:
    +                   {{ lookup('vars', networks_lower[network] ~ '_ip') }}/{{ lookup('vars', networks_lower[network] ~ '_cidr') }}
    +               routes: {{ lookup('vars', networks_lower[network] ~ '_host_routes') }}
    +           {% endfor %}
    +
    +        edpm_network_config_hide_sensitive_logs: false
    +        #
    +        # These vars are for the network config templates themselves and are
    +        # considered EDPM network defaults.
    +        neutron_physical_bridge_name: br-ctlplane
    +        neutron_public_interface_name: eth0
    +
    +        # edpm_nodes_validation
    +        edpm_nodes_validation_validate_controllers_icmp: false
    +        edpm_nodes_validation_validate_gateway_icmp: false
    +
    +        # edpm ovn-controller configuration
    +        edpm_ovn_bridge_mappings: <bridge_mappings> (2)
    +        edpm_ovn_bridge: br-int
    +        edpm_ovn_encap_type: geneve
    +        ovn_monitor_all: true
    +        edpm_ovn_remote_probe_interval: 60000
    +        edpm_ovn_ofctrl_wait_before_clear: 8000
    +
    +        # serve as a OVN gateway
    +        edpm_enable_chassis_gw: true (3)
    +
    +        timesync_ntp_servers:
    +        - hostname: clock.redhat.com
    +        - hostname: clock2.redhat.com
    +
    +        edpm_bootstrap_command: |
    +          subscription-manager register --username {{ subscription_manager_username }} --password {{ subscription_manager_password }}
    +          subscription-manager release --set=9.2
    +          subscription-manager repos --disable=*
    +          subscription-manager repos --enable=rhel-9-for-x86_64-baseos-eus-rpms --enable=rhel-9-for-x86_64-appstream-eus-rpms --enable=rhel-9-for-x86_64-highavailability-eus-rpms --enable=openstack-17.1-for-rhel-9-x86_64-rpms --enable=fast-datapath-for-rhel-9-x86_64-rpms --enable=openstack-dev-preview-for-rhel-9-x86_64-rpms
    +          podman login -u {{ registry_username }} -p {{ registry_password }} registry.redhat.io
    +
    +        gather_facts: false
    +        enable_debug: false
    +        # edpm firewall, change the allowed CIDR if needed
    +        edpm_sshd_configure_firewall: true
    +        edpm_sshd_allowed_ranges: ['192.168.122.0/24']
    +        # SELinux module
    +        edpm_selinux_mode: enforcing
    +
    +        # Do not attempt OVS major upgrades here
    +        edpm_ovs_packages:
    +        - openvswitch3.1
    +EOF
    +
    +
    +
    + + + + + + + + + + + + + +
    1If TLS Everywhere is enabled, change spec:tlsEnabled to true.
    2Set to the same values that you used in your Red Hat OpenStack Platform 17.1 deployment.
    3Set to true to run ovn-controller in gateway mode.
    +
    +
  2. +
  3. +

    Ensure that you use the same ovn-controller settings in the OpenStackDataPlaneNodeSet CR that you used in the Networker nodes before adoption. This configuration is stored in the external_ids column in the Open_vSwitch table in the Open vSwitch database:

    +
    +
    +
    ovs-vsctl list Open .
    +...
    +external_ids        : {hostname=standalone.localdomain, ovn-bridge=br-int, ovn-bridge-mappings=<bridge_mappings>, ovn-chassis-mac-mappings="datacentre:1e:0a:bb:e6:7c:ad", ovn-cms-options=enable-chassis-as-gw, ovn-encap-ip="172.19.0.100", ovn-encap-tos="0", ovn-encap-type=geneve, ovn-match-northd-version=False, ovn-monitor-all=True, ovn-ofctrl-wait-before-clear="8000", ovn-openflow-probe-interval="60", ovn-remote="tcp:ovsdbserver-sb.openstack.svc:6642", ovn-remote-probe-interval="60000", rundir="/var/run/openvswitch", system-id="2eec68e6-aa21-4c95-a868-31aeafc11736"}
    +...
    +
    +
    +
    +
      +
    • +

      Replace <bridge_mappings> with the value of the bridge mappings in your configuration, for example, "datacentre:br-ctlplane".

      +
    • +
    +
    +
  4. +
  5. +

    Optional: Enable neutron-metadata in the OpenStackDataPlaneNodeSet CR:

    +
    +
    +
    $ oc patch openstackdataplanenodeset <networker_CR_name> --type='json' --patch='[
    +  {
    +    "op": "add",
    +    "path": "/spec/services/-",
    +    "value": "neutron-metadata"
    +  }]'
    +
    +
    +
    +
      +
    • +

      Replace <networker_CR_name> with the name of the CR that you deployed for your Networker nodes, for example, openstack-networker.

      +
    • +
    +
    +
  6. +
  7. +

    Optional: Enable neutron-dhcp in the OpenStackDataPlaneNodeSet CR:

    +
    +
    +
    $ oc patch openstackdataplanenodeset <networker_CR_name> --type='json' --patch='[
    +  {
    +    "op": "add",
    +    "path": "/spec/services/-",
    +    "value": "neutron-dhcp"
    +  }]'
    +
    +
    +
  8. +
  9. +

    Run the pre-adoption-validation service for Networker nodes:

    +
    +
      +
    1. +

      Create a OpenStackDataPlaneDeployment CR that runs only the validation:

      +
      +
      +
      $ oc apply -f - <<EOF
      +apiVersion: dataplane.openstack.org/v1beta1
      +kind: OpenStackDataPlaneDeployment
      +metadata:
      +  name: openstack-pre-adoption-networker
      +spec:
      +  nodeSets:
      +  - openstack-networker
      +  servicesOverride:
      +  - pre-adoption-validation
      +EOF
      +
      +
      +
    2. +
    3. +

      When the validation is finished, confirm that the status of the Ansible EE pods is Completed:

      +
      +
      +
      $ watch oc get pod -l app=openstackansibleee
      +
      +
      +
      +
      +
      $ oc logs -l app=openstackansibleee -f --max-log-requests 20
      +
      +
      +
    4. +
    5. +

      Wait for the deployment to reach the Ready status:

      +
      +
      +
      $ oc wait --for condition=Ready openstackdataplanedeployment/openstack-pre-adoption-networker --timeout=10m
      +
      +
      +
    6. +
    +
    +
  10. +
  11. +

    Deploy the OpenStackDataPlaneDeployment CR for Networker nodes:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: dataplane.openstack.org/v1beta1
    +kind: OpenStackDataPlaneDeployment
    +metadata:
    +  name: openstack-networker
    +spec:
    +  nodeSets:
    +  - openstack-networker
    +EOF
    +
    +
    +
    + + + + + +
    + + +Alternatively, you can include the Networker node set in the nodeSets list before you deploy the main OpenStackDataPlaneDeployment CR. You cannot add new node sets to the OpenStackDataPlaneDeployment CR after deployment. +
    +
    +
  12. +
+
+
+
Verification
+
    +
  1. +

    Confirm that all the Ansible EE pods reach a Completed status:

    +
    +
    +
    $ watch oc get pod -l app=openstackansibleee
    +
    +
    +
    +
    +
    $ oc logs -l app=openstackansibleee -f --max-log-requests 20
    +
    +
    +
  2. +
  3. +

    Wait for the data plane node set to reach the Ready status:

    +
    +
    +
    $ oc wait --for condition=Ready osdpns/<networker_CR_name> --timeout=30m
    +
    +
    +
    +
      +
    • +

      Replace <networker_CR_name> with the name of the CR that you deployed for your Networker nodes, for example, openstack-networker.

      +
    • +
    +
    +
  4. +
  5. +

    Verify that the Networking service (neutron) agents are running. The list of agents varies depending on the services you enabled:

    +
    +
    +
    $ oc exec openstackclient -- openstack network agent list
    ++--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+----------------------------+
    +| ID                                   | Agent Type                   | Host                   | Availability Zone | Alive | State | Binary                     |
    ++--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+----------------------------+
    +| 174fc099-5cc9-4348-b8fc-59ed44fcfb0e | DHCP agent                   | standalone.localdomain | nova              | :-)   | UP    | neutron-dhcp-agent         |
    +| 10482583-2130-5b0d-958f-3430da21b929 | OVN Metadata agent           | standalone.localdomain |                   | :-)   | UP    | neutron-ovn-metadata-agent |
    +| a4f1b584-16f1-4937-b2b0-28102a3f6eaa | OVN Controller Gateway agent | standalone.localdomain |                   | :-)   | UP    | ovn-controller             |
    ++--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+----------------------------+
    +
    +
    +
  6. +
+
+
+
+
+
+

Migrating the Object Storage service (swift) to Red Hat OpenStack Services on OpenShift (RHOSO) nodes

+
+
+

This section only applies if you are using Red Hat OpenStack Platform Object Storage service (swift) as Object Storage +service. If you are using the Object Storage API of Ceph Object Gateway (RGW), you can skip this section.

+
+
+

Data migration to the new deployment might be a long running process that runs mostly in the background. The Object Storage service replicators will take care of moving data from old to new nodes, but depending on the amount of used storage this might take a very long time. You can still use the old nodes as long as they are running and continue with adopting other services in the meantime, reducing the amount of downtime. Note that performance might be decreased to the amount of replication traffic in the network.

+
+
+

Migration of the data happens replica by replica. Assuming you start with 3 replicas, only 1 one them is being moved at any time, ensuring the remaining 2 replicas are still available and the Object Storage service is usable during the migration.

+
+
+

Migrating the Object Storage service (swift) data from RHOSP to Red Hat OpenStack Services on OpenShift (RHOSO) nodes

+
+

To ensure availability during the Object Storage service (swift) migration, you perform the following steps:

+
+
+
    +
  1. +

    Add new nodes to the Object Storage service rings

    +
  2. +
  3. +

    Set weights of existing nodes to 0

    +
  4. +
  5. +

    Rebalance rings, moving one replica

    +
  6. +
  7. +

    Copy rings to old nodes and restart services

    +
  8. +
  9. +

    Check replication status and repeat previous two steps until old nodes are drained

    +
  10. +
  11. +

    Remove the old nodes from the rings

    +
  12. +
+
+
+
Prerequisites
+
    +
  • +

    Previous Object Storage service adoption steps are completed.

    +
  • +
  • +

    No new environmental variables need to be defined, though you use the +CONTROLLER1_SSH alias that was defined in a previous step.

    +
  • +
  • +

    For DNS servers, all existing nodes must be able to resolve host names of the Red Hat OpenShift Container Platform (RHOCP) pods, for example by using the +external IP of the DNSMasq service as name server in /etc/resolv.conf:

    +
    +
    +
    oc get service dnsmasq-dns -o jsonpath="{.status.loadBalancer.ingress[0].ip}" | CONTROLLER1_SSH tee /etc/resolv.conf
    +
    +
    +
  • +
  • +

    To track the current status of the replication a tool called swift-dispersion is used. It consists of two parts, a population tool to be run before changing the Object Storage service rings and a report tool to run afterwards to gather the current status. Run the swift-dispersion-populate command:

    +
    +
    +
    oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-dispersion-populate'
    +
    +
    +
    +

    The command might need a few minutes to complete. It creates 0-byte objects distributed across the Object Storage service deployment, and its counter-part swift-dispersion-report can be used afterwards to show the current replication status.

    +
    +
    +

    The output of the swift-dispersion-report command should look like the following:

    +
    +
    +
    +
    oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-dispersion-report'
    +
    +
    +
    +
    +
    Queried 1024 containers for dispersion reporting, 5s, 0 retries
    +100.00% of container copies found (3072 of 3072)
    +Sample represents 100.00% of the container partition space
    +Queried 1024 objects for dispersion reporting, 4s, 0 retries
    +There were 1024 partitions missing 0 copies.
    +100.00% of object copies found (3072 of 3072)
    +Sample represents 100.00% of the object partition space
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Add new nodes by scaling up the SwiftStorage resource from 0 to 3. In +that case 3 storage instances using PVCs are created, running on the +Red Hat OpenShift Container Platform (RHOCP) cluster.

    +
    +
    +
    oc patch openstackcontrolplane openstack --type=merge -p='{"spec":{"swift":{"template":{"swiftStorage":{"replicas": 3}}}}}'
    +
    +
    +
  2. +
  3. +

    Wait until all three pods are running:

    +
    +
    +
    oc wait pods --for condition=Ready -l component=swift-storage
    +
    +
    +
  4. +
  5. +

    Drain the existing nodes. Get the storage management IP +addresses of the nodes to drain from the current rings:

    +
    +
    +
    oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-ring-builder object.builder' | tail -n +7 | awk '{print $4}' | sort -u
    +
    +
    +
    +

    The output will look similar to the following:

    +
    +
    +
    +
    172.20.0.100:6200
    +swift-storage-0.swift-storage.openstack.svc:6200
    +swift-storage-1.swift-storage.openstack.svc:6200
    +swift-storage-2.swift-storage.openstack.svc:6200
    +
    +
    +
    +

    In this case the old node 172.20.0.100 is drained. Your nodes might be +different, and depending on the deployment there are likely more nodes to be included in the following commands.

    +
    +
    +
    +
    oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c '
    +swift-ring-tool get
    +swift-ring-tool drain 172.20.0.100
    +swift-ring-tool rebalance
    +swift-ring-tool push'
    +
    +
    +
  6. +
  7. +

    Copy and apply the updated rings need to the original nodes. Run the +ssh commands for your existing nodes storing Object Storage service data.

    +
    +
    +
    oc extract --confirm cm/swift-ring-files
    +CONTROLLER1_SSH "tar -C /var/lib/config-data/puppet-generated/swift/etc/swift/ -xzf -" < swiftrings.tar.gz
    +CONTROLLER1_SSH "systemctl restart tripleo_swift_*"
    +
    +
    +
  8. +
  9. +

    Track the replication progress by using the swift-dispersion-report tool:

    +
    +
    +
    oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c "swift-ring-tool get && swift-dispersion-report"
    +
    +
    +
    +

    The output shows less than 100% of copies found. Repeat the above command until both the container and all container and object copies are found:

    +
    +
    +
    +
    Queried 1024 containers for dispersion reporting, 6s, 0 retries
    +There were 5 partitions missing 1 copy.
    +99.84% of container copies found (3067 of 3072)
    +Sample represents 100.00% of the container partition space
    +Queried 1024 objects for dispersion reporting, 7s, 0 retries
    +There were 739 partitions missing 1 copy.
    +There were 285 partitions missing 0 copies.
    +75.94% of object copies found (2333 of 3072)
    +Sample represents 100.00% of the object partition space
    +
    +
    +
  10. +
  11. +

    Move the next replica to the new nodes. To do so, rebalance and distribute the rings again:

    +
    +
    +
    oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c '
    +swift-ring-tool get
    +swift-ring-tool rebalance
    +swift-ring-tool push'
    +
    +oc extract --confirm cm/swift-ring-files
    +CONTROLLER1_SSH "tar -C /var/lib/config-data/puppet-generated/swift/etc/swift/ -xzf -" < swiftrings.tar.gz
    +CONTROLLER1_SSH "systemctl restart tripleo_swift_*"
    +
    +
    +
    +

    Monitor the swift-dispersion-report output again, wait until all copies are found again and repeat this step until all your replicas are moved to the new nodes.

    +
    +
  12. +
  13. +

    After the nodes are drained, remove the nodes from the rings:

    +
    +
    +
    oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c '
    +swift-ring-tool get
    +swift-ring-tool remove 172.20.0.100
    +swift-ring-tool rebalance
    +swift-ring-tool push'
    +
    +
    +
  14. +
+
+
+
Verification
+
    +
  • +

    Even if all replicas are already on the the new nodes and the +swift-dispersion-report command reports 100% of the copies found, there might still be data on old nodes. This data is removed by the replicators, but it might take some more time.

    +
    +

    You can check the disk usage of all disks in the cluster:

    +
    +
    +
    +
    oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-recon -d'
    +
    +
    +
  • +
  • +

    Confirm that there are no more \*.db or *.data files in the directory /srv/node on these nodes:

    +
    +
    +
    CONTROLLER1_SSH "find /srv/node/ -type f -name '*.db' -o -name '*.data' | wc -l"
    +
    +
    +
  • +
+
+
+
+

Troubleshooting the Object Storage service (swift) migration

+
+

You can troubleshoot issues with the Object Storage service (swift) migration.

+
+
+
    +
  • +

    The following command might be helpful to debug if the replication is not working and the swift-dispersion-report is not back to 100% availability.

    +
    +
    +
    CONTROLLER1_SSH tail /var/log/containers/swift/swift.log | grep object-server
    +
    +
    +
    +

    This should show the replicator progress, for example:

    +
    +
    +
    +
    Mar 14 06:05:30 standalone object-server[652216]: <f+++++++++ 4e2/9cbea55c47e243994b0b10d8957184e2/1710395823.58025.data
    +Mar 14 06:05:30 standalone object-server[652216]: Successful rsync of /srv/node/vdd/objects/626/4e2 to swift-storage-1.swift-storage.openstack.svc::object/d1/objects/626 (0.094)
    +Mar 14 06:05:30 standalone object-server[652216]: Removing partition: /srv/node/vdd/objects/626
    +Mar 14 06:05:31 standalone object-server[652216]: <f+++++++++ 85f/cf53b5a048e5b19049e05a548cde185f/1710395796.70868.data
    +Mar 14 06:05:31 standalone object-server[652216]: Successful rsync of /srv/node/vdb/objects/829/85f to swift-storage-2.swift-storage.openstack.svc::object/d1/objects/829 (0.095)
    +Mar 14 06:05:31 standalone object-server[652216]: Removing partition: /srv/node/vdb/objects/829
    +
    +
    +
  • +
  • +

    You can also check the ring consistency and replicator status:

    +
    +
    +
    oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-recon -r --md5'
    +
    +
    +
    +

    Note that the output might show a md5 mismatch until approx. 2 minutes after pushing new rings. Eventually it looks similar to the following example:

    +
    +
    +
    +
    [...]
    +Oldest completion was 2024-03-14 16:53:27 (3 minutes ago) by 172.20.0.100:6000.
    +Most recent completion was 2024-03-14 16:56:38 (12 seconds ago) by swift-storage-0.swift-storage.openstack.svc:6200.
    +===============================================================================
    +[2024-03-14 16:56:50] Checking ring md5sums
    +4/4 hosts matched, 0 error[s] while checking hosts.
    +[...]
    +
    +
    +
  • +
+
+
+
+
+
+

Migrating the Red Hat Ceph Storage Cluster

+
+
+

In the context of data plane adoption, where the Red Hat OpenStack Platform +(RHOSP) services are redeployed in Red Hat OpenShift Container Platform (RHOCP), you migrate a +director-deployed Red Hat Ceph Storage cluster by using a process +called “externalizing” the Red Hat Ceph Storage cluster.

+
+
+

There are two deployment topologies that include an internal Red Hat Ceph Storage +cluster:

+
+
+
    +
  • +

    RHOSP includes dedicated Red Hat Ceph Storage nodes to host object +storage daemons (OSDs)

    +
  • +
  • +

    Hyperconverged Infrastructure (HCI), where Compute and Storage services are +colocated on hyperconverged nodes

    +
  • +
+
+
+

In either scenario, there are some Red Hat Ceph Storage processes that are deployed on +RHOSP Controller nodes: Red Hat Ceph Storage monitors, Ceph Object Gateway (RGW), +Rados Block Device (RBD), Ceph Metadata Server (MDS), Ceph Dashboard, and NFS +Ganesha. To migrate your Red Hat Ceph Storage cluster, you must decommission the +Controller nodes and move the Red Hat Ceph Storage daemons to a set of target nodes that are +already part of the Red Hat Ceph Storage cluster.

+
+
+

Red Hat Ceph Storage daemon cardinality

+
+

Red Hat Ceph Storage 6 and later applies strict constraints in the way daemons can be +colocated within the same node. +For more information, see the Red Hat Knowledgebase article Red Hat Ceph Storage: Supported configurations. +Your topology depends on the available hardware and the amount of Red Hat Ceph Storage services in the Controller nodes that you retire. +The amount of services that you can migrate depends on the amount of available nodes in the cluster. The following diagrams show the distribution of Red Hat Ceph Storage daemons on Red Hat Ceph Storage nodes where at least 3 nodes are required.

+
+
+
    +
  • +

    The following scenario includes only RGW and RBD, without the Red Hat Ceph Storage dashboard:

    +
    +
    +
    |    |                     |             |
    +|----|---------------------|-------------|
    +| osd | mon/mgr/crash      | rgw/ingress |
    +| osd | mon/mgr/crash      | rgw/ingress |
    +| osd | mon/mgr/crash      | rgw/ingress |
    +
    +
    +
  • +
  • +

    With the Red Hat Ceph Storage dashboard, but without Shared File Systems service (manila), at least 4 nodes are required. The Red Hat Ceph Storage dashboard has no failover:

    +
    +
    +
    |     |                     |             |
    +|-----|---------------------|-------------|
    +| osd | mon/mgr/crash | rgw/ingress       |
    +| osd | mon/mgr/crash | rgw/ingress       |
    +| osd | mon/mgr/crash | dashboard/grafana |
    +| osd | rgw/ingress   | (free)            |
    +
    +
    +
  • +
  • +

    With the Red Hat Ceph Storage dashboard and the Shared File Systems service, a minimum of 5 nodes are required, and the Red Hat Ceph Storage dashboard has no failover:

    +
    +
    +
    |     |                     |                         |
    +|-----|---------------------|-------------------------|
    +| osd | mon/mgr/crash       | rgw/ingress             |
    +| osd | mon/mgr/crash       | rgw/ingress             |
    +| osd | mon/mgr/crash       | mds/ganesha/ingress     |
    +| osd | rgw/ingress         | mds/ganesha/ingress     |
    +| osd | mds/ganesha/ingress | dashboard/grafana       |
    +
    +
    +
  • +
+
+
+
+

Migrating the monitoring stack component to new nodes within an existing Red Hat Ceph Storage cluster

+
+

The Red Hat Ceph Storage Dashboard module adds web-based monitoring and administration to the +Ceph Manager. With director-deployed Red Hat Ceph Storage, the Red Hat Ceph Storage Dashboard is enabled as part of the overcloud deploy and is composed of the following components:

+
+
+
    +
  • +

    Ceph Manager module

    +
  • +
  • +

    Grafana

    +
  • +
  • +

    Prometheus

    +
  • +
  • +

    Alertmanager

    +
  • +
  • +

    Node exporter

    +
  • +
+
+
+

The Red Hat Ceph Storage Dashboard containers are included through tripleo-container-image-prepare parameters, and high availability (HA) relies +on HAProxy and Pacemaker to be deployed on the Red Hat OpenStack Platform (RHOSP) environment. For an external Red Hat Ceph Storage cluster, HA is not supported.

+
+
+

In this procedure, you migrate and relocate the Ceph Monitoring components to free Controller nodes.

+
+
+
Prerequisites
+
    +
  • +

    You have an RHOSP 17.1 environment.

    +
  • +
  • +

    You have a Red Hat Ceph Storage 7 deployment that is managed by director.

    +
  • +
  • +

    Your Red Hat Ceph Storage 7 deployment is managed by cephadm.

    +
  • +
  • +

    Both the Red Hat Ceph Storage public and cluster networks are propagated through director to the target nodes.

    +
  • +
+
+
+

Completing prerequisites for a Red Hat Ceph Storage cluster with monitoring stack components

+
+

Complete the following prerequisites before you migrate a Red Hat Ceph Storage cluster with monitoring stack components.

+
+
+
Procedure
+
    +
  1. +

    Gather the current status of the monitoring stack. Verify that +the hosts have no monitoring label, or grafana, prometheus, or alertmanager, in cases of a per daemons placement evaluation:

    +
    + + + + + +
    + + +The entire relocation process is driven by cephadm and relies on labels to be +assigned to the target nodes, where the daemons are scheduled. +For more information about assigning labels to nodes, review the Red Hat Knowledgebase article Red Hat Ceph Storage: Supported configurations. +
    +
    +
    +
    +
    [tripleo-admin@controller-0 ~]$ sudo cephadm shell -- ceph orch host ls
    +
    +HOST                    	ADDR       	LABELS                 	STATUS
    +cephstorage-0.redhat.local  192.168.24.11  osd mds
    +cephstorage-1.redhat.local  192.168.24.12  osd mds
    +cephstorage-2.redhat.local  192.168.24.47  osd mds
    +controller-0.redhat.local   192.168.24.35  _admin mon mgr
    +controller-1.redhat.local   192.168.24.53  mon _admin mgr
    +controller-2.redhat.local   192.168.24.10  mon _admin mgr
    +6 hosts in cluster
    +
    +
    +
    +

    Confirm that the cluster is healthy and that both ceph orch ls and +ceph orch ps return the expected number of deployed daemons.

    +
    +
  2. +
  3. +

    Review and update the container image registry:

    +
    + + + + + +
    + + +If you run the Red Hat Ceph Storage externalization procedure after you migrate the Red Hat OpenStack Platform control plane, update the container images in the Red Hat Ceph Storage cluster configuration. The current container images point to the undercloud registry, which might not be available anymore. Because the undercloud is not available after adoption is complete, replace the undercloud-provided images with an alternative registry. +
    +
    +
    +
    +
    $ ceph config dump
    +...
    +...
    +mgr   advanced  mgr/cephadm/container_image_alertmanager    undercloud-0.ctlplane.redhat.local:8787/rh-osbs/openshift-ose-prometheus-alertmanager:v4.10
    +mgr   advanced  mgr/cephadm/container_image_base            undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhceph
    +mgr   advanced  mgr/cephadm/container_image_grafana         undercloud-0.ctlplane.redhat.local:8787/rh-osbs/grafana:latest
    +mgr   advanced  mgr/cephadm/container_image_node_exporter   undercloud-0.ctlplane.redhat.local:8787/rh-osbs/openshift-ose-prometheus-node-exporter:v4.10
    +mgr   advanced  mgr/cephadm/container_image_prometheus      undercloud-0.ctlplane.redhat.local:8787/rh-osbs/openshift-ose-prometheus:v4.10
    +
    +
    +
  4. +
  5. +

    Remove the undercloud container images:

    +
    +
    +
    $ cephadm shell -- ceph config rm mgr mgr/cephadm/container_image_base
    +for i in prometheus grafana alertmanager node_exporter; do
    +    cephadm shell -- ceph config rm mgr mgr/cephadm/container_image_$i
    +done
    +
    +
    +
  6. +
+
+
+
+

Migrating the monitoring stack to the target nodes

+
+

To migrate the monitoring stack to the target nodes, you add the monitoring label to your existing nodes and update the configuration of each daemon. You do not need to migrate node exporters. These daemons are deployed across +the nodes that are part of the Red Hat Ceph Storage cluster (the placement is ‘*’).

+
+
+
Prerequisites
+
    +
  • +

    Confirm that the firewall rules are in place and the ports are open for a given monitoring stack service.

    +
  • +
+
+
+ + + + + +
+ + +Depending on the target nodes and the number of deployed or active daemons, you can either relocate the existing containers to the target nodes, or +select a subset of nodes that host the monitoring stack daemons. High availability (HA) is not supported. Reducing the placement with count: 1 allows you to migrate the existing daemons in a Hyperconverged Infrastructure, or hardware-limited, scenario without impacting other services. +
+
+
+
Migrating the existing daemons to the target nodes
+
+

The following procedure is an example of an environment with 3 Red Hat Ceph Storage nodes or ComputeHCI nodes. This scenario extends the monitoring labels to all the Red Hat Ceph Storage or ComputeHCI nodes that are part of the cluster. This means that you keep 3 placements for the target nodes.

+
+
+
Procedure
+
    +
  1. +

    Add the monitoring label to all the Red Hat Ceph Storage or ComputeHCI nodes in the cluster:

    +
    +
    +
    for item in $(sudo cephadm shell --  ceph orch host ls --format json | jq -r '.[].hostname'); do
    +    sudo cephadm shell -- ceph orch host label add  $item monitoring;
    +done
    +
    +
    +
  2. +
  3. +

    Verify that all the hosts on the target nodes have the monitoring label:

    +
    +
    +
    [tripleo-admin@controller-0 ~]$ sudo cephadm shell -- ceph orch host ls
    +
    +HOST                        ADDR           LABELS
    +cephstorage-0.redhat.local  192.168.24.11  osd monitoring
    +cephstorage-1.redhat.local  192.168.24.12  osd monitoring
    +cephstorage-2.redhat.local  192.168.24.47  osd monitoring
    +controller-0.redhat.local   192.168.24.35  _admin mon mgr monitoring
    +controller-1.redhat.local   192.168.24.53  mon _admin mgr monitoring
    +controller-2.redhat.local   192.168.24.10  mon _admin mgr monitoring
    +
    +
    +
  4. +
  5. +

    Remove the labels from the Controller nodes:

    +
    +
    +
    $ for i in 0 1 2; do ceph orch host label rm "controller-$i.redhat.local" monitoring; done
    +
    +Removed label monitoring from host controller-0.redhat.local
    +Removed label monitoring from host controller-1.redhat.local
    +Removed label monitoring from host controller-2.redhat.local
    +
    +
    +
  6. +
  7. +

    Dump the current monitoring stack spec:

    +
    +
    +
    function export_spec {
    +    local component="$1"
    +    local target_dir="$2"
    +    sudo cephadm shell -- ceph orch ls --export "$component" > "$target_dir/$component"
    +}
    +
    +SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"}
    +for m in grafana prometheus alertmanager; do
    +    export_spec "$m" "$SPEC_DIR"
    +done
    +
    +
    +
  8. +
  9. +

    For each daemon, edit the current spec and replace the placement:hosts: section with the placement:label: section, for example:

    +
    +
    +
    service_type: grafana
    +service_name: grafana
    +placement:
    +  label: monitoring
    +networks:
    +- 172.17.3.0/24
    +spec:
    +  port: 3100
    +
    +
    +
    +

    This step also applies to Prometheus and Alertmanager specs.

    +
    +
  10. +
  11. +

    Apply the new monitoring spec to relocate the monitoring stack daemons:

    +
    +
    +
    SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"}
    +function migrate_daemon {
    +    local component="$1"
    +    local target_dir="$2"
    +    sudo cephadm shell -m "$target_dir" -- ceph orch apply -i /mnt/ceph_specs/$component
    +}
    +for m in grafana prometheus alertmanager; do
    +    migrate_daemon  "$m" "$SPEC_DIR"
    +done
    +
    +
    +
  12. +
  13. +

    Verify that the daemons are deployed on the expected nodes:

    +
    +
    +
    [ceph: root@controller-0 /]# ceph orch ps | grep -iE "(prome|alert|grafa)"
    +alertmanager.cephstorage-2  cephstorage-2.redhat.local  172.17.3.144:9093,9094
    +grafana.cephstorage-0       cephstorage-0.redhat.local  172.17.3.83:3100
    +prometheus.cephstorage-1    cephstorage-1.redhat.local  172.17.3.53:9092
    +
    +
    +
    + + + + + +
    + + +After you migrate the monitoring stack, you lose high availability. The monitoring stack daemons no longer have a Virtual IP address and HAProxy anymore. Node exporters are still running on all the nodes. +
    +
    +
  14. +
  15. +

    Review the Red Hat Ceph Storage configuration to ensure that it aligns with the configuration on the target nodes. In particular, focus on the following configuration entries:

    +
    +
    +
    [ceph: root@controller-0 /]# ceph config dump
    +...
    +mgr  advanced  mgr/dashboard/ALERTMANAGER_API_HOST  http://172.17.3.83:9093
    +mgr  advanced  mgr/dashboard/GRAFANA_API_URL        https://172.17.3.144:3100
    +mgr  advanced  mgr/dashboard/PROMETHEUS_API_HOST    http://172.17.3.83:9092
    +mgr  advanced  mgr/dashboard/controller-0.ycokob/server_addr  172.17.3.33
    +mgr  advanced  mgr/dashboard/controller-1.lmzpuc/server_addr  172.17.3.147
    +mgr  advanced  mgr/dashboard/controller-2.xpdgfl/server_addr  172.17.3.138
    +
    +
    +
  16. +
  17. +

    Verify that the API_HOST/URL of the grafana, alertmanager and prometheus services points to the IP addresses on the storage network of the node where each daemon is relocated:

    +
    +
    +
    [ceph: root@controller-0 /]# ceph orch ps | grep -iE "(prome|alert|grafa)"
    +alertmanager.cephstorage-0  cephstorage-0.redhat.local  172.17.3.83:9093,9094
    +alertmanager.cephstorage-1  cephstorage-1.redhat.local  172.17.3.53:9093,9094
    +alertmanager.cephstorage-2  cephstorage-2.redhat.local  172.17.3.144:9093,9094
    +grafana.cephstorage-0       cephstorage-0.redhat.local  172.17.3.83:3100
    +grafana.cephstorage-1       cephstorage-1.redhat.local  172.17.3.53:3100
    +grafana.cephstorage-2       cephstorage-2.redhat.local  172.17.3.144:3100
    +prometheus.cephstorage-0    cephstorage-0.redhat.local  172.17.3.83:9092
    +prometheus.cephstorage-1    cephstorage-1.redhat.local  172.17.3.53:9092
    +prometheus.cephstorage-2    cephstorage-2.redhat.local  172.17.3.144:9092
    +
    +
    +
    +
    +
    [ceph: root@controller-0 /]# ceph config dump
    +...
    +...
    +mgr  advanced  mgr/dashboard/ALERTMANAGER_API_HOST   http://172.17.3.83:9093
    +mgr  advanced  mgr/dashboard/PROMETHEUS_API_HOST     http://172.17.3.83:9092
    +mgr  advanced  mgr/dashboard/GRAFANA_API_URL         https://172.17.3.144:3100
    +
    +
    +
    + + + + + +
    + + +The Ceph Dashboard, as the service provided by the Ceph mgr, is not impacted by the relocation. You might experience an impact when the active mgr daemon is migrated or is force-failed. However, you can define 3 replicas in the Ceph Manager configuration to redirect requests to a different instance. +
    +
    +
  18. +
+
+
+
+
+
+

Migrating Red Hat Ceph Storage MDS to new nodes within the existing cluster

+
+

You can migrate the MDS daemon when Shared File Systems service (manila), deployed with either a cephfs-native or ceph-nfs back end, is part of the overcloud deployment. The MDS migration is performed by cephadm, and you move the daemons placement from a hosts-based approach to a label-based approach. +This ensures that you can visualize the status of the cluster and where daemons are placed by using the ceph orch host command. You can also have a general view of how the daemons are co-located within a given host, as described in the Red Hat Knowledgebase article Red Hat Ceph Storage: Supported configurations.

+
+
+
Prerequisites
+
    +
  • +

    An RHOSP 17.1 environment and a Red Hat Ceph Storage 7 deployment that is managed by director.

    +
  • +
  • +

    Red Hat Ceph Storage is upgraded to Red Hat Ceph Storage 7 and is managed by cephadm.

    +
  • +
  • +

    Both the Red Hat Ceph Storage public and cluster networks are propagated throughdirector to the target nodes.

    +
  • +
+
+
+
Prerequisites
+
    +
  • +

    Verify that the Red Hat Ceph Storage cluster is healthy and check the MDS status:

    +
    +
    +
    [ceph: root@controller-0 /]# ceph fs ls
    +name: cephfs, metadata pool: manila_metadata, data pools: [manila_data ]
    +
    +[ceph: root@controller-0 /]# ceph mds stat
    +cephfs:1 {0=mds.controller-2.oebubl=up:active} 2 up:standby
    +
    +[ceph: root@controller-0 /]# ceph fs status cephfs
    +
    +cephfs - 0 clients
    +======
    +RANK  STATE         	MDS           	ACTIVITY 	DNS	INOS   DIRS   CAPS
    + 0	active  mds.controller-2.oebubl  Reqs:	0 /s   696	196	173  	0
    +  	POOL     	TYPE 	USED  AVAIL
    +manila_metadata  metadata   152M   141G
    +  manila_data  	data	3072M   141G
    +  	STANDBY MDS
    +mds.controller-0.anwiwd
    +mds.controller-1.cwzhog
    +MDS version: ceph version 17.2.6-100.el9cp (ea4e3ef8df2cf26540aae06479df031dcfc80343) quincy (stable)
    +
    +
    +
  • +
  • +

    Retrieve more detailed information on the Ceph File System (CephFS) MDS status:

    +
    +
    +
    [ceph: root@controller-0 /]# ceph fs dump
    +
    +e8
    +enable_multiple, ever_enabled_multiple: 1,1
    +default compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2}
    +legacy client fscid: 1
    +
    +Filesystem 'cephfs' (1)
    +fs_name cephfs
    +epoch   5
    +flags   12 joinable allow_snaps allow_multimds_snaps
    +created 2024-01-18T19:04:01.633820+0000
    +modified    	2024-01-18T19:04:05.393046+0000
    +tableserver 	0
    +root	0
    +session_timeout 60
    +session_autoclose   	300
    +max_file_size   1099511627776
    +required_client_features    	{}
    +last_failure	0
    +last_failure_osd_epoch  0
    +compat  compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2}
    +max_mds 1
    +in  	0
    +up  	{0=24553}
    +failed
    +damaged
    +stopped
    +data_pools  	[7]
    +metadata_pool   9
    +inline_data 	disabled
    +balancer
    +standby_count_wanted	1
    +[mds.mds.controller-2.oebubl{0:24553} state up:active seq 2 addr [v2:172.17.3.114:6800/680266012,v1:172.17.3.114:6801/680266012] compat {c=[1],r=[1],i=[7ff]}]
    +
    +
    +Standby daemons:
    +
    +[mds.mds.controller-0.anwiwd{-1:14715} state up:standby seq 1 addr [v2:172.17.3.20:6802/3969145800,v1:172.17.3.20:6803/3969145800] compat {c=[1],r=[1],i=[7ff]}]
    +[mds.mds.controller-1.cwzhog{-1:24566} state up:standby seq 1 addr [v2:172.17.3.43:6800/2227381308,v1:172.17.3.43:6801/2227381308] compat {c=[1],r=[1],i=[7ff]}]
    +dumped fsmap epoch 8
    +
    +
    +
  • +
  • +

    Check the OSD blocklist and clean up the client list:

    +
    +
    +
    [ceph: root@controller-0 /]# ceph osd blocklist ls
    +..
    +..
    +for item in $(ceph osd blocklist ls | awk '{print $0}'); do
    +     ceph osd blocklist rm $item;
    +done
    +
    +
    +
    + + + + + +
    + + +
    +

    When a file system client is unresponsive or misbehaving, the access to the file system might be forcibly terminated. This process is called eviction. Evicting a CephFS client prevents it from communicating further with MDS daemons and OSD daemons.

    +
    +
    +

    Ordinarily, a blocklisted client cannot reconnect to the servers; you must unmount and then remount the client. However, permitting a client that was evicted to attempt to reconnect can be useful. Because CephFS uses the RADOS OSD blocklist to control client eviction, you can permit CephFS clients to reconnect by removing them from the blocklist.

    +
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Get the hosts that are currently part of the Red Hat Ceph Storage cluster:

    +
    +
    +
    [ceph: root@controller-0 /]# ceph orch host ls
    +HOST                        ADDR           LABELS          STATUS
    +cephstorage-0.redhat.local  192.168.24.25  osd mds
    +cephstorage-1.redhat.local  192.168.24.50  osd mds
    +cephstorage-2.redhat.local  192.168.24.47  osd mds
    +controller-0.redhat.local   192.168.24.24  _admin mgr mon
    +controller-1.redhat.local   192.168.24.42  mgr _admin mon
    +controller-2.redhat.local   192.168.24.37  mgr _admin mon
    +6 hosts in cluster
    +
    +[ceph: root@controller-0 /]# ceph orch ls --export mds
    +service_type: mds
    +service_id: mds
    +service_name: mds.mds
    +placement:
    +  hosts:
    +  - controller-0.redhat.local
    +  - controller-1.redhat.local
    +  - controller-2.redhat.local
    +
    +
    +
  2. +
  3. +

    Apply the MDS labels to the target nodes:

    +
    +
    +
    for item in $(sudo cephadm shell --  ceph orch host ls --format json | jq -r '.[].hostname'); do
    +    sudo cephadm shell -- ceph orch host label add  $item mds;
    +done
    +
    +
    +
  4. +
  5. +

    Verify that all the hosts have the MDS label:

    +
    +
    +
    [tripleo-admin@controller-0 ~]$ sudo cephadm shell -- ceph orch host ls
    +
    +HOST                    	ADDR       	   LABELS
    +cephstorage-0.redhat.local  192.168.24.11  osd mds
    +cephstorage-1.redhat.local  192.168.24.12  osd mds
    +cephstorage-2.redhat.local  192.168.24.47  osd mds
    +controller-0.redhat.local   192.168.24.35  _admin mon mgr mds
    +controller-1.redhat.local   192.168.24.53  mon _admin mgr mds
    +controller-2.redhat.local   192.168.24.10  mon _admin mgr mds
    +
    +
    +
  6. +
  7. +

    Dump the current MDS spec:

    +
    +
    +
    [ceph: root@controller-0 /]# ceph orch ls --export mds > mds.yaml
    +
    +
    +
  8. +
  9. +

    Edit the retrieved spec and replace the placement.hosts section with +placement.label:

    +
    +
    +
    service_type: mds
    +service_id: mds
    +service_name: mds.mds
    +placement:
    +  label: mds
    +
    +
    +
  10. +
  11. +

    Use the ceph orchestrator to apply the new MDS spec:

    +
    +
    +
    $ sudo cephadm shell -m mds.yaml -- ceph orch apply -i /mnt/mds.yaml
    +Scheduling new mds deployment …
    +
    +
    +
    +

    This results in an increased number of MDS daemons.

    +
    +
  12. +
  13. +

    Check the new standby daemons that are temporarily added to the CephFS:

    +
    +
    +
    $ ceph fs dump
    +
    +Active
    +
    +standby_count_wanted    1
    +[mds.mds.controller-0.awzplm{0:463158} state up:active seq 307 join_fscid=1 addr [v2:172.17.3.20:6802/51565420,v1:172.17.3.20:6803/51565420] compat {c=[1],r=[1],i=[7ff]}]
    +
    +
    +Standby daemons:
    +
    +[mds.mds.cephstorage-1.jkvomp{-1:463800} state up:standby seq 1 join_fscid=1 addr [v2:172.17.3.135:6820/2075903648,v1:172.17.3.135:6821/2075903648] compat {c=[1],r=[1],i=[7ff]}]
    +[mds.mds.controller-2.gfrqvc{-1:475945} state up:standby seq 1 addr [v2:172.17.3.114:6800/2452517189,v1:172.17.3.114:6801/2452517189] compat {c=[1],r=[1],i=[7ff]}]
    +[mds.mds.cephstorage-0.fqcshx{-1:476503} state up:standby seq 1 join_fscid=1 addr [v2:172.17.3.92:6820/4120523799,v1:172.17.3.92:6821/4120523799] compat {c=[1],r=[1],i=[7ff]}]
    +[mds.mds.cephstorage-2.gnfhfe{-1:499067} state up:standby seq 1 addr [v2:172.17.3.79:6820/2448613348,v1:172.17.3.79:6821/2448613348] compat {c=[1],r=[1],i=[7ff]}]
    +[mds.mds.controller-1.tyiziq{-1:499136} state up:standby seq 1 addr [v2:172.17.3.43:6800/3615018301,v1:172.17.3.43:6801/3615018301] compat {c=[1],r=[1],i=[7ff]}]
    +
    +
    +
  14. +
  15. +

    To migrate MDS to the target nodes, set the MDS affinity that manages the MDS failover:

    +
    + + + + + +
    + + +It is possible to elect a dedicated MDS as "active" for a particular file system. To configure this preference, CephFS provides a configuration option for MDS called mds_join_fs, which enforces this affinity. +When failing over MDS daemons, cluster monitors prefer standby daemons with mds_join_fs equal to the file system name with the failed rank. If no standby exists with mds_join_fs equal to the file system name, it chooses an unqualified standby as a replacement. +
    +
    +
    +
    +
    $ ceph config set mds.mds.cephstorage-0.fqcshx mds_join_fs cephfs
    +
    +
    +
  16. +
  17. +

    Remove the labels from the Controller nodes and force the MDS failover to the +target node:

    +
    +
    +
    $ for i in 0 1 2; do ceph orch host label rm "controller-$i.redhat.local" mds; done
    +
    +Removed label mds from host controller-0.redhat.local
    +Removed label mds from host controller-1.redhat.local
    +Removed label mds from host controller-2.redhat.local
    +
    +
    +
    +

    The switch to the target node happens in the background. The new active MDS is the one that you set by using the mds_join_fs command.

    +
    +
  18. +
  19. +

    Check the result of the failover and the new deployed daemons:

    +
    +
    +
    $ ceph fs dump
    +…
    +…
    +standby_count_wanted    1
    +[mds.mds.cephstorage-0.fqcshx{0:476503} state up:active seq 168 join_fscid=1 addr [v2:172.17.3.92:6820/4120523799,v1:172.17.3.92:6821/4120523799] compat {c=[1],r=[1],i=[7ff]}]
    +
    +
    +Standby daemons:
    +
    +[mds.mds.cephstorage-2.gnfhfe{-1:499067} state up:standby seq 1 addr [v2:172.17.3.79:6820/2448613348,v1:172.17.3.79:6821/2448613348] compat {c=[1],r=[1],i=[7ff]}]
    +[mds.mds.cephstorage-1.jkvomp{-1:499760} state up:standby seq 1 join_fscid=1 addr [v2:172.17.3.135:6820/452139733,v1:172.17.3.135:6821/452139733] compat {c=[1],r=[1],i=[7ff]}]
    +
    +
    +$ ceph orch ls
    +
    +NAME                     PORTS   RUNNING  REFRESHED  AGE  PLACEMENT
    +crash                                6/6  10m ago    10d  *
    +mds.mds                          3/3  10m ago    32m  label:mds
    +
    +
    +$ ceph orch ps | grep mds
    +
    +
    +mds.mds.cephstorage-0.fqcshx  cephstorage-0.redhat.local                     running (79m)     3m ago  79m    27.2M        -  17.2.6-100.el9cp  1af7b794f353  2a2dc5ba6d57
    +mds.mds.cephstorage-1.jkvomp  cephstorage-1.redhat.local                     running (79m)     3m ago  79m    21.5M        -  17.2.6-100.el9cp  1af7b794f353  7198b87104c8
    +mds.mds.cephstorage-2.gnfhfe  cephstorage-2.redhat.local                     running (79m)     3m ago  79m    24.2M        -  17.2.6-100.el9cp  1af7b794f353  f3cb859e2a15
    +
    +
    +
  20. +
+
+
+
+

Migrating Red Hat Ceph Storage RGW to external RHEL nodes

+
+

For Hyperconverged Infrastructure (HCI) or dedicated Storage nodes, you must migrate the Ceph Object Gateway (RGW) daemons that are included in the Red Hat OpenStack Platform Controller nodes into the existing external Red Hat Enterprise Linux (RHEL) nodes. The external RHEL nodes typically include the Compute nodes for an HCI environment or Red Hat Ceph Storage nodes. Your environment must have Red Hat Ceph Storage version 6 or later and be managed by cephadm or Ceph Orchestrator.

+
+
+

Completing prerequisites for Red Hat Ceph Storage RGW migration

+
+

Complete the following prerequisites before you begin the Ceph Object Gateway (RGW) migration.

+
+
+
Procedure
+
    +
  1. +

    Check the current status of the Red Hat Ceph Storage nodes:

    +
    +
    +
    (undercloud) [stack@undercloud-0 ~]$ metalsmith list
    +
    +
    +    +------------------------+    +----------------+
    +    | IP Addresses           |    |  Hostname      |
    +    +------------------------+    +----------------+
    +    | ctlplane=192.168.24.25 |    | cephstorage-0  |
    +    | ctlplane=192.168.24.10 |    | cephstorage-1  |
    +    | ctlplane=192.168.24.32 |    | cephstorage-2  |
    +    | ctlplane=192.168.24.28 |    | compute-0      |
    +    | ctlplane=192.168.24.26 |    | compute-1      |
    +    | ctlplane=192.168.24.43 |    | controller-0   |
    +    | ctlplane=192.168.24.7  |    | controller-1   |
    +    | ctlplane=192.168.24.41 |    | controller-2   |
    +    +------------------------+    +----------------+
    +
    +
    +
  2. +
  3. +

    Log in to controller-0 and check the Pacemaker status to identify important information for the RGW migration:

    +
    +
    +
    Full List of Resources:
    +  * ip-192.168.24.46	(ocf:heartbeat:IPaddr2):     	Started controller-0
    +  * ip-10.0.0.103   	(ocf:heartbeat:IPaddr2):     	Started controller-1
    +  * ip-172.17.1.129 	(ocf:heartbeat:IPaddr2):     	Started controller-2
    +  * ip-172.17.3.68  	(ocf:heartbeat:IPaddr2):     	Started controller-0
    +  * ip-172.17.4.37  	(ocf:heartbeat:IPaddr2):     	Started controller-1
    +  * Container bundle set: haproxy-bundle
    +
    +[undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp17-openstack-haproxy:pcmklatest]:
    +    * haproxy-bundle-podman-0   (ocf:heartbeat:podman):  Started controller-2
    +    * haproxy-bundle-podman-1   (ocf:heartbeat:podman):  Started controller-0
    +    * haproxy-bundle-podman-2   (ocf:heartbeat:podman):  Started controller-1
    +
    +
    +
  4. +
  5. +

    Identify the ranges of the storage networks. The following is an example and the values might differ in your environment:

    +
    +
    +
    [heat-admin@controller-0 ~]$ ip -o -4 a
    +
    +1: lo	inet 127.0.0.1/8 scope host lo\   	valid_lft forever preferred_lft forever
    +2: enp1s0	inet 192.168.24.45/24 brd 192.168.24.255 scope global enp1s0\   	valid_lft forever preferred_lft forever
    +2: enp1s0	inet 192.168.24.46/32 brd 192.168.24.255 scope global enp1s0\   	valid_lft forever preferred_lft forever
    +7: br-ex	inet 10.0.0.122/24 brd 10.0.0.255 scope global br-ex\   	valid_lft forever preferred_lft forever (1)
    +8: vlan70	inet 172.17.5.22/24 brd 172.17.5.255 scope global vlan70\   	valid_lft forever preferred_lft forever (2)
    +8: vlan70	inet 172.17.5.94/32 brd 172.17.5.255 scope global vlan70\   	valid_lft forever preferred_lft forever
    +9: vlan50	inet 172.17.2.140/24 brd 172.17.2.255 scope global vlan50\   	valid_lft forever preferred_lft forever
    +10: vlan30	inet 172.17.3.73/24 brd 172.17.3.255 scope global vlan30\   	valid_lft forever preferred_lft forever
    +10: vlan30	inet 172.17.3.68/32 brd 172.17.3.255 scope global vlan30\   	valid_lft forever preferred_lft forever
    +11: vlan20	inet 172.17.1.88/24 brd 172.17.1.255 scope global vlan20\   	valid_lft forever preferred_lft forever
    +12: vlan40	inet 172.17.4.24/24 brd 172.17.4.255 scope global vlan40\   	valid_lft forever preferred_lft forever
    +
    +
    +
    + + + + + + + + + +
    1br-ex represents the External Network, where in the current +environment, HAProxy has the front-end Virtual IP (VIP) assigned.
    2vlan30 represents the Storage Network, where the new RGW instances should be started on the Red Hat Ceph Storage nodes.
    +
    +
  6. +
  7. +

    Identify the network that you previously had in HAProxy and propagate it through director to the Red Hat Ceph Storage nodes. Use this network to reserve a new VIP that is owned by Red Hat Ceph Storage as the entry point for the RGW service.

    +
    +
      +
    1. +

      Log in to controller-0 and find the ceph_rgw section in the current HAProxy configuration:

      +
      +
      +
      $ less /var/lib/config-data/puppet-generated/haproxy/etc/haproxy/haproxy.cfg
      +
      +...
      +...
      +listen ceph_rgw
      +  bind 10.0.0.103:8080 transparent
      +  bind 172.17.3.68:8080 transparent
      +  mode http
      +  balance leastconn
      +  http-request set-header X-Forwarded-Proto https if { ssl_fc }
      +  http-request set-header X-Forwarded-Proto http if !{ ssl_fc }
      +  http-request set-header X-Forwarded-Port %[dst_port]
      +  option httpchk GET /swift/healthcheck
      +  option httplog
      +  option forwardfor
      +  server controller-0.storage.redhat.local 172.17.3.73:8080 check fall 5 inter 2000 rise 2
      +  server controller-1.storage.redhat.local 172.17.3.146:8080 check fall 5 inter 2000 rise 2
      +  server controller-2.storage.redhat.local 172.17.3.156:8080 check fall 5 inter 2000 rise 2
      +
      +
      +
    2. +
    3. +

      Confirm that the network is used as an HAProxy front end. The following example shows that controller-0 exposes the services by using the external network, which is absent from the Red Hat Ceph Storage nodes. You must propagate the external network through director:

      +
      +
      +
      [controller-0]$ ip -o -4 a
      +
      +...
      +7: br-ex	inet 10.0.0.106/24 brd 10.0.0.255 scope global br-ex\   	valid_lft forever preferred_lft forever
      +...
      +
      +
      +
    4. +
    +
    +
  8. +
  9. +

    Propagate the HAProxy front-end network to Red Hat Ceph Storage nodes.

    +
    +
      +
    1. +

      Change the NIC template that you use to define the ceph-storage network interfaces and add the new config section:

      +
      +
      +
      ---
      +network_config:
      +- type: interface
      +  name: nic1
      +  use_dhcp: false
      +  dns_servers: {{ ctlplane_dns_nameservers }}
      +  addresses:
      +  - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }}
      +  routes: {{ ctlplane_host_routes }}
      +- type: vlan
      +  vlan_id: {{ storage_mgmt_vlan_id }}
      +  device: nic1
      +  addresses:
      +  - ip_netmask: {{ storage_mgmt_ip }}/{{ storage_mgmt_cidr }}
      +  routes: {{ storage_mgmt_host_routes }}
      +- type: interface
      +  name: nic2
      +  use_dhcp: false
      +  defroute: false
      +- type: vlan
      +  vlan_id: {{ storage_vlan_id }}
      +  device: nic2
      +  addresses:
      +  - ip_netmask: {{ storage_ip }}/{{ storage_cidr }}
      +  routes: {{ storage_host_routes }}
      +- type: ovs_bridge
      +  name: {{ neutron_physical_bridge_name }}
      +  dns_servers: {{ ctlplane_dns_nameservers }}
      +  domain: {{ dns_search_domains }}
      +  use_dhcp: false
      +  addresses:
      +  - ip_netmask: {{ external_ip }}/{{ external_cidr }}
      +  routes: {{ external_host_routes }}
      +  members:
      +  - type: interface
      +    name: nic3
      +    primary: true
      +
      +
      +
    2. +
    3. +

      Add the External Network to the baremetal.yaml file that is used by metalsmith:

      +
      +
      +
      - name: CephStorage
      +  count: 3
      +  hostname_format: cephstorage-%index%
      +  instances:
      +  - hostname: cephstorage-0
      +  name: ceph-0
      +  - hostname: cephstorage-1
      +  name: ceph-1
      +  - hostname: cephstorage-2
      +  name: ceph-2
      +  defaults:
      +  profile: ceph-storage
      +  network_config:
      +      template: /home/stack/composable_roles/network/nic-configs/ceph-storage.j2
      +  networks:
      +  - network: ctlplane
      +      vif: true
      +  - network: storage
      +  - network: storage_mgmt
      +  - network: external
      +
      +
      +
    4. +
    5. +

      Configure the new network on the bare metal nodes:

      +
      +
      +
      (undercloud) [stack@undercloud-0]$
      +
      +openstack overcloud node provision
      +   -o overcloud-baremetal-deployed-0.yaml
      +   --stack overcloud
      +   --network-config -y
      +  $PWD/network/baremetal_deployment.yaml
      +
      +
      +
    6. +
    7. +

      Verify that the new network is configured on the Red Hat Ceph Storage nodes:

      +
      +
      +
      [root@cephstorage-0 ~]# ip -o -4 a
      +
      +1: lo	inet 127.0.0.1/8 scope host lo\   	valid_lft forever preferred_lft forever
      +2: enp1s0	inet 192.168.24.54/24 brd 192.168.24.255 scope global enp1s0\   	valid_lft forever preferred_lft forever
      +11: vlan40	inet 172.17.4.43/24 brd 172.17.4.255 scope global vlan40\   	valid_lft forever preferred_lft forever
      +12: vlan30	inet 172.17.3.23/24 brd 172.17.3.255 scope global vlan30\   	valid_lft forever preferred_lft forever
      +14: br-ex	inet 10.0.0.133/24 brd 10.0.0.255 scope global br-ex\   	valid_lft forever preferred_lft forever
      +
      +
      +
    8. +
    +
    +
  10. +
+
+
+
+

Migrating the Red Hat Ceph Storage RGW back ends

+
+

You must migrate your Ceph Object Gateway (RGW) back ends from your Controller nodes to your Red Hat Ceph Storage nodes. To ensure that you distribute the correct amount of services to your available nodes, you use cephadm labels to refer to a group of nodes where a given daemon type is deployed. For more information about the cardinality diagram, see Red Hat Ceph Storage daemon cardinality. +The following procedure assumes that you have three target nodes, cephstorage-0, cephstorage-1, cephstorage-2.

+
+
+
Procedure
+
    +
  1. +

    Add the RGW label to the Red Hat Ceph Storage nodes that you want to migrate your RGW back ends to:

    +
    +
    +
    $ ceph orch host label add cephstorage-0 rgw;
    +$ ceph orch host label add cephstorage-1 rgw;
    +$ ceph orch host label add cephstorage-2 rgw;
    +
    +Added label rgw to host cephstorage-0
    +Added label rgw to host cephstorage-1
    +Added label rgw to host cephstorage-2
    +
    +[ceph: root@controller-0 /]# ceph orch host ls
    +
    +HOST       	ADDR       	LABELS      	STATUS
    +cephstorage-0  192.168.24.54  osd rgw
    +cephstorage-1  192.168.24.44  osd rgw
    +cephstorage-2  192.168.24.30  osd rgw
    +controller-0   192.168.24.45  _admin mon mgr
    +controller-1   192.168.24.11  _admin mon mgr
    +controller-2   192.168.24.38  _admin mon mgr
    +
    +6 hosts in cluster
    +
    +
    +
  2. +
  3. +

    Locate the RGW spec:

    +
    +
    +
    [root@controller-0 heat-admin]# cat rgw
    +
    +networks:
    +- 172.17.3.0/24
    +placement:
    +  hosts:
    +  - controller-0
    +  - controller-1
    +  - controller-2
    +service_id: rgw
    +service_name: rgw.rgw
    +service_type: rgw
    +spec:
    +  rgw_frontend_port: 8080
    +  rgw_realm: default
    +  rgw_zone: default
    +
    +
    +
    +

    This example assumes that 172.17.3.0/24 is the storage network.

    +
    +
  4. +
  5. +

    In the placement section, ensure that the label and rgw_frontend_port values are set:

    +
    +
    +
    ---
    +networks:
    +- 172.17.3.0/24(1)
    +placement:
    +  label: rgw (2)
    +service_id: rgw
    +service_name: rgw.rgw
    +service_type: rgw
    +spec:
    +  rgw_frontend_port: 8090 (3)
    +  rgw_realm: default
    +  rgw_zone: default
    +  rgw_frontend_ssl_certificate: | (4)
    +     -----BEGIN PRIVATE KEY-----
    +     ...
    +     -----END PRIVATE KEY-----
    +     -----BEGIN CERTIFICATE-----
    +     ...
    +     -----END CERTIFICATE-----
    +  ssl: true
    +
    +
    +
    + + + + + + + + + + + + + + + + + +
    1Add the storage network where the RGW back ends are deployed.
    2Replace the Controller nodes with the label: rgw label.
    3Change the rgw_frontend_port value to 8090 to avoid conflicts with the Ceph ingress daemon.
    4Optional: if TLS is enabled, add the SSL certificate and key concatenation as described in Configuring RGW with TLS for an external Red Hat Ceph Storage cluster in Configuring persistent storage.
    +
    +
  6. +
  7. +

    Apply the new RGW spec by using the orchestrator CLI:

    +
    +
    +
    $ cephadm shell -m /home/ceph-admin/specs/rgw
    +$ cephadm shell -- ceph orch apply -i /mnt/rgw
    +
    +
    +
    +

    This command triggers the redeploy, for example:

    +
    +
    +
    +
    ...
    +osd.9                     	cephstorage-2
    +rgw.rgw.cephstorage-0.wsjlgx  cephstorage-0  172.17.3.23:8090   starting
    +rgw.rgw.cephstorage-1.qynkan  cephstorage-1  172.17.3.26:8090   starting
    +rgw.rgw.cephstorage-2.krycit  cephstorage-2  172.17.3.81:8090   starting
    +rgw.rgw.controller-1.eyvrzw   controller-1   172.17.3.146:8080  running (5h)
    +rgw.rgw.controller-2.navbxa   controller-2   172.17.3.66:8080   running (5h)
    +
    +...
    +osd.9                     	cephstorage-2
    +rgw.rgw.cephstorage-0.wsjlgx  cephstorage-0  172.17.3.23:8090  running (19s)
    +rgw.rgw.cephstorage-1.qynkan  cephstorage-1  172.17.3.26:8090  running (16s)
    +rgw.rgw.cephstorage-2.krycit  cephstorage-2  172.17.3.81:8090  running (13s)
    +
    +
    +
  8. +
  9. +

    Ensure that the new RGW back ends are reachable on the new ports, so you can enable an ingress daemon on port 8080 later. Log in to each Red Hat Ceph Storage node that includes RGW and add the iptables rule to allow connections to both 8080 and 8090 ports in the Red Hat Ceph Storage nodes:

    +
    +
    +
    $ iptables -I INPUT -p tcp -m tcp --dport 8080 -m conntrack --ctstate NEW -m comment --comment "ceph rgw ingress" -j ACCEPT
    +$ iptables -I INPUT -p tcp -m tcp --dport 8090 -m conntrack --ctstate NEW -m comment --comment "ceph rgw backends" -j ACCEPT
    +
    +
    +
  10. +
  11. +

    From a Controller node, such as controller-0, try to reach the RGW back ends:

    +
    +
    +
    $ curl http://cephstorage-0.storage:8090;
    +
    +
    +
    +

    You should observe the following output:

    +
    +
    +
    +
    <?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>
    +
    +
    +
    +

    Repeat the verification for each node where a RGW daemon is deployed.

    +
    +
  12. +
  13. +

    If you migrated RGW back ends to the Red Hat Ceph Storage nodes, there is no internalAPI network, except in the case of HCI nodes. You must reconfigure the RGW keystone endpoint to point to the external network that you propagated:

    +
    +
    +
    [ceph: root@controller-0 /]# ceph config dump | grep keystone
    +global   basic rgw_keystone_url  http://172.16.1.111:5000
    +
    +[ceph: root@controller-0 /]# ceph config set global rgw_keystone_url http://<keystone_endpoint>:5000
    +
    +
    +
    +
      +
    • +

      Replace <keystone_endpoint> with the Identity service (keystone) internal endpoint of the service that is deployed in the OpenStackControlPlane CR when you adopt the Identity service. For more information, see xref: adopting-the-identity-service_adopt-control-plane[Adopting the Identity service].

      +
    • +
    +
    +
  14. +
+
+
+
+

Deploying a Red Hat Ceph Storage ingress daemon

+
+

To deploy the Ceph ingress daemon, you perform the following actions:

+
+
+
    +
  1. +

    Remove the existing ceph_rgw configuration.

    +
  2. +
  3. +

    Clean up the configuration created by director.

    +
  4. +
  5. +

    Redeploy the Object Storage service (swift).

    +
  6. +
+
+
+

When you deploy the ingress daemon, two new containers are created:

+
+
+
    +
  • +

    HAProxy, which you use to reach the back ends.

    +
  • +
  • +

    Keepalived, which you use to own the virtual IP address.

    +
  • +
+
+
+

You use the rgw label to distribute the ingress daemon to only the number of nodes that host Ceph Object Gateway (RGW) daemons. For more information about distributing daemons among your nodes, see Red Hat Ceph Storage daemon cardinality.

+
+
+

After you complete this procedure, you can reach the RGW back end from the ingress daemon and use RGW through the Object Storage service CLI.

+
+
+
Procedure
+
    +
  1. +

    Log in to each Controller node and remove the following configuration from the /var/lib/config-data/puppet-generated/haproxy/etc/haproxy/haproxy.cfg file:

    +
    +
    +
    listen ceph_rgw
    +  bind 10.0.0.103:8080 transparent
    +  mode http
    +  balance leastconn
    +  http-request set-header X-Forwarded-Proto https if { ssl_fc }
    +  http-request set-header X-Forwarded-Proto http if !{ ssl_fc }
    +  http-request set-header X-Forwarded-Port %[dst_port]
    +  option httpchk GET /swift/healthcheck
    +  option httplog
    +  option forwardfor
    +   server controller-0.storage.redhat.local 172.17.3.73:8080 check fall 5 inter 2000 rise 2
    +  server controller-1.storage.redhat.local 172.17.3.146:8080 check fall 5 inter 2000 rise 2
    +  server controller-2.storage.redhat.local 172.17.3.156:8080 check fall 5 inter 2000 rise 2
    +
    +
    +
  2. +
  3. +

    Restart haproxy-bundle and confirm that it is started:

    +
    +
    +
    [root@controller-0 ~]# sudo pcs resource restart haproxy-bundle
    +haproxy-bundle successfully restarted
    +
    +
    +[root@controller-0 ~]# sudo pcs status | grep haproxy
    +
    +  * Container bundle set: haproxy-bundle [undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp17-openstack-haproxy:pcmklatest]:
    +    * haproxy-bundle-podman-0   (ocf:heartbeat:podman):  Started controller-0
    +    * haproxy-bundle-podman-1   (ocf:heartbeat:podman):  Started controller-1
    +    * haproxy-bundle-podman-2   (ocf:heartbeat:podman):  Started controller-2
    +
    +
    +
  4. +
  5. +

    Confirm that no process is connected to port 8080:

    +
    +
    +
    [root@controller-0 ~]# ss -antop | grep 8080
    +[root@controller-0 ~]#
    +
    +
    +
    +

    You can expect the Object Storage service (swift) CLI to fail to establish the connection:

    +
    +
    +
    +
    (overcloud) [root@cephstorage-0 ~]# swift list
    +
    +HTTPConnectionPool(host='10.0.0.103', port=8080): Max retries exceeded with url: /swift/v1/AUTH_852f24425bb54fa896476af48cbe35d3?format=json (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc41beb0430>: Failed to establish a new connection: [Errno 111] Connection refused'))
    +
    +
    +
  6. +
  7. +

    Set the required images for both HAProxy and Keepalived:

    +
    +
    +
    [ceph: root@controller-0 /]# ceph config set mgr mgr/cephadm/container_image_haproxy registry.redhat.io/rhceph/rhceph-haproxy-rhel9:latest
    +[ceph: root@controller-0 /]# ceph config set mgr mgr/cephadm/container_image_keepalived registry.redhat.io/rhceph/keepalived-rhel9:latest
    +
    +
    +
  8. +
  9. +

    Create a file called rgw_ingress in the /home/ceph-admin/specs/ directory in controller-0:

    +
    +
    +
    $ sudo vim /home/ceph-admin/specs/rgw_ingress
    +
    +
    +
  10. +
  11. +

    Paste the following content into the rgw_ingress file:

    +
    +
    +
    ---
    +service_type: ingress
    +service_id: rgw.rgw
    +placement:
    +  label: rgw
    +spec:
    +  backend_service: rgw.rgw
    +  virtual_ip: 10.0.0.89/24
    +  frontend_port: 8080
    +  monitor_port: 8898
    +  virtual_interface_networks:
    +    - <external_network>
    +  ssl_cert: |
    +     -----BEGIN CERTIFICATE-----
    +     ...
    +     -----END CERTIFICATE-----
    +     -----BEGIN PRIVATE KEY-----
    +     ...
    +     -----END PRIVATE KEY-----
    +
    +
    +
    + +
    +
  12. +
  13. +

    Apply the rgw_ingress spec by using the Ceph orchestrator CLI:

    +
    +
    +
    $ cephadm shell -m /home/ceph-admin/specs/rgw_ingress
    +$ cephadm shell -- ceph orch apply -i /mnt/rgw_ingress
    +
    +
    +
  14. +
  15. +

    Wait until the ingress is deployed and query the resulting endpoint:

    +
    +
    +
    [ceph: root@controller-0 /]# ceph orch ls
    +
    +NAME                 	PORTS            	RUNNING  REFRESHED  AGE  PLACEMENT
    +crash                                         	6/6  6m ago 	3d   *
    +ingress.rgw.rgw      	10.0.0.89:8080,8898  	6/6  37s ago	60s  label:rgw
    +mds.mds                   3/3  6m ago 	3d   controller-0;controller-1;controller-2
    +mgr                       3/3  6m ago 	3d   controller-0;controller-1;controller-2
    +mon                       3/3  6m ago 	3d   controller-0;controller-1;controller-2
    +osd.default_drive_group   15  37s ago	3d   cephstorage-0;cephstorage-1;cephstorage-2
    +rgw.rgw   ?:8090          3/3  37s ago	4m   label:rgw
    +
    +
    +
    +
    +
    [ceph: root@controller-0 /]# curl  10.0.0.89:8080
    +
    +---
    +<?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>[ceph: root@controller-0 /]#
    +—
    +
    +
    +
  16. +
+
+
+
+

Updating the Object Storage service endpoints

+
+

You must update the Object Storage service (swift) endpoints to point to the new virtual IP address (VIP) that you reserved on the same network that you used to deploy RGW ingress.

+
+
+
Procedure
+
    +
  1. +

    List the current endpoints:

    +
    +
    +
    (overcloud) [stack@undercloud-0 ~]$ openstack endpoint list | grep object
    +
    +| 1326241fb6b6494282a86768311f48d1 | regionOne | swift    	| object-store   | True	| internal  | http://172.17.3.68:8080/swift/v1/AUTH_%(project_id)s |
    +| 8a34817a9d3443e2af55e108d63bb02b | regionOne | swift    	| object-store   | True	| public	| http://10.0.0.103:8080/swift/v1/AUTH_%(project_id)s  |
    +| fa72f8b8b24e448a8d4d1caaeaa7ac58 | regionOne | swift    	| object-store   | True	| admin 	| http://172.17.3.68:8080/swift/v1/AUTH_%(project_id)s |
    +
    +
    +
  2. +
  3. +

    Update the endpoints that are pointing to the Ingress VIP:

    +
    +
    +
    (overcloud) [stack@undercloud-0 ~]$ openstack endpoint set --url "http://10.0.0.89:8080/swift/v1/AUTH_%(project_id)s" 95596a2d92c74c15b83325a11a4f07a3
    +
    +(overcloud) [stack@undercloud-0 ~]$ openstack endpoint list | grep object-store
    +| 6c7244cc8928448d88ebfad864fdd5ca | regionOne | swift    	| object-store   | True	| internal  | http://172.17.3.79:8080/swift/v1/AUTH_%(project_id)s |
    +| 95596a2d92c74c15b83325a11a4f07a3 | regionOne | swift    	| object-store   | True	| public	| http://10.0.0.89:8080/swift/v1/AUTH_%(project_id)s   |
    +| e6d0599c5bf24a0fb1ddf6ecac00de2d | regionOne | swift    	| object-store   | True	| admin 	| http://172.17.3.79:8080/swift/v1/AUTH_%(project_id)s |
    +
    +
    +
    +

    Repeat this step for both internal and admin endpoints.

    +
    +
  4. +
  5. +

    Test the migrated service:

    +
    +
    +
    (overcloud) [stack@undercloud-0 ~]$ swift list --debug
    +
    +DEBUG:swiftclient:Versionless auth_url - using http://10.0.0.115:5000/v3 as endpoint
    +DEBUG:keystoneclient.auth.identity.v3.base:Making authentication request to http://10.0.0.115:5000/v3/auth/tokens
    +DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): 10.0.0.115:5000
    +DEBUG:urllib3.connectionpool:http://10.0.0.115:5000 "POST /v3/auth/tokens HTTP/1.1" 201 7795
    +DEBUG:keystoneclient.auth.identity.v3.base:{"token": {"methods": ["password"], "user": {"domain": {"id": "default", "name": "Default"}, "id": "6f87c7ffdddf463bbc633980cfd02bb3", "name": "admin", "password_expires_at": null},
    +
    +
    +...
    +...
    +...
    +
    +DEBUG:swiftclient:REQ: curl -i http://10.0.0.89:8080/swift/v1/AUTH_852f24425bb54fa896476af48cbe35d3?format=json -X GET -H "X-Auth-Token: gAAAAABj7KHdjZ95syP4c8v5a2zfXckPwxFQZYg0pgWR42JnUs83CcKhYGY6PFNF5Cg5g2WuiYwMIXHm8xftyWf08zwTycJLLMeEwoxLkcByXPZr7kT92ApT-36wTfpi-zbYXd1tI5R00xtAzDjO3RH1kmeLXDgIQEVp0jMRAxoVH4zb-DVHUos" -H "Accept-Encoding: gzip"
    +DEBUG:swiftclient:RESP STATUS: 200 OK
    +DEBUG:swiftclient:RESP HEADERS: {'content-length': '2', 'x-timestamp': '1676452317.72866', 'x-account-container-count': '0', 'x-account-object-count': '0', 'x-account-bytes-used': '0', 'x-account-bytes-used-actual': '0', 'x-account-storage-policy-default-placement-container-count': '0', 'x-account-storage-policy-default-placement-object-count': '0', 'x-account-storage-policy-default-placement-bytes-used': '0', 'x-account-storage-policy-default-placement-bytes-used-actual': '0', 'x-trans-id': 'tx00000765c4b04f1130018-0063eca1dd-1dcba-default', 'x-openstack-request-id': 'tx00000765c4b04f1130018-0063eca1dd-1dcba-default', 'accept-ranges': 'bytes', 'content-type': 'application/json; charset=utf-8', 'date': 'Wed, 15 Feb 2023 09:11:57 GMT'}
    +DEBUG:swiftclient:RESP BODY: b'[]'
    +
    +
    +
  6. +
+
+
+
+
+

Migrating Red Hat Ceph Storage RBD to external RHEL nodes

+
+

For Hyperconverged Infrastructure (HCI) or dedicated Storage nodes that are +running Red Hat Ceph Storage version 6 or later, you must migrate the daemons that are +included in the Red Hat OpenStack Platform control plane into the existing external Red +Hat Enterprise Linux (RHEL) nodes. The external RHEL nodes typically include +the Compute nodes for an HCI environment or dedicated storage nodes.

+
+
+

To migrate Red Hat Ceph Storage Rados Block Device (RBD), your environment must +meet the following requirements:

+
+
+
    +
  • +

    Red Hat Ceph Storage is running version 6 or later and is managed by cephadm.

    +
  • +
  • +

    NFS Ganesha is migrated from a director deployment to cephadm. For more information, see Creating a NFS Ganesha +cluster.

    +
  • +
  • +

    Both the Red Hat Ceph Storage public and cluster networks are propagated with +director to the target nodes.

    +
  • +
  • +

    Ceph Metadata Server, monitoring stack, Ceph Object Gateway, and any other daemon that is deployed on Controller nodes.

    +
  • +
  • +

    The daemons distribution follows the cardinality constraints that are +described in Red Hat Ceph +Storage: Supported configurations.

    +
  • +
  • +

    The Red Hat Ceph Storage cluster is healthy, and the ceph -s command returns HEALTH_OK.

    +
  • +
+
+
+

Migrating Ceph Manager daemons to Red Hat Ceph Storage nodes

+
+

You must migrate your Ceph Manager daemons from the Red Hat OpenStack Platform (RHOSP) Controller nodes to a set of target nodes. Target nodes are either existing Red Hat Ceph Storage nodes, or RHOSP Compute nodes if Red Hat Ceph Storage is deployed by director with a Hyperconverged Infrastructure (HCI) topology.

+
+
+ + + + + +
+ + +The following procedure uses cephadm and the Ceph Orchestrator to drive the Ceph Manager migration, and the Ceph spec to modify the placement and reschedule the Ceph Manager daemons. Ceph Manager is run in an active/passive state. It also provides many modules, including the Ceph Orchestrator. Every potential module, such as the Ceph Dashboard, that is provided by ceph-mgr is implicitly migrated with Ceph Manager. +
+
+
+
Prerequisites
+
    +
  • +

    The target nodes, CephStorage or ComputeHCI, are configured to have both storage and storage_mgmt networks. This ensures that you can use both Red Hat Ceph Storage public and cluster networks from the same node.

    +
    + + + + + +
    + + +This step requires you to interact with director. From RHOSP 17.1 and later you do not have to run a stack update. +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    SSH into the target node and enable the firewall rules that are required to reach a Ceph Manager service:

    +
    +
    +
    dports="6800:7300"
    +ssh heat-admin@<target_node> sudo iptables -I INPUT \
    +    -p tcp --match multiport --dports $dports -j ACCEPT;
    +
    +
    +
    +
      +
    • +

      Replace <target_node> with the hostname of the hosts that are listed in the Red Hat Ceph Storage environment. Run ceph orch host ls to see the list of the hosts.

      +
      +

      Repeat this step for each target node.

      +
      +
    • +
    +
    +
  2. +
  3. +

    Check that the rules are properly applied to the target node and persist them:

    +
    +
    +
    $ sudo iptables-save
    +$ sudo systemctl restart iptables
    +
    +
    +
  4. +
  5. +

    Prepare the target node to host the new Ceph Manager daemon, and add the mgr +label to the target node:

    +
    +
    +
    $ ceph orch host label add <target_node> mgr; done
    +
    +
    +
  6. +
  7. +

    Repeat steps 1-3 for each target node that hosts a Ceph Manager daemon.

    +
  8. +
  9. +

    Get the Ceph Manager spec:

    +
    +
    +
    $ sudo cephadm shell -- ceph orch ls --export mgr > mgr.yaml
    +
    +
    +
  10. +
  11. +

    Edit the retrieved spec and add the label: mgr section to the placement +section:

    +
    +
    +
    service_type: mgr
    +service_id: mgr
    +placement:
    +  label: mgr
    +
    +
    +
  12. +
  13. +

    Save the spec in the /tmp/mgr.yaml file.

    +
  14. +
  15. +

    Apply the spec with cephadm` by using the Ceph Orchestrator:

    +
    +
    +
    $ sudo cephadm shell -m /tmp/mgr.yaml -- ceph orch apply -i /mnt/mgr.yaml
    +
    +
    +
  16. +
+
+
+
Verification
+
    +
  1. +

    Verify that the new Ceph Manager daemons are created in the target nodes:

    +
    +
    +
    $ ceph orch ps | grep -i mgr
    +$ ceph -s
    +
    +
    +
    +

    The Ceph Manager daemon count should match the number of hosts where the mgr label is added.

    +
    +
    + + + + + +
    + + +The migration does not shrink the Ceph Manager daemons. The count grows by +the number of target nodes, and migrating Ceph Monitor daemons to Red Hat Ceph Storage nodes +decommissions the stand-by Ceph Manager instances. For more information, see +Migrating Ceph Monitor daemons to Red Hat Ceph Storage nodes. +
    +
    +
  2. +
+
+
+
+

Migrating Ceph Monitor daemons to Red Hat Ceph Storage nodes

+
+

You must move Ceph Monitor daemons from the Red Hat OpenStack Platform (RHOSP) Controller nodes to a set of target nodes. Target nodes are either existing Red Hat Ceph Storage nodes, or RHOSP Compute nodes if Red Hat Ceph Storage is +deployed by director with a Hyperconverged Infrastructure (HCI) topology. Additional Ceph Monitors are deployed to the target nodes, and they are promoted as _admin nodes that you can use to manage the Red Hat Ceph Storage cluster and perform day 2 operations.

+
+
+

To migrate the Ceph Monitor daemons, you must perform the following high-level steps:

+
+ +
+

Repeat these steps for any additional Controller node that hosts a Ceph Monitor until you migrate all the Ceph Monitor daemons to the target nodes.

+
+
+
Configuring target nodes for Ceph Monitor migration
+
+

Prepare the target Red Hat Ceph Storage nodes for the Ceph Monitor migration by performing the following actions:

+
+
+
    +
  1. +

    Enable firewall rules in a target node and persist them.

    +
  2. +
  3. +

    Create a spec that is based on labels and apply it by using cephadm.

    +
  4. +
  5. +

    Ensure that the Ceph Monitor quorum is maintained during the migration process.

    +
  6. +
+
+
+
Procedure
+
    +
  1. +

    SSH into the target node and enable the firewall rules that are required to +reach a Ceph Monitor service:

    +
    +
    +
    $ for port in 3300 6789; {
    +    ssh heat-admin@<target_node> sudo iptables -I INPUT \
    +    -p tcp -m tcp --dport $port -m conntrack --ctstate NEW \
    +    -j ACCEPT;
    +}
    +
    +
    +
    +
      +
    • +

      Replace <target_node> with the hostname of the node that hosts the new Ceph Monitor.

      +
    • +
    +
    +
  2. +
  3. +

    Check that the rules are properly applied to the target node and persist them:

    +
    +
    +
    $ sudo iptables-save
    +$ sudo systemctl restart iptables
    +
    +
    +
  4. +
  5. +

    To migrate the existing Ceph Monitors to the target Red Hat Ceph Storage nodes, create the following Red Hat Ceph Storage spec from the first Ceph Monitor, or the first Controller node, and add the label:mon section to the placement section:

    +
    +
    +
    service_type: mon
    +service_id: mon
    +placement:
    +  label: mon
    +
    +
    +
  6. +
  7. +

    Save the spec in the /tmp/mon.yaml file.

    +
  8. +
  9. +

    Apply the spec with cephadm by using the Ceph Orchestrator:

    +
    +
    +
    $ sudo cephadm shell -m /tmp/mon.yaml
    +$ ceph orch apply -i /mnt/mon.yaml
    +
    +
    +
  10. +
  11. +

    Apply the mon label to the remaining Red Hat Ceph Storage target nodes to ensure that +quorum is maintained during the migration process:

    +
    +
    +
    declare -A target_nodes
    +
    +target_nodes[mon]="oc0-ceph-0 oc0-ceph-1 oc0-ceph2"
    +
    +mon_nodes="${target_nodes[mon]}"
    +IFS=' ' read -r -a mons <<< "$mon_nodes"
    +
    +for node in "${mons[@]}"; do
    +    ceph orch host add label $node mon
    +    ceph orch host add label $node _admin
    +done
    +
    +
    +
    + + + + + +
    + + +Applying the mon.yaml spec allows the existing strategy to use labels +instead of hosts. As a result, any node with the mon label can host a Ceph +Monitor daemon. Perform this step only once to avoid multiple iterations when multiple Ceph Monitors are migrated. +
    +
    +
  12. +
  13. +

    Check the status of the Red Hat Ceph Storage and the Ceph Orchestrator daemons list. +Ensure that Ceph Monitors are in a quorum and listed by the ceph orch command:

    +
    +
    +
    # ceph -s
    +  cluster:
    +    id:     f6ec3ebe-26f7-56c8-985d-eb974e8e08e3
    +    health: HEALTH_OK
    +
    +  services:
    +    mon: 6 daemons, quorum oc0-controller-0,oc0-controller-1,oc0-controller-2,oc0-ceph-0,oc0-ceph-1,oc0-ceph-2 (age 19m)
    +    mgr: oc0-controller-0.xzgtvo(active, since 32m), standbys: oc0-controller-1.mtxohd, oc0-controller-2.ahrgsk
    +    osd: 8 osds: 8 up (since 12m), 8 in (since 18m); 1 remapped pgs
    +
    +  data:
    +    pools:   1 pools, 1 pgs
    +    objects: 0 objects, 0 B
    +    usage:   43 MiB used, 400 GiB / 400 GiB avail
    +    pgs:     1 active+clean
    +
    +
    +
    +
    +
    [ceph: root@oc0-controller-0 /]# ceph orch host ls
    +HOST              ADDR           LABELS          STATUS
    +oc0-ceph-0        192.168.24.14  osd mon _admin
    +oc0-ceph-1        192.168.24.7   osd mon _admin
    +oc0-ceph-2        192.168.24.8   osd mon _admin
    +oc0-controller-0  192.168.24.15  _admin mgr mon
    +oc0-controller-1  192.168.24.23  _admin mgr mon
    +oc0-controller-2  192.168.24.13  _admin mgr mon
    +
    +
    +
  14. +
+
+
+
Next steps
+

Proceed to the next step Draining the source node.

+
+
+
+
Draining the source node
+
+

Drain the existing Controller nodes and remove the source node host from the Red Hat Ceph Storage cluster.

+
+
+
Procedure
+
    +
  1. +

    On the source node, back up the /etc/ceph/ directory to run cephadm and get a shell for the Red Hat Ceph Storage cluster from the source node:

    +
    +
    +
    $ mkdir -p $HOME/ceph_client_backup
    +$ sudo cp -R /etc/ceph $HOME/ceph_client_backup
    +
    +
    +
  2. +
  3. +

    Identify the active ceph-mgr instance:

    +
    +
    +
    $ cepahdm shell -- ceph mgr stat
    +
    +
    +
  4. +
  5. +

    Fail the ceph-mgr if it is active on the source node or target node:

    +
    +
    +
    $ cephadm shell -- ceph mgr fail <mgr_instance>
    +
    +
    +
    +
      +
    • +

      Replace <mgr_instance> with the Ceph Manager daemon to fail.

      +
    • +
    +
    +
  6. +
  7. +

    From the cephadm shell, remove the labels on the source node:

    +
    +
    +
    $ for label in mon mgr _admin; do
    +    cephadm shell -- ceph orch host rm label <source_node> $label;
    +done
    +
    +
    +
    +
      +
    • +

      Replace <source_node> with the hostname of the source node.

      +
    • +
    +
    +
  8. +
  9. +

    Remove the running Ceph Monitor daemon from the source node:

    +
    +
    +
    $ cephadm shell -- ceph orch daemon rm mon.<source_node> --force"
    +
    +
    +
  10. +
  11. +

    Drain the source node:

    +
    +
    +
    $ cephadm shell -- ceph drain <source_node>
    +
    +
    +
  12. +
  13. +

    Remove the source node host from the Red Hat Ceph Storage cluster:

    +
    +
    +
    $ cephadm shell -- ceph orch host rm <source_node> --force"
    +
    +
    +
    + + + + + +
    + + +
    +

    The source node is not part of the cluster anymore, and should not appear in +the Red Hat Ceph Storage host list when cephadm shell -- ceph orch host ls is run. +However, if you run sudo podman ps in the source node, the list might show that both Ceph Monitors and Ceph Managers are still up and running.

    +
    +
    +
    +
    [root@oc0-controller-1 ~]# sudo podman ps
    +CONTAINER ID  IMAGE                                                                                        COMMAND               CREATED         STATUS             PORTS       NAMES
    +5c1ad36472bc  registry.redhat.io/ceph/rhceph@sha256:320c364dcc8fc8120e2a42f54eb39ecdba12401a2546763b7bef15b02ce93bc4  -n mon.oc0-contro...  35 minutes ago  Up 35 minutes ago              ceph-f6ec3ebe-26f7-56c8-985d-eb974e8e08e3-mon-oc0-controller-1
    +3b14cc7bf4dd  registry.redhat.io/ceph/rhceph@sha256:320c364dcc8fc8120e2a42f54eb39ecdba12401a2546763b7bef15b02ce93bc4  -n mgr.oc0-contro...  35 minutes ago  Up 35 minutes ago              ceph-f6ec3ebe-26f7-56c8-985d-eb974e8e08e3-mgr-oc0-controller-1-mtxohd
    +
    +
    +
    +

    To clean up the existing containers and remove the cephadm data from the source node, contact Red Hat Support.

    +
    +
    +
    +
  14. +
  15. +

    Confirm that mons are still in quorum:

    +
    +
    +
    $ cephadm shell -- ceph -s
    +$ cephadm shell -- ceph orch ps | grep -i mon
    +
    +
    +
  16. +
+
+
+
Next steps
+

Proceed to the next step Migrating the Ceph Monitor IP address.

+
+
+
+
Migrating the Ceph Monitor IP address
+
+

You must migrate your Ceph Monitor IP addresses to the target Red Hat Ceph Storage nodes. The +IP address migration assumes that the target nodes are originally deployed by +director and that the network configuration is managed by +os-net-config.

+
+
+
Procedure
+
    +
  1. +

    Get the original Ceph Monitor IP address from the existing /etc/ceph/ceph.conf file on the mon_host line, for example:

    +
    +
    +
    mon_host = [v2:172.17.3.60:3300/0,v1:172.17.3.60:6789/0] [v2:172.17.3.29:3300/0,v1:172.17.3.29:6789/0] [v2:172.17.3.53:3300/0,v1:172.17.3.53:6789/0]
    +
    +
    +
  2. +
  3. +

    Confirm that the Ceph Monitor IP address is present in the os-net-config configuration that is located in the /etc/os-net-config directory on the source node:

    +
    +
    +
    [tripleo-admin@controller-0 ~]$ grep "172.17.3.60" /etc/os-net-config/config.yaml
    +    - ip_netmask: 172.17.3.60/24
    +
    +
    +
  4. +
  5. +

    Edit the /etc/os-net-config/config.yaml file and remove the ip_netmask line.

    +
  6. +
  7. +

    Save the file and refresh the node network configuration:

    +
    +
    +
    $ sudo os-net-config -c /etc/os-net-config/config.yaml
    +
    +
    +
  8. +
  9. +

    Verify that the IP address is not present in the source node anymore, for example:

    +
    +
    +
    [controller-0]$ ip -o a | grep 172.17.3.60
    +
    +
    +
  10. +
  11. +

    SSH into the target node, for example cephstorage-0, and add the IP address +for the new Ceph Monitor.

    +
  12. +
  13. +

    On the target node, edit /etc/os-net-config/config.yaml and +add the - ip_netmask: 172.17.3.60 line that you removed in the source node.

    +
  14. +
  15. +

    Save the file and refresh the node network configuration:

    +
    +
    +
    $ sudo os-net-config -c /etc/os-net-config/config.yaml
    +
    +
    +
  16. +
  17. +

    Verify that the IP address is present in the target node.

    +
    +
    +
    $ ip -o a | grep 172.17.3.60
    +
    +
    +
  18. +
  19. +

    From the source node, ping the ip address that is migrated to the target node +and confirm that it is still reachable:

    +
    +
    +
    [controller-0]$ ping -c 3 172.17.3.60
    +
    +
    +
  20. +
+
+
+
Next steps
+

Proceed to the next step Redeploying the Ceph Monitor on the target node.

+
+
+
+
Redeploying a Ceph Monitor on the target node
+
+

You use the IP address that you migrated to the target node to redeploy the +Ceph Monitor on the target node.

+
+
+
Procedure
+
    +
  1. +

    Get the Ceph mon spec:

    +
    +
    +
    $ sudo cephadm shell -- ceph orch ls --export mon > mon.yaml
    +
    +
    +
  2. +
  3. +

    Edit the retrieved spec and add the unmanaged: true keyword:

    +
    +
    +
    service_type: mon
    +service_id: mon
    +placement:
    +  label: mon
    +unmanaged: true
    +
    +
    +
  4. +
  5. +

    Save the spec in the /tmp/mon.yaml file.

    +
  6. +
  7. +

    Apply the spec with cephadm by using the Ceph Orchestrator:

    +
    +
    +
    $ sudo cephadm shell -m /tmp/mon.yaml
    +$ ceph orch apply -i /mnt/mon.yaml
    +
    +
    +
    +

    The Ceph Monitor daemons are marked as unmanaged, and you can now redeploy the existing daemon and bind it to the migrated IP address.

    +
    +
  8. +
  9. +

    Delete the existing Ceph Monitor on the target node:

    +
    +
    +
    $ sudo cephadm shell -- ceph orch daemon add rm mon.<target_node> --force
    +
    +
    +
    +
      +
    • +

      Replace <target_node> with the hostname of the target node that is included in the Red Hat Ceph Storage cluster.

      +
    • +
    +
    +
  10. +
  11. +

    Redeploy the new Ceph Monitor on the target node by using the migrated IP address:

    +
    +
    +
    $ sudo cephadm shell -- ceph orch daemon add mon <target_node>:<ip_address>
    +
    +
    +
    +
      +
    • +

      Replace <ip_address> with the IP address of the migrated IP address.

      +
    • +
    +
    +
  12. +
  13. +

    Get the Ceph Monitor spec:

    +
    +
    +
    $ sudo cephadm shell -- ceph orch ls --export mon > mon.yaml
    +
    +
    +
  14. +
  15. +

    Edit the retrieved spec and set the unmanaged keyword to false:

    +
    +
    +
    service_type: mon
    +service_id: mon
    +placement:
    +  label: mon
    +unmanaged: false
    +
    +
    +
  16. +
  17. +

    Save the spec in the /tmp/mon.yaml file.

    +
  18. +
  19. +

    Apply the spec with cephadm by using the Ceph Orchestrator:

    +
    +
    +
    $ sudo cephadm shell -m /tmp/mon.yaml
    +$ ceph orch apply -i /mnt/mon.yaml
    +
    +
    +
    +

    The new Ceph Monitor runs on the target node with the original IP address.

    +
    +
  20. +
  21. +

    Identify the running mgr:

    +
    +
    +
    $ sudo cephadm shell --  mgr stat
    +
    +
    +
  22. +
  23. +

    Refresh the Ceph Manager information by force-failing it:

    +
    +
    +
    $ sudo cephadm shell -- ceph mgr fail
    +
    +
    +
  24. +
  25. +

    Refresh the OSD information:

    +
    +
    +
    $ sudo cephadm shell -- ceph orch reconfig osd.default_drive_group
    +
    +
    +
  26. +
+
+ +
+
+
Verifying the Red Hat Ceph Storage cluster after Ceph Monitor migration
+
+

After you finish migrating your Ceph Monitor daemons to the target nodes, verify that the the Red Hat Ceph Storage cluster is healthy.

+
+
+
Procedure
+
    +
  • +

    Verify that the Red Hat Ceph Storage cluster is healthy:

    +
    +
    +
    [ceph: root@oc0-controller-0 specs]# ceph -s
    +  cluster:
    +    id:     f6ec3ebe-26f7-56c8-985d-eb974e8e08e3
    +    health: HEALTH_OK
    +...
    +...
    +
    +
    +
  • +
+
+
+
+
+
+
+
+ + + + + + + \ No newline at end of file diff --git a/user/images/.gitkeep b/user/images/.gitkeep new file mode 100644 index 000000000..e69de29bb diff --git a/user/index.html b/user/index.html new file mode 100644 index 000000000..3c672b3e4 --- /dev/null +++ b/user/index.html @@ -0,0 +1,12732 @@ + + + + + + + +Adopting a Red Hat OpenStack Platform 17.1 deployment + + + + + + + +
+
+

Red Hat OpenStack Services on OpenShift Antelope adoption overview

+
+
+

Adoption is the process of migrating a OpenStack (OSP) Wallaby overcloud to a Red Hat OpenStack Services on OpenShift Antelope data plane. To ensure that you understand the entire adoption process and how to sufficiently prepare your OSP environment, review the prerequisites, adoption process, and post-adoption tasks.

+
+
+ + + + + +
+ + +It is important to read the whole adoption guide before you start +the adoption. You should form an understanding of the procedure, +prepare the necessary configuration snippets for each service ahead of +time, and test the procedure in a representative test environment +before you adopt your main environment. +
+
+
+

Adoption limitations

+
+

The adoption process does not support the following features:

+
+
+
    +
  • +

    OpenStack (OSP) Wallaby multi-cell deployments

    +
  • +
  • +

    Fast Data path

    +
  • +
  • +

    instanceHA

    +
  • +
  • +

    Auto-scaling

    +
  • +
  • +

    DCN

    +
  • +
  • +

    Designate

    +
  • +
  • +

    Octavia

    +
  • +
+
+
+

If you plan to adopt the Key Manager service (barbican) or a FIPs environment, review the following limitations:

+
+
+
    +
  • +

    The Key Manager service does not yet support all of the crypto plug-ins available in TripleO.

    +
  • +
  • +

    When you adopt a OSP Wallaby FIPS environment to RHOSO Antelope, your adopted cluster remains a FIPS cluster. There is no option to change the FIPS status during adoption. If your cluster is FIPS-enabled, you must deploy a FIPS OpenShift cluster to adopt your OSP Wallaby FIPS control plane. For more information about enabling FIPS in OCP, see Support for FIPS cryptography in the OCP Installing guide.

    +
  • +
+
+
+
+

Adoption prerequisites

+
+

Before you begin the adoption procedure, complete the following prerequisites:

+
+
+
+
Planning information
+
+
+ +
+
+
Back-up information
+
+
+ +
+
+
Compute
+
+
+ +
+
+
ML2/OVS
+
+
+
    +
  • +

    If you use the Modular Layer 2 plug-in with Open vSwitch mechanism driver (ML2/OVS), migrate it to the Modular Layer 2 plug-in with Open Virtual Networking (ML2/OVN) mechanism driver. For more information, see Migrating to the OVN mechanism driver.

    +
  • +
+
+
+
Tools
+
+
+
    +
  • +

    Install the oc command line tool on your workstation.

    +
  • +
  • +

    Install the podman command line tool on your workstation.

    +
  • +
+
+
+
OSP Wallaby hosts
+
+
+
    +
  • +

    All control plane and data plane hosts of the OSP Wallaby cloud are up and running, and continue to run throughout the adoption procedure.

    +
  • +
+
+
+
+
+
+
+

Guidelines for planning the adoption

+
+

When planning to adopt a Red Hat OpenStack Services on OpenShift (RHOSO) Antelope environment, consider the scope of the change. An adoption is similar in scope to a data center upgrade. Different firmware levels, hardware vendors, hardware profiles, networking interfaces, storage interfaces, and so on affect the adoption process and can cause changes in behavior during the adoption.

+
+
+

Review the following guidelines to adequately plan for the adoption and increase the chance that you complete the adoption successfully:

+
+
+ + + + + +
+ + +All commands in the adoption documentation are examples. Do not copy and paste the commands verbatim. +
+
+
+
    +
  • +

    To minimize the risk of an adoption failure, reduce the number of environmental differences between the staging environment and the production sites.

    +
  • +
  • +

    If the staging environment is not representative of the production sites, or a staging environment is not available, then you must plan to include contingency time in case the adoption fails.

    +
  • +
  • +

    Review your custom OpenStack (OSP) service configuration at every major release.

    +
    +
      +
    • +

      Every major RHOSO release upgrades through multiple OSP versions.

      +
    • +
    • +

      Each new OSP version might deprecate configuration options or change the format of the configuration.

      +
    • +
    +
    +
  • +
  • +

    Prepare a Method of Procedure (MOP) that is specific to your environment to reduce the risk of variance or omitted steps when running the adoption process.

    +
  • +
  • +

    You can use representative hardware in a staging environment to prepare a MOP and validate any content changes.

    +
    +
      +
    • +

      Include a cross-section of firmware versions, additional interface or device hardware, and any additional software in the representative staging environment to ensure that it is broadly representative of the variety that is present in the production environments.

      +
    • +
    • +

      Ensure that you validate any Red Hat Enterprise Linux update or upgrade in the representative staging environment.

      +
    • +
    +
    +
  • +
  • +

    Use Satellite for localized and version-pinned RPM content where your data plane nodes are located.

    +
  • +
  • +

    In the production environment, use the content that you tested in the staging environment.

    +
  • +
+
+
+
+

Adoption process overview

+
+

Familiarize yourself with the steps of the adoption process and the optional post-adoption tasks.

+
+ +
+
Post-adoption tasks
+ +
+
+
+

Identity service authentication

+
+

If you have custom policies enabled, contact Red Hat Support before adopting a TripleO OpenStack deployment. You must complete the following steps for adoption:

+
+
+
    +
  1. +

    Remove custom policies.

    +
  2. +
  3. +

    Run the adoption.

    +
  4. +
  5. +

    Re-add custom policies by using the new SRBAC syntax.

    +
  6. +
+
+
+

After you adopt a TripleO-based OpenStack deployment to a Red Hat OpenStack Services on OpenShift deployment, the Identity service performs user authentication and authorization by using Secure RBAC (SRBAC). If SRBAC is already enabled, then there is no change to how you perform operations. If SRBAC is disabled, then adopting a TripleO-based OpenStack deployment might change how you perform operations due to changes in API access policies.

+
+
+

For more information on SRBAC, see Secure role based access control in Red Hat OpenStack Services on OpenShift in Performing security operations.

+
+
+
+

Configuring the network for the RHOSO deployment

+
+

With OpenShift, the network is a very important aspect of the deployment, and +it is important to plan it carefully. The general network requirements for the +OpenStack (OSP) services are not much different from the ones in a TripleO deployment, but the way you handle them is.

+
+
+ + + + + +
+ + +For more information about the network architecture and configuration, see +Deploying Red Hat OpenStack Platform 18.0 Development Preview 3 on Red Hat OpenShift Container Platform and About +networking in OpenShift Container Platform 4.15 Documentation. This document will address concerns specific to adoption. +
+
+
+

When adopting a new OSP deployment, it is important to align the network +configuration with the adopted cluster to maintain connectivity for existing +workloads.

+
+
+

The following logical configuration steps will incorporate the existing network +configuration:

+
+
+
    +
  • +

    configure OCP worker nodes to align VLAN tags and IPAM +configuration with the existing deployment.

    +
  • +
  • +

    configure Control Plane services to use compatible IP ranges for +service and load balancing IPs.

    +
  • +
  • +

    configure Data Plane nodes to use corresponding compatible configuration for +VLAN tags and IPAM.

    +
  • +
+
+
+

Specifically,

+
+
+
    +
  • +

    IPAM configuration will either be reused from the +existing deployment or, depending on IP address availability in the +existing allocation pools, new ranges will be defined to be used for the +new control plane services. If so, IP routing will be configured between +the old and new ranges. For more information, see Planning your IPAM configuration.

    +
  • +
  • +

    VLAN tags will be reused from the existing deployment.

    +
  • +
+
+
+

Retrieving the network configuration from your existing deployment

+
+

Let’s first determine which isolated networks are defined in the existing +deployment. You can find the network configuration in the network_data.yaml +file. For example,

+
+
+
+
- name: InternalApi
+  mtu: 1500
+  vip: true
+  vlan: 20
+  name_lower: internal_api
+  dns_domain: internal.mydomain.tld.
+  service_net_map_replace: internal
+  subnets:
+    internal_api_subnet:
+      ip_subnet: '172.17.0.0/24'
+      allocation_pools: [{'start': '172.17.0.4', 'end': '172.17.0.250'}]
+
+
+
+

You should make a note of the VLAN tag used (vlan key) and the IP range +(ip_subnet key) for each isolated network. The IP range will later be split +into separate pools for control plane services and load balancer IP addresses.

+
+
+

You should also determine the list of IP addresses already consumed in the +adopted environment. Consult tripleo-ansible-inventory.yaml file to find this +information. In the file, for each listed host, note IP and VIP addresses +consumed by the node.

+
+
+

For example,

+
+
+
+
Standalone:
+  hosts:
+    standalone:
+      ...
+      internal_api_ip: 172.17.0.100
+    ...
+  ...
+standalone:
+  children:
+    Standalone: {}
+  vars:
+    ...
+    internal_api_vip: 172.17.0.2
+    ...
+
+
+
+

In the example above, note that the 172.17.0.2 and 172.17.0.100 are +consumed and won’t be available for the new control plane services, at least +until the adoption is complete.

+
+
+

Repeat the process for each isolated network and each host in the +configuration.

+
+
+
+

At the end of this process, you should have the following information:

+
+
+
    +
  • +

    A list of isolated networks used in the existing deployment.

    +
  • +
  • +

    For each of the isolated networks, the VLAN tag and IP ranges used for +dynamic address allocation.

    +
  • +
  • +

    A list of existing IP address allocations used in the environment. You will +later exclude these addresses from allocation pools available for the new +control plane services.

    +
  • +
+
+
+
+

Planning your IPAM configuration

+
+

The new deployment model puts additional burden on the size of IP allocation +pools available for OpenStack (OSP) services. This is because each service deployed +on OpenShift worker nodes will now require an IP address from the IPAM pool (in +the previous deployment model, all services hosted on a controller node shared +the same IP address.)

+
+
+

Since the new control plane deployment has different requirements as to the +number of IP addresses available for services, it may even be impossible to +reuse the existing IP ranges used in adopted environment, depending on its +size. Prudent planning is required to determine which options are available in +your particular case.

+
+
+

The total number of IP addresses required for the new control plane services, +in each isolated network, is calculated as a sum of the following:

+
+
+
    +
  • +

    The number of OCP worker nodes. (Each node will require 1 IP address in +NodeNetworkConfigurationPolicy custom resources (CRs).)

    +
  • +
  • +

    The number of IP addresses required for the data plane nodes. (Each node will require +an IP address from NetConfig CRs.)

    +
  • +
  • +

    The number of IP addresses required for control plane services. (Each service +will require an IP address from NetworkAttachmentDefinition CRs.) This +number depends on the number of replicas for each service.

    +
  • +
  • +

    The number of IP addresses required for load balancer IP addresses. (Each +service will require a VIP address from IPAddressPool CRs.)

    +
  • +
+
+
+

As of the time of writing, the simplest single worker node OCP deployment +(CRC) has the following IP ranges defined (for the internalapi network):

+
+
+
    +
  • +

    1 IP address for the single worker node;

    +
  • +
  • +

    1 IP address for the data plane node;

    +
  • +
  • +

    NetworkAttachmentDefinition CRs for control plane services: +X.X.X.30-X.X.X.70 (41 addresses);

    +
  • +
  • +

    IPAllocationPool CRs for load balancer IPs: X.X.X.80-X.X.X.90 (11 +addresses).

    +
  • +
+
+
+

Which comes to a total of 54 IP addresses allocated to the internalapi +allocation pools.

+
+
+

The exact requirements may differ depending on the list of OSP services +to be deployed, their replica numbers, as well as the number of OCP +worker nodes and data plane nodes.

+
+
+

Additional IP addresses may be required in future OSP releases, so it is +advised to plan for some extra capacity, for each of the allocation pools used +in the new environment.

+
+
+

Once you know the required IP pool size for the new deployment, you can choose +one of the following scenarios to handle IPAM allocation in the new +environment.

+
+
+

The first listed scenario is more general and implies using new IP ranges, +while the second scenario implies reusing the existing ranges. The end state of +the former scenario is using the new subnet ranges for control plane services, +but keeping the old ranges, with their node IP address allocations intact, for +data plane nodes.

+
+
+

Regardless of the IPAM scenario, the VLAN tags used in the existing deployment will be reused in the new deployment. Depending on the scenario, the IP address ranges to be used for control plane services will be either reused from the old deployment or defined anew. Adjust the configuration as described in Configuring isolated networks.

+
+
+
Scenario 1: Using new subnet ranges
+
+

This scenario is compatible with any existing subnet configuration, and can be +used even when the existing cluster subnet ranges don’t have enough free IP +addresses for the new control plane services.

+
+
+

The general idea here is to define new IP ranges for control plane services +that belong to a different subnet that was not used in the existing cluster. +Then, configure link local IP routing between the old and new subnets to allow +old and new service deployments to communicate. This involves using TripleO +mechanism on pre-adopted cluster to configure additional link local routes +there. This will allow EDP deployment to reach out to adopted nodes using their +old subnet addresses.

+
+
+

The new subnet should be sized appropriately to accommodate the new control +plane services, but otherwise doesn’t have any specific requirements as to the +existing deployment allocation pools already consumed. Actually, the +requirements as to the size of the new subnet are lower than in the second +scenario, as the old subnet ranges are kept for the adopted nodes, which means +they don’t consume any IP addresses from the new range.

+
+
+

In this scenario, you will configure NetworkAttachmentDefinition custom resources (CRs) to use a +different subnet from what will be configured in NetConfig CR for the same +networks. The former range will be used for control plane services, +while the latter will be used to manage IPAM for data plane nodes.

+
+
+

During the process, you will need to make sure that adopted node IP addresses +don’t change during the adoption process. This is achieved by listing the +addresses in fixedIP fields in OpenstackDataplaneNodeSet per-node section.

+
+
+
+

Before proceeding, configure host routes on the adopted nodes for the +control plane subnets.

+
+
+

To achieve this, you will need to re-run openstack overcloud node provision with additional routes entries added to network_config. (This change should be applied for every adopted node configuration.) For example, you may add the following to net_config.yaml:

+
+
+
+
network_config:
+  - type: ovs_bridge
+    name: br-ctlplane
+    routes:
+    - ip_netmask: 0.0.0.0/0
+      next_hop: 192.168.1.1
+    - ip_netmask: 172.31.0.0/24 (1)
+      next_hop: 192.168.1.100 (2)
+
+
+
+ + + + + + + + + +
1The new control plane subnet.
2The control plane IP address of the adopted data plane node.
+
+
+

Do the same for other networks that will need to use different subnets for the +new and old parts of the deployment.

+
+
+

Once done, run openstack overcloud node provision to apply the new configuration.

+
+
+

Note that network configuration changes are not applied by default to avoid +risk of network disruption. You will have to enforce the changes by setting the +StandaloneNetworkConfigUpdate: true in the TripleO configuration files.

+
+
+

Once openstack overcloud node provision is complete, you should see new link local routes to the +new subnet on each node. For example,

+
+
+
+
# ip route | grep 172
+172.31.0.0/24 via 192.168.122.100 dev br-ctlplane
+
+
+
+
+

The next step is to configure similar routes for the old subnet for control plane services attached to the networks. This is done by adding routes entries to +NodeNetworkConfigurationPolicy CRs for each network. For example,

+
+
+
+
      - destination: 192.168.122.0/24 (1)
+        next-hop-interface: ospbr (2)
+
+
+
+ + + + + + + + + +
1The isolated network’s original subnet on the data plane.
2The OCP worker network interface that corresponds to the isolated network on the data plane.
+
+
+

Once applied, you should eventually see the following route added to your OpenShift nodes.

+
+
+
+
# ip route | grep 192
+192.168.122.0/24 dev ospbr proto static scope link
+
+
+
+
+

At this point, you should be able to ping the adopted nodes from OCP nodes +using their old subnet addresses; and vice versa.

+
+
+
+

Finally, during the data plane adoption, you will have to take care of several aspects:

+
+
+
    +
  • +

    in network_config, add link local routes to the new subnets, for example:

    +
  • +
+
+
+
+
  nodeTemplate:
+    ansible:
+      ansibleUser: root
+      ansibleVars:
+        additional_ctlplane_host_routes:
+        - ip_netmask: 172.31.0.0/24
+          next_hop: '{{ ctlplane_ip }}'
+        edpm_network_config_template: |
+          network_config:
+          - type: ovs_bridge
+            routes: {{ ctlplane_host_routes + additional_ctlplane_host_routes }}
+            ...
+
+
+
+
    +
  • +

    list the old IP addresses as ansibleHost and fixedIP, for example:

    +
  • +
+
+
+
+
  nodes:
+    standalone:
+      ansible:
+        ansibleHost: 192.168.122.100
+        ansibleUser: ""
+      hostName: standalone
+      networks:
+      - defaultRoute: true
+        fixedIP: 192.168.122.100
+        name: ctlplane
+        subnetName: subnet1
+
+
+
+
    +
  • +

    expand SSH range for the firewall configuration to include both subnets:

    +
  • +
+
+
+
+
        edpm_sshd_allowed_ranges:
+        - 192.168.122.0/24
+        - 172.31.0.0/24
+
+
+
+

This is to allow SSH access from the new subnet to the adopted nodes as well as +the old one.

+
+
+
+

Since you are applying new network configuration to the nodes, consider also +setting edpm_network_config_update: true to enforce the changes.

+
+
+
+

Note that the examples above are incomplete and should be incorporated into +your general configuration.

+
+
+
+
Scenario 2: Reusing existing subnet ranges
+
+

This scenario is only applicable when the existing subnet ranges have enough IP +addresses for the new control plane services. On the other hand, it allows to +avoid additional routing configuration between the old and new subnets, as in Scenario 1: Using new subnet ranges.

+
+
+

The general idea here is to instruct the new control plane services to use the +same subnet as in the adopted environment, but define allocation pools used by +the new services in a way that would exclude IP addresses that were already +allocated to existing cluster nodes.

+
+
+

This scenario implies that the remaining IP addresses in the existing subnet is +enough for the new control plane services. If not, +Scenario 1: Using new subnet ranges should be used +instead. For more information, see Planning your IPAM configuration.

+
+
+

No special routing configuration is required in this scenario; the only thing +to pay attention to is to make sure that already consumed IP addresses don’t +overlap with the new allocation pools configured for OpenStack control plane services.

+
+
+

If you are especially constrained by the size of the existing subnet, you may +have to apply elaborate exclusion rules when defining allocation pools for the +new control plane services. For more information, see

+
+
+
+
+

Configuring isolated networks

+
+

Before you begin replicating your existing VLAN and IPAM configuration in the Red Hat OpenStack Services on OpenShift (RHOSO) environment, you must have the following IP address allocations for the new control plane services:

+
+
+
    +
  • +

    1 IP address for each isolated network on each OpenShift worker node. You configure these IP addresses in the NodeNetworkConfigurationPolicy custom resources (CRs) for the OCP worker nodes. For more information, see Configuring OCP worker nodes.

    +
  • +
  • +

    1 IP range for each isolated network for the data plane nodes. You configure these ranges in the NetConfig CRs for the data plane nodes. For more information, see Configuring data plane nodes.

    +
  • +
  • +

    1 IP range for each isolated network for control plane services. These ranges +enable pod connectivity for isolated networks in the NetworkAttachmentDefinition CRs. For more information, see Configuring the networking for control plane services.

    +
  • +
  • +

    1 IP range for each isolated network for load balancer IP addresses. These IP ranges define load balancer IP addresses for MetalLB in the IPAddressPool CRs. For more information, see Configuring the networking for control plane services.

    +
  • +
+
+
+ + + + + +
+ + +The exact list and configuration of isolated networks in the following procedures should reflect the actual OpenStack environment. The number of isolated networks might differ from the examples used in the procedures. The IPAM scheme might also differ. Only the parts of the configuration that are relevant to configuring networks are shown. The values that are used in the following procedures are examples. Use values that are specific to your configuration. +
+
+
+
Configuring isolated networks on OCP worker nodes
+
+

To connect service pods to isolated networks on OpenShift worker nodes that run OpenStack services, physical network configuration on the hypervisor is required.

+
+
+

This configuration is managed by the NMState operator, which uses NodeNetworkConfigurationPolicy custom resources (CRs) to define the desired network configuration for the nodes.

+
+
+
Procedure
+
    +
  • +

    For each OCP worker node, define a NodeNetworkConfigurationPolicy CR that describes the desired network configuration. For example:

    +
    +
    +
    apiVersion: v1
    +items:
    +- apiVersion: nmstate.io/v1
    +  kind: NodeNetworkConfigurationPolicy
    +  spec:
    +    desiredState:
    +      interfaces:
    +      - description: internalapi vlan interface
    +        ipv4:
    +          address:
    +          - ip: 172.17.0.10
    +            prefix-length: 24
    +          dhcp: false
    +          enabled: true
    +        ipv6:
    +          enabled: false
    +        name: enp6s0.20
    +        state: up
    +        type: vlan
    +        vlan:
    +          base-iface: enp6s0
    +          id: 20
    +          reorder-headers: true
    +      - description: storage vlan interface
    +        ipv4:
    +          address:
    +          - ip: 172.18.0.10
    +            prefix-length: 24
    +          dhcp: false
    +          enabled: true
    +        ipv6:
    +          enabled: false
    +        name: enp6s0.21
    +        state: up
    +        type: vlan
    +        vlan:
    +          base-iface: enp6s0
    +          id: 21
    +          reorder-headers: true
    +      - description: tenant vlan interface
    +        ipv4:
    +          address:
    +          - ip: 172.19.0.10
    +            prefix-length: 24
    +          dhcp: false
    +          enabled: true
    +        ipv6:
    +          enabled: false
    +        name: enp6s0.22
    +        state: up
    +        type: vlan
    +        vlan:
    +          base-iface: enp6s0
    +          id: 22
    +          reorder-headers: true
    +    nodeSelector:
    +      kubernetes.io/hostname: ocp-worker-0
    +      node-role.kubernetes.io/worker: ""
    +
    +
    +
  • +
+
+
+ + + + + +
+ + +In IPv6, OpenShift worker nodes need a /64 prefix allocation due to OVN +limitations (RFC 4291). For dynamic IPv6 configuration, you need to change the +prefix allocation on the Router Advertisement settings. If you want to use +manual configuration for IPv6, define a similar CR to the +NodeNetworkConfigurationPolicy CR example in this procedure, and define an +IPv6 address and disable IPv4. Because the constraint for the /64 prefix did +not exist in TripleO, your OSP +control plane network might not have enough capacity to allocate these +networks. If that is the case, allocate a prefix that fits a large enough number +of addresses, for example, /60. The prefix depends on the number of worker nodes you have. +
+
+
+
+
Configuring isolated networks on control plane services
+
+

After the NMState operator creates the desired hypervisor network configuration for isolated networks, you must configure the OpenStack (OSP) services to use the configured interfaces. You define a NetworkAttachmentDefinition custom resource (CR) for each isolated network. In some clusters, these CRs are managed by the Cluster Network Operator, in which case you use Network CRs instead. For more information, see +Cluster Network Operator in Networking.

+
+
+
Procedure
+
    +
  1. +

    Define a NetworkAttachmentDefinition CR for each isolated network. +For example:

    +
    +
    +
    apiVersion: k8s.cni.cncf.io/v1
    +kind: NetworkAttachmentDefinition
    +metadata:
    +  name: internalapi
    +  namespace: openstack
    +spec:
    +  config: |
    +    {
    +      "cniVersion": "0.3.1",
    +      "name": "internalapi",
    +      "type": "macvlan",
    +      "master": "enp6s0.20",
    +      "ipam": {
    +        "type": "whereabouts",
    +        "range": "172.17.0.0/24",
    +        "range_start": "172.17.0.20",
    +        "range_end": "172.17.0.50"
    +      }
    +    }
    +
    +
    +
    + + + + + +
    + + +Ensure that the interface name and IPAM range match the configuration that you used in the NodeNetworkConfigurationPolicy CRs. +
    +
    +
  2. +
  3. +

    Optional: When reusing existing IP ranges, you can exclude part of the range that is used in the existing deployment by using the exclude parameter in the NetworkAttachmentDefinition pool. For example:

    +
    +
    +
    apiVersion: k8s.cni.cncf.io/v1
    +kind: NetworkAttachmentDefinition
    +metadata:
    +  name: internalapi
    +  namespace: openstack
    +spec:
    +  config: |
    +    {
    +      "cniVersion": "0.3.1",
    +      "name": "internalapi",
    +      "type": "macvlan",
    +      "master": "enp6s0.20",
    +      "ipam": {
    +        "type": "whereabouts",
    +        "range": "172.17.0.0/24",
    +        "range_start": "172.17.0.20", (1)
    +        "range_end": "172.17.0.50", (2)
    +        "exclude": [ (3)
    +          "172.17.0.24/32",
    +          "172.17.0.44/31"
    +        ]
    +      }
    +    }
    +
    +
    +
    + + + + + + + + + + + + + +
    1Defines the start of the IP range.
    2Defines the end of the IP range.
    3Excludes part of the IP range. This example excludes IP addresses 172.17.0.24/32 and 172.17.0.44/31 from the allocation pool.
    +
    +
  4. +
  5. +

    If your OSP services require load balancer IP addresses, define the pools for these services in an IPAddressPool CR. For example:

    +
    + + + + + +
    + + +The load balancer IP addresses belong to the same IP range as the control plane services, and are managed by MetalLB. This pool should also be aligned with the OSP configuration. +
    +
    +
    +
    +
    - apiVersion: metallb.io/v1beta1
    +  kind: IPAddressPool
    +  spec:
    +    addresses:
    +    - 172.17.0.60-172.17.0.70
    +
    +
    +
    +

    Define IPAddressPool CRs for each isolated network that requires load +balancer IP addresses.

    +
    +
  6. +
  7. +

    Optional: When reusing existing IP ranges, you can exclude part of the range by listing multiple entries in the addresses section of the IPAddressPool. For example:

    +
    +
    +
    - apiVersion: metallb.io/v1beta1
    +  kind: IPAddressPool
    +  spec:
    +    addresses:
    +    - 172.17.0.60-172.17.0.64
    +    - 172.17.0.66-172.17.0.70
    +
    +
    +
    +

    The example above would exclude the 172.17.0.65 address from the allocation +pool.

    +
    +
  8. +
+
+
+
+
Configuring isolated networks on data plane nodes
+
+

Data plane nodes are configured by the OpenStack Operator and your OpenStackDataPlaneNodeSet custom resources (CRs). The OpenStackDataPlaneNodeSet CRs define your desired network configuration for the nodes.

+
+
+

Your Red Hat OpenStack Services on OpenShift (RHOSO) network configuration should reflect the existing OpenStack (OSP) network setup. You must pull the net_config.yaml files from each OSP node and reuse them when you define the OpenStackDataPlaneNodeSet CRs. The format of the configuration does not change, so you can put network templates under edpm_network_config_template variables, either for all nodes or for each node.

+
+
+

To ensure that the latest network configuration is used during the data plane adoption, you should also set edpm_network_config_update: true in the nodeTemplate field of the OpenStackDataPlaneNodeSet CR.

+
+
+
Procedure
+
    +
  1. +

    Configure a NetConfig CR with your desired VLAN tags and IPAM configuration. For example:

    +
    +
    +
    apiVersion: network.openstack.org/v1beta1
    +kind: NetConfig
    +metadata:
    +  name: netconfig
    +spec:
    +  networks:
    +  - name: internalapi
    +    dnsDomain: internalapi.example.com
    +    subnets:
    +    - name: subnet1
    +      allocationRanges:
    +      - end: 172.17.0.250
    +        start: 172.17.0.100
    +      cidr: 172.17.0.0/24
    +      vlan: 20
    +  - name: storage
    +    dnsDomain: storage.example.com
    +    subnets:
    +    - name: subnet1
    +      allocationRanges:
    +      - end: 172.18.0.250
    +        start: 172.18.0.100
    +      cidr: 172.18.0.0/24
    +      vlan: 21
    +  - name: tenant
    +    dnsDomain: tenant.example.com
    +    subnets:
    +    - name: subnet1
    +      allocationRanges:
    +      - end: 172.19.0.250
    +        start: 172.19.0.100
    +      cidr: 172.19.0.0/24
    +      vlan: 22
    +
    +
    +
  2. +
  3. +

    Optional: In the NetConfig CR, list multiple ranges for the allocationRanges field to exclude some of the IP addresses, for example, to accommodate IP addresses that are already consumed by the adopted environment:

    +
    +
    +
    apiVersion: network.openstack.org/v1beta1
    +kind: NetConfig
    +metadata:
    +  name: netconfig
    +spec:
    +  networks:
    +  - name: internalapi
    +    dnsDomain: internalapi.example.com
    +    subnets:
    +    - name: subnet1
    +      allocationRanges:
    +      - end: 172.17.0.199
    +        start: 172.17.0.100
    +      - end: 172.17.0.250
    +        start: 172.17.0.201
    +      cidr: 172.17.0.0/24
    +      vlan: 20
    +
    +
    +
    +

    This example excludes the 172.17.0.200 address from the pool.

    +
    +
  4. +
+
+
+
+
+
+

Storage requirements

+
+

Storage in a OpenStack (OSP) deployment refers to the following types:

+
+
+
    +
  • +

    The storage that is needed for the service to run

    +
  • +
  • +

    The storage that the service manages

    +
  • +
+
+
+

Before you can deploy the services in Red Hat OpenStack Services on OpenShift (RHOSO), you must review the storage requirements, plan your OpenShift node selection, prepare your OCP nodes, and so on.

+
+
+

Storage driver certification

+
+

Before you adopt your OpenStack Wallaby deployment to a Red Hat OpenStack Services on OpenShift (RHOSO) Antelope deployment, confirm that your deployed storage drivers are certified for use with RHOSO Antelope.

+
+
+

For information on software certified for use with RHOSO Antelope, see the Red Hat Ecosystem Catalog.

+
+
+
+

Block Storage service guidelines

+
+

Prepare to adopt your Block Storage service (cinder):

+
+
+
    +
  • +

    Take note of the Block Storage service back ends that you use.

    +
  • +
  • +

    Determine all the transport protocols that the Block Storage service back ends use, such as +RBD, iSCSI, FC, NFS, NVMe-TCP, and so on. You must consider them when you place the Block Storage services and when the right storage transport-related binaries are running on the OpenShift nodes. For more information about each storage transport protocol, see OCP preparation for Block Storage service adoption.

    +
  • +
  • +

    Use a Block Storage service volume service to deploy each Block Storage service volume back end.

    +
    +

    For example, you have an LVM back end, a Ceph back end, and two entries in cinderVolumes, and you cannot set global defaults for all volume services. You must define a service for each of them:

    +
    +
    +
    +
    apiVersion: core.openstack.org/v1beta1
    +kind: OpenStackControlPlane
    +metadata:
    +  name: openstack
    +spec:
    +  cinder:
    +    enabled: true
    +    template:
    +      cinderVolumes:
    +        lvm:
    +          customServiceConfig: |
    +            [DEFAULT]
    +            debug = True
    +            [lvm]
    +< . . . >
    +        ceph:
    +          customServiceConfig: |
    +            [DEFAULT]
    +            debug = True
    +            [ceph]
    +< . . . >
    +
    +
    +
    + + + + + +
    + + +Check that all configuration options are still valid for RHOSO Antelope version. Configuration options might be deprecated, removed, or added. This applies to both back-end driver-specific configuration options and other generic options. +
    +
    +
  • +
+
+
+

There are two ways to prepare a Block Storage service configuration for adoption. You can customize the configuration or prepare a quick configuration. There is no difference in how Block Storage service operates with both methods, but customization is recommended whenever possible.

+
+
+
Preparing the Block Storage service by using an agnostic configuration file
+
+

The quick and dirty process is more straightforward:

+
+
+
Procedure
+
    +
  1. +

    Create an agnostic configuration file removing any specifics from the old +deployment’s cinder.conf file, like the connection in the [dabase] +section, the transport_url and log_dir in [DEFAULT], the whole +[coordination] and [barbican] sections, etc..

    +
  2. +
  3. +

    Assuming the configuration has sensitive information, drop the modified +contents of the whole file into a Secret.

    +
  4. +
  5. +

    Reference this secret in all the services, creating a Block Storage service (cinder) volumes section +for each backend and just adding the respective enabled_backends option.

    +
  6. +
  7. +

    Add external files as mentioned in the last bullet of the tailor-made +configuration explanation.

    +
  8. +
+
+
+

Example of what the quick and dirty configuration patch would look like:

+
+
+
+
   spec:
+     cinder:
+       enabled: true
+       template:
+         cinderAPI:
+           customServiceConfigSecrets:
+             - cinder-conf
+         cinderScheduler:
+           customServiceConfigSecrets:
+             - cinder-conf
+         cinderBackup:
+           customServiceConfigSecrets:
+             - cinder-conf
+         cinderVolume:
+           lvm1:
+             customServiceConfig: |
+               [DEFAULT]
+               enabled_backends = lvm1
+             customServiceConfigSecrets:
+               - cinder-conf
+           lvm2:
+             customServiceConfig: |
+               [DEFAULT]
+               enabled_backends = lvm2
+             customServiceConfigSecrets:
+               - cinder-conf
+
+
+
+
+
About the Block Storage service configuration generation helper tool
+
+

Creating the right Block Storage service (cinder) configuration files to deploy using Operators may +sometimes be a complicated experience, especially the first times, so you have a +helper tool that can create a draft of the files from a cinder.conf file.

+
+
+

This tool is not meant to be a automation tool. It is mostly to help you get the +gist of it, maybe point out some potential pitfalls and reminders.

+
+
+ + + + + +
+ + +The tools requires PyYAML Python package to be installed (pip +install PyYAML). +
+
+
+

This cinder-cfg.py script defaults to reading the +cinder.conf file from the current directory (unless --config option is used) +and outputs files to the current directory (unless --out-dir option is used).

+
+
+

In the output directory you always get a cinder.patch file with the Cinder +specific configuration patch to apply to the OpenStackControlPlane custom resource but you might also get an additional file called cinder-prereq.yaml file with some +Secrets and MachineConfigs, and an openstackversion.yaml file with the +OpenStackVersion sample.

+
+
+

Example of an invocation setting input and output explicitly to the defaults for +a Ceph backend:

+
+
+
+
$ python cinder-cfg.py --config cinder.conf --out-dir ./
+WARNING:root:The {block_storage} is configured to use ['/etc/cinder/policy.yaml'] as policy file, please ensure this file is available for the control plane {block_storage} services using "extraMounts" or remove the option.
+
+WARNING:root:Deployment uses Ceph, so make sure the Ceph credentials and configuration are present in OpenShift as a asecret and then use the extra volumes to make them available in all the services that would need them.
+
+WARNING:root:You were using user ['nova'] to talk to Nova, but in podified using the service keystone username is preferred in this case ['cinder']. Dropping that configuration.
+
+WARNING:root:ALWAYS REVIEW RESULTS, OUTPUT IS JUST A ROUGH DRAFT!!
+
+Output written at ./: cinder.patch
+
+
+
+

The script outputs some warnings to let you know things that you might need to do +manually -adding the custom policy, provide the ceph configuration files- and +also let you know a change in how the service_user has been removed.

+
+
+

A different example when using multiple backends, one of them being a 3PAR FC +could be:

+
+
+
+
$ python cinder-cfg.py --config cinder.conf --out-dir ./
+WARNING:root:The {block_storage} is configured to use ['/etc/cinder/policy.yaml'] as policy file, please ensure this file is available for the control plane Block Storage services using "extraMounts" or remove the option.
+
+ERROR:root:Backend hpe_fc requires a vendor container image, but there is no certified image available yet. Patch will use the last known image for reference, but IT WILL NOT WORK
+
+WARNING:root:Deployment uses Ceph, so make sure the Ceph credentials and configuration are present in OpenShift as a asecret and then use the extra volumes to make them available in all the services that would need them.
+
+WARNING:root:You were using user ['nova'] to talk to Nova, but in podified using the service keystone username is preferred, in this case ['cinder']. Dropping that configuration.
+
+WARNING:root:Configuration is using FC, please ensure all your OpenShift nodes have HBAs or use labels to ensure that Volume and Backup services are scheduled on nodes with HBAs.
+
+WARNING:root:ALWAYS REVIEW RESULTS, OUTPUT IS JUST A ROUGH DRAFT!!
+
+Output written at ./: cinder.patch, cinder-prereq.yaml
+
+
+
+

In this case there are additional messages. The following list provides an explanation of each one:

+
+
+
    +
  • +

    There is one message mentioning how this backend driver needs external vendor +dependencies so the standard container image will not work. Unfortunately this +image is still not available, so an older image is used in the output patch file +for reference. You can then replace this image with one that you build or +with a Red Hat official image once the image is available. In this case you can +see in your cinder.patch file that has an OpenStackVersion object:

    +
    +
    +
    apiVersion: core.openstack.org/v1beta1
    +kind: OpenStackVersion
    +metadata:
    +  name: openstack
    +spec:
    +  customContainerImages:
    +    cinderVolumeImages:
    +      hpe-fc:
    +        containerImage: registry.connect.redhat.com/hpe3parcinder/openstack-cinder-volume-hpe3parcinder17-0
    +
    +
    +
    +

    The name of the OpenStackVersion must match the name of your OpenStackControlPlane, so in your case it may be other than openstack.

    +
    +
  • +
  • +

    The FC message reminds you that this transport protocol requires specific HBA +cards to be present on the nodes where Block Storage services are running.

    +
  • +
  • +

    In this case it has created the cinder-prereq.yaml file and within the file +there is one MachineConfig and one Secret. The MachineConfig is called 99-master-cinder-enable-multipathd and like the name suggests enables multipathing on all the OCP worker nodes. The Secret is +called openstackcinder-volumes-hpe_fc and contains the 3PAR backend +configuration because it has sensitive information (credentials). The +cinder.patch file uses the following configuration:

    +
    +
    +
       cinderVolumes:
    +      hpe-fc:
    +        customServiceConfigSecrets:
    +        - openstackcinder-volumes-hpe_fc
    +
    +
    +
  • +
+
+
+
+
+

Limitations for adopting the Block Storage service

+
+

Before you begin the Block Storage service (cinder) adoption, review the following limitations:

+
+
+
    +
  • +

    There is no global nodeSelector option for all Block Storage service volumes. You must specify the nodeSelector for each back end.

    +
  • +
  • +

    There are no global customServiceConfig or customServiceConfigSecrets options for all Block Storage service volumes. You must specify these options for each back end.

    +
  • +
  • +

    Support for Block Storage service back ends that require kernel modules that are not included in Red Hat Enterprise Linux is not tested in Red Hat OpenStack Services on OpenShift (RHOSO).

    +
  • +
+
+
+
+

OCP preparation for Block Storage service adoption

+
+

Before you deploy OpenStack (OSP) in OpenShift nodes, ensure that the networks are ready, that you decide which OCP nodes to restrict, and that you make any necessary changes to the OCP nodes.

+
+
+
+
Node selection
+
+

You might need to restrict the OCP nodes where the Block Storage service volume and backup services run.

+
+

An example of when you need to restrict nodes for a specific Block Storage service is when you deploy the Block Storage service with the LVM driver. In that scenario, the LVM data where the volumes are stored only exists in a specific host, so you need to pin the Block Storage-volume service to that specific OCP node. Running the service on any other OCP node does not work. You cannot use the OCP host node name to restrict the LVM back end. You need to identify the LVM back end by using a unique label, an existing label, or a new label:

+
+
+
+
$ oc label nodes worker0 lvm=cinder-volumes
+
+
+
+
+
apiVersion: core.openstack.org/v1beta1
+kind: OpenStackControlPlane
+metadata:
+  name: openstack
+spec:
+  secret: osp-secret
+  storageClass: local-storage
+  cinder:
+    enabled: true
+    template:
+      cinderVolumes:
+        lvm-iscsi:
+          nodeSelector:
+            lvm: cinder-volumes
+< . . . >
+
+
+
+

For more information about node selection, see About node selectors.

+
+
+ + + + + +
+ + +
+

If your nodes do not have enough local disk space for temporary images, you can use a remote NFS location by setting the extra volumes feature, extraMounts.

+
+
+
+
+
Transport protocols
+
+

Some changes to the storage transport protocols might be required for OCP:

+
+
    +
  • +

    If you use a MachineConfig to make changes to OCP nodes, the nodes reboot.

    +
  • +
  • +

    Check the back-end sections that are listed in the enabled_backends configuration option in your cinder.conf file to determine the enabled storage back-end sections.

    +
  • +
  • +

    Depending on the back end, you can find the transport protocol by viewing the volume_driver or target_protocol configuration options.

    +
  • +
  • +

    The icssid service, multipathd service, and NVMe-TCP kernel modules start automatically on data plane nodes.

    +
    +
    +
    NFS
    +
    +
    +
      +
    • +

      OCP connects to NFS back ends without additional changes.

      +
    • +
    +
    +
    +
    Rados Block Device and Ceph
    +
    +
    +
      +
    • +

      OCP connects to Ceph back ends without additional changes. You must provide credentials and configuration files to the services.

      +
    • +
    +
    +
    +
    iSCSI
    +
    +
    +
      +
    • +

      To connect to iSCSI volumes, the iSCSI initiator must run on the +OCP hosts where the volume and backup services run. The Linux Open iSCSI initiator does not support network namespaces, so you must only run one instance of the service for the normal OCP usage, as well as +the OCP CSI plugins and the OSP services.

      +
    • +
    • +

      If you are not already running iscsid on the OCP nodes, then you must apply a MachineConfig. For example:

      +
      +
      +
      apiVersion: machineconfiguration.openshift.io/v1
      +kind: MachineConfig
      +metadata:
      +  labels:
      +    machineconfiguration.openshift.io/role: worker
      +    service: cinder
      +  name: 99-master-cinder-enable-iscsid
      +spec:
      +  config:
      +    ignition:
      +      version: 3.2.0
      +    systemd:
      +      units:
      +      - enabled: true
      +        name: iscsid.service
      +
      +
      +
    • +
    • +

      If you use labels to restrict the nodes where the Block Storage services run, you must use a MachineConfigPool to limit the effects of the +MachineConfig to the nodes where your services might run. For more information, see About node selectors.

      +
    • +
    • +

      If you are using a single node deployment to test the process, replace worker with master in the MachineConfig.

      +
    • +
    • +

      For production deployments that use iSCSI volumes, configure multipathing for better I/O.

      +
    • +
    +
    +
    +
    FC
    +
    +
    +
      +
    • +

      The Block Storage service volume and Block Storage service backup services must run in an OCP host that has host bus adapters (HBAs). If some nodes do not have HBAs, then use labels to restrict where these services run. For more information, see About node selectors.

      +
    • +
    • +

      If you have virtualized OCP clusters that use FC you need to expose the host HBAs inside the virtual machine.

      +
    • +
    • +

      For production deployments that use FC volumes, configure multipathing for better I/O.

      +
    • +
    +
    +
    +
    NVMe-TCP
    +
    +
    +
      +
    • +

      To connect to NVMe-TCP volumes, load NVMe-TCP kernel modules on the OCP hosts.

      +
    • +
    • +

      If you do not already load the nvme-fabrics module on the OCP nodes where the volume and backup services are going to run, then you must apply a MachineConfig. For example:

      +
      +
      +
      apiVersion: machineconfiguration.openshift.io/v1
      +kind: MachineConfig
      +metadata:
      +  labels:
      +    machineconfiguration.openshift.io/role: worker
      +    service: cinder
      +  name: 99-master-cinder-load-nvme-fabrics
      +spec:
      +  config:
      +    ignition:
      +      version: 3.2.0
      +    storage:
      +      files:
      +        - path: /etc/modules-load.d/nvme_fabrics.conf
      +          overwrite: false
      +          # Mode must be decimal, this is 0644
      +          mode: 420
      +          user:
      +            name: root
      +          group:
      +            name: root
      +          contents:
      +            # Source can be a http, https, tftp, s3, gs, or data as defined in rfc2397.
      +            # This is the rfc2397 text/plain string format
      +            source: data:,nvme-fabrics
      +
      +
      +
    • +
    • +

      If you use labels to restrict the nodes where Block Storage +services run, use a MachineConfigPool to limit the effects of the MachineConfig to the nodes where your services run. For more information, see About node selectors.

      +
    • +
    • +

      If you use a single node deployment to test the process, replace worker with master in the MachineConfig.

      +
    • +
    • +

      Only load the nvme-fabrics module because it loads the transport-specific modules, such as TCP, RDMA, or FC, as needed.

      +
      +

      For production deployments that use NVMe-TCP volumes, it is recommended that you use multipathing. For NVMe-TCP volumes OCP uses native multipathing, called +ANA.

      +
      +
    • +
    • +

      After the OCP nodes reboot and load the nvme-fabrics module, you can confirm that the operating system is configured and that it supports ANA by checking the host:

      +
      +
      +
      $ cat /sys/module/nvme_core/parameters/multipath
      +
      +
      +
      + + + + + +
      + + +ANA does not use the Linux Multipathing Device Mapper, but OCP requires multipathd to run on Compute nodes for the Compute service (nova) to be able to use multipathing. Multipathing is automatically configured on data plane nodes when they are provisioned. +
      +
      +
    • +
    +
    +
    +
    Multipathing
    +
    +
    +
      +
    • +

      Multipathing is recommended for iSCSI and FC protocols. To configure multipathing on these protocols, you perform the following tasks:

      +
      +
        +
      • +

        Prepare the OCP hosts

        +
      • +
      • +

        Configure the Block Storage services

        +
      • +
      • +

        Prepare the Compute service nodes

        +
      • +
      • +

        Configure the Compute service

        +
      • +
      +
      +
    • +
    • +

      To prepare the OCP hosts, ensure that the Linux Multipath Device Mapper is configured and running on the OCP hosts by using MachineConfig. For example:

      +
      +
      +
      # Includes the /etc/multipathd.conf contents and the systemd unit changes
      +apiVersion: machineconfiguration.openshift.io/v1
      +kind: MachineConfig
      +metadata:
      +  labels:
      +    machineconfiguration.openshift.io/role: worker
      +    service: cinder
      +  name: 99-master-cinder-enable-multipathd
      +spec:
      +  config:
      +    ignition:
      +      version: 3.2.0
      +    storage:
      +      files:
      +        - path: /etc/multipath.conf
      +          overwrite: false
      +          # Mode must be decimal, this is 0600
      +          mode: 384
      +          user:
      +            name: root
      +          group:
      +            name: root
      +          contents:
      +            # Source can be a http, https, tftp, s3, gs, or data as defined in rfc2397.
      +            # This is the rfc2397 text/plain string format
      +            source: data:,defaults%20%7B%0A%20%20user_friendly_names%20no%0A%20%20recheck_wwid%20yes%0A%20%20skip_kpartx%20yes%0A%20%20find_multipaths%20yes%0A%7D%0A%0Ablacklist%20%7B%0A%7D
      +    systemd:
      +      units:
      +      - enabled: true
      +        name: multipathd.service
      +
      +
      +
    • +
    • +

      If you use labels to restrict the nodes where Block Storage services run, you need to use a MachineConfigPool to limit the effects of the MachineConfig to only the nodes where your services run. For more information, see About node selectors.

      +
    • +
    • +

      If you are using a single node deployment to test the process, replace worker with master in the MachineConfig.

      +
    • +
    • +

      Cinder volume and backup are configured by default to use multipathing.

      +
    • +
    +
    +
    +
    +
    +
  • +
+
+
+
+
+
+
+

Preparing the Block Storage service by customizing the configuration

+
+

The high level explanation of the tailor-made approach is:

+
+
+
    +
  • +

    Determine what part of the configuration is generic for all the Block Storage services and remove anything that would change when deployed in OpenShift, such as the connection in the [database] section, the transport_url and log_dir in the [DEFAULT] sections, the whole [coordination] and [barbican] sections. The remaining generic configuration goes into the customServiceConfig option, or a Secret custom resource (CR) and is then used in the customServiceConfigSecrets section, at the cinder: template: level.

    +
  • +
  • +

    Determine if there is a scheduler-specific configuration and add it to the customServiceConfig option in cinder: template: cinderScheduler.

    +
  • +
  • +

    Determine if there is an API-specific configuration and add it to the customServiceConfig option in cinder: template: cinderAPI.

    +
  • +
  • +

    If the Block Storage service backup is deployed, add the Block Storage service backup configuration options to customServiceConfig option, or to a Secret CR that you can add to customServiceConfigSecrets section at the cinder: template: +cinderBackup: level. Remove the host configuration in the [DEFAULT] section to support multiple replicas later.

    +
  • +
  • +

    Determine the individual volume back-end configuration for each of the drivers. The configuration is in the specific driver section, and it includes the [backend_defaults] section and FC zoning sections if you use them. The Block Storage service operator does not support a global customServiceConfig option for all volume services. Each back end has its own section under cinder: template: cinderVolumes, and the configuration goes in the customServiceConfig option or in a Secret CR and is then used in the customServiceConfigSecrets section.

    +
  • +
  • +

    If any of the Block Storage service volume drivers require a custom vendor image, find the location of the image in the Red Hat Ecosystem Catalog, and create or modify an OpenStackVersion CR to specify the custom image by using the key from the cinderVolumes section.

    +
    +

    For example, if you have the following configuration:

    +
    +
    +
    +
    spec:
    +  cinder:
    +    enabled: true
    +    template:
    +      cinderVolume:
    +        pure:
    +          customServiceConfigSecrets:
    +            - openstack-cinder-pure-cfg
    +< . . . >
    +
    +
    +
    +

    Then the OpenStackVersion CR that describes the container image for that back end looks like the following example:

    +
    +
    +
    +
    apiVersion: core.openstack.org/v1beta1
    +kind: OpenStackVersion
    +metadata:
    +  name: openstack
    +spec:
    +  customContainerImages:
    +    cinderVolumeImages:
    +      pure: registry.connect.redhat.com/purestorage/openstack-cinder-volume-pure-rhosp-18-0'
    +
    +
    +
    + + + + + +
    + + +The name of the OpenStackVersion must match the name of your OpenStackControlPlane CR. +
    +
    +
  • +
  • +

    If your Block Storage services use external files, for example, for a custom policy, or to store credentials or SSL certificate authority bundles to connect to a storage array, make those files available to the right containers. Use Secrets or ConfigMap to store the information in OCP and then in the extraMounts key. For example, for Ceph credentials that are stored in a Secret called ceph-conf-files, you patch the top-level extraMounts key in the OpenstackControlPlane CR:

    +
    +
    +
    spec:
    +  extraMounts:
    +  - extraVol:
    +    - extraVolType: Ceph
    +      mounts:
    +      - mountPath: /etc/ceph
    +        name: ceph
    +        readOnly: true
    +      propagation:
    +      - CinderVolume
    +      - CinderBackup
    +      - Glance
    +      volumes:
    +      - name: ceph
    +        projected:
    +          sources:
    +          - secret:
    +              name: ceph-conf-files
    +
    +
    +
  • +
  • +

    For a service-specific file, such as the API policy, you add the configuration +on the service itself. In the following example, you include the CinderAPI +configuration that references the policy you are adding from a ConfigMap +called my-cinder-conf that has a policy key with the contents of the policy:

    +
    +
    +
    spec:
    +  cinder:
    +    enabled: true
    +    template:
    +      cinderAPI:
    +        customServiceConfig: |
    +           [oslo_policy]
    +           policy_file=/etc/cinder/api/policy.yaml
    +      extraMounts:
    +      - extraVol:
    +        - extraVolType: Ceph
    +          mounts:
    +          - mountPath: /etc/cinder/api
    +            name: policy
    +            readOnly: true
    +          propagation:
    +          - CinderAPI
    +          volumes:
    +          - name: policy
    +            projected:
    +              sources:
    +              - configMap:
    +                  name: my-cinder-conf
    +                  items:
    +                    - key: policy
    +                      path: policy.yaml
    +
    +
    +
  • +
+
+
+
+

Changes to CephFS through NFS

+
+

Before you begin the adoption, review the following information to understand the changes to CephFS through NFS between OpenStack (OSP) Wallaby and Red Hat OpenStack Services on OpenShift (RHOSO) Antelope:

+
+
+
    +
  • +

    If the OSP Wallaby deployment uses CephFS through NFS as a back end for Shared File Systems service (manila), you cannot directly import the ceph-nfs service on the OSP Controller nodes into RHOSO Antelope. In RHOSO Antelope, the Shared File Systems service only supports using a clustered NFS service that is directly managed on the Ceph cluster. Adoption with the ceph-nfs service involves a data path disruption to existing NFS clients.

    +
  • +
  • +

    On OSP Wallaby, Pacemaker controls the high availability of the ceph-nfs service. This service is assigned a Virtual IP (VIP) address that is also managed by Pacemaker. The VIP is typically created on an isolated StorageNFS network. The Controller nodes have ordering and collocation constraints established between this VIP, ceph-nfs, and the Shared File Systems service (manila) share manager service. Prior to adopting Shared File Systems service, you must adjust the Pacemaker ordering and collocation constraints to separate the share manager service. This establishes ceph-nfs with its VIP as an isolated, standalone NFS service that you can decommission after completing the RHOSO adoption.

    +
  • +
  • +

    In Ceph Reef, a native clustered Ceph NFS service has to be deployed on the Ceph cluster by using the Ceph Orchestrator prior to adopting the Shared File Systems service. This NFS service eventually replaces the standalone NFS service from OSP Wallaby in your deployment. When the Shared File Systems service is adopted into the RHOSO Antelope environment, it establishes all the existing exports and client restrictions on the new clustered Ceph NFS service. Clients can continue to read and write data on existing NFS shares, and are not affected until the old standalone NFS service is decommissioned. After the service is decommissioned, you can re-mount the same share from the new clustered Ceph NFS service during a scheduled downtime.

    +
  • +
  • +

    To ensure that NFS users are not required to make any networking changes to their existing workloads, assign an IP address from the same isolated StorageNFS network to the clustered Ceph NFS service. NFS users only need to discover and re-mount their shares by using new export paths. When the adoption is complete, RHOSO users can query the Shared File Systems service API to list the export locations on existing shares to identify the preferred paths to mount these shares. These preferred paths correspond to the new clustered Ceph NFS service in contrast to other non-preferred export paths that continue to be displayed until the old isolated, standalone NFS service is decommissioned.

    +
  • +
+
+
+

For more information on setting up a clustered NFS service, see Creating a NFS Ganesha cluster.

+
+
+
+
+

Comparing configuration files between deployments

+
+

To help you manage the configuration for your TripleO and OpenStack (OSP) services, you can compare the configuration files between your TripleO deployment and the Red Hat OpenStack Services on OpenShift (RHOSO) cloud by using the os-diff tool.

+
+
+
Prerequisites
+
    +
  • +

    Golang is installed and configured on your environment:

    +
    +
    +
    dnf install -y golang-github-openstack-k8s-operators-os-diff
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Configure the /etc/os-diff/os-diff.cfg file and the /etc/os-diff/ssh.config file according to your environment. To allow os-diff to connect to your clouds and pull files from the services that you describe in the config.yaml file, you must set the following options in the os-diff.cfg file:

    +
    +
    +
    [Default]
    +
    +local_config_dir=/tmp/
    +service_config_file=config.yaml
    +
    +[Tripleo]
    +
    +ssh_cmd=ssh -F ssh.config (1)
    +director_host=standalone (2)
    +container_engine=podman
    +connection=ssh
    +remote_config_path=/tmp/tripleo
    +local_config_path=/tmp/
    +
    +[Openshift]
    +
    +ocp_local_config_path=/tmp/ocp
    +connection=local
    +ssh_cmd=""
    +
    +
    +
    + + + + + + + + + +
    1Instructs os-diff to access your TripleO host through SSH. The default value is ssh -F ssh.config. However, you can set the value without an ssh.config file, for example, ssh -i /home/user/.ssh/id_rsa stack@my.undercloud.local.
    2The host to use to access your cloud, and the podman/docker binary is installed and allowed to interact with the running containers. You can leave this key blank.
    +
    +
  2. +
  3. +

    If you use a host file to connect to your cloud, configure the ssh.config file to allow os-diff to access your OSP environment, for example:

    +
    +
    +
    Host *
    +    IdentitiesOnly yes
    +
    +Host virthost
    +    Hostname virthost
    +    IdentityFile ~/.ssh/id_rsa
    +    User root
    +    StrictHostKeyChecking no
    +    UserKnownHostsFile=/dev/null
    +
    +
    +Host standalone
    +    Hostname standalone
    +    IdentityFile ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa
    +    User root
    +    StrictHostKeyChecking no
    +    UserKnownHostsFile=/dev/null
    +
    +Host crc
    +    Hostname crc
    +    IdentityFile ~/.ssh/id_rsa
    +    User stack
    +    StrictHostKeyChecking no
    +    UserKnownHostsFile=/dev/null
    +
    +
    +
    +
      +
    • +

      Replace <path to SSH key> with the path to your SSH key. You must provide a value for IdentityFile to get full working access to your OSP environment.

      +
    • +
    +
    +
  4. +
  5. +

    If you use an inventory file to connect to your cloud, generate the ssh.config file from your Ansible inventory, for example, tripleo-ansible-inventory.yaml file:

    +
    +
    +
    $ os-diff configure -i tripleo-ansible-inventory.yaml -o ssh.config --yaml
    +
    +
    +
  6. +
+
+
+
Verification
+
    +
  • +

    Test your connection:

    +
    +
    +
    $ ssh -F ssh.config standalone
    +
    +
    +
  • +
+
+
+
+
+
+

Migrating TLS-e to the RHOSO deployment

+
+
+

If you enabled TLS everywhere (TLS-e) in your OpenStack (OSP) Wallaby deployment, you must migrate TLS-e to the Red Hat OpenStack Services on OpenShift (RHOSO) deployment.

+
+
+

The RHOSO deployment uses the cert-manager operator to issue, track, and renew the certificates. In the following procedure, you extract the CA signing certificate from the FreeIPA instance that you use to provide the certificates in the OSP environment, and then import them into cert-manager in the RHOSO environment. As a result, you minimize the disruption on the Compute nodes because you do not need to install a new chain of trust.

+
+
+

You then decommission the previous FreeIPA node and no longer use it to issue certificates. This might not be possible if you use the IPA server to issue certificates for non-OSP systems.

+
+
+ + + + + +
+ + +
+
    +
  • +

    The following procedure was reproduced on a FreeIPA 4.10.1 server. The location of the files and directories might change depending on the version.

    +
  • +
  • +

    If the signing keys are stored in an hardware security module (HSM) instead of an NSS shared database (NSSDB), and the keys are retrievable, special HSM utilities might be required.

    +
  • +
+
+
+
+
+
Prerequisites
+
    +
  • +

    Your OSP deployment is using TLS-e.

    +
  • +
  • +

    Ensure that the back-end services on the new deployment are not started yet.

    +
  • +
  • +

    Define the following shell variables. The values are examples and refer to a single-node standalone TripleO deployment. Replace these example values with values that are correct for your environment:

    +
    +
    +
    IPA_SSH="ssh -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa root@192.168.122.100 podman exec -ti freeipa-server-container"
    +
    +
    +
    +

    In this example the FreeIPA instance is running on a separate host, in a container.

    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    To locate the CA certificate and key, list all the certificates inside your NSSDB:

    +
    +
    +
    $IPA_SSH certutil -L -d /etc/pki/pki-tomcat/alias
    +
    +
    +
    +
      +
    • +

      The -L option lists all certificates.

      +
    • +
    • +

      The -d option specifies where the certificates are stored.

      +
      +

      The command produces an output similar to the following example:

      +
      +
      +
      +
      Certificate Nickname                                         Trust Attributes
      +                                                             SSL,S/MIME,JAR/XPI
      +
      +caSigningCert cert-pki-ca                                    CTu,Cu,Cu
      +ocspSigningCert cert-pki-ca                                  u,u,u
      +Server-Cert cert-pki-ca                                      u,u,u
      +subsystemCert cert-pki-ca                                    u,u,u
      +auditSigningCert cert-pki-ca                                 u,u,Pu
      +
      +
      +
    • +
    +
    +
  2. +
  3. +

    Export the certificate and key from the /etc/pki/pki-tomcat/alias directory. The following example uses the caSigningCert cert-pki-ca certificate:

    +
    +
    +
    $IPA_SSH pk12util -o /tmp/freeipa.p12 -n 'caSigningCert\ cert-pki-ca' -d /etc/pki/pki-tomcat/alias -k /etc/pki/pki-tomcat/alias/pwdfile.txt -w /etc/pki/pki-tomcat/alias/pwdfile.txt
    +
    +
    +
    + + + + + +
    + + +
    +

    The command generates a P12 file with both the certificate and the key. The /etc/pki/pki-tomcat/alias/pwdfile.txt file contains the password that protects the key. You can use the password to both extract the key and generate the new file, /tmp/freeipa.p12. You can also choose another password. If you choose a different password for the new file, replace the parameter of the -w option, or use the -W option followed by the password, in clear text.

    +
    +
    +

    With that file, you can also get the certificate and the key by using the openssl pkcs12 command.

    +
    +
    +
    +
  4. +
  5. +

    Create the secret that contains the root CA:

    +
    +
    +
    $ oc create secret generic rootca-internal -n openstack
    +
    +
    +
  6. +
  7. +

    Import the certificate and the key from FreeIPA:

    +
    +
    +
    $ oc patch secret rootca-internal -n openstack -p="{\"data\":{\"ca.crt\": \"`$IPA_SSH openssl pkcs12 -in /tmp/freeipa.p12 -passin file:/etc/pki/pki-tomcat/alias/pwdfile.txt -nokeys | openssl x509 | base64 -w 0`\"}}"
    +
    +$ oc patch secret rootca-internal -n openstack -p="{\"data\":{\"tls.crt\": \"`$IPA_SSH openssl pkcs12 -in /tmp/freeipa.p12 -passin file:/etc/pki/pki-tomcat/alias/pwdfile.txt -nokeys | openssl x509 | base64 -w 0`\"}}"
    +
    +$ oc patch secret rootca-internal -n openstack -p="{\"data\":{\"tls.key\": \"`$IPA_SSH openssl pkcs12 -in /tmp/freeipa.p12 -passin file:/etc/pki/pki-tomcat/alias/pwdfile.txt -nocerts -noenc | openssl rsa | base64 -w 0`\"}}"
    +
    +
    +
  8. +
  9. +

    Create the cert-manager issuer and reference the secret:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: cert-manager.io/v1
    +kind: Issuer
    +metadata:
    +  name: rootca-internal
    +  namespace: openstack
    +  labels:
    +    osp-rootca-issuer-public: ""
    +    osp-rootca-issuer-internal: ""
    +    osp-rootca-issuer-libvirt: ""
    +    osp-rootca-issuer-ovn: ""
    +spec:
    +  ca:
    +    secretName: rootca-internal
    +EOF
    +
    +
    +
  10. +
  11. +

    Delete the previously created p12 files:

    +
    +
    +
    $IPA_SSH rm /tmp/freeipa.p12
    +
    +
    +
  12. +
+
+
+
Verification
+
    +
  • +

    Verify that the necessary resources are created:

    +
    +
    +
    $ oc get issuers -n openstack
    +
    +
    +
    +
    +
    $ oc get secret rootca-internal -n openstack -o yaml
    +
    +
    +
  • +
+
+
+ + + + + +
+ + +After the adoption is complete, the cert-manager operator issues new certificates and updates the secrets with the new certificates. As a result, the pods on the control plane automatically restart in order to obtain the new certificates. On the data plane, you must manually initiate a new deployment and restart certain processes to use the new certificates. The old certificates remain active until both the control plane and data plane obtain the new certificates. +
+
+
+
+
+

Migrating databases to the control plane

+
+
+

To begin creating the control plane, enable back-end services and import the databases from your original OpenStack Wallaby deployment.

+
+
+

Retrieving topology-specific service configuration

+
+

Before you migrate your databases to the Red Hat OpenStack Services on OpenShift (RHOSO) control plane, retrieve the topology-specific service configuration from your OpenStack (OSP) environment. You need this configuration for the following reasons:

+
+
+
    +
  • +

    To check your current database for inaccuracies

    +
  • +
  • +

    To ensure that you have the data you need before the migration

    +
  • +
  • +

    To compare your OSP database with the adopted RHOSO database

    +
  • +
+
+
+
Prerequisites
+
    +
  • +

    Define the following shell variables. Replace the example values with values that are correct for your environment:

    +
    +
    +
    CONTROLLER1_SSH="ssh -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa root@192.168.122.100"
    +MARIADB_IMAGE=quay.io/podified-antelope-centos9/openstack-mariadb:current-podified
    +SOURCE_MARIADB_IP=172.17.0.2
    +SOURCE_DB_ROOT_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' MysqlRootPassword:' | awk -F ': ' '{ print $2; }')
    +MARIADB_CLIENT_ANNOTATIONS='--annotations=k8s.v1.cni.cncf.io/networks=internalapi'
    +
    +
    +
    +

    To get the value to set SOURCE_MARIADB_IP, query the puppet-generated configurations in a Controller node:

    +
    +
    +
    +
    $ grep -rI 'listen mysql' -A10 /var/lib/config-data/puppet-generated/ | grep bind
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Export the shell variables for the following outputs and test the connection to the OSP database:

    +
    +
    +
    export PULL_OPENSTACK_CONFIGURATION_DATABASES=$(oc run mariadb-client ${MARIADB_CLIENT_ANNOTATIONS} -q --image ${MARIADB_IMAGE} -i --rm --restart=Never -- \
    +    mysql -rsh "$SOURCE_MARIADB_IP" -uroot -p"$SOURCE_DB_ROOT_PASSWORD" -e 'SHOW databases;')
    +echo "$PULL_OPENSTACK_CONFIGURATION_DATABASES"
    +
    +
    +
    + + + + + +
    + + +The nova, nova_api, and nova_cell0 databases are included in the same database host. +
    +
    +
  2. +
  3. +

    Run mysqlcheck on the OSP database to check for inaccuracies:

    +
    +
    +
    export PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK=$(oc run mariadb-client ${MARIADB_CLIENT_ANNOTATIONS} -q --image ${MARIADB_IMAGE} -i --rm --restart=Never -- \
    +    mysqlcheck --all-databases -h $SOURCE_MARIADB_IP -u root -p"$SOURCE_DB_ROOT_PASSWORD" | grep -v OK)
    +echo "$PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK"
    +
    +
    +
  4. +
  5. +

    Get the Compute service (nova) cell mappings:

    +
    +
    +
    export PULL_OPENSTACK_CONFIGURATION_NOVADB_MAPPED_CELLS=$(oc run mariadb-client ${MARIADB_CLIENT_ANNOTATIONS} -q --image ${MARIADB_IMAGE} -i --rm --restart=Never -- \
    +    mysql -rsh "${SOURCE_MARIADB_IP}" -uroot -p"${SOURCE_DB_ROOT_PASSWORD}" nova_api -e \
    +    'select uuid,name,transport_url,database_connection,disabled from cell_mappings;')
    +echo "$PULL_OPENSTACK_CONFIGURATION_NOVADB_MAPPED_CELLS"
    +
    +
    +
  6. +
  7. +

    Get the hostnames of the registered Compute services:

    +
    +
    +
    export PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES=$(oc run mariadb-client ${MARIADB_CLIENT_ANNOTATIONS} -q --image ${MARIADB_IMAGE} -i --rm --restart=Never -- \
    +    mysql -rsh "$SOURCE_MARIADB_IP" -uroot -p"$SOURCE_DB_ROOT_PASSWORD" nova_api -e \
    +    "select host from nova.services where services.binary='nova-compute';")
    +echo "$PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES"
    +
    +
    +
  8. +
  9. +

    Get the list of the mapped Compute service cells:

    +
    +
    +
    export PULL_OPENSTACK_CONFIGURATION_NOVAMANAGE_CELL_MAPPINGS=$($CONTROLLER1_SSH sudo podman exec -it nova_api nova-manage cell_v2 list_cells)
    +echo "$PULL_OPENSTACK_CONFIGURATION_NOVAMANAGE_CELL_MAPPINGS"
    +
    +
    +
    + + + + + +
    + + +After the OSP control plane services are shut down, if any of the exported values are lost, re-running the command fails because the control plane services are no longer running on the source cloud, and the data cannot be retrieved. To avoid data loss, preserve the exported values in an environment file before shutting down the control plane services. +
    +
    +
  10. +
  11. +

    If neutron-sriov-nic-agent agents are running in your OSP deployment, get the configuration to use for the data plane adoption:

    +
    +
    +
    SRIOV_AGENTS=$ oc run mariadb-client mysql -rsh "$SOURCE_MARIADB_IP" \
    +-uroot -p"$SOURCE_DB_ROOT_PASSWORD" ovs_neutron -e \
    +"select host, configurations from agents where agents.binary='neutron-sriov-nic-agent';"
    +
    +
    +
  12. +
  13. +

    Store the exported variables for future use:

    +
    +
    +
    $ cat >~/.source_cloud_exported_variables <<EOF
    +PULL_OPENSTACK_CONFIGURATION_DATABASES="$PULL_OPENSTACK_CONFIGURATION_DATABASES"
    +PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK="$PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK"
    +PULL_OPENSTACK_CONFIGURATION_NOVADB_MAPPED_CELLS="$PULL_OPENSTACK_CONFIGURATION_NOVADB_MAPPED_CELLS"
    +PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES="$PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES"
    +PULL_OPENSTACK_CONFIGURATION_NOVAMANAGE_CELL_MAPPINGS="$PULL_OPENSTACK_CONFIGURATION_NOVAMANAGE_CELL_MAPPINGS"
    +SRIOV_AGENTS="$SRIOV_AGENTS"
    +EOF
    +
    +
    +
  14. +
+
+
+
+

Deploying back-end services

+
+

Create the OpenStackControlPlane custom resource (CR) with the basic back-end services deployed, and disable all the OpenStack (OSP) services. This CR is the foundation of the control plane.

+
+
+
Prerequisites
+
    +
  • +

    The cloud that you want to adopt is running, and it is on the OSP Wallaby release.

    +
  • +
  • +

    All control plane and data plane hosts of the source cloud are running, and continue to run throughout the adoption procedure.

    +
  • +
  • +

    The openstack-operator is deployed, but OpenStackControlPlane is +not deployed.

    +
    +

    For developer/CI environments, the OSP operator can be deployed +by running make openstack inside +install_yamls +repo.

    +
    +
    +

    For production environments, the deployment method will likely be +different.

    +
    +
  • +
  • +

    If you enabled TLS everywhere (TLS-e) on the OSP environment, you must copy the tls root CA from the OSP environment to the rootca-internal issuer.

    +
  • +
  • +

    There are free PVs available for MariaDB and RabbitMQ.

    +
    +

    For developer/CI environments driven by install_yamls, make sure +you’ve run make crc_storage.

    +
    +
  • +
  • +

    Set the desired admin password for the control plane deployment. This can +be the admin password from your original deployment or a different password:

    +
    +
    +
    ADMIN_PASSWORD=SomePassword
    +
    +
    +
    +

    To use the existing OSP deployment password:

    +
    +
    +
    +
    ADMIN_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' AdminPassword:' | awk -F ': ' '{ print $2; }')
    +
    +
    +
  • +
  • +

    Set the service password variables to match the original deployment. +Database passwords can differ in the control plane environment, but +you must synchronize the service account passwords.

    +
    +

    For example, in developer environments with TripleO Standalone, the passwords can be extracted:

    +
    +
    +
    +
    AODH_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' AodhPassword:' | awk -F ': ' '{ print $2; }')
    +BARBICAN_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' BarbicanPassword:' | awk -F ': ' '{ print $2; }')
    +CEILOMETER_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' CeilometerPassword:' | awk -F ': ' '{ print $2; }')
    +CINDER_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' CinderPassword:' | awk -F ': ' '{ print $2; }')
    +GLANCE_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' GlancePassword:' | awk -F ': ' '{ print $2; }')
    +HEAT_AUTH_ENCRYPTION_KEY=$(cat ~/tripleo-standalone-passwords.yaml | grep ' HeatAuthEncryptionKey:' | awk -F ': ' '{ print $2; }')
    +HEAT_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' HeatPassword:' | awk -F ': ' '{ print $2; }')
    +IRONIC_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' IronicPassword:' | awk -F ': ' '{ print $2; }')
    +MANILA_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' ManilaPassword:' | awk -F ': ' '{ print $2; }')
    +NEUTRON_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' NeutronPassword:' | awk -F ': ' '{ print $2; }')
    +NOVA_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' NovaPassword:' | awk -F ': ' '{ print $2; }')
    +OCTAVIA_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' OctaviaPassword:' | awk -F ': ' '{ print $2; }')
    +PLACEMENT_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' PlacementPassword:' | awk -F ': ' '{ print $2; }')
    +SWIFT_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' SwiftPassword:' | awk -F ': ' '{ print $2; }')
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Ensure that you are using the OpenShift namespace where you want the +control plane to be deployed:

    +
    +
    +
    $ oc project openstack
    +
    +
    +
  2. +
  3. +

    Create the OSP secret.

    +
    +

    The procedure for this will vary, but in developer/CI environments +you use install_yamls:

    +
    +
    +
    +
    # in install_yamls
    +make input
    +
    +
    +
  4. +
  5. +

    If the $ADMIN_PASSWORD is different than the password you set +in osp-secret, amend the AdminPassword key in the osp-secret:

    +
    +
    +
    $ oc set data secret/osp-secret "AdminPassword=$ADMIN_PASSWORD"
    +
    +
    +
  6. +
  7. +

    Set service account passwords in osp-secret to match the service +account passwords from the original deployment:

    +
    +
    +
    $ oc set data secret/osp-secret "AodhPassword=$AODH_PASSWORD"
    +$ oc set data secret/osp-secret "BarbicanPassword=$BARBICAN_PASSWORD"
    +$ oc set data secret/osp-secret "CeilometerPassword=$CEILOMETER_PASSWORD"
    +$ oc set data secret/osp-secret "CinderPassword=$CINDER_PASSWORD"
    +$ oc set data secret/osp-secret "GlancePassword=$GLANCE_PASSWORD"
    +$ oc set data secret/osp-secret "HeatAuthEncryptionKey=$HEAT_AUTH_ENCRYPTION_KEY"
    +$ oc set data secret/osp-secret "HeatPassword=$HEAT_PASSWORD"
    +$ oc set data secret/osp-secret "IronicPassword=$IRONIC_PASSWORD"
    +$ oc set data secret/osp-secret "IronicInspectorPassword=$IRONIC_PASSWORD"
    +$ oc set data secret/osp-secret "ManilaPassword=$MANILA_PASSWORD"
    +$ oc set data secret/osp-secret "MetadataSecret=$METADATA_SECRET"
    +$ oc set data secret/osp-secret "NeutronPassword=$NEUTRON_PASSWORD"
    +$ oc set data secret/osp-secret "NovaPassword=$NOVA_PASSWORD"
    +$ oc set data secret/osp-secret "OctaviaPassword=$OCTAVIA_PASSWORD"
    +$ oc set data secret/osp-secret "PlacementPassword=$PLACEMENT_PASSWORD"
    +$ oc set data secret/osp-secret "SwiftPassword=$SWIFT_PASSWORD"
    +
    +
    +
  8. +
  9. +

    If you enabled TLS-e in your OSP environment, in the spec:tls section, set the enabled parameter to true:

    +
    +
    +
    apiVersion: core.openstack.org/v1beta1
    +kind: OpenStackControlPlane
    +metadata:
    +  name: openstack
    +spec:
    +  tls:
    +    podLevel:
    +      enabled: true
    +      internal:
    +        ca:
    +          customIssuer: rootca-internal
    +      libvirt:
    +        ca:
    +          customIssuer: rootca-internal
    +      ovn:
    +        ca:
    +          customIssuer: rootca-internal
    +    ingress:
    +      ca:
    +        customIssuer: rootca-internal
    +      enabled: true
    +
    +
    +
  10. +
  11. +

    If you did not enable TLS-e, in the spec:tls` section, set the enabled parameter to false:

    +
    +
    +
    apiVersion: core.openstack.org/v1beta1
    +kind: OpenStackControlPlane
    +metadata:
    +  name: openstack
    +spec:
    +  tls:
    +    podLevel:
    +      enabled: false
    +    ingress:
    +      enabled: false
    +
    +
    +
  12. +
  13. +

    Deploy the OpenStackControlPlane CR. Ensure that you only enable the DNS, MariaDB, Memcached, and RabbitMQ services. All other services must +be disabled:

    +
    +
    +
    oc apply -f - <<EOF
    +apiVersion: core.openstack.org/v1beta1
    +kind: OpenStackControlPlane
    +metadata:
    +  name: openstack
    +spec:
    +  secret: osp-secret
    +  storageClass: local-storage (1)
    +
    +  barbican:
    +    enabled: false
    +    template:
    +      barbicanAPI: {}
    +      barbicanWorker: {}
    +      barbicanKeystoneListener: {}
    +
    +  cinder:
    +    enabled: false
    +    template:
    +      cinderAPI: {}
    +      cinderScheduler: {}
    +      cinderBackup: {}
    +      cinderVolumes: {}
    +
    +  dns:
    +    template:
    +      override:
    +        service:
    +          metadata:
    +            annotations:
    +              metallb.universe.tf/address-pool: ctlplane
    +              metallb.universe.tf/allow-shared-ip: ctlplane
    +              metallb.universe.tf/loadBalancerIPs: 192.168.122.80
    +          spec:
    +            type: LoadBalancer
    +      options:
    +      - key: server
    +        values:
    +        - 192.168.122.1
    +      replicas: 1
    +
    +  glance:
    +    enabled: false
    +    template:
    +      glanceAPIs: {}
    +
    +  heat:
    +    enabled: false
    +    template: {}
    +
    +  horizon:
    +    enabled: false
    +    template: {}
    +
    +  ironic:
    +    enabled: false
    +    template:
    +      ironicConductors: []
    +
    +  keystone:
    +    enabled: false
    +    template: {}
    +
    +  manila:
    +    enabled: false
    +    template:
    +      manilaAPI: {}
    +      manilaScheduler: {}
    +      manilaShares: {}
    +
    +  mariadb:
    +    enabled: false
    +    templates: {}
    +
    +  galera:
    +    enabled: true
    +    templates:
    +      openstack:
    +        secret: osp-secret
    +        replicas: 3
    +        storageRequest: 500M
    +      openstack-cell1:
    +        secret: osp-secret
    +        replicas: 3
    +        storageRequest: 500M
    +
    +  memcached:
    +    enabled: true
    +    templates:
    +      memcached:
    +        replicas: 3
    +
    +  neutron:
    +    enabled: false
    +    template: {}
    +
    +  nova:
    +    enabled: false
    +    template: {}
    +
    +  ovn:
    +    enabled: false
    +    template:
    +      ovnController:
    +        networkAttachment: tenant
    +        nodeSelector:
    +          node: non-existing-node-name
    +      ovnNorthd:
    +        replicas: 0
    +      ovnDBCluster:
    +        ovndbcluster-nb:
    +          dbType: NB
    +          networkAttachment: internalapi
    +        ovndbcluster-sb:
    +          dbType: SB
    +          networkAttachment: internalapi
    +
    +  placement:
    +    enabled: false
    +    template: {}
    +
    +  rabbitmq:
    +    templates:
    +      rabbitmq:
    +        override:
    +          service:
    +            metadata:
    +              annotations:
    +                metallb.universe.tf/address-pool: internalapi
    +                metallb.universe.tf/loadBalancerIPs: 172.17.0.85
    +            spec:
    +              type: LoadBalancer
    +      rabbitmq-cell1:
    +        override:
    +          service:
    +            metadata:
    +              annotations:
    +                metallb.universe.tf/address-pool: internalapi
    +                metallb.universe.tf/loadBalancerIPs: 172.17.0.86
    +            spec:
    +              type: LoadBalancer
    +
    +  telemetry:
    +    enabled: false
    +
    +  swift:
    +    enabled: false
    +    template:
    +      swiftRing:
    +        ringReplicas: 1
    +      swiftStorage:
    +        replicas: 0
    +      swiftProxy:
    +        replicas: 1
    +EOF
    +
    +
    +
    + + + + + +
    1Select an existing storage class in your OCP cluster.
    +
    +
  14. +
+
+
+
Verification
+
    +
  • +

    Verify that MariaDB is running:

    +
    +
    +
    $ oc get pod openstack-galera-0 -o jsonpath='{.status.phase}{"\n"}'
    +$ oc get pod openstack-cell1-galera-0 -o jsonpath='{.status.phase}{"\n"}'
    +
    +
    +
  • +
+
+
+
+

Configuring a Ceph back end

+
+

If your OpenStack (OSP) Wallaby deployment uses a Ceph back end for any service, such as Image Service (glance), Block Storage service (cinder), Compute service (nova), or Shared File Systems service (manila), you must configure the custom resources (CRs) to use the same back end in the Red Hat OpenStack Services on OpenShift (RHOSO) Antelope deployment.

+
+
+ + + + + +
+ + +To run ceph commands, you must use SSH to connect to a Ceph node and run sudo cephadm shell. This generates a Ceph orchestrator container that enables you to run administrative commands against the Ceph Storage cluster. If you deployed the Ceph Storage cluster by using TripleO, you can launch the cephadm shell from an OSP Controller node. +
+
+
+
Prerequisites
+
    +
  • +

    The OpenStackControlPlane CR is created.

    +
  • +
  • +

    If your OSP Wallaby deployment uses the Shared File Systems service, the openstack keyring is updated. Modify the openstack user so that you can use it across all OSP services:

    +
    +
    +
    ceph auth caps client.openstack \
    +  mgr 'allow *' \
    +  mon 'allow r, profile rbd' \
    +  osd 'profile rbd pool=vms, profile rbd pool=volumes, profile rbd pool=images, allow rw pool manila_data'
    +
    +
    +
    +

    Using the same user across all services makes it simpler to create a common Ceph secret that includes the keyring and ceph.conf file and propagate the secret to all the services that need it.

    +
    +
  • +
  • +

    The following shell variables are defined. Replace the following example values with values that are correct for your environment:

    +
    +
    +
    CEPH_SSH="ssh -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa root@192.168.122.100"
    +CEPH_KEY=$($CEPH_SSH "cat /etc/ceph/ceph.client.openstack.keyring | base64 -w 0")
    +CEPH_CONF=$($CEPH_SSH "cat /etc/ceph/ceph.conf | base64 -w 0")
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Create the ceph-conf-files secret that includes the Ceph configuration:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: v1
    +data:
    +  ceph.client.openstack.keyring: $CEPH_KEY
    +  ceph.conf: $CEPH_CONF
    +kind: Secret
    +metadata:
    +  name: ceph-conf-files
    +  namespace: openstack
    +type: Opaque
    +EOF
    +
    +
    +
    +

    The content of the file should be similar to the following example:

    +
    +
    +
    +
    apiVersion: v1
    +kind: Secret
    +metadata:
    +  name: ceph-conf-files
    +  namespace: openstack
    +stringData:
    +  ceph.client.openstack.keyring: |
    +    [client.openstack]
    +        key = <secret key>
    +        caps mgr = "allow *"
    +        caps mon = "allow r, profile rbd"
    +        caps osd = "pool=vms, profile rbd pool=volumes, profile rbd pool=images, allow rw pool manila_data'
    +  ceph.conf: |
    +    [global]
    +    fsid = 7a1719e8-9c59-49e2-ae2b-d7eb08c695d4
    +    mon_host = 10.1.1.2,10.1.1.3,10.1.1.4
    +
    +
    +
  2. +
  3. +

    In your OpenStackControlPlane CR, inject ceph.conf and ceph.client.openstack.keyring to the OSP services that are defined in the propagation list. For example:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch '
    +spec:
    +  extraMounts:
    +    - name: v1
    +      region: r1
    +      extraVol:
    +        - propagation:
    +          - CinderVolume
    +          - CinderBackup
    +          - GlanceAPI
    +          - ManilaShare
    +          extraVolType: Ceph
    +          volumes:
    +          - name: ceph
    +            projected:
    +              sources:
    +              - secret:
    +                  name: ceph-conf-files
    +          mounts:
    +          - name: ceph
    +            mountPath: "/etc/ceph"
    +            readOnly: true
    +'
    +
    +
    +
  4. +
+
+
+
+

Creating an NFS Ganesha cluster

+
+

If you use CephFS through NFS with the Shared File Systems service (manila), you must create a new clustered NFS service on the Ceph cluster. This service replaces the standalone, Pacemaker-controlled ceph-nfs service that you use in OpenStack (OSP) Wallaby.

+
+
+
Procedure
+
    +
  1. +

    Identify the Ceph nodes to deploy the new clustered NFS service, for example, cephstorage-0, cephstorage-1, cephstorage-2.

    +
    + + + + + +
    + + +You must deploy this service on the StorageNFS isolated network so that you can mount your existing shares through the new NFS export locations. +You can deploy the new clustered NFS service on your existing CephStorage nodes or HCI nodes, or on new hardware that you enrolled in the Ceph cluster. +
    +
    +
  2. +
  3. +

    If you deployed your Ceph nodes with TripleO, propagate the StorageNFS network to the target nodes where the ceph-nfs service is deployed.

    +
    +
      +
    1. +

      Identify the node definition file, overcloud-baremetal-deploy.yaml, that is used in the OSP environment. +See Deploying +an Overcloud with Network Isolation with TripleO and Applying +network configuration changes after deployment for the background to these +tasks.

      +
    2. +
    3. +

      Edit the networks that are associated with the Ceph Storage nodes to include the StorageNFS network:

      +
      +
      +
      - name: CephStorage
      +  count: 3
      +  hostname_format: cephstorage-%index%
      +  instances:
      +  - hostname: cephstorage-0
      +    name: ceph-0
      +  - hostname: cephstorage-1
      +    name: ceph-1
      +  - hostname: cephstorage-2
      +    name: ceph-2
      +  defaults:
      +    profile: ceph-storage
      +    network_config:
      +      template: /home/stack/network/nic-configs/ceph-storage.j2
      +      network_config_update: true
      +    networks:
      +    - network: ctlplane
      +      vif: true
      +    - network: storage
      +    - network: storage_mgmt
      +    - network: storage_nfs
      +
      +
      +
    4. +
    5. +

      Edit the network configuration template file, for example, /home/stack/network/nic-configs/ceph-storage.j2, for the Ceph Storage nodes +to include an interface that connects to the StorageNFS network:

      +
      +
      +
      - type: vlan
      +  device: nic2
      +  vlan_id: {{ storage_nfs_vlan_id }}
      +  addresses:
      +  - ip_netmask: {{ storage_nfs_ip }}/{{ storage_nfs_cidr }}
      +  routes: {{ storage_nfs_host_routes }}
      +
      +
      +
    6. +
    7. +

      Update the Ceph Storage nodes:

      +
      +
      +
      $ openstack overcloud node provision \
      +    --stack overcloud   \
      +    --network-config -y  \
      +    -o overcloud-baremetal-deployed-storage_nfs.yaml \
      +    --concurrency 2 \
      +    /home/stack/network/baremetal_deployment.yaml
      +
      +
      +
      +

      When the update is complete, ensure that a new interface is created in theCeph Storage nodes and that they are tagged with the VLAN that is associated with StorageNFS.

      +
      +
    8. +
    +
    +
  4. +
  5. +

    Identify the IP address from the StorageNFS network to use as the Virtual IP +address (VIP) for the Ceph NFS service:

    +
    +
    +
    $ openstack port list -c "Fixed IP Addresses" --network storage_nfs
    +
    +
    +
  6. +
  7. +

    In a running cephadm shell, identify the hosts for the NFS service:

    +
    +
    +
    $ ceph orch host ls
    +
    +
    +
  8. +
  9. +

    Label each host that you identified. Repeat this command for each host that you want to label:

    +
    +
    +
    $ ceph orch host label add <hostname> nfs
    +
    +
    +
    +
      +
    • +

      Replace <hostname> with the name of the host that you identified.

      +
    • +
    +
    +
  10. +
  11. +

    Create the NFS cluster:

    +
    +
    +
    $ ceph nfs cluster create cephfs \
    +    "label:nfs" \
    +    --ingress \
    +    --virtual-ip=<VIP>
    +    --ingress-mode=haproxy-protocol
    +}}
    +
    +
    +
    +
      +
    • +

      Replace <VIP> with the VIP for the Ceph NFS service.

      +
      + + + + + +
      + + +You must set the ingress-mode argument to haproxy-protocol. No other +ingress-mode is supported. This ingress mode allows you to enforce client +restrictions through the Shared File Systems service. +
      +
      +
    • +
    • +

      For more information on deploying the clustered Ceph NFS service, see the +ceph orchestrator +documentation.

      +
    • +
    +
    +
  12. +
  13. +

    Check the status of the NFS cluster:

    +
    +
    +
    $ ceph nfs cluster ls
    +$ ceph nfs cluster info cephfs
    +
    +
    +
  14. +
+
+
+
+

Stopping OpenStack services

+
+

Before you start the Red Hat OpenStack Services on OpenShift (RHOSO) adoption, you must stop the OpenStack (OSP) services to avoid inconsistencies in the data that you migrate for the data plane adoption. Inconsistencies are caused by resource changes after the database is copied to the new deployment.

+
+
+

You should not stop the infrastructure management services yet, such as:

+
+
+
    +
  • +

    Database

    +
  • +
  • +

    RabbitMQ

    +
  • +
  • +

    HAProxy Load Balancer

    +
  • +
  • +

    Ceph-nfs

    +
  • +
  • +

    Compute service

    +
  • +
  • +

    Containerized modular libvirt daemons

    +
  • +
  • +

    Object Storage service (swift) back-end services

    +
  • +
+
+
+
Prerequisites
+
    +
  • +

    Ensure that there no long-running tasks that require the services that you plan to stop, such as instance live migrations, volume migrations, volume creation, backup and restore, attaching, detaching, and other similar operations:

    +
    +
    +
    openstack server list --all-projects -c ID -c Status |grep -E '\| .+ing \|'
    +openstack volume list --all-projects -c ID -c Status |grep -E '\| .+ing \|'| grep -vi error
    +openstack volume backup list --all-projects -c ID -c Status |grep -E '\| .+ing \|' | grep -vi error
    +openstack share list --all-projects -c ID -c Status |grep -E '\| .+ing \|'| grep -vi error
    +openstack image list -c ID -c Status |grep -E '\| .+ing \|'
    +
    +
    +
  • +
  • +

    Collect the services topology-specific configuration. For more information, see Retrieving topology-specific service configuration.

    +
  • +
  • +

    Define the following shell variables. The values are examples and refer to a single node standalone TripleO deployment. Replace these example values with values that are correct for your environment:

    +
    +
    +
    CONTROLLER1_SSH="ssh -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa root@192.168.122.100"
    +CONTROLLER2_SSH="ssh -i <path to SSH key> root@<controller-2 IP>"
    +CONTROLLER3_SSH="ssh -i <path to SSH key> root@<controller-3 IP>"
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    If your deployment enables CephFS through NFS as a back end for Shared File Systems service (manila), remove the following Pacemaker ordering and co-location constraints that govern the Virtual IP address of the ceph-nfs service and the manila-share service:

    +
    +
    +
    # check the co-location and ordering constraints concerning "manila-share"
    +sudo pcs constraint list --full
    +
    +# remove these constraints
    +sudo pcs constraint remove colocation-openstack-manila-share-ceph-nfs-INFINITY
    +sudo pcs constraint remove order-ceph-nfs-openstack-manila-share-Optional
    +
    +
    +
  2. +
  3. +

    Disable OSP control plane services:

    +
    +
    +
    # Update the services list to be stopped
    +ServicesToStop=("tripleo_aodh_api.service"
    +                "tripleo_aodh_api_cron.service"
    +                "tripleo_aodh_evaluator.service"
    +                "tripleo_aodh_listener.service"
    +                "tripleo_aodh_notifier.service"
    +                "tripleo_ceilometer_agent_central.service"
    +                "tripleo_ceilometer_agent_notification.service"
    +                "tripleo_horizon.service"
    +                "tripleo_keystone.service"
    +                "tripleo_barbican_api.service"
    +                "tripleo_barbican_worker.service"
    +                "tripleo_barbican_keystone_listener.service"
    +                "tripleo_cinder_api.service"
    +                "tripleo_cinder_api_cron.service"
    +                "tripleo_cinder_scheduler.service"
    +                "tripleo_cinder_volume.service"
    +                "tripleo_cinder_backup.service"
    +                "tripleo_collectd.service"
    +                "tripleo_glance_api.service"
    +                "tripleo_gnocchi_api.service"
    +                "tripleo_gnocchi_metricd.service"
    +                "tripleo_gnocchi_statsd.service"
    +                "tripleo_manila_api.service"
    +                "tripleo_manila_api_cron.service"
    +                "tripleo_manila_scheduler.service"
    +                "tripleo_neutron_api.service"
    +                "tripleo_placement_api.service"
    +                "tripleo_nova_api_cron.service"
    +                "tripleo_nova_api.service"
    +                "tripleo_nova_conductor.service"
    +                "tripleo_nova_metadata.service"
    +                "tripleo_nova_scheduler.service"
    +                "tripleo_nova_vnc_proxy.service"
    +                "tripleo_aodh_api.service"
    +                "tripleo_aodh_api_cron.service"
    +                "tripleo_aodh_evaluator.service"
    +                "tripleo_aodh_listener.service"
    +                "tripleo_aodh_notifier.service"
    +                "tripleo_ceilometer_agent_central.service"
    +                "tripleo_ceilometer_agent_compute.service"
    +                "tripleo_ceilometer_agent_ipmi.service"
    +                "tripleo_ceilometer_agent_notification.service"
    +                "tripleo_ovn_cluster_northd.service"
    +                "tripleo_ironic_neutron_agent.service"
    +                "tripleo_ironic_api.service"
    +                "tripleo_ironic_inspector.service"
    +                "tripleo_ironic_conductor.service")
    +
    +PacemakerResourcesToStop=("openstack-cinder-volume"
    +                          "openstack-cinder-backup"
    +                          "openstack-manila-share")
    +
    +echo "Stopping systemd OpenStack services"
    +for service in ${ServicesToStop[*]}; do
    +    for i in {1..3}; do
    +        SSH_CMD=CONTROLLER${i}_SSH
    +        if [ ! -z "${!SSH_CMD}" ]; then
    +            echo "Stopping the $service in controller $i"
    +            if ${!SSH_CMD} sudo systemctl is-active $service; then
    +                ${!SSH_CMD} sudo systemctl stop $service
    +            fi
    +        fi
    +    done
    +done
    +
    +echo "Checking systemd OpenStack services"
    +for service in ${ServicesToStop[*]}; do
    +    for i in {1..3}; do
    +        SSH_CMD=CONTROLLER${i}_SSH
    +        if [ ! -z "${!SSH_CMD}" ]; then
    +            if ! ${!SSH_CMD} systemctl show $service | grep ActiveState=inactive >/dev/null; then
    +                echo "ERROR: Service $service still running on controller $i"
    +            else
    +                echo "OK: Service $service is not running on controller $i"
    +            fi
    +        fi
    +    done
    +done
    +
    +echo "Stopping pacemaker OpenStack services"
    +for i in {1..3}; do
    +    SSH_CMD=CONTROLLER${i}_SSH
    +    if [ ! -z "${!SSH_CMD}" ]; then
    +        echo "Using controller $i to run pacemaker commands"
    +        for resource in ${PacemakerResourcesToStop[*]}; do
    +            if ${!SSH_CMD} sudo pcs resource config $resource &>/dev/null; then
    +                echo "Stopping $resource"
    +                ${!SSH_CMD} sudo pcs resource disable $resource
    +            else
    +                echo "Service $resource not present"
    +            fi
    +        done
    +        break
    +    fi
    +done
    +
    +echo "Checking pacemaker OpenStack services"
    +for i in {1..3}; do
    +    SSH_CMD=CONTROLLER${i}_SSH
    +    if [ ! -z "${!SSH_CMD}" ]; then
    +        echo "Using controller $i to run pacemaker commands"
    +        for resource in ${PacemakerResourcesToStop[*]}; do
    +            if ${!SSH_CMD} sudo pcs resource config $resource &>/dev/null; then
    +                if ! ${!SSH_CMD} sudo pcs resource status $resource | grep Started; then
    +                    echo "OK: Service $resource is stopped"
    +                else
    +                    echo "ERROR: Service $resource is started"
    +                fi
    +            fi
    +        done
    +        break
    +    fi
    +done
    +
    +
    +
    +

    If the status of each service is OK, then the services stopped successfully.

    +
    +
  4. +
+
+
+
+

Migrating databases to MariaDB instances

+
+

Migrate your databases from the original OpenStack (OSP) deployment to the MariaDB instances in the OpenShift cluster.

+
+
+
Prerequisites
+
    +
  • +

    Ensure that the control plane MariaDB and RabbitMQ are running, and that no other control plane services are running.

    +
  • +
  • +

    Retrieve the topology-specific service configuration. For more information, see Retrieving topology-specific service configuration.

    +
  • +
  • +

    Stop the OSP services. For more information, see Stopping OpenStack services.

    +
  • +
  • +

    Ensure that there is network routability between the original MariaDB and the MariaDB for the control plane.

    +
  • +
  • +

    Define the following shell variables. Replace the following example values with values that are correct for your environment:

    +
    +
    +
    PODIFIED_MARIADB_IP=$(oc get svc --selector "mariadb/name=openstack" -ojsonpath='{.items[0].spec.clusterIP}')
    +PODIFIED_CELL1_MARIADB_IP=$(oc get svc --selector "mariadb/name=openstack-cell1" -ojsonpath='{.items[0].spec.clusterIP}')
    +PODIFIED_DB_ROOT_PASSWORD=$(oc get -o json secret/osp-secret | jq -r .data.DbRootPassword | base64 -d)
    +
    +# The CHARACTER_SET and collation should match the source DB
    +# if the do not then it will break foreign key relationships
    +# for any tables that are created in the future as part of db sync
    +CHARACTER_SET=utf8
    +COLLATION=utf8_general_ci
    +
    +STORAGE_CLASS=crc-csi-hostpath-provisioner
    +MARIADB_IMAGE=quay.io/podified-antelope-centos9/openstack-mariadb:current-podified
    +# Replace with your environment's MariaDB Galera cluster VIP and backend IPs:
    +SOURCE_MARIADB_IP=172.17.0.2
    +declare -A SOURCE_GALERA_MEMBERS
    +SOURCE_GALERA_MEMBERS=(
    +  ["standalone.localdomain"]=172.17.0.100
    +  # ...
    +)
    +SOURCE_DB_ROOT_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' MysqlRootPassword:' | awk -F ': ' '{ print $2; }')
    +
    +
    +
    +

    To get the value to set SOURCE_MARIADB_IP, query the puppet-generated configurations in a Controller node:

    +
    +
    +
    +
    $ grep -rI 'listen mysql' -A10 /var/lib/config-data/puppet-generated/ | grep bind
    +
    +
    +
  • +
  • +

    Prepare the MariaDB adoption helper pod:

    +
    +
      +
    1. +

      Create a temporary volume claim and a pod for the database data copy. Edit the volume claim storage request if necessary, to give it enough space for the overcloud databases:

      +
      +
      +
      oc apply -f - <<EOF
      +---
      +apiVersion: v1
      +kind: PersistentVolumeClaim
      +metadata:
      +  name: mariadb-data
      +spec:
      +  storageClassName: $STORAGE_CLASS
      +  accessModes:
      +    - ReadWriteOnce
      +  resources:
      +    requests:
      +      storage: 10Gi
      +---
      +apiVersion: v1
      +kind: Pod
      +metadata:
      +  name: mariadb-copy-data
      +  annotations:
      +    openshift.io/scc: anyuid
      +    k8s.v1.cni.cncf.io/networks: internalapi
      +  labels:
      +    app: adoption
      +spec:
      +  containers:
      +  - image: $MARIADB_IMAGE
      +    command: [ "sh", "-c", "sleep infinity"]
      +    name: adoption
      +    volumeMounts:
      +    - mountPath: /backup
      +      name: mariadb-data
      +  securityContext:
      +    allowPrivilegeEscalation: false
      +    capabilities:
      +      drop: ALL
      +    runAsNonRoot: true
      +    seccompProfile:
      +      type: RuntimeDefault
      +  volumes:
      +  - name: mariadb-data
      +    persistentVolumeClaim:
      +      claimName: mariadb-data
      +EOF
      +
      +
      +
    2. +
    3. +

      Wait for the pod to be ready:

      +
      +
      +
      $ oc wait --for condition=Ready pod/mariadb-copy-data --timeout=30s
      +
      +
      +
    4. +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Check that the source Galera database cluster members are online and synced:

    +
    +
    +
    for i in "${!SOURCE_GALERA_MEMBERS[@]}"; do
    +  echo "Checking for the database node $i WSREP status Synced"
    +  oc rsh mariadb-copy-data mysql \
    +    -h "${SOURCE_GALERA_MEMBERS[$i]}" -uroot -p"$SOURCE_DB_ROOT_PASSWORD" \
    +    -e "show global status like 'wsrep_local_state_comment'" | \
    +    grep -qE "\bSynced\b"
    +done
    +
    +
    +
  2. +
  3. +

    Get the count of source databases with the NOK (not-OK) status:

    +
    +
    +
    $ oc rsh mariadb-copy-data mysql -h "${SOURCE_MARIADB_IP}" -uroot -p"${SOURCE_DB_ROOT_PASSWORD}" -e "SHOW databases;"
    +
    +
    +
  4. +
  5. +

    Check that mysqlcheck had no errors:

    +
    +
    +
    . ~/.source_cloud_exported_variables
    +test -z "$PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK"  || [ "$PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK" = " " ] && echo "OK" || echo "CHECK FAILED"
    +
    +
    +
  6. +
  7. +

    Test the connection to the control plane databases:

    +
    +
    +
    $ oc run mariadb-client --image $MARIADB_IMAGE -i --rm --restart=Never -- \
    +    mysql -rsh "$PODIFIED_MARIADB_IP" -uroot -p"$PODIFIED_DB_ROOT_PASSWORD" -e 'SHOW databases;'
    +$ oc run mariadb-client --image $MARIADB_IMAGE -i --rm --restart=Never -- \
    +    mysql -rsh "$PODIFIED_CELL1_MARIADB_IP" -uroot -p"$PODIFIED_DB_ROOT_PASSWORD" -e 'SHOW databases;'
    +
    +
    +
    + + + + + +
    + + +You must transition Compute service (nova) services that are imported later into a superconductor architecture by deleting the old service records in the cell databases, starting with cell1. New records are registered with different hostnames provided by the Compute service operator. All Compute services, except the Compute agent, have no internal state, and their service records can be safely deleted. You also need to rename the former default cell to cell1. +
    +
    +
  8. +
  9. +

    Create a dump of the original databases:

    +
    +
    +
    $ oc rsh mariadb-copy-data << EOF
    +  mysql -h"${SOURCE_MARIADB_IP}" -uroot -p"${SOURCE_DB_ROOT_PASSWORD}" \
    +  -N -e "show databases" | grep -E -v "schema|mysql|gnocchi|aodh" | \
    +  while read dbname; do
    +    echo "Dumping \${dbname}";
    +    mysqldump -h"${SOURCE_MARIADB_IP}" -uroot -p"${SOURCE_DB_ROOT_PASSWORD}" \
    +      --single-transaction --complete-insert --skip-lock-tables --lock-tables=0 \
    +      "\${dbname}" > /backup/"\${dbname}".sql;
    +   done
    +EOF
    +
    +
    +
  10. +
  11. +

    Restore the databases from .sql files into the control plane MariaDB:

    +
    +
    +
    $ oc rsh mariadb-copy-data << EOF
    +  # db schemas to rename on import
    +  declare -A db_name_map
    +  db_name_map['nova']='nova_cell1'
    +  db_name_map['ovs_neutron']='neutron'
    +  db_name_map['ironic-inspector']='ironic_inspector'
    +
    +  # db servers to import into
    +  declare -A db_server_map
    +  db_server_map['default']=${PODIFIED_MARIADB_IP}
    +  db_server_map['nova_cell1']=${PODIFIED_CELL1_MARIADB_IP}
    +
    +  # db server root password map
    +  declare -A db_server_password_map
    +  db_server_password_map['default']=${PODIFIED_DB_ROOT_PASSWORD}
    +  db_server_password_map['nova_cell1']=${PODIFIED_DB_ROOT_PASSWORD}
    +
    +  cd /backup
    +  for db_file in \$(ls *.sql); do
    +    db_name=\$(echo \${db_file} | awk -F'.' '{ print \$1; }')
    +    if [[ -v "db_name_map[\${db_name}]" ]]; then
    +      echo "renaming \${db_name} to \${db_name_map[\${db_name}]}"
    +      db_name=\${db_name_map[\${db_name}]}
    +    fi
    +    db_server=\${db_server_map["default"]}
    +    if [[ -v "db_server_map[\${db_name}]" ]]; then
    +      db_server=\${db_server_map[\${db_name}]}
    +    fi
    +    db_password=\${db_server_password_map['default']}
    +    if [[ -v "db_server_password_map[\${db_name}]" ]]; then
    +      db_password=\${db_server_password_map[\${db_name}]}
    +    fi
    +    echo "creating \${db_name} in \${db_server}"
    +    mysql -h"\${db_server}" -uroot "-p\${db_password}" -e \
    +      "CREATE DATABASE IF NOT EXISTS \${db_name} DEFAULT \
    +      CHARACTER SET ${CHARACTER_SET} DEFAULT COLLATE ${COLLATION};"
    +    echo "importing \${db_name} into \${db_server}"
    +    mysql -h "\${db_server}" -uroot "-p\${db_password}" "\${db_name}" < "\${db_file}"
    +  done
    +
    +  mysql -h "\${db_server_map['default']}" -uroot -p"\${db_server_password_map['default']}" -e \
    +    "update nova_api.cell_mappings set name='cell1' where name='default';"
    +  mysql -h "\${db_server_map['nova_cell1']}" -uroot -p"\${db_server_password_map['nova_cell1']}" -e \
    +    "delete from nova_cell1.services where host not like '%nova-cell1-%' and services.binary != 'nova-compute';"
    +EOF
    +
    +
    +
  12. +
+
+
+
Verification
+

Compare the following outputs with the topology-specific service configuration. +For more information, see Retrieving topology-specific service configuration.

+
+
+
    +
  1. +

    Check that the databases are imported correctly:

    +
    +
    +
    . ~/.source_cloud_exported_variables
    +
    +# use 'oc exec' and 'mysql -rs' to maintain formatting
    +dbs=$(oc exec openstack-galera-0 -c galera -- mysql -rs -uroot "-p$PODIFIED_DB_ROOT_PASSWORD" -e 'SHOW databases;')
    +echo $dbs | grep -Eq '\bkeystone\b' && echo "OK" || echo "CHECK FAILED"
    +
    +# ensure neutron db is renamed from ovs_neutron
    +echo $dbs | grep -Eq '\bneutron\b'
    +echo $PULL_OPENSTACK_CONFIGURATION_DATABASES | grep -Eq '\bovs_neutron\b' && echo "OK" || echo "CHECK FAILED"
    +
    +# ensure nova cell1 db is extracted to a separate db server and renamed from nova to nova_cell1
    +c1dbs=$(oc exec openstack-cell1-galera-0 -c galera -- mysql -rs -uroot "-p$PODIFIED_DB_ROOT_PASSWORD" -e 'SHOW databases;')
    +echo $c1dbs | grep -Eq '\bnova_cell1\b' && echo "OK" || echo "CHECK FAILED"
    +
    +# ensure default cell renamed to cell1, and the cell UUIDs retained intact
    +novadb_mapped_cells=$(oc exec openstack-galera-0 -c galera -- mysql -rs -uroot "-p$PODIFIED_DB_ROOT_PASSWORD" \
    +  nova_api -e 'select uuid,name,transport_url,database_connection,disabled from cell_mappings;')
    +uuidf='\S{8,}-\S{4,}-\S{4,}-\S{4,}-\S{12,}'
    +left_behind=$(comm -23 \
    +  <(echo $PULL_OPENSTACK_CONFIGURATION_NOVADB_MAPPED_CELLS | grep -oE " $uuidf \S+") \
    +  <(echo $novadb_mapped_cells | tr -s "| " " " | grep -oE " $uuidf \S+"))
    +changed=$(comm -13 \
    +  <(echo $PULL_OPENSTACK_CONFIGURATION_NOVADB_MAPPED_CELLS | grep -oE " $uuidf \S+") \
    +  <(echo $novadb_mapped_cells | tr -s "| " " " | grep -oE " $uuidf \S+"))
    +test $(grep -Ec ' \S+$' <<<$left_behind) -eq 1 && echo "OK" || echo "CHECK FAILED"
    +default=$(grep -E ' default$' <<<$left_behind)
    +test $(grep -Ec ' \S+$' <<<$changed) -eq 1 && echo "OK" || echo "CHECK FAILED"
    +grep -qE " $(awk '{print $1}' <<<$default) cell1$" <<<$changed && echo "OK" || echo "CHECK FAILED"
    +
    +# ensure the registered Compute service name has not changed
    +novadb_svc_records=$(oc exec openstack-cell1-galera-0 -c galera -- mysql -rs -uroot "-p$PODIFIED_DB_ROOT_PASSWORD" \
    +  nova_cell1 -e "select host from services where services.binary='nova-compute' order by host asc;")
    +diff -Z <(echo $novadb_svc_records) <(echo $PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES) && echo "OK" || echo "CHECK FAILED"
    +
    +
    +
  2. +
  3. +

    Delete the mariadb-data pod and the mariadb-copy-data persistent volume claim that contains the database backup:

    +
    + + + + + +
    + + +Consider taking a snapshot of them before deleting. +
    +
    +
    +
    +
    $ oc delete pod mariadb-copy-data
    +$ oc delete pvc mariadb-data
    +
    +
    +
  4. +
+
+
+ + + + + +
+ + +During the pre-checks and post-checks, the mariadb-client pod might return a pod security warning related to the restricted:latest security context constraint. This warning is due to default security context constraints and does not prevent the admission controller from creating a pod. You see a warning for the short-lived pod, but it does not interfere with functionality. +For more information, see About pod security standards and warnings. +
+
+
+
+

Migrating OVN data

+
+

Migrate the data in the OVN databases from the original OpenStack deployment to ovsdb-server instances that are running in the OpenShift cluster.

+
+
+
Prerequisites
+
    +
  • +

    The OpenStackControlPlane resource is created.

    +
  • +
  • +

    NetworkAttachmentDefinition custom resources (CRs) for the original cluster are defined. Specifically, the internalapi network is defined.

    +
  • +
  • +

    The original Networking service (neutron) and OVN northd are not running.

    +
  • +
  • +

    There is network routability between the control plane services and the adopted cluster.

    +
  • +
  • +

    The cloud is migrated to the Modular Layer 2 plug-in with Open Virtual Networking (ML2/OVN) mechanism driver.

    +
  • +
  • +

    Define the following shell variables. Replace the example values with values that are correct for your environment:

    +
    +
    +
    STORAGE_CLASS_NAME=crc-csi-hostpath-provisioner
    +OVSDB_IMAGE=quay.io/podified-antelope-centos9/openstack-ovn-base:current-podified
    +SOURCE_OVSDB_IP=172.17.0.100
    +
    +
    +
    +

    To get the value to set SOURCE_OVSDB_IP, query the puppet-generated configurations in a Controller node:

    +
    +
    +
    +
    $ grep -rI 'ovn_[ns]b_conn' /var/lib/config-data/puppet-generated/
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Prepare a temporary PersistentVolume claim and the helper pod for the OVN backup. Adjust the storage requests for a large database, if needed:

    +
    +
    +
    $ oc apply -f - <<EOF
    +---
    +apiVersion: cert-manager.io/v1
    +kind: Certificate
    +metadata:
    +  name: ovn-data-cert
    +  namespace: openstack
    +spec:
    +  commonName: ovn-data-cert
    +  secretName: ovn-data-cert
    +  issuerRef:
    +    name: rootca-internal
    +---
    +apiVersion: v1
    +kind: PersistentVolumeClaim
    +metadata:
    +  name: ovn-data
    +spec:
    +  storageClassName: $STORAGE_CLASS_NAME
    +  accessModes:
    +    - ReadWriteOnce
    +  resources:
    +    requests:
    +      storage: 10Gi
    +---
    +apiVersion: v1
    +kind: Pod
    +metadata:
    +  name: ovn-copy-data
    +  annotations:
    +    openshift.io/scc: anyuid
    +    k8s.v1.cni.cncf.io/networks: internalapi
    +  labels:
    +    app: adoption
    +spec:
    +  containers:
    +  - image: $OVSDB_IMAGE
    +    command: [ "sh", "-c", "sleep infinity"]
    +    name: adoption
    +    volumeMounts:
    +    - mountPath: /backup
    +      name: ovn-data
    +    - mountPath: /etc/pki/tls/misc
    +      name: ovn-data-cert
    +      readOnly: true
    +  securityContext:
    +    allowPrivilegeEscalation: false
    +    capabilities:
    +      drop: ALL
    +    runAsNonRoot: true
    +    seccompProfile:
    +      type: RuntimeDefault
    +  volumes:
    +  - name: ovn-data
    +    persistentVolumeClaim:
    +      claimName: ovn-data
    +  - name: ovn-data-cert
    +    secret:
    +      secretName: ovn-data-cert
    +EOF
    +
    +
    +
  2. +
  3. +

    Wait for the pod to be ready:

    +
    +
    +
    $ oc wait --for=condition=Ready pod/ovn-copy-data --timeout=30s
    +
    +
    +
  4. +
  5. +

    Back up your OVN databases:

    +
    +
      +
    • +

      If you did not enable TLS everywhere, run the following command:

      +
      +
      +
      $ oc exec ovn-copy-data -- bash -c "ovsdb-client backup tcp:$SOURCE_OVSDB_IP:6641 > /backup/ovs-nb.db"
      +$ oc exec ovn-copy-data -- bash -c "ovsdb-client backup tcp:$SOURCE_OVSDB_IP:6642 > /backup/ovs-sb.db"
      +
      +
      +
    • +
    • +

      If you enabled TLS everywhere, run the following command:

      +
      +
      +
      $ oc exec ovn-copy-data -- bash -c "ovsdb-client backup --ca-cert=/etc/pki/tls/misc/ca.crt --private-key=/etc/pki/tls/misc/tls.key --certificate=/etc/pki/tls/misc/tls.crt ssl:$SOURCE_OVSDB_IP:6641 > /backup/ovs-nb.db"
      +$ oc exec ovn-copy-data -- bash -c "ovsdb-client backup --ca-cert=/etc/pki/tls/misc/ca.crt --private-key=/etc/pki/tls/misc/tls.key --certificate=/etc/pki/tls/misc/tls.crt ssl:$SOURCE_OVSDB_IP:6642 > /backup/ovs-sb.db"
      +
      +
      +
    • +
    +
    +
  6. +
  7. +

    Start the control plane OVN database services prior to import, with northd and ovn-controller disabled:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack-galera-network-isolation --type=merge --patch '
    +spec:
    +  ovn:
    +    enabled: true
    +    template:
    +      ovnDBCluster:
    +        ovndbcluster-nb:
    +          dbType: NB
    +          storageRequest: 10G
    +          networkAttachment: internalapi
    +        ovndbcluster-sb:
    +          dbType: SB
    +          storageRequest: 10G
    +          networkAttachment: internalapi
    +      ovnNorthd:
    +        replicas: 0
    +      ovnController:
    +        networkAttachment: tenant
    +        nodeSelector:
    +          node: non-existing-node-name
    +'
    +
    +
    +
  8. +
  9. +

    Wait for the OVN database services to reach the Running phase:

    +
    +
    +
    $ oc wait --for=jsonpath='{.status.phase}'=Running pod --selector=service=ovsdbserver-nb
    +$ oc wait --for=jsonpath='{.status.phase}'=Running pod --selector=service=ovsdbserver-sb
    +
    +
    +
  10. +
  11. +

    Fetch the OVN database IP addresses on the clusterIP service network:

    +
    +
    +
    PODIFIED_OVSDB_NB_IP=$(oc get svc --selector "statefulset.kubernetes.io/pod-name=ovsdbserver-nb-0" -ojsonpath='{.items[0].spec.clusterIP}')
    +PODIFIED_OVSDB_SB_IP=$(oc get svc --selector "statefulset.kubernetes.io/pod-name=ovsdbserver-sb-0" -ojsonpath='{.items[0].spec.clusterIP}')
    +
    +
    +
  12. +
  13. +

    Upgrade the database schema for the backup files:

    +
    +
      +
    1. +

      If you did not enable TLS everywhere, use the following command:

      +
      +
      +
      $ oc exec ovn-copy-data -- bash -c "ovsdb-client get-schema tcp:$PODIFIED_OVSDB_NB_IP:6641 > /backup/ovs-nb.ovsschema && ovsdb-tool convert /backup/ovs-nb.db /backup/ovs-nb.ovsschema"
      +$ oc exec ovn-copy-data -- bash -c "ovsdb-client get-schema tcp:$PODIFIED_OVSDB_SB_IP:6642 > /backup/ovs-sb.ovsschema && ovsdb-tool convert /backup/ovs-sb.db /backup/ovs-sb.ovsschema"
      +
      +
      +
    2. +
    3. +

      If you enabled TLS everywhere, use the following command:

      +
      +
      +
      $ oc exec ovn-copy-data -- bash -c "ovsdb-client get-schema --ca-cert=/etc/pki/tls/misc/ca.crt --private-key=/etc/pki/tls/misc/tls.key --certificate=/etc/pki/tls/misc/tls.crt ssl:$PODIFIED_OVSDB_NB_IP:6641 > /backup/ovs-nb.ovsschema && ovsdb-tool convert /backup/ovs-nb.db /backup/ovs-nb.ovsschema"
      +$ oc exec ovn-copy-data -- bash -c "ovsdb-client get-schema --ca-cert=/etc/pki/tls/misc/ca.crt --private-key=/etc/pki/tls/misc/tls.key --certificate=/etc/pki/tls/misc/tls.crt ssl:$PODIFIED_OVSDB_SB_IP:6642 > /backup/ovs-sb.ovsschema && ovsdb-tool convert /backup/ovs-sb.db /backup/ovs-sb.ovsschema"
      +
      +
      +
    4. +
    +
    +
  14. +
  15. +

    Restore the database backup to the new OVN database servers:

    +
    +
      +
    1. +

      If you did not enable TLS everywhere, use the following command:

      +
      +
      +
      $ oc exec ovn-copy-data -- bash -c "ovsdb-client restore tcp:$PODIFIED_OVSDB_NB_IP:6641 < /backup/ovs-nb.db"
      +$ oc exec ovn-copy-data -- bash -c "ovsdb-client restore tcp:$PODIFIED_OVSDB_SB_IP:6642 < /backup/ovs-sb.db"
      +
      +
      +
    2. +
    3. +

      If you enabled TLS everywhere, use the following command:

      +
      +
      +
      $ oc exec ovn-copy-data -- bash -c "ovsdb-client restore --ca-cert=/etc/pki/tls/misc/ca.crt --private-key=/etc/pki/tls/misc/tls.key --certificate=/etc/pki/tls/misc/tls.crt ssl:$PODIFIED_OVSDB_NB_IP:6641 < /backup/ovs-nb.db"
      +$ oc exec ovn-copy-data -- bash -c "ovsdb-client restore --ca-cert=/etc/pki/tls/misc/ca.crt --private-key=/etc/pki/tls/misc/tls.key --certificate=/etc/pki/tls/misc/tls.crt ssl:$PODIFIED_OVSDB_SB_IP:6642 < /backup/ovs-sb.db"
      +
      +
      +
    4. +
    +
    +
  16. +
  17. +

    Check that the data was successfully migrated by running the following commands against the new database servers, for example:

    +
    +
    +
    $ oc exec -it ovsdbserver-nb-0 -- ovn-nbctl show
    +$ oc exec -it ovsdbserver-sb-0 -- ovn-sbctl list Chassis
    +
    +
    +
  18. +
  19. +

    Start the control plane ovn-northd service to keep both OVN databases in sync:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack-galera-network-isolation --type=merge --patch '
    +spec:
    +  ovn:
    +    enabled: true
    +    template:
    +      ovnNorthd:
    +        replicas: 1
    +'
    +
    +
    +
  20. +
  21. +

    If you are running OVN gateway services on OCP nodes, enable the control plane ovn-controller service:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack-galera-network-isolation --type=json -p="[{'op': 'remove', 'path': '/spec/ovn/template/ovnController/nodeSelector'}]"
    +
    +
    +
    + + + + + +
    + + +Running OVN gateways on OCP nodes might be prone to data plane downtime during Open vSwitch upgrades. Consider running OVN gateways on dedicated Networker data plane nodes for production deployments instead. +
    +
    +
  22. +
  23. +

    Delete the ovn-data helper pod and the temporary PersistentVolumeClaim that is used to store OVN database backup files:

    +
    +
    +
    $ oc delete pod ovn-copy-data
    +$ oc delete pvc ovn-data
    +
    +
    +
    + + + + + +
    + + +Consider taking a snapshot of the ovn-data helper pod and the temporary PersistentVolumeClaim before deleting them. For more information, see About volume snapshots in OpenShift Container Platform storage overview. +
    +
    +
  24. +
  25. +

    Stop the adopted OVN database servers:

    +
    +
    +
    ServicesToStop=("tripleo_ovn_cluster_north_db_server.service"
    +                "tripleo_ovn_cluster_south_db_server.service")
    +
    +echo "Stopping systemd OpenStack services"
    +for service in ${ServicesToStop[*]}; do
    +    for i in {1..3}; do
    +        SSH_CMD=CONTROLLER${i}_SSH
    +        if [ ! -z "${!SSH_CMD}" ]; then
    +            echo "Stopping the $service in controller $i"
    +            if ${!SSH_CMD} sudo systemctl is-active $service; then
    +                ${!SSH_CMD} sudo systemctl stop $service
    +            fi
    +        fi
    +    done
    +done
    +
    +echo "Checking systemd OpenStack services"
    +for service in ${ServicesToStop[*]}; do
    +    for i in {1..3}; do
    +        SSH_CMD=CONTROLLER${i}_SSH
    +        if [ ! -z "${!SSH_CMD}" ]; then
    +            if ! ${!SSH_CMD} systemctl show $service | grep ActiveState=inactive >/dev/null; then
    +                echo "ERROR: Service $service still running on controller $i"
    +            else
    +                echo "OK: Service $service is not running on controller $i"
    +            fi
    +        fi
    +    done
    +done
    +
    +
    +
  26. +
+
+
+
+
+
+

Adopting OpenStack control plane services

+
+
+

Adopt your OpenStack Wallaby control plane services to deploy them in the Red Hat OpenStack Services on OpenShift (RHOSO) Antelope control plane.

+
+
+

Adopting the Identity service

+
+

To adopt the Identity service (keystone), you patch an existing OpenStackControlPlane custom resource (CR) where the Identity service is disabled. The patch starts the service with the configuration parameters that are provided by the OpenStack (OSP) environment.

+
+
+
Prerequisites
+
    +
  • +

    Create the keystone secret that includes the Fernet keys that were copied from the OSP environment:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: v1
    +data:
    +  CredentialKeys0: $($CONTROLLER1_SSH sudo cat /var/lib/config-data/puppet-generated/keystone/etc/keystone/credential-keys/0 | base64 -w 0)
    +  CredentialKeys1: $($CONTROLLER1_SSH sudo cat /var/lib/config-data/puppet-generated/keystone/etc/keystone/credential-keys/1 | base64 -w 0)
    +  FernetKeys0: $($CONTROLLER1_SSH sudo cat /var/lib/config-data/puppet-generated/keystone/etc/keystone/fernet-keys/0 | base64 -w 0)
    +  FernetKeys1: $($CONTROLLER1_SSH sudo cat /var/lib/config-data/puppet-generated/keystone/etc/keystone/fernet-keys/1 | base64 -w 0)
    +kind: Secret
    +metadata:
    +  name: keystone
    +  namespace: openstack
    +type: Opaque
    +EOF
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Patch the OpenStackControlPlane CR to deploy the Identity service:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch '
    +spec:
    +  keystone:
    +    enabled: true
    +    apiOverride:
    +      route: {}
    +    template:
    +      override:
    +        service:
    +          internal:
    +            metadata:
    +              annotations:
    +                metallb.universe.tf/address-pool: internalapi
    +                metallb.universe.tf/allow-shared-ip: internalapi
    +                metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +            spec:
    +              type: LoadBalancer
    +      databaseInstance: openstack
    +      secret: osp-secret
    +'
    +
    +
    +
  2. +
  3. +

    Create an alias to use the openstack command in the Red Hat OpenStack Services on OpenShift (RHOSO) deployment:

    +
    +
    +
    $ alias openstack="oc exec -t openstackclient -- openstack"
    +
    +
    +
  4. +
  5. +

    Remove services and endpoints that still point to the OSP +control plane, excluding the Identity service and its endpoints:

    +
    +
    +
    $ openstack endpoint list | grep keystone | awk '/admin/{ print $2; }' | xargs ${BASH_ALIASES[openstack]} endpoint delete || true
    +
    +for service in aodh heat heat-cfn barbican cinderv3 glance gnocchi manila manilav2 neutron nova placement swift ironic-inspector ironic; do
    +  openstack service list | awk "/ $service /{ print \$2; }" | xargs -r ${BASH_ALIASES[openstack]} service delete || true
    +done
    +
    +
    +
  6. +
+
+
+
Verification
+
    +
  • +

    Confirm that the Identity service endpoints are defined and are pointing to the control plane FQDNs:

    +
    +
    +
    $ openstack endpoint list | grep keystone
    +
    +
    +
  • +
+
+
+
+

Adopting the Key Manager service

+
+

To adopt the Key Manager service (barbican), you patch an existing OpenStackControlPlane custom resource (CR) where Key Manager service is disabled. The patch starts the service with the configuration parameters that are provided by the OpenStack (OSP) environment.

+
+
+

The Key Manager service adoption is complete if you see the following results:

+
+
+
    +
  • +

    The BarbicanAPI, BarbicanWorker, and BarbicanKeystoneListener services are up and running.

    +
  • +
  • +

    Keystone endpoints are updated, and the same crypto plugin of the source cloud is available.

    +
  • +
+
+
+ + + + + +
+ + +This procedure configures the Key Manager service to use the simple_crypto back end. Additional back ends, such as PKCS11 and DogTag, are currently not supported in Red Hat OpenStack Services on OpenShift (RHOSO). +
+
+
+
Procedure
+
    +
  1. +

    Add the kek secret:

    +
    +
    +
    $ oc set data secret/osp-secret "BarbicanSimpleCryptoKEK=$($CONTROLLER1_SSH "python3 -c \"import configparser; c = configparser.ConfigParser(); c.read('/var/lib/config-data/puppet-generated/barbican/etc/barbican/barbican.conf'); print(c['simple_crypto_plugin']['kek'])\"")"
    +
    +
    +
  2. +
  3. +

    Patch the OpenStackControlPlane CR to deploy the Key Manager service:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch '
    +spec:
    +  barbican:
    +    enabled: true
    +    apiOverride:
    +      route: {}
    +    template:
    +      databaseInstance: openstack
    +      databaseAccount: barbican
    +      rabbitMqClusterName: rabbitmq
    +      secret: osp-secret
    +      simpleCryptoBackendSecret: osp-secret
    +      serviceAccount: barbican
    +      serviceUser: barbican
    +      passwordSelectors:
    +        service: BarbicanPassword
    +        simplecryptokek: BarbicanSimpleCryptoKEK
    +      barbicanAPI:
    +        replicas: 1
    +        override:
    +          service:
    +            internal:
    +              metadata:
    +                annotations:
    +                  metallb.universe.tf/address-pool: internalapi
    +                  metallb.universe.tf/allow-shared-ip: internalapi
    +                  metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +              spec:
    +                type: LoadBalancer
    +      barbicanWorker:
    +        replicas: 1
    +      barbicanKeystoneListener:
    +        replicas: 1
    +'
    +
    +
    +
  4. +
+
+
+
Verification
+
    +
  • +

    Ensure that the Identity service (keystone) endpoints are defined and are pointing to the control plane FQDNs:

    +
    +
    +
    $ openstack endpoint list | grep key-manager
    +
    +
    +
  • +
  • +

    Ensure that Barbican API service is registered in the Identity service:

    +
    +
    +
    $ openstack service list | grep key-manager
    +
    +
    +
    +
    +
    $ openstack endpoint list | grep key-manager
    +
    +
    +
  • +
  • +

    List the secrets:

    +
    +
    +
    $ openstack secret list
    +
    +
    +
  • +
+
+
+
+

Adopting the Networking service

+
+

To adopt the Networking service (neutron), you patch an existing OpenStackControlPlane custom resource (CR) that has the Networking service disabled. The patch starts the service with the +configuration parameters that are provided by the OpenStack (OSP) environment.

+
+
+

The Networking service adoption is complete if you see the following results:

+
+
+
    +
  • +

    The NeutronAPI service is running.

    +
  • +
  • +

    The Identity service (keystone) endpoints are updated, and the same back end of the source cloud is available.

    +
  • +
+
+
+
Prerequisites
+
    +
  • +

    Ensure that Single Node OpenShift or OpenShift Local is running in the OpenShift cluster.

    +
  • +
  • +

    Adopt the Identity service. For more information, see Adopting the Identity service.

    +
  • +
  • +

    Migrate your OVN databases to ovsdb-server instances that run in the OpenShift cluster. For more information, see Migrating OVN data.

    +
  • +
+
+
+
Procedure
+

The Networking service adoption follows a similar pattern to Keystone.

+
+
+
    +
  • +

    Patch the OpenStackControlPlane CR to deploy the Networking service:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch '
    +spec:
    +  neutron:
    +    enabled: true
    +    apiOverride:
    +      route: {}
    +    template:
    +      override:
    +        service:
    +          internal:
    +            metadata:
    +              annotations:
    +                metallb.universe.tf/address-pool: internalapi
    +                metallb.universe.tf/allow-shared-ip: internalapi
    +                metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +            spec:
    +              type: LoadBalancer
    +      databaseInstance: openstack
    +      databaseAccount: neutron
    +      secret: osp-secret
    +      networkAttachments:
    +      - internalapi
    +'
    +
    +
    +
  • +
+
+
+
Verification
+
    +
  • +

    Inspect the resulting Networking service pods:

    +
    +
    +
    NEUTRON_API_POD=`oc get pods -l service=neutron | tail -n 1 | cut -f 1 -d' '`
    +oc exec -t $NEUTRON_API_POD -c neutron-api -- cat /etc/neutron/neutron.conf
    +
    +
    +
  • +
  • +

    Ensure that the Neutron API service is registered in the Identity service:

    +
    +
    +
    $ openstack service list | grep network
    +
    +
    +
    +
    +
    $ openstack endpoint list | grep network
    +
    +| 6a805bd6c9f54658ad2f24e5a0ae0ab6 | regionOne | neutron      | network      | True    | public    | http://neutron-public-openstack.apps-crc.testing  |
    +| b943243e596847a9a317c8ce1800fa98 | regionOne | neutron      | network      | True    | internal  | http://neutron-internal.openstack.svc:9696        |
    +
    +
    +
  • +
  • +

    Create sample resources so that you can test whether the user can create networks, subnets, ports, or routers:

    +
    +
    +
    $ openstack network create net
    +$ openstack subnet create --network net --subnet-range 10.0.0.0/24 subnet
    +$ openstack router create router
    +
    +
    +
  • +
+
+
+
+

Adopting the Object Storage service

+
+

If you are using Object Storage as a service, adopt the Object Storage service (swift) to the Red Hat OpenStack Services on OpenShift (RHOSO) environment. If you are using the Object Storage API of the Ceph Object Gateway (RGW), skip the following procedure.

+
+
+
Prerequisites
+
    +
  • +

    The Object Storage service storage back-end services are running in the OpenStack (OSP) deployment.

    +
  • +
  • +

    The storage network is properly configured on the OpenShift cluster. For more information, see Configuring the data plane network in Deploying Red Hat OpenStack Services on OpenShift.

    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Create the swift-conf secret that includes the Object Storage service hash path suffix and prefix:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: v1
    +kind: Secret
    +metadata:
    +  name: swift-conf
    +  namespace: openstack
    +type: Opaque
    +data:
    +  swift.conf: $($CONTROLLER1_SSH sudo cat /var/lib/config-data/puppet-generated/swift/etc/swift/swift.conf | base64 -w0)
    +EOF
    +
    +
    +
  2. +
  3. +

    Create the swift-ring-files ConfigMap that includes the Object Storage service ring files:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: v1
    +kind: ConfigMap
    +metadata:
    +  name: swift-ring-files
    +binaryData:
    +  swiftrings.tar.gz: $($CONTROLLER1_SSH "cd /var/lib/config-data/puppet-generated/swift/etc/swift && tar cz *.builder *.ring.gz backups/ | base64 -w0")
    +  account.ring.gz: $($CONTROLLER1_SSH "base64 -w0 /var/lib/config-data/puppet-generated/swift/etc/swift/account.ring.gz")
    +  container.ring.gz: $($CONTROLLER1_SSH "base64 -w0 /var/lib/config-data/puppet-generated/swift/etc/swift/container.ring.gz")
    +  object.ring.gz: $($CONTROLLER1_SSH "base64 -w0 /var/lib/config-data/puppet-generated/swift/etc/swift/object.ring.gz")
    +EOF
    +
    +
    +
  4. +
  5. +

    Patch the OpenStackControlPlane custom resource to deploy the Object Storage service:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch '
    +spec:
    +  swift:
    +    enabled: true
    +    template:
    +      memcachedInstance: memcached
    +      swiftRing:
    +        ringReplicas: 1
    +      swiftStorage:
    +        replicas: 0
    +        networkAttachments:
    +        - storage
    +        storageClass: local-storage (1)
    +        storageRequest: 10Gi
    +      swiftProxy:
    +        secret: osp-secret
    +        replicas: 1
    +        passwordSelectors:
    +          service: SwiftPassword
    +        serviceUser: swift
    +        override:
    +          service:
    +            internal:
    +              metadata:
    +                annotations:
    +                  metallb.universe.tf/address-pool: internalapi
    +                  metallb.universe.tf/allow-shared-ip: internalapi
    +                  metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +              spec:
    +                type: LoadBalancer
    +        networkAttachments: (2)
    +        - storage
    +'
    +
    +
    +
    + + + + + + + + + +
    1Must match the RHOSO deployment storage class.
    2Must match the network attachment for the previous Object Storage service configuration from the OSP deployment.
    +
    +
  6. +
+
+
+
Verification
+
    +
  • +

    Inspect the resulting Object Storage service pods:

    +
    +
    +
    $ oc get pods -l component=swift-proxy
    +
    +
    +
  • +
  • +

    Verify that the Object Storage proxy service is registered in the Identity service (keystone):

    +
    +
    +
    $ openstack service list | grep swift
    +| b5b9b1d3c79241aa867fa2d05f2bbd52 | swift    | object-store |
    +
    +
    +
    +
    +
    $ openstack endpoint list | grep swift
    +| 32ee4bd555414ab48f2dc90a19e1bcd5 | regionOne | swift        | object-store | True    | public    | https://swift-public-openstack.apps-crc.testing/v1/AUTH_%(tenant_id)s |
    +| db4b8547d3ae4e7999154b203c6a5bed | regionOne | swift        | object-store | True    | internal  | http://swift-internal.openstack.svc:8080/v1/AUTH_%(tenant_id)s        |
    +
    +
    +
  • +
  • +

    Verify that you are able to upload and download objects:

    +
    +
    +
    openstack container create test
    ++---------------------------------------+-----------+------------------------------------+
    +| account                               | container | x-trans-id                         |
    ++---------------------------------------+-----------+------------------------------------+
    +| AUTH_4d9be0a9193e4577820d187acdd2714a | test      | txe5f9a10ce21e4cddad473-0065ce41b9 |
    ++---------------------------------------+-----------+------------------------------------+
    +
    +openstack object create test --name obj <(echo "Hello World!")
    ++--------+-----------+----------------------------------+
    +| object | container | etag                             |
    ++--------+-----------+----------------------------------+
    +| obj    | test      | d41d8cd98f00b204e9800998ecf8427e |
    ++--------+-----------+----------------------------------+
    +
    +openstack object save test obj --file -
    +Hello World!
    +
    +
    +
  • +
+
+
+ + + + + +
+ + +The Object Storage data is still stored on the existing OSP nodes. For more information about migrating the actual data from the OSP deployment to the RHOSO deployment, see Migrating the Object Storage service (swift) data from OSP to Red Hat OpenStack Services on OpenShift (RHOSO) nodes. +
+
+
+
+

Adopting the Image service

+
+

To adopt the Image Service (glance) you patch an existing OpenStackControlPlane custom resource (CR) that has the Image service disabled. The patch starts the service with the configuration parameters that are provided by the OpenStack (OSP) environment.

+
+
+

The Image service adoption is complete if you see the following results:

+
+
+
    +
  • +

    The GlanceAPI service up and running.

    +
  • +
  • +

    The Identity service endpoints are updated, and the same back end of the source cloud is available.

    +
  • +
+
+
+

To complete the Image service adoption, ensure that your environment meets the following criteria:

+
+
+
    +
  • +

    You have a running TripleO environment (the source cloud).

    +
  • +
  • +

    You have a Single Node OpenShift or OpenShift Local that is running in the OpenShift cluster.

    +
  • +
  • +

    Optional: You can reach an internal/external Ceph cluster by both crc and TripleO.

    +
  • +
+
+
+

Adopting the Image service that is deployed with a Object Storage service back end

+
+

Adopt the Image Service (glance) that you deployed with an Object Storage service (swift) back end in the OpenStack (OSP) environment. The control plane glanceAPI instance is deployed with the following configuration. You use this configuration in the patch manifest that deploys the Image service with the object storage back end:

+
+
+
+
..
+spec
+  glance:
+   ...
+      customServiceConfig: |
+          [DEFAULT]
+          enabled_backends = default_backend:swift
+          [glance_store]
+          default_backend = default_backend
+          [default_backend]
+          swift_store_create_container_on_put = True
+          swift_store_auth_version = 3
+          swift_store_auth_address = {{ .KeystoneInternalURL }}
+          swift_store_endpoint_type = internalURL
+          swift_store_user = service:glance
+          swift_store_key = {{ .ServicePassword }}
+
+
+
+
Prerequisites
+
    +
  • +

    You have completed the previous adoption steps.

    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Create a new file, for example, glance_swift.patch, and include the following content:

    +
    +
    +
    spec:
    +  glance:
    +    enabled: true
    +    apiOverride:
    +      route: {}
    +    template:
    +      secret: osp-secret
    +      databaseInstance: openstack
    +      storage:
    +        storageRequest: 10G
    +      customServiceConfig: |
    +        [DEFAULT]
    +        enabled_backends = default_backend:swift
    +        [glance_store]
    +        default_backend = default_backend
    +        [default_backend]
    +        swift_store_create_container_on_put = True
    +        swift_store_auth_version = 3
    +        swift_store_auth_address = {{ .KeystoneInternalURL }}
    +        swift_store_endpoint_type = internalURL
    +        swift_store_user = service:glance
    +        swift_store_key = {{ .ServicePassword }}
    +      glanceAPIs:
    +        default:
    +          replicas: 1
    +          override:
    +            service:
    +              internal:
    +                metadata:
    +                  annotations:
    +                    metallb.universe.tf/address-pool: internalapi
    +                    metallb.universe.tf/allow-shared-ip: internalapi
    +                    metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +                spec:
    +                  type: LoadBalancer
    +          networkAttachments:
    +            - storage
    +
    +
    +
    + + + + + +
    + + +The Object Storage service as a back end establishes a dependency with the Image service. Any deployed GlanceAPI instances do not work if the Image service is configured with the Object Storage service that is not available in the OpenStackControlPlane custom resource. +After the Object Storage service, and in particular SwiftProxy, is adopted, you can proceed with the GlanceAPI adoption. For more information, see Adopting the Object Storage service. +
    +
    +
  2. +
  3. +

    Verify that SwiftProxy is available:

    +
    +
    +
    $ oc get pod -l component=swift-proxy | grep Running
    +swift-proxy-75cb47f65-92rxq   3/3     Running   0
    +
    +
    +
  4. +
  5. +

    Patch the GlanceAPI service that is deployed in the control plane:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch-file=glance_swift.patch
    +
    +
    +
  6. +
+
+
+
+

Adopting the Image service that is deployed with a Block Storage service back end

+
+

Adopt the Image Service (glance) that you deployed with a Block Storage service (cinder) back end in the OpenStack (OSP) environment. The control plane glanceAPI instance is deployed with the following configuration. You use this configuration in the patch manifest that deploys the Image service with the block storage back end:

+
+
+
+
..
+spec
+  glance:
+   ...
+      customServiceConfig: |
+          [DEFAULT]
+          enabled_backends = default_backend:cinder
+          [glance_store]
+          default_backend = default_backend
+          [default_backend]
+          rootwrap_config = /etc/glance/rootwrap.conf
+          description = Default cinder backend
+          cinder_store_auth_address = {{ .KeystoneInternalURL }}
+          cinder_store_user_name = {{ .ServiceUser }}
+          cinder_store_password = {{ .ServicePassword }}
+          cinder_store_project_name = service
+          cinder_catalog_info = volumev3::internalURL
+          cinder_use_multipath = true
+
+
+
+
Prerequisites
+
    +
  • +

    You have completed the previous adoption steps.

    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Create a new file, for example glance_cinder.patch, and include the following content:

    +
    +
    +
    spec:
    +  glance:
    +    enabled: true
    +    apiOverride:
    +      route: {}
    +    template:
    +      secret: osp-secret
    +      databaseInstance: openstack
    +      storage:
    +        storageRequest: 10G
    +      customServiceConfig: |
    +        [DEFAULT]
    +        enabled_backends = default_backend:cinder
    +        [glance_store]
    +        default_backend = default_backend
    +        [default_backend]
    +        rootwrap_config = /etc/glance/rootwrap.conf
    +        description = Default cinder backend
    +        cinder_store_auth_address = {{ .KeystoneInternalURL }}
    +        cinder_store_user_name = {{ .ServiceUser }}
    +        cinder_store_password = {{ .ServicePassword }}
    +        cinder_store_project_name = service
    +        cinder_catalog_info = volumev3::internalURL
    +        cinder_use_multipath = true
    +      glanceAPIs:
    +        default:
    +          replicas: 1
    +          override:
    +            service:
    +              internal:
    +                metadata:
    +                  annotations:
    +                    metallb.universe.tf/address-pool: internalapi
    +                    metallb.universe.tf/allow-shared-ip: internalapi
    +                    metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +                spec:
    +                  type: LoadBalancer
    +          networkAttachments:
    +            - storage
    +
    +
    +
    + + + + + +
    + + +The Block Storage service as a back end establishes a dependency with the Image service. Any deployed GlanceAPI instances do not work if the Image service is configured with the Block Storage service that is not available in the OpenStackControlPlane custom resource. +After the Block Storage service, and in particular CinderVolume, is adopted, you can proceed with the GlanceAPI adoption. For more information, see Adopting the Block Storage service. +
    +
    +
  2. +
  3. +

    Verify that CinderVolume is available:

    +
    +
    +
    $ oc get pod -l component=cinder-volume | grep Running
    +cinder-volume-75cb47f65-92rxq   3/3     Running   0
    +
    +
    +
  4. +
  5. +

    Patch the GlanceAPI service that is deployed in the control plane:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch-file=glance_cinder.patch
    +
    +
    +
  6. +
+
+
+
+

Adopting the Image service that is deployed with an NFS back end

+
+

Adopt the Image Service (glance) that you deployed with an NFS back end. To complete the following procedure, ensure that your environment meets the following criteria:

+
+
+
    +
  • +

    The Storage network is propagated to the OpenStack (OSP) control plane.

    +
  • +
  • +

    The Image service can reach the Storage network and connect to the nfs-server through the port 2049.

    +
  • +
+
+
+
Prerequisites
+
    +
  • +

    You have completed the previous adoption steps.

    +
  • +
  • +

    In the source cloud, verify the NFS parameters that the overcloud uses to configure the Image service back end. Specifically, in yourTripleO heat templates, find the following variables that override the default content that is provided by the glance-nfs.yaml file in the +/usr/share/openstack-tripleo-heat-templates/environments/storage directory:

    +
    +
    +
    **GlanceBackend**: file
    +
    +**GlanceNfsEnabled**: true
    +
    +**GlanceNfsShare**: 192.168.24.1:/var/nfs
    +
    +
    +
    + + + + + +
    + + +
    +

    In this example, the GlanceBackend variable shows that the Image service has no notion of an NFS back end. The variable is using the File driver and, in the background, the filesystem_store_datadir. The filesystem_store_datadir is mapped to the export value provided by the GlanceNfsShare variable instead of /var/lib/glance/images/. +If you do not export the GlanceNfsShare through a network that is propagated to the adopted Red Hat OpenStack Services on OpenShift (RHOSO) control plane, you must stop the nfs-server and remap the export to the storage network. Before doing so, ensure that the Image service is stopped in the source Controller nodes. +In the control plane, as per the (network isolation diagram, +the Image service is attached to the Storage network, propagated via the associated NetworkAttachmentsDefinition custom resource, and the resulting Pods have already the right permissions to handle the Image service traffic through this network.

    +
    +
    +

    In a deployed OSP control plane, you can verify that the network mapping matches with what has been deployed in the TripleO-based environment by checking both the NodeNetworkConfigPolicy (nncp) and the NetworkAttachmentDefinition (net-attach-def). The following is an example of the output that you should check in the OpenShift environment to make sure that there are no issues with the propagated networks:

    +
    +
    +
    +
    $ oc get nncp
    +NAME                        STATUS      REASON
    +enp6s0-crc-8cf2w-master-0   Available   SuccessfullyConfigured
    +
    +$ oc get net-attach-def
    +NAME
    +ctlplane
    +internalapi
    +storage
    +tenant
    +
    +$ oc get ipaddresspool -n metallb-system
    +NAME          AUTO ASSIGN   AVOID BUGGY IPS   ADDRESSES
    +ctlplane      true          false             ["192.168.122.80-192.168.122.90"]
    +internalapi   true          false             ["172.17.0.80-172.17.0.90"]
    +storage       true          false             ["172.18.0.80-172.18.0.90"]
    +tenant        true          false             ["172.19.0.80-172.19.0.90"]
    +
    +
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Adopt the Image service and create a new default GlanceAPI instance that is connected with the existing NFS share:

    +
    +
    +
    $ cat << EOF > glance_nfs_patch.yaml
    +
    +spec:
    +  extraMounts:
    +  - extraVol:
    +    - extraVolType: Nfs
    +      mounts:
    +      - mountPath: /var/lib/glance/images
    +        name: nfs
    +      propagation:
    +      - Glance
    +      volumes:
    +      - name: nfs
    +        nfs:
    +          path: <exported_path>
    +          server: <ip_address>
    +    name: r1
    +    region: r1
    +  glance:
    +    enabled: true
    +    template:
    +      databaseInstance: openstack
    +      customServiceConfig: |
    +        [DEFAULT]
    +        enabled_backends = default_backend:file
    +        [glance_store]
    +        default_backend = default_backend
    +        [default_backend]
    +        filesystem_store_datadir = /var/lib/glance/images/
    +      storage:
    +        storageRequest: 10G
    +      glanceAPIs:
    +        default:
    +          replicas: 0
    +          type: single
    +          override:
    +            service:
    +              internal:
    +                metadata:
    +                  annotations:
    +                    metallb.universe.tf/address-pool: internalapi
    +                    metallb.universe.tf/allow-shared-ip: internalapi
    +                    metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +                spec:
    +                  type: LoadBalancer
    +          networkAttachments:
    +          - storage
    +EOF
    +
    +
    +
    +
      +
    • +

      Replace <ip_address> with the IP address that you use to reach the nfs-server.

      +
    • +
    • +

      Replace <exported_path> with the exported path in the nfs-server.

      +
    • +
    +
    +
  2. +
  3. +

    Patch the OpenStackControlPlane CR to deploy the Image service with an NFS back end:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch-file glance_nfs_patch.yaml
    +
    +
    +
  4. +
+
+
+
Verification
+
    +
  • +

    When GlanceAPI is active, confirm that you can see a single API instance:

    +
    +
    +
    $ oc get pods -l service=glance
    +NAME                      READY   STATUS    RESTARTS
    +glance-default-single-0   3/3     Running   0
    +```
    +
    +
    +
  • +
  • +

    Ensure that the description of the pod reports the following output:

    +
    +
    +
    Mounts:
    +...
    +  nfs:
    +    Type:      NFS (an NFS mount that lasts the lifetime of a pod)
    +    Server:    {{ server ip address }}
    +    Path:      {{ nfs export path }}
    +    ReadOnly:  false
    +...
    +
    +
    +
  • +
  • +

    Check that the mountpoint that points to /var/lib/glance/images is mapped to the expected nfs server ip and nfs path that you defined in the new default GlanceAPI instance:

    +
    +
    +
    $ oc rsh -c glance-api glance-default-single-0
    +
    +sh-5.1# mount
    +...
    +...
    +{{ ip address }}:/var/nfs on /var/lib/glance/images type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=172.18.0.5,local_lock=none,addr=172.18.0.5)
    +...
    +...
    +
    +
    +
  • +
  • +

    Confirm that the UUID is created in the exported directory on the NFS node. For example:

    +
    +
    +
    $ oc rsh openstackclient
    +$ openstack image list
    +
    +sh-5.1$  curl -L -o /tmp/cirros-0.5.2-x86_64-disk.img http://download.cirros-cloud.net/0.5.2/cirros-0.5.2-x86_64-disk.img
    +...
    +...
    +
    +sh-5.1$ openstack image create --container-format bare --disk-format raw --file /tmp/cirros-0.5.2-x86_64-disk.img cirros
    +...
    +...
    +
    +sh-5.1$ openstack image list
    ++--------------------------------------+--------+--------+
    +| ID                                   | Name   | Status |
    ++--------------------------------------+--------+--------+
    +| 634482ca-4002-4a6d-b1d5-64502ad02630 | cirros | active |
    ++--------------------------------------+--------+--------+
    +
    +
    +
  • +
  • +

    On the nfs-server node, the same uuid is in the exported /var/nfs:

    +
    +
    +
    $ ls /var/nfs/
    +634482ca-4002-4a6d-b1d5-64502ad02630
    +
    +
    +
  • +
+
+
+
+

Adopting the Image service that is deployed with a Ceph back end

+
+

Adopt the Image Service (glance) that you deployed with a Ceph back end. Use the customServiceConfig parameter to inject the right configuration to the GlanceAPI instance.

+
+
+
Prerequisites
+
    +
  • +

    You have completed the previous adoption steps.

    +
  • +
  • +

    Ensure that the Ceph-related secret (ceph-conf-files) is created in +the openstack namespace and that the extraMounts property of the +OpenStackControlPlane custom resource (CR) is configured properly. For more information, see Configuring a Ceph back end.

    +
    +
    +
    $ cat << EOF > glance_patch.yaml
    +spec:
    +  glance:
    +    enabled: true
    +    template:
    +      databaseInstance: openstack
    +      customServiceConfig: |
    +        [DEFAULT]
    +        enabled_backends=default_backend:rbd
    +        [glance_store]
    +        default_backend=default_backend
    +        [default_backend]
    +        rbd_store_ceph_conf=/etc/ceph/ceph.conf
    +        rbd_store_user=openstack
    +        rbd_store_pool=images
    +        store_description=Ceph glance store backend.
    +      storage:
    +        storageRequest: 10G
    +      glanceAPIs:
    +        default:
    +          replicas: 0
    +          override:
    +            service:
    +              internal:
    +                metadata:
    +                  annotations:
    +                    metallb.universe.tf/address-pool: internalapi
    +                    metallb.universe.tf/allow-shared-ip: internalapi
    +                    metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +                spec:
    +                  type: LoadBalancer
    +          networkAttachments:
    +          - storage
    +EOF
    +
    +
    +
  • +
+
+
+ + + + + +
+ + +
+

If you backed up your OpenStack (OSP) services configuration file from the original environment, you can compare it with the confgiuration file that you adopted and ensure that the configuration is correct. +For more information, see Pulling the configuration from a TripleO deployment.

+
+
+
+
os-diff diff /tmp/collect_tripleo_configs/glance/etc/glance/glance-api.conf glance_patch.yaml --crd
+
+
+
+

This command produces the difference between both ini configuration files.

+
+
+
+
+
Procedure
+
    +
  • +

    Patch the OpenStackControlPlane CR to deploy the Image service with a Ceph back end:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch-file glance_patch.yaml
    +
    +
    +
  • +
+
+
+
+

Verifying the Image service adoption

+
+

Verify that you adopted the Image Service (glance) to the Red Hat OpenStack Services on OpenShift (RHOSO) Antelope deployment.

+
+
+
Procedure
+
    +
  1. +

    Test the Image service from the OpenStack CLI. You can compare and ensure that the configuration is applied to the Image service pods:

    +
    +
    +
    $ os-diff diff /etc/glance/glance.conf.d/02-config.conf glance_patch.yaml --frompod -p glance-api
    +
    +
    +
    +

    If no line appears, then the configuration is correct.

    +
    +
  2. +
  3. +

    Inspect the resulting Image service pods:

    +
    +
    +
    GLANCE_POD=`oc get pod |grep glance-default | cut -f 1 -d' ' | head -n 1`
    +oc exec -t $GLANCE_POD -c glance-api -- cat /etc/glance/glance.conf.d/02-config.conf
    +
    +[DEFAULT]
    +enabled_backends=default_backend:rbd
    +[glance_store]
    +default_backend=default_backend
    +[default_backend]
    +rbd_store_ceph_conf=/etc/ceph/ceph.conf
    +rbd_store_user=openstack
    +rbd_store_pool=images
    +store_description=Ceph glance store backend.
    +
    +
    +
  4. +
  5. +

    If you use a Ceph back end, ensure that the Ceph secrets are mounted:

    +
    +
    +
    $ oc exec -t $GLANCE_POD -c glance-api -- ls /etc/ceph
    +ceph.client.openstack.keyring
    +ceph.conf
    +
    +
    +
  6. +
  7. +

    Check that the service is active, and that the endpoints are updated in the OSP CLI:

    +
    +
    +
    $ oc rsh openstackclient -n openstackclient
    +$ openstack service list | grep image
    +
    +| fc52dbffef36434d906eeb99adfc6186 | glance    | image        |
    +
    +$ openstack endpoint list | grep image
    +
    +| 569ed81064f84d4a91e0d2d807e4c1f1 | regionOne | glance       | image        | True    | internal  | http://glance-internal-openstack.apps-crc.testing   |
    +| 5843fae70cba4e73b29d4aff3e8b616c | regionOne | glance       | image        | True    | public    | http://glance-public-openstack.apps-crc.testing     |
    +
    +
    +
  8. +
  9. +

    Check that the images that you previously listed in the source cloud are available in the adopted service:

    +
    +
    +
    $ openstack image list
    ++--------------------------------------+--------+--------+
    +| ID                                   | Name   | Status |
    ++--------------------------------------+--------+--------+
    +| c3158cad-d50b-452f-bec1-f250562f5c1f | cirros | active |
    ++--------------------------------------+--------+--------+
    +
    +
    +
  10. +
  11. +

    Test that you can create an image on the adopted service:

    +
    +
    +
    (openstack)$ alias openstack="oc exec -t openstackclient -- openstack"
    +(openstack)$ curl -L -o /tmp/cirros-0.5.2-x86_64-disk.img http://download.cirros-cloud.net/0.5.2/cirros-0.5.2-x86_64-disk.img
    +    qemu-img convert -O raw /tmp/cirros-0.5.2-x86_64-disk.img /tmp/cirros-0.5.2-x86_64-disk.img.raw
    +    openstack image create --container-format bare --disk-format raw --file /tmp/cirros-0.5.2-x86_64-disk.img.raw cirros2
    +    openstack image list
    +  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
    +                                 Dload  Upload   Total   Spent    Left  Speed
    +100   273  100   273    0     0   1525      0 --:--:-- --:--:-- --:--:--  1533
    +  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
    +100 15.5M  100 15.5M    0     0  17.4M      0 --:--:-- --:--:-- --:--:-- 17.4M
    +
    ++------------------+--------------------------------------------------------------------------------------------------------------------------------------------+
    +| Field            | Value                                                                                                                                      |
    ++------------------+--------------------------------------------------------------------------------------------------------------------------------------------+
    +| container_format | bare                                                                                                                                       |
    +| created_at       | 2023-01-31T21:12:56Z                                                                                                                       |
    +| disk_format      | raw                                                                                                                                        |
    +| file             | /v2/images/46a3eac1-7224-40bc-9083-f2f0cd122ba4/file                                                                                       |
    +| id               | 46a3eac1-7224-40bc-9083-f2f0cd122ba4                                                                                                       |
    +| min_disk         | 0                                                                                                                                          |
    +| min_ram          | 0                                                                                                                                          |
    +| name             | cirros                                                                                                                                     |
    +| owner            | 9f7e8fdc50f34b658cfaee9c48e5e12d                                                                                                           |
    +| properties       | os_hidden='False', owner_specified.openstack.md5='', owner_specified.openstack.object='images/cirros', owner_specified.openstack.sha256='' |
    +| protected        | False                                                                                                                                      |
    +| schema           | /v2/schemas/image                                                                                                                          |
    +| status           | queued                                                                                                                                     |
    +| tags             |                                                                                                                                            |
    +| updated_at       | 2023-01-31T21:12:56Z                                                                                                                       |
    +| visibility       | shared                                                                                                                                     |
    ++------------------+--------------------------------------------------------------------------------------------------------------------------------------------+
    +
    ++--------------------------------------+--------+--------+
    +| ID                                   | Name   | Status |
    ++--------------------------------------+--------+--------+
    +| 46a3eac1-7224-40bc-9083-f2f0cd122ba4 | cirros2| active |
    +| c3158cad-d50b-452f-bec1-f250562f5c1f | cirros | active |
    ++--------------------------------------+--------+--------+
    +
    +
    +(openstack)$ oc rsh ceph
    +sh-4.4$ ceph -s
    +r  cluster:
    +    id:     432d9a34-9cee-4109-b705-0c59e8973983
    +    health: HEALTH_OK
    +
    +  services:
    +    mon: 1 daemons, quorum a (age 4h)
    +    mgr: a(active, since 4h)
    +    osd: 1 osds: 1 up (since 4h), 1 in (since 4h)
    +
    +  data:
    +    pools:   5 pools, 160 pgs
    +    objects: 46 objects, 224 MiB
    +    usage:   247 MiB used, 6.8 GiB / 7.0 GiB avail
    +    pgs:     160 active+clean
    +
    +sh-4.4$ rbd -p images ls
    +46a3eac1-7224-40bc-9083-f2f0cd122ba4
    +c3158cad-d50b-452f-bec1-f250562f5c1f
    +
    +
    +
  12. +
+
+
+
+
+

Adopting the Placement service

+
+

To adopt the Placement service, you patch an existing OpenStackControlPlane custom resource (CR) that has the Placement service disabled. The patch starts the service with the configuration parameters that are provided by the OpenStack (OSP) environment.

+
+
+
Prerequisites
+ +
+
+
Procedure
+
    +
  • +

    Patch the OpenStackControlPlane CR to deploy the Placement service:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch '
    +spec:
    +  placement:
    +    enabled: true
    +    apiOverride:
    +      route: {}
    +    template:
    +      databaseInstance: openstack
    +      databaseAccount: placement
    +      secret: osp-secret
    +      override:
    +        service:
    +          internal:
    +            metadata:
    +              annotations:
    +                metallb.universe.tf/address-pool: internalapi
    +                metallb.universe.tf/allow-shared-ip: internalapi
    +                metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +            spec:
    +              type: LoadBalancer
    +'
    +
    +
    +
  • +
+
+
+
Verification
+
    +
  • +

    Check that the Placement service endpoints are defined and pointing to the +control plane FQDNs, and that the Placement API responds:

    +
    +
    +
    $ alias openstack="oc exec -t openstackclient -- openstack"
    +
    +$ openstack endpoint list | grep placement
    +
    +
    +# Without OpenStack CLI placement plugin installed:
    +PLACEMENT_PUBLIC_URL=$(openstack endpoint list -c 'Service Name' -c 'Service Type' -c URL | grep placement | grep public | awk '{ print $6; }')
    +oc exec -t openstackclient -- curl "$PLACEMENT_PUBLIC_URL"
    +
    +# With OpenStack CLI placement plugin installed:
    +openstack resource class list
    +
    +
    +
  • +
+
+
+
+

Adopting the Compute service

+
+

To adopt the Compute service (nova), you patch an existing OpenStackControlPlane custom resource (CR) where the Compute service is disabled. The patch starts the service with the configuration parameters that are provided by the OpenStack (OSP) environment. The following procedure describes a single-cell setup.

+
+
+
Prerequisites
+
    +
  • +

    You have completed the previous adoption steps.

    +
  • +
  • +

    You have defined the following shell variables. Replace the following example values with the values that are correct for your environment:

    +
  • +
+
+
+
+
$ alias openstack="oc exec -t openstackclient -- openstack"
+
+
+
+
Procedure
+
    +
  1. +

    Patch the OpenStackControlPlane CR to deploy the Compute service:

    +
    + + + + + +
    + + +This procedure assumes that Compute service metadata is deployed on the top level and not on each cell level. If the OSP deployment has a per-cell metadata deployment, adjust the following patch as needed. You cannot run the metadata service in cell0. +
    +
    +
    +
    +
    $ oc patch openstackcontrolplane openstack -n openstack --type=merge --patch '
    +spec:
    +  nova:
    +    enabled: true
    +    apiOverride:
    +      route: {}
    +    template:
    +      secret: osp-secret
    +      apiServiceTemplate:
    +        override:
    +          service:
    +            internal:
    +              metadata:
    +                annotations:
    +                  metallb.universe.tf/address-pool: internalapi
    +                  metallb.universe.tf/allow-shared-ip: internalapi
    +                  metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +              spec:
    +                type: LoadBalancer
    +        customServiceConfig: |
    +          [workarounds]
    +          disable_compute_service_check_for_ffu=true
    +      metadataServiceTemplate:
    +        enabled: true # deploy single nova metadata on the top level
    +        override:
    +          service:
    +            metadata:
    +              annotations:
    +                metallb.universe.tf/address-pool: internalapi
    +                metallb.universe.tf/allow-shared-ip: internalapi
    +                metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +            spec:
    +              type: LoadBalancer
    +        customServiceConfig: |
    +          [workarounds]
    +          disable_compute_service_check_for_ffu=true
    +      schedulerServiceTemplate:
    +        customServiceConfig: |
    +          [workarounds]
    +          disable_compute_service_check_for_ffu=true
    +      cellTemplates:
    +        cell0:
    +          conductorServiceTemplate:
    +            customServiceConfig: |
    +              [workarounds]
    +              disable_compute_service_check_for_ffu=true
    +        cell1:
    +          metadataServiceTemplate:
    +            enabled: false # enable here to run it in a cell instead
    +            override:
    +                service:
    +                  metadata:
    +                    annotations:
    +                      metallb.universe.tf/address-pool: internalapi
    +                      metallb.universe.tf/allow-shared-ip: internalapi
    +                      metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +                  spec:
    +                    type: LoadBalancer
    +            customServiceConfig: |
    +              [workarounds]
    +              disable_compute_service_check_for_ffu=true
    +          conductorServiceTemplate:
    +            customServiceConfig: |
    +              [workarounds]
    +              disable_compute_service_check_for_ffu=true
    +'
    +
    +
    +
  2. +
  3. +

    If you are adopting the Compute service with the Bare Metal Provisioning service (ironic), append the following novaComputeTemplates in the cell1 section of the Compute service CR patch:

    +
    +
    +
            cell1:
    +          novaComputeTemplates:
    +            standalone:
    +              customServiceConfig: |
    +                [DEFAULT]
    +                host = <hostname>
    +                [workarounds]
    +                disable_compute_service_check_for_ffu=true
    +
    +
    +
    +
      +
    • +

      Replace <hostname> with the hostname of the node that is running the ironic Compute driver in the source cloud.

      +
    • +
    +
    +
  4. +
  5. +

    Wait for the CRs for the Compute control plane services to be ready:

    +
    +
    +
    $ oc wait --for condition=Ready --timeout=300s Nova/nova
    +
    +
    +
    + + + + + +
    + + +The local Conductor services are started for each cell, while the superconductor runs in cell0. +Note that disable_compute_service_check_for_ffu is mandatory for all imported Compute services until the external data plane is imported, and until Compute services are fast-forward upgraded. For more information, see Adopting Compute services to the RHOSO data plane and Performing a fast-forward upgrade on Compute services. +
    +
    +
  6. +
+
+
+
Verification
+
    +
  • +

    Check that Compute service endpoints are defined and pointing to the +control plane FQDNs, and that the Nova API responds:

    +
    +
    +
    $ openstack endpoint list | grep nova
    +$ openstack server list
    +
    +
    +
    + +
    +
  • +
  • +

    Query the superconductor to check that cell1 exists, and compare it to pre-adoption values:

    +
    +
    +
    . ~/.source_cloud_exported_variables
    +echo $PULL_OPENSTACK_CONFIGURATION_NOVAMANAGE_CELL_MAPPINGS
    +oc rsh nova-cell0-conductor-0 nova-manage cell_v2 list_cells | grep -F '| cell1 |'
    +
    +
    +
    +

    The following changes are expected:

    +
    +
    +
      +
    • +

      The cell1 nova database and username become nova_cell1.

      +
    • +
    • +

      The default cell is renamed to cell1.

      +
    • +
    • +

      RabbitMQ transport URL no longer uses guest.

      +
    • +
    +
    +
  • +
+
+
+ + + + + +
+ + +At this point, the Compute service control plane services do not control the existing Compute service workloads. The control plane manages the data plane only after the data adoption process is completed. For more information, see Adopting Compute services to the RHOSO data plane. +
+
+
+
+

Adopting the Block Storage service

+
+

To adopt a TripleO-deployed Block Storage service (cinder), create the manifest based on the existing cinder.conf file, deploy the Block Storage service, and validate the new deployment.

+
+
+
Prerequisites
+
    +
  • +

    You have reviewed the Block Storage service limitations. For more information, see Limitations for adopting the Block Storage service.

    +
  • +
  • +

    You have planned the placement of the Block Storage services.

    +
  • +
  • +

    You have prepared the OpenShift nodes where the volume and backup services run. For more information, see OCP preparation for Block Storage service adoption.

    +
  • +
  • +

    The Block Storage service (cinder) is stopped.

    +
  • +
  • +

    The service databases are imported into the control plane MariaDB.

    +
  • +
  • +

    The Identity service (keystone) and Key Manager service (barbican) are adopted.

    +
  • +
  • +

    The Storage network is correctly configured on the OCP cluster.

    +
  • +
  • +

    The contents of cinder.conf file. Download the file so that you can access it locally:

    +
    +
    +
    $CONTROLLER1_SSH cat /var/lib/config-data/puppet-generated/cinder/etc/cinder/cinder.conf > cinder.conf
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Create a new file, for example, cinder.patch, and apply the configuration:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch-file=<patch_name>
    +
    +
    +
    +
      +
    • +

      Replace <patch_name> with the name of your patch file.

      +
      +

      The following example shows a cinder.patch file for an RBD deployment:

      +
      +
      +
      +
      spec:
      +  extraMounts:
      +  - extraVol:
      +    - extraVolType: Ceph
      +      mounts:
      +      - mountPath: /etc/ceph
      +        name: ceph
      +        readOnly: true
      +      propagation:
      +      - CinderVolume
      +      - CinderBackup
      +      - Glance
      +      volumes:
      +      - name: ceph
      +        projected:
      +          sources:
      +          - secret:
      +              name: ceph-conf-files
      +  cinder:
      +    enabled: true
      +    apiOverride:
      +      route: {}
      +    template:
      +      databaseInstance: openstack
      +      databaseAccount: cinder
      +      secret: osp-secret
      +      cinderAPI:
      +        override:
      +          service:
      +            internal:
      +              metadata:
      +                annotations:
      +                  metallb.universe.tf/address-pool: internalapi
      +                  metallb.universe.tf/allow-shared-ip: internalapi
      +                  metallb.universe.tf/loadBalancerIPs: 172.17.0.80
      +              spec:
      +                type: LoadBalancer
      +        replicas: 1
      +        customServiceConfig: |
      +          [DEFAULT]
      +          default_volume_type=tripleo
      +      cinderScheduler:
      +        replicas: 1
      +      cinderBackup:
      +        networkAttachments:
      +        - storage
      +        replicas: 1
      +        customServiceConfig: |
      +          [DEFAULT]
      +          backup_driver=cinder.backup.drivers.ceph.CephBackupDriver
      +          backup_ceph_conf=/etc/ceph/ceph.conf
      +          backup_ceph_user=openstack
      +          backup_ceph_pool=backups
      +      cinderVolumes:
      +        ceph:
      +          networkAttachments:
      +          - storage
      +          replicas: 1
      +          customServiceConfig: |
      +            [tripleo_ceph]
      +            backend_host=hostgroup
      +            volume_backend_name=tripleo_ceph
      +            volume_driver=cinder.volume.drivers.rbd.RBDDriver
      +            rbd_ceph_conf=/etc/ceph/ceph.conf
      +            rbd_user=openstack
      +            rbd_pool=volumes
      +            rbd_flatten_volume_from_snapshot=False
      +            report_discard_supported=True
      +
      +
      +
    • +
    +
    +
  2. +
  3. +

    Retrieve the list of the previous scheduler and backup services:

    +
    +
    +
    $ openstack volume service list
    +
    ++------------------+------------------------+------+---------+-------+----------------------------+
    +| Binary           | Host                   | Zone | Status  | State | Updated At                 |
    ++------------------+------------------------+------+---------+-------+----------------------------+
    +| cinder-backup    | standalone.localdomain | nova | enabled | down  | 2023-06-28T11:00:59.000000 |
    +| cinder-scheduler | standalone.localdomain | nova | enabled | down  | 2023-06-28T11:00:29.000000 |
    +| cinder-volume    | hostgroup@tripleo_ceph | nova | enabled | up    | 2023-06-28T17:00:03.000000 |
    +| cinder-scheduler | cinder-scheduler-0     | nova | enabled | up    | 2023-06-28T17:00:02.000000 |
    +| cinder-backup    | cinder-backup-0        | nova | enabled | up    | 2023-06-28T17:00:01.000000 |
    ++------------------+------------------------+------+---------+-------+----------------------------+
    +
    +
    +
  4. +
  5. +

    Remove services for hosts that are in the down state:

    +
    +
    +
    $ oc exec -it cinder-scheduler-0 -- cinder-manage service remove <service_binary> <service_host>
    +
    +
    +
    +
      +
    • +

      Replace <service_binary> with the name of the binary, for example, cinder-backup.

      +
    • +
    • +

      Replace <service_host> with the host name, for example, cinder-backup-0.

      +
    • +
    +
    +
  6. +
  7. +

    Apply the DB data migrations:

    +
    + + + + + +
    + + +
    +

    You are not required to run the data migrations at this step, but you must run them before the next upgrade. However, for adoption, it is recommended to run the migrations now to ensure that there are no issues before you run production workloads on the deployment.

    +
    +
    +
    +
    +
    +
    $ oc exec -it cinder-scheduler-0 -- cinder-manage db online_data_migrations
    +
    +
    +
  8. +
+
+
+
Verification
+
    +
  1. +

    Ensure that the openstack alias is defined:

    +
    +
    +
    $ alias openstack="oc exec -t openstackclient -- openstack"
    +
    +
    +
  2. +
  3. +

    Confirm that Block Storage service endpoints are defined and pointing to the control plane FQDNs:

    +
    +
    +
    $ openstack endpoint list --service <endpoint>
    +
    +
    +
    +
      +
    • +

      Replace <endpoint> with the name of the endpoint that you want to confirm.

      +
    • +
    +
    +
  4. +
  5. +

    Confirm that the Block Storage services are running:

    +
    +
    +
    $ openstack volume service list
    +
    +
    +
    + + + + + +
    + + +Cinder API services do not appear in the list. However, if you get a response from the openstack volume service list command, that means at least one of the cinder API services is running. +
    +
    +
  6. +
  7. +

    Confirm that you have your previous volume types, volumes, snapshots, and backups:

    +
    +
    +
    $ openstack volume type list
    +$ openstack volume list
    +$ openstack volume snapshot list
    +$ openstack volume backup list
    +
    +
    +
  8. +
  9. +

    To confirm that the configuration is working, perform the following steps:

    +
    +
      +
    1. +

      Create a volume from an image to check that the connection to Image Service (glance) is working:

      +
      +
      +
      $ openstack volume create --image cirros --bootable --size 1 disk_new
      +
      +
      +
    2. +
    3. +

      Back up the previous attached volume:

      +
      +
      +
      $ openstack --os-volume-api-version 3.47 volume create --backup <backup_name>
      +
      +
      +
      +
        +
      • +

        Replace <backup_name> with the name of your new backup location.

        +
        + + + + + +
        + + +You do not boot a Compute service (nova) instance by using the new volume from image or try to detach the previous volume because the Compute service and the Block Storage service are still not connected. +
        +
        +
      • +
      +
      +
    4. +
    +
    +
  10. +
+
+
+
+

Adopting the Dashboard service

+
+

To adopt the Dashboard service (horizon), you patch an existing OpenStackControlPlane custom resource (CR) that has the Dashboard service disabled. The patch starts the service with the configuration parameters that are provided by the OpenStack environment.

+
+
+
Prerequisites
+ +
+
+
Procedure
+
    +
  • +

    Patch the OpenStackControlPlane CR to deploy the Dashboard service:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch '
    +spec:
    +  horizon:
    +    enabled: true
    +    apiOverride:
    +      route: {}
    +    template:
    +      memcachedInstance: memcached
    +      secret: osp-secret
    +'
    +
    +
    +
  • +
+
+
+
Verification
+
    +
  1. +

    Verify that the Dashboard service instance is successfully deployed and ready:

    +
    +
    +
    $ oc get horizon
    +
    +
    +
  2. +
  3. +

    Confirm that the Dashboard service is reachable and returns a 200 status code:

    +
    +
    +
    PUBLIC_URL=$(oc get horizon horizon -o jsonpath='{.status.endpoint}')
    +curl --silent --output /dev/stderr --head --write-out "%{http_code}" "$PUBLIC_URL/dashboard/auth/login/?next=/dashboard/" -k | grep 200
    +
    +
    +
  4. +
+
+
+
+

Adopting the Shared File Systems service

+
+

The Shared File Systems service (manila) in Red Hat OpenStack Services on OpenShift (RHOSO) provides a self-service API to create and manage file shares. File shares (or "shares"), are built for concurrent read/write access from multiple clients. This makes the Shared File Systems service essential in cloud environments that require a ReadWriteMany persistent storage.

+
+
+

File shares in RHOSO require network access. Ensure that the networking in the OpenStack (OSP) Wallaby environment matches the network plans for your new cloud after adoption. This ensures that tenant workloads remain connected to storage during the adoption process. The Shared File Systems service control plane services are not in the data path. Shutting down the API, scheduler, and share manager services do not impact access to existing shared file systems.

+
+
+

Typically, storage and storage device management are separate networks. Shared File Systems services only need access to the storage device management network. +For example, if you used a Ceph Storage cluster in the deployment, the "storage" +network refers to the Ceph Storage cluster’s public network, and the Shared File Systems service’s share manager service needs to be able to reach it.

+
+
+

The Shared File Systems service supports the following storage networking scenarios:

+
+
+
    +
  • +

    You can directly control the networking for your respective file shares.

    +
  • +
  • +

    The RHOSO administrator configures the storage networking.

    +
  • +
+
+
+

Guidelines for preparing the Shared File Systems service configuration

+
+

To deploy Shared File Systems service (manila) on the control plane, you must copy the original configuration file from the OpenStack Wallaby deployment. You must review the content in the file to make sure you are adopting the correct configuration for Red Hat OpenStack Services on OpenShift (RHOSO) Antelope. Not all of the content needs to be brought into the new cloud environment.

+
+
+

Review the following guidelines for preparing your Shared File Systems service configuration file for adoption:

+
+
+
    +
  • +

    The Shared File Systems service operator sets up the following configurations and can be ignored:

    +
    +
      +
    • +

      Database-related configuration ([database])

      +
    • +
    • +

      Service authentication (auth_strategy, [keystone_authtoken])

      +
    • +
    • +

      Message bus configuration (transport_url, control_exchange)

      +
    • +
    • +

      The default paste config (api_paste_config)

      +
    • +
    • +

      Inter-service communication configuration ([neutron], [nova], [cinder], [glance] [oslo_messaging_*])

      +
    • +
    +
    +
  • +
  • +

    Ignore the osapi_share_listen configuration. In Red Hat OpenStack Services on OpenShift (RHOSO) Antelope, you rely on OpenShift routes and ingress.

    +
  • +
  • +

    Check for policy overrides. In RHOSO Antelope, the Shared File Systems service ships with a secure default Role-based access control (RBAC), and overrides might not be necessary. +Please review RBAC defaults by using the Oslo policy generator +tool.

    +
  • +
  • +

    If a custom policy is necessary, you must provide it as a ConfigMap. The following example spec illustrates how you can set up a ConfigMap called manila-policy with the contents of a file called policy.yaml:

    +
    +
    +
      spec:
    +    manila:
    +      enabled: true
    +      template:
    +        manilaAPI:
    +          customServiceConfig: |
    +             [oslo_policy]
    +             policy_file=/etc/manila/policy.yaml
    +        extraMounts:
    +        - extraVol:
    +          - extraVolType: Undefined
    +            mounts:
    +            - mountPath: /etc/manila/
    +              name: policy
    +              readOnly: true
    +            propagation:
    +            - ManilaAPI
    +            volumes:
    +            - name: policy
    +              projected:
    +                sources:
    +                - configMap:
    +                    name: manila-policy
    +                    items:
    +                      - key: policy
    +                        path: policy.yaml
    +
    +
    +
  • +
  • +

    The value of the host option under the [DEFAULT] section must be hostgroup.

    +
  • +
  • +

    To run the Shared File Systems service API service, you must add the enabled_share_protocols option to the customServiceConfig section in manila: template: manilaAPI.

    +
  • +
  • +

    If you have scheduler overrides, add them to the customServiceConfig +section in manila: template: manilaScheduler.

    +
  • +
  • +

    If you have multiple storage back-end drivers configured with OSP Wallaby, you need to split them up when deploying RHOSO Antelope. Each storage back-end driver needs to use its own instance of the manila-share service.

    +
  • +
  • +

    If a storage back-end driver needs a custom container image, find it in the +Red Hat Ecosystem Catalog, and create or modify an OpenStackVersion custom resource (CR) to specify the custom image using the same custom name.

    +
    +

    The following example shows a manila spec from the OpenStackControlPlane CR that includes multiple storage back-end drivers, where only one is using a custom container image:

    +
    +
    +
    +
      spec:
    +    manila:
    +      enabled: true
    +      template:
    +        manilaAPI:
    +          customServiceConfig: |
    +            [DEFAULT]
    +            enabled_share_protocols = nfs
    +          replicas: 3
    +        manilaScheduler:
    +          replicas: 3
    +        manilaShares:
    +         netapp:
    +           customServiceConfig: |
    +             [DEFAULT]
    +             debug = true
    +             enabled_share_backends = netapp
    +             host = hostgroup
    +             [netapp]
    +             driver_handles_share_servers = False
    +             share_backend_name = netapp
    +             share_driver = manila.share.drivers.netapp.common.NetAppDriver
    +             netapp_storage_family = ontap_cluster
    +             netapp_transport_type = http
    +           replicas: 1
    +         pure:
    +            customServiceConfig: |
    +             [DEFAULT]
    +             debug = true
    +             enabled_share_backends=pure-1
    +             host = hostgroup
    +             [pure-1]
    +             driver_handles_share_servers = False
    +             share_backend_name = pure-1
    +             share_driver = manila.share.drivers.purestorage.flashblade.FlashBladeShareDriver
    +             flashblade_mgmt_vip = 203.0.113.15
    +             flashblade_data_vip = 203.0.10.14
    +            replicas: 1
    +
    +
    +
    +

    The following example shows the OpenStackVersion CR that defines the custom container image:

    +
    +
    +
    +
    apiVersion: core.openstack.org/v1beta1
    +kind: OpenStackVersion
    +metadata:
    +  name: openstack
    +spec:
    +  customContainerImages:
    +    cinderVolumeImages:
    +      pure: registry.connect.redhat.com/purestorage/openstack-manila-share-pure-rhosp-18-0
    +
    +
    +
    +

    The name of the OpenStackVersion CR must match the name of your OpenStackControlPlane CR.

    +
    +
  • +
  • +

    If you are providing sensitive information, such as passwords, hostnames, and usernames, it is recommended to use OCP secrets, and the customServiceConfigSecrets key. You can use customConfigSecrets in any service. If you use third party storage that requires credentials, create a secret that is referenced in the manila CR/patch file by using the customServiceConfigSecrets key. For example:

    +
    +
      +
    1. +

      Create a file that includes the secrets, for example, netapp_secrets.conf:

      +
      +
      +
      $ cat << __EOF__ > ~/netapp_secrets.conf
      +
      +[netapp]
      +netapp_server_hostname = 203.0.113.10
      +netapp_login = fancy_netapp_user
      +netapp_password = secret_netapp_password
      +netapp_vserver = mydatavserver
      +__EOF__
      +
      +
      +
      +
      +
      $ oc create secret generic osp-secret-manila-netapp --from-file=~/<secret> -n openstack
      +
      +
      +
      +
        +
      • +

        Replace <secret> with the name of the file that includes your secrets, for example, netapp_secrets.conf.

        +
      • +
      +
      +
    2. +
    3. +

      Add the secret to any Shared File Systems service file in the customServiceConfigSecrets section. The following example adds the osp-secret-manila-netapp secret to the manilaShares service:

      +
      +
      +
        spec:
      +    manila:
      +      enabled: true
      +      template:
      +        < . . . >
      +        manilaShares:
      +         netapp:
      +           customServiceConfig: |
      +             [DEFAULT]
      +             debug = true
      +             enabled_share_backends = netapp
      +             host = hostgroup
      +             [netapp]
      +             driver_handles_share_servers = False
      +             share_backend_name = netapp
      +             share_driver = manila.share.drivers.netapp.common.NetAppDriver
      +             netapp_storage_family = ontap_cluster
      +             netapp_transport_type = http
      +           customServiceConfigSecrets:
      +             - osp-secret-manila-netapp
      +           replicas: 1
      +    < . . . >
      +
      +
      +
    4. +
    +
    +
  • +
+
+
+
+

Deploying the Shared File Systems service on the control plane

+
+

Copy the Shared File Systems service (manila) configuration from the OpenStack (OSP) Wallaby deployment, and then deploy the Shared File Systems service on the control plane.

+
+
+
Prerequisites
+
    +
  • +

    The Shared File Systems service systemd services such as api, cron, and scheduler are stopped. For more information, see Stopping OpenStack services.

    +
  • +
  • +

    If the deployment uses CephFS through NFS as a storage back end, the Pacemaker ordering and collocation constraints are adjusted. For more information, see Stopping OpenStack services.

    +
  • +
  • +

    The Shared File Systems service Pacemaker service (openstack-manila-share) is stopped. For more information, see Stopping OpenStack services.

    +
  • +
  • +

    The database migration is complete. For more information, see Migrating databases to MariaDB instances.

    +
  • +
  • +

    The OpenShift nodes where the manila-share service is to be deployed can reach the management network that the storage system is in.

    +
  • +
  • +

    If the deployment uses CephFS through NFS as a storage back end, a new clustered Ceph NFS service is deployed on the Ceph Storage cluster with the help +of Ceph orchestrator. For more information, see Creating a Ceph NFS cluster.

    +
  • +
  • +

    Services such as the Identity service (keystone) and memcached are available prior to adopting the Shared File Systems services.

    +
  • +
  • +

    If you enabled tenant-driven networking by setting driver_handles_share_servers=True, the Networking service (neutron) is deployed.

    +
  • +
  • +

    Define the CONTROLLER1_SSH environment variable if it hasn’t been +defined already. Replace the following example values with values that are correct for your environment:

    +
    +
    +
    CONTROLLER1_SSH="ssh -i <path to SSH key> root@<node IP>"
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Copy the configuration file from OSP Wallaby for reference:

    +
    +
    +
    $ CONTROLLER1_SSH cat /var/lib/config-data/puppet-generated/manila/etc/manila/manila.conf | awk '!/^ *#/ && NF' > ~/manila.conf
    +
    +
    +
  2. +
  3. +

    Review the configuration file for configuration changes that were made since OSP Wallaby. For more information on preparing this file for Red Hat OpenStack Services on OpenShift (RHOSO), see Guidelines for preparing the Shared File Systems service configuration.

    +
  4. +
  5. +

    Create a patch file for the OpenStackControlPlane CR to deploy the Shared File Systems service. The following example manila.patch file uses native CephFS:

    +
    +
    +
    $ cat << __EOF__ > ~/manila.patch
    +spec:
    +  manila:
    +    enabled: true
    +    apiOverride:
    +      route: {}
    +    template:
    +      databaseInstance: openstack
    +      databaseAccount: manila
    +      secret: osp-secret
    +      manilaAPI:
    +        replicas: 3 (1)
    +        customServiceConfig: |
    +          [DEFAULT]
    +          enabled_share_protocols = cephfs
    +        override:
    +          service:
    +            internal:
    +              metadata:
    +                annotations:
    +                  metallb.universe.tf/address-pool: internalapi
    +                  metallb.universe.tf/allow-shared-ip: internalapi
    +                  metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +              spec:
    +                type: LoadBalancer
    +      manilaScheduler:
    +        replicas: 3 (2)
    +      manilaShares:
    +        cephfs:
    +          replicas: 1 (3)
    +          customServiceConfig: |
    +            [DEFAULT]
    +            enabled_share_backends = tripleo_ceph
    +            host = hostgroup
    +            [cephfs]
    +            driver_handles_share_servers=False
    +            share_backend_name=cephfs (4)
    +            share_driver=manila.share.drivers.cephfs.driver.CephFSDriver
    +            cephfs_conf_path=/etc/ceph/ceph.conf
    +            cephfs_auth_id=openstack
    +            cephfs_cluster_name=ceph
    +            cephfs_volume_mode=0755
    +            cephfs_protocol_helper_type=CEPHFS
    +          networkAttachments: (5)
    +              - storage
    +      extraMounts: (6)
    +      - name: v1
    +        region: r1
    +        extraVol:
    +          - propagation:
    +            - ManilaShare
    +          extraVolType: Ceph
    +          volumes:
    +          - name: ceph
    +            secret:
    +              secretName: ceph-conf-files
    +          mounts:
    +          - name: ceph
    +            mountPath: "/etc/ceph"
    +            readOnly: true
    +__EOF__
    +
    +
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + +
    1Set the replica count of the manilaAPI service to 3.
    2Set the replica count of the manilaScheduler service to 3.
    3Set the replica count of the manilaShares service to 1.
    4Ensure that the names of the back ends (share_backend_name) are the same as they were in OSP Wallaby.
    5Ensure that the appropriate storage management network is specified in the networkAttachments section. For example, the manilaShares instance with the CephFS back-end driver is connected to the storage network.
    6If you need to add extra files to any of the services, you can use extraMounts. For example, when using Ceph, you can add the Shared File Systems service Ceph user’s keyring file as well as the ceph.conf configuration file. +
    +

    The following example patch file uses CephFS through NFS:

    +
    +
    +
    +
    $ cat << __EOF__ > ~/manila.patch
    +spec:
    +  manila:
    +    enabled: true
    +    apiOverride:
    +      route: {}
    +    template:
    +      databaseInstance: openstack
    +      secret: osp-secret
    +      manilaAPI:
    +        replicas: 3
    +        customServiceConfig: |
    +          [DEFAULT]
    +          enabled_share_protocols = cephfs
    +        override:
    +          service:
    +            internal:
    +              metadata:
    +                annotations:
    +                  metallb.universe.tf/address-pool: internalapi
    +                  metallb.universe.tf/allow-shared-ip: internalapi
    +                  metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +              spec:
    +                type: LoadBalancer
    +      manilaScheduler:
    +        replicas: 3
    +      manilaShares:
    +        cephfs:
    +          replicas: 1
    +          customServiceConfig: |
    +            [DEFAULT]
    +            enabled_share_backends = cephfs
    +            host = hostgroup
    +            [cephfs]
    +            driver_handles_share_servers=False
    +            share_backend_name=tripleo_ceph
    +            share_driver=manila.share.drivers.cephfs.driver.CephFSDriver
    +            cephfs_conf_path=/etc/ceph/ceph.conf
    +            cephfs_auth_id=openstack
    +            cephfs_cluster_name=ceph
    +            cephfs_protocol_helper_type=NFS
    +            cephfs_nfs_cluster_id=cephfs
    +            cephfs_ganesha_server_ip=172.17.5.47
    +          networkAttachments:
    +              - storage
    +__EOF__
    +
    +
    +
    +
      +
    • +

      Prior to adopting the manilaShares service for CephFS through NFS, ensure that you create a clustered Ceph NFS service. The name of the service must be cephfs_nfs_cluster_id. The cephfs_nfs_cluster_id option is set with the name of the NFS cluster created on Ceph.

      +
    • +
    • +

      The cephfs_ganesha_server_ip option is preserved from the configuration on the OSP Wallaby environment.

      +
    • +
    +
    +
    +
  6. +
  7. +

    Patch the OpenStackControlPlane CR:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch-file=~/<manila.patch>
    +
    +
    +
    +
      +
    • +

      Replace <manila.patch> with the name of your patch file.

      +
    • +
    +
    +
  8. +
+
+
+
Verification
+
    +
  1. +

    Inspect the resulting Shared File Systems service pods:

    +
    +
    +
    $ oc get pods -l service=manila
    +
    +
    +
  2. +
  3. +

    Check that the Shared File Systems API service is registered in the Identity service (keystone):

    +
    +
    +
    $ openstack service list | grep manila
    +
    +
    +
    +
    +
    $ openstack endpoint list | grep manila
    +
    +| 1164c70045d34b959e889846f9959c0e | regionOne | manila       | share        | True    | internal  | http://manila-internal.openstack.svc:8786/v1/%(project_id)s        |
    +| 63e89296522d4b28a9af56586641590c | regionOne | manilav2     | sharev2      | True    | public    | https://manila-public-openstack.apps-crc.testing/v2                |
    +| af36c57adcdf4d50b10f484b616764cc | regionOne | manila       | share        | True    | public    | https://manila-public-openstack.apps-crc.testing/v1/%(project_id)s |
    +| d655b4390d7544a29ce4ea356cc2b547 | regionOne | manilav2     | sharev2      | True    | internal  | http://manila-internal.openstack.svc:8786/v2                       |
    +
    +
    +
  4. +
  5. +

    Test the health of the service:

    +
    +
    +
    $ openstack share service list
    +$ openstack share pool list --detail
    +
    +
    +
  6. +
  7. +

    Check existing workloads:

    +
    +
    +
    $ openstack share list
    +$ openstack share snapshot list
    +
    +
    +
  8. +
  9. +

    You can create further resources:

    +
    +
    +
    $ openstack share create cephfs 10 --snapshot mysharesnap --name myshareclone
    +$ openstack share create nfs 10 --name mynfsshare
    +$ openstack share export location list mynfsshare
    +
    +
    +
  10. +
+
+
+
+

Decommissioning the OpenStack standalone Ceph NFS service

+
+

If your deployment uses CephFS through NFS, you must decommission the OpenStack(OSP) standalone NFS service. Since future software upgrades do not support the previous NFS service, it is recommended that the decommissioning period is short.

+
+
+
Prerequisites
+
    +
  • +

    You identified the new export locations for your existing shares by querying the Shared File Systems API.

    +
  • +
  • +

    You unmounted and remounted the shared file systems on each client to stop using the previous NFS server.

    +
  • +
  • +

    If you are consuming the Shared File Systems service shares with the Shared File Systems service CSI plugin for OpenShift, you migrated the shares by scaling down the application pods and scaling them back up.

    +
  • +
+
+
+ + + + + +
+ + +Clients that are creating new workloads cannot use share exports through the previous NFS service. The Shared File Systems service no longer communicates with the previous NFS service, and cannot apply or alter export rules on the previous NFS service. +
+
+
+
Procedure
+
    +
  1. +

    Remove the cephfs_ganesha_server_ip option from the manila-share service configuration:

    +
    + + + + + +
    + + +This restarts the manila-share process and removes the export locations that applied to the previous NFS service from all the shares. +
    +
    +
    +
    +
    $ cat << __EOF__ > ~/manila.patch
    +spec:
    +  manila:
    +    enabled: true
    +    apiOverride:
    +      route: {}
    +    template:
    +      manilaShares:
    +        cephfs:
    +          replicas: 1
    +          customServiceConfig: |
    +            [DEFAULT]
    +            enabled_share_backends = cephfs
    +            host = hostgroup
    +            [cephfs]
    +            driver_handles_share_servers=False
    +            share_backend_name=cephfs
    +            share_driver=manila.share.drivers.cephfs.driver.CephFSDriver
    +            cephfs_conf_path=/etc/ceph/ceph.conf
    +            cephfs_auth_id=openstack
    +            cephfs_cluster_name=ceph
    +            cephfs_protocol_helper_type=NFS
    +            cephfs_nfs_cluster_id=cephfs
    +          networkAttachments:
    +              - storage
    +__EOF__
    +
    +
    +
  2. +
  3. +

    Patch the OpenStackControlPlane custom resource:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch-file=~/<manila.patch>
    +
    +
    +
    +
      +
    • +

      Replace <manila.patch> with the name of your patch file.

      +
    • +
    +
    +
  4. +
  5. +

    Clean up the standalone ceph-nfs service from the OSP control plane nodes by disabling and deleting the Pacemaker resources associated with the service:

    +
    + + + + + +
    + + +You can defer this step until after RHOSO Antelope is operational. During this time, you cannot decommission the Controller nodes. +
    +
    +
    +
    +
    $ sudo pcs resource disable ceph-nfs
    +$ sudo pcs resource disable ip-<VIP>
    +$ sudo pcs resource unmanage ceph-nfs
    +$ sudo pcs resource unmanage ip-<VIP>
    +
    +
    +
    +
      +
    • +

      Replace <VIP> with the IP address assigned to the ceph-nfs service in your environment.

      +
    • +
    +
    +
  6. +
+
+
+
+
+

Adopting the Bare Metal Provisioning service

+
+

Review information about your Bare Metal Provisioning service (ironic) configuration and then adopt the Bare Metal Provisioning service to the Red Hat OpenStack Services on OpenShift control plane.

+
+
+

Bare Metal Provisioning service configurations

+
+

You configure the Bare Metal Provisioning service (ironic) by using configuration snippets. For more information about configuring the control plane with the Bare Metal Provisioning service, see Customizing the Red Hat OpenStack Services on OpenShift deployment.

+
+
+

Some Bare Metal Provisioning service configuration is overridden in TripleO, for example, PXE Loader file names are often overridden at intermediate layers. You must pay attention to the settings you apply in your Red Hat OpenStack Services on OpenShift (RHOSO) deployment. The ironic-operator applies a reasonable working default configuration, but if you override them with your prior configuration, your experience might not be ideal or your new Bare Metal Provisioning service fails to operate. Similarly, additional configuration might be necessary, for example, if you enable and use additional hardware types in your ironic.conf file.

+
+
+

The model of reasonable defaults includes commonly used hardware-types and driver interfaces. For example, the redfish-virtual-media boot interface and the ramdisk deploy interface are enabled by default. If you add new bare metal nodes after the adoption is complete, the driver interface selection occurs based on the order of precedence in the configuration if you do not explicitly set it on the node creation request or as an established default in the ironic.conf file.

+
+
+

Some configuration parameters do not need to be set on an individual node level, for example, network UUID values, or they are centrally configured in the ironic.conf file, as the setting controls security behavior.

+
+
+

It is critical that you maintain the following parameters that you configured and formatted as [section] and parameter name from the prior deployment to the new deployment. These parameters that govern the underlying behavior and values in the previous configuration would have used specific values if set.

+
+
+
    +
  • +

    [neutron]cleaning_network

    +
  • +
  • +

    [neutron]provisioning_network

    +
  • +
  • +

    [neutron]rescuing_network

    +
  • +
  • +

    [neutron]inspection_network

    +
  • +
  • +

    [conductor]automated_clean

    +
  • +
  • +

    [deploy]erase_devices_priority

    +
  • +
  • +

    [deploy]erase_devices_metadata_priority

    +
  • +
  • +

    [conductor]force_power_state_during_sync

    +
  • +
+
+
+

You can set the following parameters individually on a node. However, you might choose to use embedded configuration options to avoid the need to set the parameters individually when creating or managing bare metal nodes. Check your prior ironic.conf file for these parameters, and if set, apply a specific override configuration.

+
+
+
    +
  • +

    [conductor]bootloader

    +
  • +
  • +

    [conductor]rescue_ramdisk

    +
  • +
  • +

    [conductor]rescue_kernel

    +
  • +
  • +

    [conductor]deploy_kernel

    +
  • +
  • +

    [conductor]deploy_ramdisk

    +
  • +
+
+
+

The instances of kernel_append_params, formerly pxe_append_params in the [pxe] and [redfish] configuration sections, are used to apply boot time options like "console" for the deployment ramdisk and as such often must be changed.

+
+
+ + + + + +
+ + +You cannot migrate hardware types that are set with the ironic.conf file enabled_hardware_types parameter, and hardware type driver interfaces starting with staging- into the adopted configuration. +
+
+
+
+

Deploying the Bare Metal Provisioning service

+
+

To deploy the Bare Metal Provisioning service (ironic), you patch an existing OpenStackControlPlane custom resource (CR) that has the Bare Metal Provisioning service disabled. The ironic-operator applies the configuration and starts the Bare Metal Provisioning services. After the services are running, the Bare Metal Provisioning service automatically begins polling the power state of the bare metal nodes that it manages.

+
+
+ + + + + +
+ + +By default, newer versions of the Bare Metal Provisioning service contain a more restrictive access control model while also becoming multi-tenant aware. As a result, bare metal nodes might be missing from a openstack baremetal node list command after you adopt the Bare Metal Provisioning service. Your nodes are not deleted. You must set the owner field on each bare metal node due to the increased access restrictions in the role-based access control (RBAC) model. Because this involves access controls and the model of use can be site specific, you should identify which project owns the bare metal nodes. +
+
+
+
Prerequisites
+
    +
  • +

    You have imported the service databases into the control plane MariaDB.

    +
  • +
  • +

    The Identity service (keystone), Networking service (neutron), Image Service (glance), and Block Storage service (cinder) are operational.

    +
    + + + + + +
    + + +If you use the Bare Metal Provisioning service in a Bare Metal as a Service configuration, you have not yet adopted the Compute service (nova). +
    +
    +
  • +
  • +

    For the Bare Metal Provisioning service conductor services, the services must be able to reach Baseboard Management Controllers of hardware that is configured to be managed by the Bare Metal Provisioning service. If this hardware is unreachable, the nodes might enter "maintenance" state and be unavailable until connectivity is restored later.

    +
  • +
  • +

    You have downloaded the ironic.conf file locally:

    +
    +
    +
    $CONTROLLER1_SSH cat /var/lib/config-data/puppet-generated/ironic/etc/ironic/ironic.conf > ironic.conf
    +
    +
    +
    + + + + + +
    + + +This configuration file must come from one of the Controller nodes and not a TripleO undercloud node. The TripleO undercloud node operates with different configuration that does not apply when you adopt the Overcloud Ironic deployment. +
    +
    +
  • +
  • +

    If you are adopting the Ironic Inspector service, you need the value of the IronicInspectorSubnets TripleO parameter. Use the same values to populate the dhcpRanges parameter in the RHOSO environment.

    +
  • +
  • +

    You have defined the following shell variables. Replace the following example values with values that apply to your environment:

    +
    +
    +
    $ alias openstack="oc exec -t openstackclient -- openstack"
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Patch the OpenStackControlPlane custom resource (CR) to deploy the Bare Metal Provisioning service:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack -n openstack --type=merge --patch '
    +spec:
    +  ironic:
    +    enabled: true
    +    template:
    +      rpcTransport: oslo
    +      databaseInstance: openstack
    +      ironicAPI:
    +        replicas: 1
    +        override:
    +          service:
    +            internal:
    +              metadata:
    +                annotations:
    +                  metallb.universe.tf/address-pool: internalapi
    +                  metallb.universe.tf/allow-shared-ip: internalapi
    +                  metallb.universe.tf/loadBalancerIPs: 172.17.0.80
    +              spec:
    +                type: LoadBalancer
    +      ironicConductors:
    +      - replicas: 1
    +        networkAttachments:
    +          - baremetal
    +        provisionNetwork: baremetal
    +        storageRequest: 10G
    +        customServiceConfig: |
    +          [neutron]
    +          cleaning_network=<cleaning network uuid>
    +          provisioning_network=<provisioning network uuid>
    +          rescuing_network=<rescuing network uuid>
    +          inspection_network=<introspection network uuid>
    +          [conductor]
    +          automated_clean=true
    +      ironicInspector:
    +        replicas: 1
    +        inspectionNetwork: baremetal
    +        networkAttachments:
    +          - baremetal
    +        dhcpRanges:
    +          - name: inspector-0
    +            cidr: 172.20.1.0/24
    +            start: 172.20.1.190
    +            end: 172.20.1.199
    +            gateway: 172.20.1.1
    +        serviceUser: ironic-inspector
    +        databaseAccount: ironic-inspector
    +        passwordSelectors:
    +          database: IronicInspectorDatabasePassword
    +          service: IronicInspectorPassword
    +      ironicNeutronAgent:
    +        replicas: 1
    +        rabbitMqClusterName: rabbitmq
    +      secret: osp-secret
    +'
    +
    +
    +
  2. +
  3. +

    Wait for the Bare Metal Provisioning service control plane services CRs to become ready:

    +
    +
    +
    $ oc wait --for condition=Ready --timeout=300s ironics.ironic.openstack.org ironic
    +
    +
    +
  4. +
  5. +

    Verify that the individual services are ready:

    +
    +
    +
    $ oc wait --for condition=Ready --timeout=300s ironicapis.ironic.openstack.org ironic-api
    +$ oc wait --for condition=Ready --timeout=300s ironicconductors.ironic.openstack.org ironic-conductor
    +$ oc wait --for condition=Ready --timeout=300s ironicinspectors.ironic.openstack.org ironic-inspector
    +$ oc wait --for condition=Ready --timeout=300s ironicneutronagents.ironic.openstack.org ironic-ironic-neutron-agent
    +
    +
    +
  6. +
  7. +

    Update the DNS Nameservers on the provisoning, cleaning, and rescue networks:

    +
    + + + + + +
    + + +For name resolution to work for Bare Metal Provisioning service operations, you must set the DNS nameserver to use the internal DNS servers in the RHOSO control plane: +
    +
    +
    +
    +
    $ openstack subnet set --dns-nameserver 192.168.122.80 provisioning-subnet
    +
    +
    +
  8. +
  9. +

    Verify that no Bare Metal Provisioning service nodes are missing from the node list:

    +
    +
    +
    $ openstack baremetal node list
    +
    +
    +
    + + + + + +
    + + +If the openstack baremetal node list command output reports an incorrect power status, wait a few minutes and re-run the command to see if the output syncs with the actual state of the hardware being managed. The time required for the Bare Metal Provisioning service to review and reconcile the power state of bare metal nodes depends on the number of operating conductors through the replicas parameter and which are present in the Bare Metal Provisioning service deployment being adopted. +
    +
    +
  10. +
  11. +

    If any Bare Metal Provisioning service nodes are missing from the openstack baremetal node list command, temporarily disable the new RBAC policy to see the nodes again:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack -n openstack --type=merge --patch '
    +spec:
    +  ironic:
    +    enabled: true
    +    template:
    +      databaseInstance: openstack
    +      ironicAPI:
    +        replicas: 1
    +        customServiceConfig: |
    +          [oslo_policy]
    +          enforce_scope=false
    +          enforce_new_defaults=false
    +'
    +
    +
    +
  12. +
  13. +

    After you set the owner field on the bare metal nodes, you can re-enable RBAC by removing the customServiceConfig section or by setting the following values to true:

    +
    +
    +
    customServiceConfig: |
    +  [oslo_policy]
    +  enforce_scope=true
    +  enforce_new_defaults=true
    +
    +
    +
  14. +
  15. +

    After this configuration is applied, the operator restarts the Ironic API service and disables the new RBAC policy that is enabled by default. After the RBAC policy is disabled, you can view bare metal nodes without an owner field:

    +
    +
    +
    $ openstack baremetal node list -f uuid,provision_state,owner
    +
    +
    +
  16. +
  17. +

    Assign all bare metal nodes with no owner to a new project, for example, the admin project:

    +
    +
    +
    ADMIN_PROJECT_ID=$(openstack project show -c id -f value --domain default admin)
    +for node in $(openstack baremetal node list -f json -c UUID -c Owner | jq -r '.[] | select(.Owner == null) | .UUID'); do openstack baremetal node set --owner $ADMIN_PROJECT_ID $node; done
    +
    +
    +
  18. +
  19. +

    Re-apply the default RBAC:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack -n openstack --type=merge --patch '
    +spec:
    +  ironic:
    +    enabled: true
    +    template:
    +      databaseInstance: openstack
    +      ironicAPI:
    +        replicas: 1
    +        customServiceConfig: |
    +          [oslo_policy]
    +          enforce_scope=true
    +          enforce_new_defaults=true
    +'
    +
    +
    +
  20. +
+
+
+
Verification
+
    +
  1. +

    Verify the list of endpoints:

    +
    +
    +
    $ openstack endpoint list |grep ironic
    +
    +
    +
  2. +
  3. +

    Verify the list of bare metal nodes:

    +
    +
    +
    $ openstack baremetal node list
    +
    +
    +
  4. +
+
+
+
+
+

Adopting the Orchestration service

+
+

To adopt the Orchestration service (heat), you patch an existing OpenStackControlPlane custom resource (CR), where the Orchestration service +is disabled. The patch starts the service with the configuration parameters that are provided by the OpenStack (OSP) environment.

+
+
+

After you complete the adoption process, you have CRs for Heat, HeatAPI, HeatEngine, and HeatCFNAPI, and endpoints within the Identity service (keystone) to facilitate these services.

+
+
+
Prerequisites
+
    +
  • +

    The source TripleO environment is running.

    +
  • +
  • +

    The target OpenShift environment is running.

    +
  • +
  • +

    You adopted MariaDB and the Identity service.

    +
  • +
  • +

    If your existing Orchestration service stacks contain resources from other services such as Networking service (neutron), Compute service (nova), Object Storage service (swift), and so on, adopt those sevices before adopting the Orchestration service.

    +
  • +
+
+
+
Procedure
+

The Heat Adoption follows a similar workflow to Keystone.

+
+
+
    +
  1. +

    Retrieve the existing auth_encryption_key and service passwords. You use these passwords to patch the osp-secret. In the following example, the auth_encryption_key is used as HeatAuthEncryptionKey and the service password is used as HeatPassword:

    +
    +
    +
    [stack@rhosp17 ~]$ grep -E 'HeatPassword|HeatAuth' ~/overcloud-deploy/overcloud/overcloud-passwords.yaml
    +  HeatAuthEncryptionKey: Q60Hj8PqbrDNu2dDCbyIQE2dibpQUPg2
    +  HeatPassword: dU2N0Vr2bdelYH7eQonAwPfI3
    +
    +
    +
  2. +
  3. +

    Log in to a Controller node and verify the auth_encryption_key value in use:

    +
    +
    +
    [stack@rhosp17 ~]$ ansible -i overcloud-deploy/overcloud/config-download/overcloud/tripleo-ansible-inventory.yaml overcloud-controller-0 -m shell -a "grep auth_encryption_key /var/lib/config-data/puppet-generated/heat/etc/heat/heat.conf | grep -Ev '^#|^$'" -b
    +overcloud-controller-0 | CHANGED | rc=0 >>
    +auth_encryption_key=Q60Hj8PqbrDNu2dDCbyIQE2dibpQUPg2
    +
    +
    +
  4. +
  5. +

    Encode the password to Base64 format:

    +
    +
    +
    $ echo Q60Hj8PqbrDNu2dDCbyIQE2dibpQUPg2 | base64
    +UTYwSGo4UHFickROdTJkRENieUlRRTJkaWJwUVVQZzIK
    +
    +
    +
  6. +
  7. +

    Patch the osp-secret to update the HeatAuthEncryptionKey and HeatPassword parameters. These values must match the values in the TripleO Orchestration service configuration:

    +
    +
    +
    $ oc patch secret osp-secret --type='json' -p='[{"op" : "replace" ,"path" : "/data/HeatAuthEncryptionKey" ,"value" : "UTYwSGo4UHFickROdTJkRENieUlRRTJkaWJwUVVQZzIK"}]'
    +secret/osp-secret patched
    +
    +
    +
  8. +
  9. +

    Patch the OpenStackControlPlane CR to deploy the Orchestration service:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch '
    +spec:
    +  heat:
    +    enabled: true
    +    apiOverride:
    +      route: {}
    +    template:
    +      databaseInstance: openstack
    +      databaseAccount: heat
    +      secret: osp-secret
    +      memcachedInstance: memcached
    +      passwordSelectors:
    +        authEncryptionKey: HeatAuthEncryptionKey
    +        service: HeatPassword
    +'
    +
    +
    +
  10. +
+
+
+
Verification
+
    +
  1. +

    Ensure that the statuses of all the CRs are Setup complete:

    +
    +
    +
    $ oc get Heat,HeatAPI,HeatEngine,HeatCFNAPI
    +NAME                           STATUS   MESSAGE
    +heat.heat.openstack.org/heat   True     Setup complete
    +
    +NAME                                  STATUS   MESSAGE
    +heatapi.heat.openstack.org/heat-api   True     Setup complete
    +
    +NAME                                        STATUS   MESSAGE
    +heatengine.heat.openstack.org/heat-engine   True     Setup complete
    +
    +NAME                                        STATUS   MESSAGE
    +heatcfnapi.heat.openstack.org/heat-cfnapi   True     Setup complete
    +
    +
    +
  2. +
  3. +

    Check that the Orchestration service is registered in the Identity service:

    +
    +
    +
    $ oc exec -it openstackclient -- openstack service list -c Name -c Type
    ++------------+----------------+
    +| Name       | Type           |
    ++------------+----------------+
    +| heat       | orchestration  |
    +| glance     | image          |
    +| heat-cfn   | cloudformation |
    +| ceilometer | Ceilometer     |
    +| keystone   | identity       |
    +| placement  | placement      |
    +| cinderv3   | volumev3       |
    +| nova       | compute        |
    +| neutron    | network        |
    ++------------+----------------+
    +
    +
    +
    +
    +
    $ oc exec -it openstackclient -- openstack endpoint list --service=heat -f yaml
    +- Enabled: true
    +  ID: 1da7df5b25b94d1cae85e3ad736b25a5
    +  Interface: public
    +  Region: regionOne
    +  Service Name: heat
    +  Service Type: orchestration
    +  URL: http://heat-api-public-openstack-operators.apps.okd.bne-shift.net/v1/%(tenant_id)s
    +- Enabled: true
    +  ID: 414dd03d8e9d462988113ea0e3a330b0
    +  Interface: internal
    +  Region: regionOne
    +  Service Name: heat
    +  Service Type: orchestration
    +  URL: http://heat-api-internal.openstack-operators.svc:8004/v1/%(tenant_id)s
    +
    +
    +
  4. +
  5. +

    Check that the Orchestration service engine services are running:

    +
    +
    +
    $ oc exec -it openstackclient -- openstack orchestration service list -f yaml
    +- Binary: heat-engine
    +  Engine ID: b16ad899-815a-4b0c-9f2e-e6d9c74aa200
    +  Host: heat-engine-6d47856868-p7pzz
    +  Hostname: heat-engine-6d47856868-p7pzz
    +  Status: up
    +  Topic: engine
    +  Updated At: '2023-10-11T21:48:01.000000'
    +- Binary: heat-engine
    +  Engine ID: 887ed392-0799-4310-b95c-ac2d3e6f965f
    +  Host: heat-engine-6d47856868-p7pzz
    +  Hostname: heat-engine-6d47856868-p7pzz
    +  Status: up
    +  Topic: engine
    +  Updated At: '2023-10-11T21:48:00.000000'
    +- Binary: heat-engine
    +  Engine ID: 26ed9668-b3f2-48aa-92e8-2862252485ea
    +  Host: heat-engine-6d47856868-p7pzz
    +  Hostname: heat-engine-6d47856868-p7pzz
    +  Status: up
    +  Topic: engine
    +  Updated At: '2023-10-11T21:48:00.000000'
    +- Binary: heat-engine
    +  Engine ID: 1011943b-9fea-4f53-b543-d841297245fd
    +  Host: heat-engine-6d47856868-p7pzz
    +  Hostname: heat-engine-6d47856868-p7pzz
    +  Status: up
    +  Topic: engine
    +  Updated At: '2023-10-11T21:48:01.000000'
    +
    +
    +
  6. +
  7. +

    Verify that you can see your Orchestration service stacks:

    +
    +
    +
    $ openstack stack list -f yaml
    +- Creation Time: '2023-10-11T22:03:20Z'
    +  ID: 20f95925-7443-49cb-9561-a1ab736749ba
    +  Project: 4eacd0d1cab04427bc315805c28e66c9
    +  Stack Name: test-networks
    +  Stack Status: CREATE_COMPLETE
    +  Updated Time: null
    +
    +
    +
  8. +
+
+
+
+

Adopting the Loadbalancer service

+
+

During the adoption process the Loadbalancer service (octavia) service +must stay disabled in the new control plane.

+
+
+

Certificates

+
+

Before running the script below the shell variables CONTROLLER1_SSH and +CONTROLLER1_SCP must be set to contain the command to log into one of the +controllers using ssh and scp respectively as root user as shown below.

+
+
+
+
$ CONTROLLER1_SSH="ssh -i <path to the ssh key> root@192.168.122.100"
+$ CONTROLLER1_SCP="scp -i <path to the ssh key> root@192.168.122.100"
+
+
+
+

Make sure to replace <path to the ssh key> with the correct path to the ssh +key for connecting to the controller.

+
+
+
+
SERVER_CA_PASSPHRASE=$($CONTROLLER1_SSH grep ^ca_private_key_passphrase /var/lib/config-data/puppet-generated/octavia/etc/octavia/octavia.conf)
+export SERVER_CA_PASSPHRASE=$(echo "${SERVER_CA_PASSPHRASE}"  | cut -d '=' -f 2 | xargs)
+export CLIENT_PASSPHRASE="ThisIsOnlyAppliedTemporarily"
+CERT_SUBJECT="/C=US/ST=Denial/L=Springfield/O=Dis/CN=www.example.com"
+CERT_MIGRATE_PATH="$HOME/octavia_cert_migration"
+
+mkdir -p ${CERT_MIGRATE_PATH}
+cd ${CERT_MIGRATE_PATH}
+# Set up the server CA
+mkdir -p server_ca
+cd server_ca
+mkdir -p certs crl newcerts private csr
+chmod 700 private
+${CONTROLLER1_SCP}:/var/lib/config-data/puppet-generated/octavia/etc/octavia/certs/private/cakey.pem private/server_ca.key.pem
+chmod 400 private/server_ca.key.pem
+${CONTROLLER1_SCP}:/tmp/octavia-ssl/client-.pem certs/old_client_cert.pem
+${CONTROLLER1_SCP}:/tmp/octavia-ssl/index.txt* ./
+${CONTROLLER1_SCP}:/tmp/octavia-ssl/serial* ./
+${CONTROLLER1_SCP}:/tmp/octavia-ssl/openssl.cnf ../
+openssl req -config ../openssl.cnf -key private/server_ca.key.pem -new -passin env:SERVER_CA_PASSPHRASE -x509 -days 18250 -sha256 -extensions v3_ca -out certs/server_ca.cert.pem -subj "/C=US/ST=Denial/L=Springfield/O=Dis/CN=www.example.com"
+
+# Set up the new client CA
+sed -i "s|^dir\s\+=\s\+\"/tmp/octavia-ssl\"|dir = \"$CERT_MIGRATE_PATH/client_ca\"|" ../openssl.cnf
+cd ${CERT_MIGRATE_PATH}
+mkdir -p client_ca
+cd client_ca
+mkdir -p certs crl csr newcerts private
+chmod 700 private
+touch index.txt
+echo 1000 > serial
+openssl genrsa -aes256 -out private/ca.key.pem -passout env:SERVER_CA_PASSPHRASE 4096
+chmod 400 private/ca.key.pem
+openssl req -config ../openssl.cnf -key private/ca.key.pem -new -passin env:SERVER_CA_PASSPHRASE -x509 -days 18250 -sha256 -extensions v3_ca -out certs/client_ca.cert.pem -subj "${CERT_SUBJECT}"
+
+# Create client certificates
+cd ${CERT_MIGRATE_PATH}/client_ca
+openssl genrsa -aes256 -out private/client.key.pem -passout env:CLIENT_PASSPHRASE 4096
+openssl req -config ../openssl.cnf -new -passin env:CLIENT_PASSPHRASE -sha256 -key private/client.key.pem -out csr/client.csr.pem -subj "${CERT_SUBJECT}"
+mkdir -p ${CERT_MIGRATE_PATH}/client_ca/private ${CERT_MIGRATE_PATH}/client_ca/newcerts ${CERT_MIGRATE_PATH}/private
+chmod 700 ${CERT_MIGRATE_PATH}/client_ca/private ${CERT_MIGRATE_PATH}/private
+
+cp ${CERT_MIGRATE_PATH}/client_ca/private/ca.key.pem ${CERT_MIGRATE_PATH}/client_ca/private/cakey.pem
+cp ${CERT_MIGRATE_PATH}/client_ca/certs/client_ca.cert.pem $CERT_MIGRATE_PATH/client_ca/ca_01.pem
+openssl ca -config ../openssl.cnf -extensions usr_cert -passin env:SERVER_CA_PASSPHRASE -days 1825 -notext -batch -md sha256 -in csr/client.csr.pem -out certs/client.cert.pem
+openssl rsa -passin env:CLIENT_PASSPHRASE -in private/client.key.pem -out private/client.cert-and-key.pem
+cat certs/client.cert.pem >> private/client.cert-and-key.pem
+
+# Install new data in k8s
+oc apply -f - <<EOF
+apiVersion: v1
+kind: Secret
+metadata:
+  name: octavia-certs-secret
+  namespace: openstack
+type: Opaque
+data:
+  server_ca.key.pem:  $(cat ${CERT_MIGRATE_PATH}/server_ca/private/server_ca.key.pem | base64 -w0)
+  server_ca.cert.pem: $(cat ${CERT_MIGRATE_PATH}/server_ca/certs/server_ca.cert.pem | base64 -w0)
+  client_ca.cert.pem: $(cat ${CERT_MIGRATE_PATH}/client_ca/certs/client_ca.cert.pem | base64 -w0)
+  client.cert-and-key.pem: $(cat ${CERT_MIGRATE_PATH}/client_ca/private/client.cert-and-key.pem | base64 -w0)
+EOF
+
+oc apply -f - <<EOF
+apiVersion: v1
+kind: Secret
+metadata:
+  name: octavia-ca-passphrase
+  namespace: openstack
+type: Opaque
+data:
+  server-ca-passphrase: $(echo $SERVER_CA_PASSPHRASE | base64 -w0)
+EOF
+
+rm -rf ${CERT_MIGRATE_PATH}
+
+
+
+

These commands convert the existing single CA configuration into a dual CA configuration.

+
+
+
+

Enabling the Loadbalancer service in OpenShift

+
+

Run the following command in order to enable the Loadbalancer service CR.

+
+
+
+
$ oc patch openstackcontrolplane openstack --type=merge --patch '
+spec:
+  octavia:
+    enabled: true
+    template: {}
+'
+
+
+
+
+
+

Adopting Telemetry services

+
+

To adopt Telemetry services, you patch an existing OpenStackControlPlane custom resource (CR) that has Telemetry services disabled to start the service with the configuration parameters that are provided by the OpenStack (OSP) Wallaby environment.

+
+
+

If you adopt Telemetry services, the observability solution that is used in the OSP Wallaby environment, Service Telemetry Framework, is removed from the cluster. The new solution is deployed in the Red Hat OpenStack Services on OpenShift (RHOSO) environment, allowing for metrics, and optionally logs, to be retrieved and stored in the new back ends.

+
+
+

You cannot automatically migrate old data because different back ends are used. Metrics and logs are considered short-lived data and are not intended to be migrated to the RHOSO environment. For information about adopting legacy autoscaling stack templates to the RHOSO environment, see Adopting Autoscaling services.

+
+
+
Prerequisites
+
    +
  • +

    The TripleO environment is running (the source cloud).

    +
  • +
  • +

    The Single Node OpenShift or OpenShift Local is running in the OpenShift cluster.

    +
  • +
  • +

    Previous adoption steps are completed.

    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Patch the OpenStackControlPlane CR to deploy cluster-observability-operator:

    +
    +
    +
    $ oc create -f - <<EOF
    +apiVersion: operators.coreos.com/v1alpha1
    +kind: Subscription
    +metadata:
    +  name: cluster-observability-operator
    +  namespace: openshift-operators
    +spec:
    +  channel: development
    +  installPlanApproval: Automatic
    +  name: cluster-observability-operator
    +  source: redhat-operators
    +  sourceNamespace: openshift-marketplace
    +EOF
    +
    +
    +
  2. +
  3. +

    Wait for the installation to succeed:

    +
    +
    +
    $ oc wait --for jsonpath="{.status.phase}"=Succeeded csv --namespace=openshift-operators -l operators.coreos.com/cluster-observability-operator.openshift-operators
    +
    +
    +
  4. +
  5. +

    Patch the OpenStackControlPlane CR to deploy Ceilometer services:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch '
    +spec:
    +  telemetry:
    +    enabled: true
    +    template:
    +      ceilometer:
    +        passwordSelector:
    +          ceilometerService: CeilometerPassword
    +        enabled: true
    +        secret: osp-secret
    +        serviceUser: ceilometer
    +'
    +
    +
    +
  6. +
  7. +

    Enable the metrics storage back end:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch '
    +spec:
    +  telemetry:
    +    template:
    +      metricStorage:
    +        enabled: true
    +        monitoringStack:
    +          alertingEnabled: true
    +          scrapeInterval: 30s
    +          storage:
    +            strategy: persistent
    +            retention: 24h
    +            persistent:
    +              pvcStorageRequest: 20G
    +'
    +
    +
    +
  8. +
+
+
+
Verification
+
    +
  1. +

    Verify that the alertmanager and prometheus pods are available:

    +
    +
    +
    $ oc get pods -l alertmanager=metric-storage -n openstack
    +NAME                            READY   STATUS    RESTARTS   AGE
    +alertmanager-metric-storage-0   2/2     Running   0          46s
    +alertmanager-metric-storage-1   2/2     Running   0          46s
    +
    +$ oc get pods -l prometheus=metric-storage -n openstack
    +NAME                          READY   STATUS    RESTARTS   AGE
    +prometheus-metric-storage-0   3/3     Running   0          46s
    +
    +
    +
  2. +
  3. +

    Inspect the resulting Ceilometer pods:

    +
    +
    +
    CEILOMETETR_POD=`oc get pods -l service=ceilometer -n openstack | tail -n 1 | cut -f 1 -d' '`
    +oc exec -t $CEILOMETETR_POD -c ceilometer-central-agent -- cat /etc/ceilometer/ceilometer.conf
    +
    +
    +
  4. +
  5. +

    Inspect enabled pollsters:

    +
    +
    +
    $ oc get secret ceilometer-config-data -o jsonpath="{.data['polling\.yaml\.j2']}"  | base64 -d
    +
    +
    +
  6. +
  7. +

    Optional: Override default pollsters according to the requirements of your environment:

    +
    +
    +
    $ oc patch openstackcontrolplane controlplane --type=merge --patch '
    +spec:
    +  telemetry:
    +    template:
    +      ceilometer:
    +          defaultConfigOverwrite:
    +            polling.yaml.j2: |
    +              ---
    +              sources:
    +                - name: pollsters
    +                  interval: 100
    +                  meters:
    +                    - volume.*
    +                    - image.size
    +          enabled: true
    +          secret: osp-secret
    +'
    +
    +
    +
  8. +
+
+
+
Next steps
+
    +
  1. +

    Optional: Patch the OpenStackControlPlane CR to include logging:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch '
    +spec:
    +  telemetry:
    +    template:
    +      logging:
    +      enabled: false
    +      ipaddr: 172.17.0.80
    +      port: 10514
    +      cloNamespace: openshift-logging
    +'
    +
    +
    +
  2. +
+
+
+
+

Adopting autoscaling services

+
+

To adopt services that enable autoscaling, you patch an existing OpenStackControlPlane custom resource (CR) where the Alarming services (aodh) are disabled. The patch starts the service with the configuration parameters that are provided by the OpenStack environment.

+
+
+
Prerequisites
+
    +
  • +

    The source TripleO environment is running.

    +
  • +
  • +

    A Single Node OpenShift or OpenShift Local is running in the OpenShift cluster.

    +
  • +
  • +

    You have adopted the following services:

    +
    +
      +
    • +

      MariaDB

      +
    • +
    • +

      Identity service (keystone)

      +
    • +
    • +

      Orchestration service (heat)

      +
    • +
    • +

      Telemetry service

      +
    • +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Patch the OpenStackControlPlane CR to deploy the autoscaling services:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack --type=merge --patch '
    +spec:
    +  telemetry:
    +    enabled: true
    +    template:
    +      autoscaling:
    +        enabled: true
    +        aodh:
    +          passwordSelector:
    +            aodhService: AodhPassword
    +          databaseAccount: aodh
    +          databaseInstance: openstack
    +          secret: osp-secret
    +          serviceUser: aodh
    +        heatInstance: heat
    +'
    +
    +
    +
  2. +
  3. +

    Inspect the aodh pods:

    +
    +
    +
    $ AODH_POD=`oc get pods -l service=aodh -n openstack | tail -n 1 | cut -f 1 -d' '`
    +$ oc exec -t $AODH_POD -c aodh-api -- cat /etc/aodh/aodh.conf
    +
    +
    +
  4. +
  5. +

    Check whether the aodh API service is registered in the Identity service:

    +
    +
    +
    $ openstack endpoint list | grep aodh
    +| d05d120153cd4f9b8310ac396b572926 | regionOne | aodh  | alarming  | True    | internal  | http://aodh-internal.openstack.svc:8042  |
    +| d6daee0183494d7a9a5faee681c79046 | regionOne | aodh  | alarming  | True    | public    | http://aodh-public.openstack.svc:8042    |
    +
    +
    +
  6. +
  7. +

    Optional: Create aodh alarms with the PrometheusAlarm alarm type:

    +
    + + + + + +
    + + +You must use the PrometheusAlarm alarm type instead of GnocchiAggregationByResourcesAlarm. +
    +
    +
    +
    +
    $ openstack alarm create --name high_cpu_alarm \
    +--type prometheus \
    +--query "(rate(ceilometer_cpu{resource_name=~'cirros'})) * 100" \
    +--alarm-action 'log://' \
    +--granularity 15 \
    +--evaluation-periods 3 \
    +--comparison-operator gt \
    +--threshold 7000000000
    +
    +
    +
    +
      +
    1. +

      Verify that the alarm is enabled:

      +
      +
      +
      $ openstack alarm list
      ++--------------------------------------+------------+------------------+-------------------+----------+
      +| alarm_id                             | type       | name             | state  | severity | enabled  |
      ++--------------------------------------+------------+------------------+-------------------+----------+
      +| 209dc2e9-f9d6-40e5-aecc-e767ce50e9c0 | prometheus | prometheus_alarm |   ok   |    low   |   True   |
      ++--------------------------------------+------------+------------------+-------------------+----------+
      +
      +
      +
    2. +
    +
    +
  8. +
+
+
+
+

Pulling the configuration from a TripleO deployment

+
+

Before you start the data plane adoption workflow, back up the configuration from the OpenStack (OSP) services and TripleO. You can then use the files during the configuration of the adopted services to ensure that nothing is missed or misconfigured.

+
+
+
Prerequisites
+ +
+
+

All the services are describes in a yaml file:

+
+ +
+
Procedure
+
    +
  1. +

    Update your ssh parameters according to your environment in the os-diff.cfg. Os-diff uses the ssh parameters to connect to your TripleO node, and then query and download the configuration files:

    +
    +
    +
    ssh_cmd=ssh -F ssh.config standalone
    +container_engine=podman
    +connection=ssh
    +remote_config_path=/tmp/tripleo
    +
    +
    +
    +

    Ensure that the ssh command you provide in ssh_cmd parameter is correct and includes key authentication.

    +
    +
  2. +
  3. +

    Enable the services that you want to include in the /etc/os-diff/config.yaml file, and disable the services that you want to exclude from the file. Ensure that you have the correct permissions to edit the file:

    +
    +
    +
    $ chown ospng:ospng /etc/os-diff/config.yaml
    +
    +
    +
    +

    The following example enables the default Identity service (keystone) to be included in the /etc/os-diff/config.yaml file:

    +
    +
    +
    +
    # service name and file location
    +services:
    +  # Service name
    +  keystone:
    +    # Bool to enable/disable a service (not implemented yet)
    +    enable: true
    +    # Pod name, in both OCP and podman context.
    +    # It could be strict match or will only just grep the podman_name
    +    # and work with all the pods which matched with pod_name.
    +    # To enable/disable use strict_pod_name_match: true/false
    +    podman_name: keystone
    +    pod_name: keystone
    +    container_name: keystone-api
    +    # pod options
    +    # strict match for getting pod id in TripleO and podman context
    +    strict_pod_name_match: false
    +    # Path of the config files you want to analyze.
    +    # It could be whatever path you want:
    +    # /etc/<service_name> or /etc or /usr/share/<something> or even /
    +    # @TODO: need to implement loop over path to support multiple paths such as:
    +    # - /etc
    +    # - /usr/share
    +    path:
    +      - /etc/
    +      - /etc/keystone
    +      - /etc/keystone/keystone.conf
    +      - /etc/keystone/logging.conf
    +
    +
    +
    +

    Repeat this step for each OSP service that you want to disable or enable.

    +
    +
  4. +
  5. +

    If you use non-containerized services, such as the ovs-external-ids, pull the configuration or the command output. For example:

    +
    +
    +
    services:
    +  ovs_external_ids:
    +    hosts: (1)
    +      - standalone
    +    service_command: "ovs-vsctl list Open_vSwitch . | grep external_ids | awk -F ': ' '{ print $2; }'" (2)
    +    cat_output: true (3)
    +    path:
    +      - ovs_external_ids.json
    +    config_mapping: (4)
    +      ovn-bridge-mappings: edpm_ovn_bridge_mappings (5)
    +      ovn-bridge: edpm_ovn_bridge
    +      ovn-encap-type: edpm_ovn_encap_type
    +      ovn-monitor-all: ovn_monitor_all
    +      ovn-remote-probe-interval: edpm_ovn_remote_probe_interval
    +      ovn-ofctrl-wait-before-clear: edpm_ovn_ofctrl_wait_before_clear
    +
    +
    +
    + + + + + +
    + + +You must correctly configure an SSH configuration file or equivalent for non-standard services, such as OVS. The ovs_external_ids service does not run in a container, and the OVS data is stored on each host of your cloud, for example, controller_1/controller_2/, and so on. +
    +
    +
    + + + + + + + + + + + + + + + + + + + + + +
    1The list of hosts, for example, compute-1, compute-2.
    2The command that runs against the hosts.
    3Os-diff gets the output of the command and stores the output in a file that is specified by the key path.
    4Provides a mapping between, in this example, the data plane custom resource definition and the ovs-vsctl output.
    5The edpm_ovn_bridge_mappings variable must be a list of strings, for example, ["datacentre:br-ex"]. +
    +
      +
    1. +

      Compare the values:

      +
      +
      +
      $ os-diff diff ovs_external_ids.json edpm.crd --crd --service ovs_external_ids
      +
      +
      +
      +

      For example, to check the /etc/yum.conf on every host, you must put the following statement in the config.yaml file. The following example uses a file called yum_config:

      +
      +
      +
      +
      services:
      +  yum_config:
      +    hosts:
      +      - undercloud
      +      - controller_1
      +      - compute_1
      +      - compute_2
      +    service_command: "cat /etc/yum.conf"
      +    cat_output: true
      +    path:
      +      - yum.conf
      +
      +
      +
    2. +
    +
    +
    +
  6. +
  7. +

    Pull the configuration:

    +
    + + + + + +
    + + +
    +

    The following command pulls all the configuration files that are included in the /etc/os-diff/config.yaml file. You can configure os-diff to update this file automatically according to your running environment by using the --update or --update-only option. These options set the podman information into the config.yaml for all running containers. The podman information can be useful later, when all the OpenStack services are turned off.

    +
    +
    +

    Note that when the config.yaml file is populated automatically you must provide the configuration paths manually for each service.

    +
    +
    +
    +
    +
    +
    # will only update the /etc/os-diff/config.yaml
    +os-diff pull --update-only
    +
    +
    +
    +
    +
    # will update the /etc/os-diff/config.yaml and pull configuration
    +os-diff pull --update
    +
    +
    +
    +
    +
    # will update the /etc/os-diff/config.yaml and pull configuration
    +os-diff pull
    +
    +
    +
    +

    The configuration is pulled and stored by default in the following directory:

    +
    +
    +
    +
    /tmp/tripleo/
    +
    +
    +
  8. +
+
+
+
Verification
+
    +
  • +

    Verify that you have a directory for each service configuration in your local path:

    +
    +
    +
      ▾ tmp/
    +    ▾ tripleo/
    +      ▾ glance/
    +      ▾ keystone/
    +
    +
    +
  • +
+
+
+
+

Rolling back the control plane adoption

+
+

If you encountered a problem and are unable to complete the adoption of the OpenStack (OSP) control plane services, you can roll back the control plane adoption.

+
+
+ + + + + +
+ + +Do not attempt the rollback if you altered the data plane nodes in any way. +You can only roll back the control plane adoption if you altered the control plane. +
+
+
+

During the control plane adoption, services on the OSP control plane are stopped but not removed. The databases on the OSP control plane are not edited during the adoption procedure. The Red Hat OpenStack Services on OpenShift (RHOSO) control plane receives a copy of the original control plane databases. The rollback procedure assumes that the data plane has not yet been modified by the adoption procedure, and it is still connected to the OSP control plane.

+
+
+

The rollback procedure consists of the following steps:

+
+
+
    +
  • +

    Restoring the functionality of the OSP control plane.

    +
  • +
  • +

    Removing the partially or fully deployed RHOSO control plane.

    +
  • +
+
+
+
Procedure
+
    +
  1. +

    To restore the source cloud to a working state, start the OSP +control plane services that you previously stopped during the adoption +procedure:

    +
    +
    +
    ServicesToStart=("tripleo_horizon.service"
    +                 "tripleo_keystone.service"
    +                 "tripleo_barbican_api.service"
    +                 "tripleo_barbican_worker.service"
    +                 "tripleo_barbican_keystone_listener.service"
    +                 "tripleo_cinder_api.service"
    +                 "tripleo_cinder_api_cron.service"
    +                 "tripleo_cinder_scheduler.service"
    +                 "tripleo_cinder_volume.service"
    +                 "tripleo_cinder_backup.service"
    +                 "tripleo_glance_api.service"
    +                 "tripleo_manila_api.service"
    +                 "tripleo_manila_api_cron.service"
    +                 "tripleo_manila_scheduler.service"
    +                 "tripleo_neutron_api.service"
    +                 "tripleo_placement_api.service"
    +                 "tripleo_nova_api_cron.service"
    +                 "tripleo_nova_api.service"
    +                 "tripleo_nova_conductor.service"
    +                 "tripleo_nova_metadata.service"
    +                 "tripleo_nova_scheduler.service"
    +                 "tripleo_nova_vnc_proxy.service"
    +                 "tripleo_aodh_api.service"
    +                 "tripleo_aodh_api_cron.service"
    +                 "tripleo_aodh_evaluator.service"
    +                 "tripleo_aodh_listener.service"
    +                 "tripleo_aodh_notifier.service"
    +                 "tripleo_ceilometer_agent_central.service"
    +                 "tripleo_ceilometer_agent_compute.service"
    +                 "tripleo_ceilometer_agent_ipmi.service"
    +                 "tripleo_ceilometer_agent_notification.service"
    +                 "tripleo_ovn_cluster_north_db_server.service"
    +                 "tripleo_ovn_cluster_south_db_server.service"
    +                 "tripleo_ovn_cluster_northd.service")
    +
    +PacemakerResourcesToStart=("galera-bundle"
    +                           "haproxy-bundle"
    +                           "rabbitmq-bundle"
    +                           "openstack-cinder-volume"
    +                           "openstack-cinder-backup"
    +                           "openstack-manila-share")
    +
    +echo "Starting systemd OpenStack services"
    +for service in ${ServicesToStart[*]}; do
    +    for i in {1..3}; do
    +        SSH_CMD=CONTROLLER${i}_SSH
    +        if [ ! -z "${!SSH_CMD}" ]; then
    +            if ${!SSH_CMD} sudo systemctl is-enabled $service &> /dev/null; then
    +                echo "Starting the $service in controller $i"
    +                ${!SSH_CMD} sudo systemctl start $service
    +            fi
    +        fi
    +    done
    +done
    +
    +echo "Checking systemd OpenStack services"
    +for service in ${ServicesToStart[*]}; do
    +    for i in {1..3}; do
    +        SSH_CMD=CONTROLLER${i}_SSH
    +        if [ ! -z "${!SSH_CMD}" ]; then
    +            if ${!SSH_CMD} sudo systemctl is-enabled $service &> /dev/null; then
    +                if ! ${!SSH_CMD} systemctl show $service | grep ActiveState=active >/dev/null; then
    +                    echo "ERROR: Service $service is not running on controller $i"
    +                else
    +                    echo "OK: Service $service is running in controller $i"
    +                fi
    +            fi
    +        fi
    +    done
    +done
    +
    +echo "Starting pacemaker OpenStack services"
    +for i in {1..3}; do
    +    SSH_CMD=CONTROLLER${i}_SSH
    +    if [ ! -z "${!SSH_CMD}" ]; then
    +        echo "Using controller $i to run pacemaker commands"
    +        for resource in ${PacemakerResourcesToStart[*]}; do
    +            if ${!SSH_CMD} sudo pcs resource config $resource &>/dev/null; then
    +                echo "Starting $resource"
    +                ${!SSH_CMD} sudo pcs resource enable $resource
    +            else
    +                echo "Service $resource not present"
    +            fi
    +        done
    +        break
    +    fi
    +done
    +
    +echo "Checking pacemaker OpenStack services"
    +for i in {1..3}; do
    +    SSH_CMD=CONTROLLER${i}_SSH
    +    if [ ! -z "${!SSH_CMD}" ]; then
    +        echo "Using controller $i to run pacemaker commands"
    +        for resource in ${PacemakerResourcesToStop[*]}; do
    +            if ${!SSH_CMD} sudo pcs resource config $resource &>/dev/null; then
    +                if ${!SSH_CMD} sudo pcs resource status $resource | grep Started >/dev/null; then
    +                    echo "OK: Service $resource is started"
    +                else
    +                    echo "ERROR: Service $resource is stopped"
    +                fi
    +            fi
    +        done
    +        break
    +    fi
    +done
    +
    +
    +
  2. +
  3. +

    If the Ceph NFS service is running on the deployment as a Shared File Systems service (manila) back end, you must restore the Pacemaker order and colocation constraints for the openstack-manila-share service:

    +
    +
    +
    $ sudo pcs constraint order start ceph-nfs then openstack-manila-share kind=Optional id=order-ceph-nfs-openstack-manila-share-Optional
    +$ sudo pcs constraint colocation add openstack-manila-share with ceph-nfs score=INFINITY id=colocation-openstack-manila-share-ceph-nfs-INFINITY
    +
    +
    +
  4. +
  5. +

    Verify that the source cloud is operational again, for example, you +can run openstack CLI commands such as openstack server list, or check that you can access the Dashboard service (horizon).

    +
  6. +
  7. +

    Remove the partially or fully deployed control plane so that you can attempt the adoption again later:

    +
    +
    +
    $ oc delete --ignore-not-found=true --wait=false openstackcontrolplane/openstack
    +$ oc patch openstackcontrolplane openstack --type=merge --patch '
    +metadata:
    +  finalizers: []
    +' || true
    +
    +while oc get pod | grep rabbitmq-server-0; do
    +    sleep 2
    +done
    +while oc get pod | grep openstack-galera-0; do
    +    sleep 2
    +done
    +
    +$ oc delete --ignore-not-found=true --wait=false pod mariadb-copy-data
    +$ oc delete --ignore-not-found=true --wait=false pvc mariadb-data
    +$ oc delete --ignore-not-found=true --wait=false pod ovn-copy-data
    +$ oc delete --ignore-not-found=true secret osp-secret
    +
    +
    +
  8. +
+
+
+ + + + + +
+ + +After you restore the OSP control plane services, their internal +state might have changed. Before you retry the adoption procedure, verify that all the control plane resources are removed and that there are no leftovers which could affect the following adoption procedure attempt. You must not use previously created copies of the database contents in another adoption attempt. You must make a new copy of the latest state of the original source database contents. For more information about making new copies of the database, see Migrating databases to the control plane. +
+
+
+
+
+
+

Adopting the data plane

+
+
+

Adopting the Red Hat OpenStack Services on OpenShift (RHOSO) data plane involves the following steps:

+
+
+
    +
  1. +

    Stop any remaining services on the OpenStack (OSP) Wallaby control plane.

    +
  2. +
  3. +

    Deploy the required custom resources.

    +
  4. +
  5. +

    Perform a fast-forward upgrade on Compute services from OSP Wallaby to RHOSO Antelope.

    +
  6. +
  7. +

    If applicable, adopt Networker nodes to the RHOSO data plane.

    +
  8. +
+
+
+ + + + + +
+ + +After the RHOSO control plane manages the newly deployed data plane, you must not re-enable services on the OSP Wallaby control plane and data plane. If you re-enable services, workloads are managed by two control planes or two data planes, resulting in data corruption, loss of control of existing workloads, inability to start new workloads, or other issues. +
+
+
+

Stopping infrastructure management and Compute services

+
+

You must stop cloud Controller nodes, database nodes, and messaging nodes on the OpenStack Wallaby control plane. Do not stop nodes that are running the Compute, Storage, or Networker roles on the control plane.

+
+
+

The following procedure applies to a single node standalone TripleO deployment. You must remove conflicting repositories and packages from your Compute hosts, so that you can install libvirt packages when these hosts are adopted as data plane nodes, where modular libvirt daemons are no longer running in podman containers.

+
+
+
Prerequisites
+
    +
  • +

    Define the shell variables. Replace the following example values with values that apply to your environment:

    +
    +
    +
    EDPM_PRIVATEKEY_PATH="~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa"
    +declare -A computes
    +computes=(
    +  ["standalone.localdomain"]="192.168.122.100"
    +  # ...
    +)
    +
    +
    +
    +
      +
    • +

      Replace ["standalone.localdomain"]="192.168.122.100" with the name and IP address of the Compute node.

      +
    • +
    +
    +
  • +
+
+
+
Procedure
+
    +
  • +

    Remove the conflicting repositories and packages from all Compute hosts:

    +
    +
    +
    PacemakerResourcesToStop=(
    +                "galera-bundle"
    +                "haproxy-bundle"
    +                "rabbitmq-bundle")
    +
    +echo "Stopping pacemaker services"
    +for i in {1..3}; do
    +    SSH_CMD=CONTROLLER${i}_SSH
    +    if [ ! -z "${!SSH_CMD}" ]; then
    +        echo "Using controller $i to run pacemaker commands"
    +        for resource in ${PacemakerResourcesToStop[*]}; do
    +            if ${!SSH_CMD} sudo pcs resource config $resource; then
    +                ${!SSH_CMD} sudo pcs resource disable $resource
    +            fi
    +        done
    +        break
    +    fi
    +done
    +
    +
    +
  • +
+
+
+
+

Adopting Compute services to the RHOSO data plane

+
+

Adopt your Compute (nova) services to the Red Hat OpenStack Services on OpenShift (RHOSO) data plane.

+
+
+
Prerequisites
+
    +
  • +

    You have stopped the remaining control plane nodes, repositories, and packages on the Compute service (nova) hosts. For more information, see Stopping infrastructure management and Compute services.

    +
  • +
  • +

    You have configured the Ceph back end for the NovaLibvirt service. For more information, see Configuring a Ceph back end.

    +
  • +
  • +

    You have configured IP Address Management (IPAM):

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: network.openstack.org/v1beta1
    +kind: NetConfig
    +metadata:
    +  name: netconfig
    +spec:
    +  networks:
    +  - name: ctlplane
    +    dnsDomain: ctlplane.example.com
    +    subnets:
    +    - name: subnet1
    +      allocationRanges:
    +      - end: 192.168.122.120
    +        start: 192.168.122.100
    +      - end: 192.168.122.200
    +        start: 192.168.122.150
    +      cidr: 192.168.122.0/24
    +      gateway: 192.168.122.1
    +  - name: internalapi
    +    dnsDomain: internalapi.example.com
    +    subnets:
    +    - name: subnet1
    +      allocationRanges:
    +      - end: 172.17.0.250
    +        start: 172.17.0.100
    +      cidr: 172.17.0.0/24
    +      vlan: 20
    +  - name: External
    +    dnsDomain: external.example.com
    +    subnets:
    +    - name: subnet1
    +      allocationRanges:
    +      - end: 10.0.0.250
    +        start: 10.0.0.100
    +      cidr: 10.0.0.0/24
    +      gateway: 10.0.0.1
    +  - name: storage
    +    dnsDomain: storage.example.com
    +    subnets:
    +    - name: subnet1
    +      allocationRanges:
    +      - end: 172.18.0.250
    +        start: 172.18.0.100
    +      cidr: 172.18.0.0/24
    +      vlan: 21
    +  - name: storagemgmt
    +    dnsDomain: storagemgmt.example.com
    +    subnets:
    +    - name: subnet1
    +      allocationRanges:
    +      - end: 172.20.0.250
    +        start: 172.20.0.100
    +      cidr: 172.20.0.0/24
    +      vlan: 23
    +  - name: tenant
    +    dnsDomain: tenant.example.com
    +    subnets:
    +    - name: subnet1
    +      allocationRanges:
    +      - end: 172.19.0.250
    +        start: 172.19.0.100
    +      cidr: 172.19.0.0/24
    +      vlan: 22
    +EOF
    +
    +
    +
  • +
  • +

    If neutron-sriov-nic-agent is running on your Compute service nodes, ensure that the physical device mappings match the values that are defined in the OpenStackDataPlaneNodeSet custom resource (CR). For more information, see Pulling the configuration from a TripleO deployment.

    +
  • +
  • +

    You have defined the shell variables to run the script that runs the fast-forward upgrade:

    +
    +
    +
    PODIFIED_DB_ROOT_PASSWORD=$(oc get -o json secret/osp-secret | jq -r .data.DbRootPassword | base64 -d)
    +CEPH_FSID=$(oc get secret ceph-conf-files -o json | jq -r '.data."ceph.conf"' | base64 -d | grep fsid | sed -e 's/fsid = //'
    +
    +alias openstack="oc exec -t openstackclient -- openstack"
    +declare -A computes
    +export computes=(
    +  ["standalone.localdomain"]="192.168.122.100"
    +  # ...
    +)
    +
    +
    +
    +
      +
    • +

      Replace ["standalone.localdomain"]="192.168.122.100" with the name and IP address of the Compute service node.

      +
      + + + + + +
      + + +Do not set a value for the CEPH_FSID parameter if the local storage back end is configured by the Compute service for libvirt. The storage back end must match the source cloud storage back end. You cannot change the storage back end during adoption. +
      +
      +
    • +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Create a ssh authentication secret for the data plane nodes:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: v1
    +kind: Secret
    +metadata:
    +    name: dataplane-adoption-secret
    +    namespace: openstack
    +data:
    +    ssh-privatekey: |
    +$(cat ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa | base64 | sed 's/^/        /')
    +EOF
    +
    +
    +
  2. +
  3. +

    Generate an ssh key-pair nova-migration-ssh-key secret:

    +
    +
    +
    $ cd "$(mktemp -d)"
    +ssh-keygen -f ./id -t ecdsa-sha2-nistp521 -N ''
    +oc get secret nova-migration-ssh-key || oc create secret generic nova-migration-ssh-key \
    +  -n openstack \
    +  --from-file=ssh-privatekey=id \
    +  --from-file=ssh-publickey=id.pub \
    +  --type kubernetes.io/ssh-auth
    +rm -f id*
    +cd -
    +
    +
    +
  4. +
  5. +

    If you use a local storage back end for libvirt, create a nova-compute-extra-config service to remove pre-fast-forward workarounds and configure Compute services to use a local storage back end:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: v1
    +kind: ConfigMap
    +metadata:
    +  name: nova-extra-config
    +  namespace: openstack
    +data:
    +  19-nova-compute-cell1-workarounds.conf: |
    +    [workarounds]
    +    disable_compute_service_check_for_ffu=true
    +EOF
    +
    +
    +
    + + + + + +
    + + +The secret nova-cell<X>-compute-config auto-generates for each +cell<X>. You must specify values for the nova-cell<X>-compute-config and nova-migration-ssh-key parameters for each custom OpenStackDataPlaneService CR that is related to the Compute service. +
    +
    +
  6. +
  7. +

    If TLS Everywhere is enabled, append the following content to the OpenStackDataPlaneService CR:

    +
    +
    +
      tlsCerts:
    +    contents:
    +      - dnsnames
    +      - ips
    +    networks:
    +      - ctlplane
    +    issuer: osp-rootca-issuer-internal
    +  caCerts: combined-ca-bundle
    +  edpmServiceType: nova
    +
    +
    +
  8. +
  9. +

    If you use a Ceph back end for libvirt, create a nova-compute-extra-config service to remove pre-fast-forward upgrade workarounds and configure Compute services to use a Ceph back end:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: v1
    +kind: ConfigMap
    +metadata:
    +  name: nova-extra-config
    +  namespace: openstack
    +data:
    +  19-nova-compute-cell1-workarounds.conf: |
    +    [workarounds]
    +    disable_compute_service_check_for_ffu=true
    +  03-ceph-nova.conf: |
    +    [libvirt]
    +    images_type=rbd
    +    images_rbd_pool=vms
    +    images_rbd_ceph_conf=/etc/ceph/ceph.conf
    +    images_rbd_glance_store_name=default_backend
    +    images_rbd_glance_copy_poll_interval=15
    +    images_rbd_glance_copy_timeout=600
    +    rbd_user=openstack
    +    rbd_secret_uuid=$CEPH_FSID
    +EOF
    +
    +
    +
    +

    The resources in the ConfigMap contain cell-specific configurations.

    +
    +
  10. +
  11. +

    Deploy the OpenStackDataPlaneNodeSet CR:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: dataplane.openstack.org/v1beta1
    +kind: OpenStackDataPlaneNodeSet
    +metadata:
    +  name: openstack
    +spec:
    +  tlsEnabled: false (1)
    +  networkAttachments:
    +      - ctlplane
    +  preProvisioned: true
    +  services:
    +    - bootstrap
    +    - download-cache
    +    - configure-network
    +    - validate-network
    +    - install-os
    +    - configure-os
    +    - ssh-known-hosts
    +    - run-os
    +    - reboot-os
    +    - install-certs
    +    - libvirt
    +    - nova
    +    - ovn
    +    - neutron-metadata
    +    - telemetry
    +  env:
    +    - name: ANSIBLE_CALLBACKS_ENABLED
    +      value: "profile_tasks"
    +    - name: ANSIBLE_FORCE_COLOR
    +      value: "True"
    +  nodes:
    +    standalone:
    +      hostName: standalone (2)
    +      ansible:
    +        ansibleHost: ${computes[standalone.localdomain]}
    +      networks:
    +      - defaultRoute: true
    +        fixedIP: ${computes[standalone.localdomain]}
    +        name: ctlplane
    +        subnetName: subnet1
    +      - name: internalapi
    +        subnetName: subnet1
    +      - name: storage
    +        subnetName: subnet1
    +      - name: tenant
    +        subnetName: subnet1
    +  nodeTemplate:
    +    ansibleSSHPrivateKeySecret: dataplane-adoption-secret
    +    ansible:
    +      ansibleUser: root
    +      ansibleVars:
    +        edpm_bootstrap_release_version_package: []
    +        # edpm_network_config
    +        # Default nic config template for a EDPM node
    +        # These vars are edpm_network_config role vars
    +        edpm_network_config_template: |
    +           ---
    +           {% set mtu_list = [ctlplane_mtu] %}
    +           {% for network in nodeset_networks %}
    +           {{ mtu_list.append(lookup('vars', networks_lower[network] ~ '_mtu')) }}
    +           {%- endfor %}
    +           {% set min_viable_mtu = mtu_list | max %}
    +           network_config:
    +           - type: ovs_bridge
    +             name: {{ neutron_physical_bridge_name }}
    +             mtu: {{ min_viable_mtu }}
    +             use_dhcp: false
    +             dns_servers: {{ ctlplane_dns_nameservers }}
    +             domain: {{ dns_search_domains }}
    +             addresses:
    +             - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }}
    +             routes: {{ ctlplane_host_routes }}
    +             members:
    +             - type: interface
    +               name: nic1
    +               mtu: {{ min_viable_mtu }}
    +               # force the MAC address of the bridge to this interface
    +               primary: true
    +           {% for network in nodeset_networks %}
    +             - type: vlan
    +               mtu: {{ lookup('vars', networks_lower[network] ~ '_mtu') }}
    +               vlan_id: {{ lookup('vars', networks_lower[network] ~ '_vlan_id') }}
    +               addresses:
    +               - ip_netmask:
    +                   {{ lookup('vars', networks_lower[network] ~ '_ip') }}/{{ lookup('vars', networks_lower[network] ~ '_cidr') }}
    +               routes: {{ lookup('vars', networks_lower[network] ~ '_host_routes') }}
    +           {% endfor %}
    +
    +        edpm_network_config_hide_sensitive_logs: false
    +        #
    +        # These vars are for the network config templates themselves and are
    +        # considered EDPM network defaults.
    +        neutron_physical_bridge_name: br-ctlplane
    +        neutron_public_interface_name: eth0
    +
    +        # edpm_nodes_validation
    +        edpm_nodes_validation_validate_controllers_icmp: false
    +        edpm_nodes_validation_validate_gateway_icmp: false
    +
    +        # edpm ovn-controller configuration
    +        edpm_ovn_bridge_mappings: <bridge_mappings> (3)
    +        edpm_ovn_bridge: br-int
    +        edpm_ovn_encap_type: geneve
    +        ovn_monitor_all: true
    +        edpm_ovn_remote_probe_interval: 60000
    +        edpm_ovn_ofctrl_wait_before_clear: 8000
    +
    +        timesync_ntp_servers:
    +        - hostname: pool.ntp.org
    +
    +        edpm_bootstrap_command: |
    +          # This is a hack to deploy RDO Delorean repos to RHEL as if it were Centos 9 Stream
    +          set -euxo pipefail
    +          curl -sL https://github.com/openstack-k8s-operators/repo-setup/archive/refs/heads/main.tar.gz | tar -xz
    +          python3 -m venv ./venv
    +          PBR_VERSION=0.0.0 ./venv/bin/pip install ./repo-setup-main
    +          # This is required for FIPS enabled until trunk.rdoproject.org
    +          # is not being served from a centos7 host, tracked by
    +          # https://issues.redhat.com/browse/RHOSZUUL-1517
    +          dnf -y install crypto-policies
    +          update-crypto-policies --set FIPS:NO-ENFORCE-EMS
    +          # FIXME: perform dnf upgrade for other packages in EDPM ansible
    +          # here we only ensuring that decontainerized libvirt can start
    +          ./venv/bin/repo-setup current-podified -b antelope -d centos9 --stream
    +          dnf -y upgrade openstack-selinux
    +          rm -f /run/virtlogd.pid
    +          rm -rf repo-setup-main
    +
    +        gather_facts: false
    +        # edpm firewall, change the allowed CIDR if needed
    +        edpm_sshd_configure_firewall: true
    +        edpm_sshd_allowed_ranges: ['192.168.122.0/24']
    +
    +        # Do not attempt OVS major upgrades here
    +        edpm_ovs_packages:
    +        - openvswitch3.1
    +EOF
    +
    +
    +
    + + + + + + + + + + + + + +
    1If TLS Everywhere is enabled, change spec:tlsEnabled to true.
    2If your deployment has a custom DNS Domain, modify the spec:nodes:[NODE NAME]:hostName to use fqdn for the node.
    3Replace <bridge_mappings> with the value of the bridge mappings in your configuration, for example, "datacentre:br-ctlplane".
    +
    +
  12. +
  13. +

    Ensure that you use the same ovn-controller settings in the OpenStackDataPlaneNodeSet CR that you used in the Compute service nodes before adoption. This configuration is stored in the external_ids column in the Open_vSwitch table in the Open vSwitch database:

    +
    +
    +
    ovs-vsctl list Open .
    +...
    +external_ids        : {hostname=standalone.localdomain, ovn-bridge=br-int, ovn-bridge-mappings=<bridge_mappings>, ovn-chassis-mac-mappings="datacentre:1e:0a:bb:e6:7c:ad", ovn-encap-ip="172.19.0.100", ovn-encap-tos="0", ovn-encap-type=geneve, ovn-match-northd-version=False, ovn-monitor-all=True, ovn-ofctrl-wait-before-clear="8000", ovn-openflow-probe-interval="60", ovn-remote="tcp:ovsdbserver-sb.openstack.svc:6642", ovn-remote-probe-interval="60000", rundir="/var/run/openvswitch", system-id="2eec68e6-aa21-4c95-a868-31aeafc11736"}
    +...
    +
    +
    +
    +
      +
    • +

      Replace <bridge_mappings> with the value of the bridge mappings in your configuration, for example, "datacentre:br-ctlplane".

      +
    • +
    +
    +
  14. +
  15. +

    If you use a Ceph back end for Block Storage service (cinder), prepare the adopted data plane workloads:

    +
    +
    +
    $ oc patch osdpns/openstack --type=merge --patch "
    +spec:
    +  services:
    +    - bootstrap
    +    - download-cache
    +    - configure-network
    +    - validate-network
    +    - install-os
    +    - configure-os
    +    - ssh-known-hosts
    +    - run-os
    +    - reboot-os
    +    - ceph-client
    +    - install-certs
    +    - ovn
    +    - neutron-metadata
    +    - libvirt
    +    - nova
    +    - telemetry
    +  nodeTemplate:
    +    extraMounts:
    +    - extraVolType: Ceph
    +      volumes:
    +      - name: ceph
    +        secret:
    +          secretName: ceph-conf-files
    +      mounts:
    +      - name: ceph
    +        mountPath: "/etc/ceph"
    +        readOnly: true
    +"
    +
    +
    +
    + + + + + +
    + + +Ensure that you use the same list of services from the original OpenStackDataPlaneNodeSet CR, except for the inserted ceph-client service. +
    +
    +
  16. +
  17. +

    Optional: Enable neutron-sriov-nic-agent in the OpenStackDataPlaneNodeSet CR:

    +
    +
    +
    $ oc patch openstackdataplanenodeset openstack --type='json' --patch='[
    +  {
    +    "op": "add",
    +    "path": "/spec/services/-",
    +    "value": "neutron-sriov"
    +  }, {
    +    "op": "add",
    +    "path": "/spec/nodeTemplate/ansible/ansibleVars/edpm_neutron_sriov_agent_SRIOV_NIC_physical_device_mappings",
    +    "value": "dummy_sriov_net:dummy-dev"
    +  }, {
    +    "op": "add",
    +    "path": "/spec/nodeTemplate/ansible/ansibleVars/edpm_neutron_sriov_agent_SRIOV_NIC_resource_provider_bandwidths",
    +    "value": "dummy-dev:40000000:40000000"
    +  }, {
    +    "op": "add",
    +    "path": "/spec/nodeTemplate/ansible/ansibleVars/edpm_neutron_sriov_agent_SRIOV_NIC_resource_provider_hypervisors",
    +    "value": "dummy-dev:standalone.localdomain"
    +  }
    +]'
    +
    +
    +
  18. +
  19. +

    Optional: Enable neutron-dhcp in the OpenStackDataPlaneNodeSet CR:

    +
    +
    +
    $ oc patch openstackdataplanenodeset openstack --type='json' --patch='[
    +  {
    +    "op": "add",
    +    "path": "/spec/services/-",
    +    "value": "neutron-dhcp"
    +  }]'
    +
    +
    +
    + + + + + +
    + + +
    +

    To use neutron-dhcp with OVN for the Bare Metal Provisioning service (ironic), you must set the disable_ovn_dhcp_for_baremetal_ports configuration option for the Networking service (neutron) to true. You can set this configuration in the NeutronAPI spec:

    +
    +
    +
    +
    ..
    +spec:
    +  serviceUser: neutron
    +   ...
    +      customServiceConfig: |
    +          [ovn]
    +          disable_ovn_dhcp_for_baremetal_ports = true
    +
    +
    +
    +
    +
  20. +
  21. +

    Run the pre-adoption validation:

    +
    +
      +
    1. +

      Create the validation service:

      +
      +
      +
      $ oc apply -f - <<EOF
      +apiVersion: dataplane.openstack.org/v1beta1
      +kind: OpenStackDataPlaneService
      +metadata:
      +  name: pre-adoption-validation
      +spec:
      +  playbook: osp.edpm.pre_adoption_validation
      +EOF
      +
      +
      +
    2. +
    3. +

      Create a OpenStackDataPlaneDeployment CR that runs only the validation:

      +
      +
      +
      $ oc apply -f - <<EOF
      +apiVersion: dataplane.openstack.org/v1beta1
      +kind: OpenStackDataPlaneDeployment
      +metadata:
      +  name: openstack-pre-adoption
      +spec:
      +  nodeSets:
      +  - openstack
      +  servicesOverride:
      +  - pre-adoption-validation
      +EOF
      +
      +
      +
    4. +
    5. +

      When the validation is finished, confirm that the status of the Ansible EE pods is Completed:

      +
      +
      +
      $ watch oc get pod -l app=openstackansibleee
      +
      +
      +
      +
      +
      $ oc logs -l app=openstackansibleee -f --max-log-requests 20
      +
      +
      +
    6. +
    7. +

      Wait for the deployment to reach the Ready status:

      +
      +
      +
      $ oc wait --for condition=Ready openstackdataplanedeployment/openstack-pre-adoption --timeout=10m
      +
      +
      +
      + + + + + +
      + + +
      +

      If any openstack-pre-adoption validations fail, you must reference the Ansible logs to determine which ones were unsuccessful, and then try the following troubleshooting options:

      +
      +
      +
        +
      • +

        If the hostname validation failed, check that the hostname of the data plane +node is correctly listed in the OpenStackDataPlaneNodeSet CR.

        +
      • +
      • +

        If the kernel argument check failed, ensure that the kernel argument configuration in the edpm_kernel_args and edpm_kernel_hugepages variables in the OpenStackDataPlaneNodeSet CR is the same as the kernel argument configuration that you used in the OpenStack (OSP) Wallaby node.

        +
      • +
      • +

        If the tuned profile check failed, ensure that the +edpm_tuned_profile variable in the OpenStackDataPlaneNodeSet CR is configured +to use the same profile as the one set on the OSP Wallaby node.

        +
      • +
      +
      +
      +
      +
    8. +
    +
    +
  22. +
  23. +

    Remove the remaining TripleO services:

    +
    +
      +
    1. +

      Create an OpenStackDataPlaneService CR to clean up the data plane services you are adopting:

      +
      +
      +
      $ oc apply -f - <<EOF
      +apiVersion: dataplane.openstack.org/v1beta1
      +kind: OpenStackDataPlaneService
      +metadata:
      +  name: tripleo-cleanup
      +spec:
      +  playbook: osp.edpm.tripleo_cleanup
      +EOF
      +
      +
      +
    2. +
    3. +

      Create the OpenStackDataPlaneDeployment CR to run the clean-up:

      +
      +
      +
      $ oc apply -f - <<EOF
      +apiVersion: dataplane.openstack.org/v1beta1
      +kind: OpenStackDataPlaneDeployment
      +metadata:
      +  name: tripleo-cleanup
      +spec:
      +  nodeSets:
      +  - openstack
      +  servicesOverride:
      +  - tripleo-cleanup
      +EOF
      +
      +
      +
    4. +
    +
    +
  24. +
  25. +

    When the clean-up is finished, deploy the OpenStackDataPlaneDeployment CR:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: dataplane.openstack.org/v1beta1
    +kind: OpenStackDataPlaneDeployment
    +metadata:
    +  name: openstack
    +spec:
    +  nodeSets:
    +  - openstack
    +EOF
    +
    +
    +
    + + + + + +
    + + +If you have other node sets to deploy, such as Networker nodes, you can +add them in the nodeSets list in this step, or create separate OpenStackDataPlaneDeployment CRs later. You cannot add new node sets to an OpenStackDataPlaneDeployment CR after deployment. +
    +
    +
  26. +
+
+
+
Verification
+
    +
  1. +

    Confirm that all the Ansible EE pods reach a Completed status:

    +
    +
    +
    $ watch oc get pod -l app=openstackansibleee
    +
    +
    +
    +
    +
    $ oc logs -l app=openstackansibleee -f --max-log-requests 20
    +
    +
    +
  2. +
  3. +

    Wait for the data plane node set to reach the Ready status:

    +
    +
    +
    $ oc wait --for condition=Ready osdpns/openstack --timeout=30m
    +
    +
    +
  4. +
  5. +

    Verify that the Networking service (neutron) agents are running:

    +
    +
    +
    $ oc exec openstackclient -- openstack network agent list
    ++--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+----------------------------+
    +| ID                                   | Agent Type                   | Host                   | Availability Zone | Alive | State | Binary                     |
    ++--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+----------------------------+
    +| 174fc099-5cc9-4348-b8fc-59ed44fcfb0e | DHCP agent                   | standalone.localdomain | nova              | :-)   | UP    | neutron-dhcp-agent         |
    +| 10482583-2130-5b0d-958f-3430da21b929 | OVN Metadata agent           | standalone.localdomain |                   | :-)   | UP    | neutron-ovn-metadata-agent |
    +| a4f1b584-16f1-4937-b2b0-28102a3f6eaa | OVN Controller agent         | standalone.localdomain |                   | :-)   | UP    | ovn-controller             |
    ++--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+----------------------------+
    +
    +
    +
  6. +
+
+
+
Next steps
+ +
+
+
+

Performing a fast-forward upgrade on Compute services

+
+

You must upgrade the Compute services from OpenStack Wallaby to Red Hat OpenStack Services on OpenShift (RHOSO) Antelope on the control plane and data plane by completing the following tasks:

+
+
+
    +
  • +

    Update the cell1 Compute data plane services version.

    +
  • +
  • +

    Remove pre-fast-forward upgrade workarounds from the Compute control plane services and Compute data plane services.

    +
  • +
  • +

    Run Compute database online migrations to update live data.

    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Wait for cell1 Compute data plane services version to update:

    +
    +
    +
    $ oc exec openstack-cell1-galera-0 -c galera -- mysql -rs -uroot -p$PODIFIED_DB_ROOT_PASSWORD \
    +    -e "select a.version from nova_cell1.services a join nova_cell1.services b where a.version!=b.version and a.binary='nova-compute';"
    +
    +
    +
    + + + + + +
    + + +
    +

    The query returns an empty result when the update is completed. No downtime is expected for virtual machine workloads.

    +
    +
    +

    Review any errors in the nova Compute agent logs on the data plane, and the nova-conductor journal records on the control plane.

    +
    +
    +
    +
  2. +
  3. +

    Patch the OpenStackControlPlane CR to remove the pre-fast-forward upgrade workarounds from the Compute control plane services:

    +
    +
    +
    $ oc patch openstackcontrolplane openstack -n openstack --type=merge --patch '
    +spec:
    +  nova:
    +    template:
    +      cellTemplates:
    +        cell0:
    +          conductorServiceTemplate:
    +            customServiceConfig: |
    +              [workarounds]
    +              disable_compute_service_check_for_ffu=false
    +        cell1:
    +          metadataServiceTemplate:
    +            customServiceConfig: |
    +              [workarounds]
    +              disable_compute_service_check_for_ffu=false
    +          conductorServiceTemplate:
    +            customServiceConfig: |
    +              [workarounds]
    +              disable_compute_service_check_for_ffu=false
    +      apiServiceTemplate:
    +        customServiceConfig: |
    +          [workarounds]
    +          disable_compute_service_check_for_ffu=false
    +      metadataServiceTemplate:
    +        customServiceConfig: |
    +          [workarounds]
    +          disable_compute_service_check_for_ffu=false
    +      schedulerServiceTemplate:
    +        customServiceConfig: |
    +          [workarounds]
    +          disable_compute_service_check_for_ffu=false
    +'
    +
    +
    +
  4. +
  5. +

    Wait until the Compute control plane services CRs are ready:

    +
    +
    +
    $ oc wait --for condition=Ready --timeout=300s Nova/nova
    +
    +
    +
  6. +
  7. +

    Remove the pre-fast-forward upgrade workarounds from the Compute data plane services:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: v1
    +kind: ConfigMap
    +metadata:
    +  name: nova-extra-config
    +  namespace: openstack
    +data:
    +  20-nova-compute-cell1-workarounds.conf: |
    +    [workarounds]
    +    disable_compute_service_check_for_ffu=false
    +---
    +apiVersion: dataplane.openstack.org/v1beta1
    +kind: OpenStackDataPlaneDeployment
    +metadata:
    +  name: openstack-nova-compute-ffu
    +  namespace: openstack
    +spec:
    +  nodeSets:
    +    - openstack
    +  servicesOverride:
    +    - nova
    +EOF
    +
    +
    +
    + + + + + +
    + + +The service included in the servicesOverride key must match the name of the service that you included in the OpenStackDataPlaneNodeSet CR. For example, if you use a custom service called nova-custom, ensure that you add it to the servicesOverride key. +
    +
    +
  8. +
  9. +

    Wait for the Compute data plane services to be ready:

    +
    +
    +
    $ oc wait --for condition=Ready openstackdataplanedeployment/openstack-nova-compute-ffu --timeout=5m
    +
    +
    +
  10. +
  11. +

    Run Compute database online migrations to complete the fast-forward upgrade:

    +
    +
    +
    $ oc exec -it nova-cell0-conductor-0 -- nova-manage db online_data_migrations
    +$ oc exec -it nova-cell1-conductor-0 -- nova-manage db online_data_migrations
    +
    +
    +
  12. +
+
+
+
Verification
+
    +
  1. +

    Discover the Compute hosts in the cell:

    +
    +
    +
    $ oc rsh nova-cell0-conductor-0 nova-manage cell_v2 discover_hosts --verbose
    +
    +
    +
  2. +
  3. +

    Verify if the Compute services can stop the existing test VM instance:

    +
    +
    +
    ${BASH_ALIASES[openstack]} server list -c Name -c Status -f value | grep -qF "test ACTIVE" && ${BASH_ALIASES[openstack]} server stop test || echo PASS
    +${BASH_ALIASES[openstack]} server list -c Name -c Status -f value | grep -qF "test SHUTOFF" || echo FAIL
    +${BASH_ALIASES[openstack]} server --os-compute-api-version 2.48 show --diagnostics test 2>&1 || echo PASS
    +
    +
    +
  4. +
  5. +

    Verify if the Compute services can start the existing test VM instance:

    +
    +
    +
    ${BASH_ALIASES[openstack]} server list -c Name -c Status -f value | grep -qF "test SHUTOFF" && ${BASH_ALIASES[openstack]} server start test || echo PASS
    +${BASH_ALIASES[openstack]} server list -c Name -c Status -f value | grep -qF "test ACTIVE" && \
    +  ${BASH_ALIASES[openstack]} server --os-compute-api-version 2.48 show --diagnostics test --fit-width -f json | jq -r '.state' | grep running || echo FAIL
    +
    +
    +
  6. +
+
+
+ + + + + +
+ + +After the data plane adoption, the Compute hosts continue to run Red Hat Enterprise Linux (RHEL) 9.2. To take advantage of RHEL 9.4, perform a minor update procedure after finishing the adoption procedure. +
+
+
+
+

Adopting Networker services to the RHOSO data plane

+
+

Adopt the Networker nodes in your existing OpenStack deployment to the Red Hat OpenStack Services on OpenShift (RHOSO) data plane. You decide which services you want to run on the Networker nodes, and create a separate OpenStackDataPlaneNodeSet custom resource (CR) for the Networker nodes. You might also decide to implement the following options if they apply to your environment:

+
+
+
    +
  • +

    Depending on your topology, you might need to run the neutron-metadata service on the nodes, specifically when you want to serve metadata to SR-IOV ports that are hosted on Compute nodes.

    +
  • +
  • +

    If you want to continue running OVN gateway services on Networker nodes, keep ovn service in the list to deploy.

    +
  • +
  • +

    Optional: You can run the neutron-dhcp service on your Networker nodes instead of your Compute nodes. You might not need to use neutron-dhcp with OVN, unless your deployment uses DHCP relays, or advanced DHCP options that are supported by dnsmasq but not by the OVN DHCP implementation.

    +
  • +
+
+
+
Prerequisites
+
    +
  • +

    Define the shell variable. The following value is an example +from a single node standalone TripleO deployment:

    +
    +
    +
    declare -A networkers
    +networkers=(
    +  ["standalone.localdomain"]="192.168.122.100"
    +  # ...
    +)
    +
    +
    +
    +
      +
    • +

      Replace ["standalone.localdomain"]="192.168.122.100" with the name and IP address of the Networker node.

      +
    • +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Deploy the OpenStackDataPlaneNodeSet CR for your Networker nodes:

    +
    + + + + + +
    + + +You can reuse most of the nodeTemplate section from the OpenStackDataPlaneNodeSet CR that is designated for your Compute nodes. You can omit some of the variables because of the limited set of services that are running on Networker nodes. +
    +
    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: dataplane.openstack.org/v1beta1
    +kind: OpenStackDataPlaneNodeSet
    +metadata:
    +  name: openstack-networker
    +spec:
    +  tlsEnabled: false (1)
    +  networkAttachments:
    +      - ctlplane
    +  preProvisioned: true
    +  services:
    +    - bootstrap
    +    - download-cache
    +    - configure-network
    +    - validate-network
    +    - install-os
    +    - configure-os
    +    - ssh-known-hosts
    +    - run-os
    +    - install-certs
    +    - ovn
    +  env:
    +    - name: ANSIBLE_CALLBACKS_ENABLED
    +      value: "profile_tasks"
    +    - name: ANSIBLE_FORCE_COLOR
    +      value: "True"
    +  nodes:
    +    standalone:
    +      hostName: standalone
    +      ansible:
    +        ansibleHost: ${networkers[standalone.localdomain]}
    +      networks:
    +      - defaultRoute: true
    +        fixedIP: ${networkers[standalone.localdomain]}
    +        name: ctlplane
    +        subnetName: subnet1
    +      - name: internalapi
    +        subnetName: subnet1
    +      - name: storage
    +        subnetName: subnet1
    +      - name: tenant
    +        subnetName: subnet1
    +  nodeTemplate:
    +    ansibleSSHPrivateKeySecret: dataplane-adoption-secret
    +    ansible:
    +      ansibleUser: root
    +      ansibleVars:
    +        edpm_bootstrap_release_version_package: []
    +        # edpm_network_config
    +        # Default nic config template for a EDPM node
    +        # These vars are edpm_network_config role vars
    +        edpm_network_config_template: |
    +           ---
    +           {% set mtu_list = [ctlplane_mtu] %}
    +           {% for network in nodeset_networks %}
    +           {{ mtu_list.append(lookup('vars', networks_lower[network] ~ '_mtu')) }}
    +           {%- endfor %}
    +           {% set min_viable_mtu = mtu_list | max %}
    +           network_config:
    +           - type: ovs_bridge
    +             name: {{ neutron_physical_bridge_name }}
    +             mtu: {{ min_viable_mtu }}
    +             use_dhcp: false
    +             dns_servers: {{ ctlplane_dns_nameservers }}
    +             domain: {{ dns_search_domains }}
    +             addresses:
    +             - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }}
    +             routes: {{ ctlplane_host_routes }}
    +             members:
    +             - type: interface
    +               name: nic1
    +               mtu: {{ min_viable_mtu }}
    +               # force the MAC address of the bridge to this interface
    +               primary: true
    +           {% for network in nodeset_networks %}
    +             - type: vlan
    +               mtu: {{ lookup('vars', networks_lower[network] ~ '_mtu') }}
    +               vlan_id: {{ lookup('vars', networks_lower[network] ~ '_vlan_id') }}
    +               addresses:
    +               - ip_netmask:
    +                   {{ lookup('vars', networks_lower[network] ~ '_ip') }}/{{ lookup('vars', networks_lower[network] ~ '_cidr') }}
    +               routes: {{ lookup('vars', networks_lower[network] ~ '_host_routes') }}
    +           {% endfor %}
    +
    +        edpm_network_config_hide_sensitive_logs: false
    +        #
    +        # These vars are for the network config templates themselves and are
    +        # considered EDPM network defaults.
    +        neutron_physical_bridge_name: br-ctlplane
    +        neutron_public_interface_name: eth0
    +
    +        # edpm_nodes_validation
    +        edpm_nodes_validation_validate_controllers_icmp: false
    +        edpm_nodes_validation_validate_gateway_icmp: false
    +
    +        # edpm ovn-controller configuration
    +        edpm_ovn_bridge_mappings: <bridge_mappings> (2)
    +        edpm_ovn_bridge: br-int
    +        edpm_ovn_encap_type: geneve
    +        ovn_monitor_all: true
    +        edpm_ovn_remote_probe_interval: 60000
    +        edpm_ovn_ofctrl_wait_before_clear: 8000
    +
    +        # serve as a OVN gateway
    +        edpm_enable_chassis_gw: true (3)
    +
    +        timesync_ntp_servers:
    +        - hostname: pool.ntp.org
    +
    +        edpm_bootstrap_command: |
    +          # This is a hack to deploy RDO Delorean repos to RHEL as if it were Centos 9 Stream
    +          set -euxo pipefail
    +          curl -sL https://github.com/openstack-k8s-operators/repo-setup/archive/refs/heads/main.tar.gz | tar -xz
    +          python3 -m venv ./venv
    +          PBR_VERSION=0.0.0 ./venv/bin/pip install ./repo-setup-main
    +          # This is required for FIPS enabled until trunk.rdoproject.org
    +          # is not being served from a centos7 host, tracked by
    +          # https://issues.redhat.com/browse/RHOSZUUL-1517
    +          dnf -y install crypto-policies
    +          update-crypto-policies --set FIPS:NO-ENFORCE-EMS
    +          ./venv/bin/repo-setup current-podified -b antelope -d centos9 --stream
    +          rm -rf repo-setup-main
    +
    +        gather_facts: false
    +        enable_debug: false
    +        # edpm firewall, change the allowed CIDR if needed
    +        edpm_sshd_configure_firewall: true
    +        edpm_sshd_allowed_ranges: ['192.168.122.0/24']
    +        # SELinux module
    +        edpm_selinux_mode: enforcing
    +
    +        # Do not attempt OVS major upgrades here
    +        edpm_ovs_packages:
    +        - openvswitch3.1
    +EOF
    +
    +
    +
    + + + + + + + + + + + + + +
    1If TLS Everywhere is enabled, change spec:tlsEnabled to true.
    2Set to the same values that you used in your OpenStack Wallaby deployment.
    3Set to true to run ovn-controller in gateway mode.
    +
    +
  2. +
  3. +

    Ensure that you use the same ovn-controller settings in the OpenStackDataPlaneNodeSet CR that you used in the Networker nodes before adoption. This configuration is stored in the external_ids column in the Open_vSwitch table in the Open vSwitch database:

    +
    +
    +
    ovs-vsctl list Open .
    +...
    +external_ids        : {hostname=standalone.localdomain, ovn-bridge=br-int, ovn-bridge-mappings=<bridge_mappings>, ovn-chassis-mac-mappings="datacentre:1e:0a:bb:e6:7c:ad", ovn-cms-options=enable-chassis-as-gw, ovn-encap-ip="172.19.0.100", ovn-encap-tos="0", ovn-encap-type=geneve, ovn-match-northd-version=False, ovn-monitor-all=True, ovn-ofctrl-wait-before-clear="8000", ovn-openflow-probe-interval="60", ovn-remote="tcp:ovsdbserver-sb.openstack.svc:6642", ovn-remote-probe-interval="60000", rundir="/var/run/openvswitch", system-id="2eec68e6-aa21-4c95-a868-31aeafc11736"}
    +...
    +
    +
    +
    +
      +
    • +

      Replace <bridge_mappings> with the value of the bridge mappings in your configuration, for example, "datacentre:br-ctlplane".

      +
    • +
    +
    +
  4. +
  5. +

    Optional: Enable neutron-metadata in the OpenStackDataPlaneNodeSet CR:

    +
    +
    +
    $ oc patch openstackdataplanenodeset <networker_CR_name> --type='json' --patch='[
    +  {
    +    "op": "add",
    +    "path": "/spec/services/-",
    +    "value": "neutron-metadata"
    +  }]'
    +
    +
    +
    +
      +
    • +

      Replace <networker_CR_name> with the name of the CR that you deployed for your Networker nodes, for example, openstack-networker.

      +
    • +
    +
    +
  6. +
  7. +

    Optional: Enable neutron-dhcp in the OpenStackDataPlaneNodeSet CR:

    +
    +
    +
    $ oc patch openstackdataplanenodeset <networker_CR_name> --type='json' --patch='[
    +  {
    +    "op": "add",
    +    "path": "/spec/services/-",
    +    "value": "neutron-dhcp"
    +  }]'
    +
    +
    +
  8. +
  9. +

    Run the pre-adoption-validation service for Networker nodes:

    +
    +
      +
    1. +

      Create a OpenStackDataPlaneDeployment CR that runs only the validation:

      +
      +
      +
      $ oc apply -f - <<EOF
      +apiVersion: dataplane.openstack.org/v1beta1
      +kind: OpenStackDataPlaneDeployment
      +metadata:
      +  name: openstack-pre-adoption-networker
      +spec:
      +  nodeSets:
      +  - openstack-networker
      +  servicesOverride:
      +  - pre-adoption-validation
      +EOF
      +
      +
      +
    2. +
    3. +

      When the validation is finished, confirm that the status of the Ansible EE pods is Completed:

      +
      +
      +
      $ watch oc get pod -l app=openstackansibleee
      +
      +
      +
      +
      +
      $ oc logs -l app=openstackansibleee -f --max-log-requests 20
      +
      +
      +
    4. +
    5. +

      Wait for the deployment to reach the Ready status:

      +
      +
      +
      $ oc wait --for condition=Ready openstackdataplanedeployment/openstack-pre-adoption-networker --timeout=10m
      +
      +
      +
    6. +
    +
    +
  10. +
  11. +

    Deploy the OpenStackDataPlaneDeployment CR for Networker nodes:

    +
    +
    +
    $ oc apply -f - <<EOF
    +apiVersion: dataplane.openstack.org/v1beta1
    +kind: OpenStackDataPlaneDeployment
    +metadata:
    +  name: openstack-networker
    +spec:
    +  nodeSets:
    +  - openstack-networker
    +EOF
    +
    +
    +
    + + + + + +
    + + +Alternatively, you can include the Networker node set in the nodeSets list before you deploy the main OpenStackDataPlaneDeployment CR. You cannot add new node sets to the OpenStackDataPlaneDeployment CR after deployment. +
    +
    +
  12. +
+
+
+
Verification
+
    +
  1. +

    Confirm that all the Ansible EE pods reach a Completed status:

    +
    +
    +
    $ watch oc get pod -l app=openstackansibleee
    +
    +
    +
    +
    +
    $ oc logs -l app=openstackansibleee -f --max-log-requests 20
    +
    +
    +
  2. +
  3. +

    Wait for the data plane node set to reach the Ready status:

    +
    +
    +
    $ oc wait --for condition=Ready osdpns/<networker_CR_name> --timeout=30m
    +
    +
    +
    +
      +
    • +

      Replace <networker_CR_name> with the name of the CR that you deployed for your Networker nodes, for example, openstack-networker.

      +
    • +
    +
    +
  4. +
  5. +

    Verify that the Networking service (neutron) agents are running. The list of agents varies depending on the services you enabled:

    +
    +
    +
    $ oc exec openstackclient -- openstack network agent list
    ++--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+----------------------------+
    +| ID                                   | Agent Type                   | Host                   | Availability Zone | Alive | State | Binary                     |
    ++--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+----------------------------+
    +| 174fc099-5cc9-4348-b8fc-59ed44fcfb0e | DHCP agent                   | standalone.localdomain | nova              | :-)   | UP    | neutron-dhcp-agent         |
    +| 10482583-2130-5b0d-958f-3430da21b929 | OVN Metadata agent           | standalone.localdomain |                   | :-)   | UP    | neutron-ovn-metadata-agent |
    +| a4f1b584-16f1-4937-b2b0-28102a3f6eaa | OVN Controller Gateway agent | standalone.localdomain |                   | :-)   | UP    | ovn-controller             |
    ++--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+----------------------------+
    +
    +
    +
  6. +
+
+
+
+
+
+

Migrating the Object Storage service (swift) to Red Hat OpenStack Services on OpenShift (RHOSO) nodes

+
+
+

This section only applies if you are using OpenStack Object Storage service (swift) as Object Storage +service. If you are using the Object Storage API of Ceph Object Gateway (RGW), you can skip this section.

+
+
+

Data migration to the new deployment might be a long running process that runs mostly in the background. The Object Storage service replicators will take care of moving data from old to new nodes, but depending on the amount of used storage this might take a very long time. You can still use the old nodes as long as they are running and continue with adopting other services in the meantime, reducing the amount of downtime. Note that performance might be decreased to the amount of replication traffic in the network.

+
+
+

Migration of the data happens replica by replica. Assuming you start with 3 replicas, only 1 one them is being moved at any time, ensuring the remaining 2 replicas are still available and the Object Storage service is usable during the migration.

+
+
+

Migrating the Object Storage service (swift) data from OSP to Red Hat OpenStack Services on OpenShift (RHOSO) nodes

+
+

To ensure availability during the Object Storage service (swift) migration, you perform the following steps:

+
+
+
    +
  1. +

    Add new nodes to the Object Storage service rings

    +
  2. +
  3. +

    Set weights of existing nodes to 0

    +
  4. +
  5. +

    Rebalance rings, moving one replica

    +
  6. +
  7. +

    Copy rings to old nodes and restart services

    +
  8. +
  9. +

    Check replication status and repeat previous two steps until old nodes are drained

    +
  10. +
  11. +

    Remove the old nodes from the rings

    +
  12. +
+
+
+
Prerequisites
+
    +
  • +

    Previous Object Storage service adoption steps are completed.

    +
  • +
  • +

    No new environmental variables need to be defined, though you use the +CONTROLLER1_SSH alias that was defined in a previous step.

    +
  • +
  • +

    For DNS servers, all existing nodes must be able to resolve host names of the OpenShift pods, for example by using the +external IP of the DNSMasq service as name server in /etc/resolv.conf:

    +
    +
    +
    oc get service dnsmasq-dns -o jsonpath="{.status.loadBalancer.ingress[0].ip}" | CONTROLLER1_SSH tee /etc/resolv.conf
    +
    +
    +
  • +
  • +

    To track the current status of the replication a tool called swift-dispersion is used. It consists of two parts, a population tool to be run before changing the Object Storage service rings and a report tool to run afterwards to gather the current status. Run the swift-dispersion-populate command:

    +
    +
    +
    oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-dispersion-populate'
    +
    +
    +
    +

    The command might need a few minutes to complete. It creates 0-byte objects distributed across the Object Storage service deployment, and its counter-part swift-dispersion-report can be used afterwards to show the current replication status.

    +
    +
    +

    The output of the swift-dispersion-report command should look like the following:

    +
    +
    +
    +
    oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-dispersion-report'
    +
    +
    +
    +
    +
    Queried 1024 containers for dispersion reporting, 5s, 0 retries
    +100.00% of container copies found (3072 of 3072)
    +Sample represents 100.00% of the container partition space
    +Queried 1024 objects for dispersion reporting, 4s, 0 retries
    +There were 1024 partitions missing 0 copies.
    +100.00% of object copies found (3072 of 3072)
    +Sample represents 100.00% of the object partition space
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Add new nodes by scaling up the SwiftStorage resource from 0 to 3. In +that case 3 storage instances using PVCs are created, running on the +OpenShift cluster.

    +
    +
    +
    oc patch openstackcontrolplane openstack --type=merge -p='{"spec":{"swift":{"template":{"swiftStorage":{"replicas": 3}}}}}'
    +
    +
    +
  2. +
  3. +

    Wait until all three pods are running:

    +
    +
    +
    oc wait pods --for condition=Ready -l component=swift-storage
    +
    +
    +
  4. +
  5. +

    Drain the existing nodes. Get the storage management IP +addresses of the nodes to drain from the current rings:

    +
    +
    +
    oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-ring-builder object.builder' | tail -n +7 | awk '{print $4}' | sort -u
    +
    +
    +
    +

    The output will look similar to the following:

    +
    +
    +
    +
    172.20.0.100:6200
    +swift-storage-0.swift-storage.openstack.svc:6200
    +swift-storage-1.swift-storage.openstack.svc:6200
    +swift-storage-2.swift-storage.openstack.svc:6200
    +
    +
    +
    +

    In this case the old node 172.20.0.100 is drained. Your nodes might be +different, and depending on the deployment there are likely more nodes to be included in the following commands.

    +
    +
    +
    +
    oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c '
    +swift-ring-tool get
    +swift-ring-tool drain 172.20.0.100
    +swift-ring-tool rebalance
    +swift-ring-tool push'
    +
    +
    +
  6. +
  7. +

    Copy and apply the updated rings need to the original nodes. Run the +ssh commands for your existing nodes storing Object Storage service data.

    +
    +
    +
    oc extract --confirm cm/swift-ring-files
    +CONTROLLER1_SSH "tar -C /var/lib/config-data/puppet-generated/swift/etc/swift/ -xzf -" < swiftrings.tar.gz
    +CONTROLLER1_SSH "systemctl restart tripleo_swift_*"
    +
    +
    +
  8. +
  9. +

    Track the replication progress by using the swift-dispersion-report tool:

    +
    +
    +
    oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c "swift-ring-tool get && swift-dispersion-report"
    +
    +
    +
    +

    The output shows less than 100% of copies found. Repeat the above command until both the container and all container and object copies are found:

    +
    +
    +
    +
    Queried 1024 containers for dispersion reporting, 6s, 0 retries
    +There were 5 partitions missing 1 copy.
    +99.84% of container copies found (3067 of 3072)
    +Sample represents 100.00% of the container partition space
    +Queried 1024 objects for dispersion reporting, 7s, 0 retries
    +There were 739 partitions missing 1 copy.
    +There were 285 partitions missing 0 copies.
    +75.94% of object copies found (2333 of 3072)
    +Sample represents 100.00% of the object partition space
    +
    +
    +
  10. +
  11. +

    Move the next replica to the new nodes. To do so, rebalance and distribute the rings again:

    +
    +
    +
    oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c '
    +swift-ring-tool get
    +swift-ring-tool rebalance
    +swift-ring-tool push'
    +
    +oc extract --confirm cm/swift-ring-files
    +CONTROLLER1_SSH "tar -C /var/lib/config-data/puppet-generated/swift/etc/swift/ -xzf -" < swiftrings.tar.gz
    +CONTROLLER1_SSH "systemctl restart tripleo_swift_*"
    +
    +
    +
    +

    Monitor the swift-dispersion-report output again, wait until all copies are found again and repeat this step until all your replicas are moved to the new nodes.

    +
    +
  12. +
  13. +

    After the nodes are drained, remove the nodes from the rings:

    +
    +
    +
    oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c '
    +swift-ring-tool get
    +swift-ring-tool remove 172.20.0.100
    +swift-ring-tool rebalance
    +swift-ring-tool push'
    +
    +
    +
  14. +
+
+
+
Verification
+
    +
  • +

    Even if all replicas are already on the the new nodes and the +swift-dispersion-report command reports 100% of the copies found, there might still be data on old nodes. This data is removed by the replicators, but it might take some more time.

    +
    +

    You can check the disk usage of all disks in the cluster:

    +
    +
    +
    +
    oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-recon -d'
    +
    +
    +
  • +
  • +

    Confirm that there are no more \*.db or *.data files in the directory /srv/node on these nodes:

    +
    +
    +
    CONTROLLER1_SSH "find /srv/node/ -type f -name '*.db' -o -name '*.data' | wc -l"
    +
    +
    +
  • +
+
+
+
+

Troubleshooting the Object Storage service (swift) migration

+
+

You can troubleshoot issues with the Object Storage service (swift) migration.

+
+
+
    +
  • +

    The following command might be helpful to debug if the replication is not working and the swift-dispersion-report is not back to 100% availability.

    +
    +
    +
    CONTROLLER1_SSH tail /var/log/containers/swift/swift.log | grep object-server
    +
    +
    +
    +

    This should show the replicator progress, for example:

    +
    +
    +
    +
    Mar 14 06:05:30 standalone object-server[652216]: <f+++++++++ 4e2/9cbea55c47e243994b0b10d8957184e2/1710395823.58025.data
    +Mar 14 06:05:30 standalone object-server[652216]: Successful rsync of /srv/node/vdd/objects/626/4e2 to swift-storage-1.swift-storage.openstack.svc::object/d1/objects/626 (0.094)
    +Mar 14 06:05:30 standalone object-server[652216]: Removing partition: /srv/node/vdd/objects/626
    +Mar 14 06:05:31 standalone object-server[652216]: <f+++++++++ 85f/cf53b5a048e5b19049e05a548cde185f/1710395796.70868.data
    +Mar 14 06:05:31 standalone object-server[652216]: Successful rsync of /srv/node/vdb/objects/829/85f to swift-storage-2.swift-storage.openstack.svc::object/d1/objects/829 (0.095)
    +Mar 14 06:05:31 standalone object-server[652216]: Removing partition: /srv/node/vdb/objects/829
    +
    +
    +
  • +
  • +

    You can also check the ring consistency and replicator status:

    +
    +
    +
    oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-recon -r --md5'
    +
    +
    +
    +

    Note that the output might show a md5 mismatch until approx. 2 minutes after pushing new rings. Eventually it looks similar to the following example:

    +
    +
    +
    +
    [...]
    +Oldest completion was 2024-03-14 16:53:27 (3 minutes ago) by 172.20.0.100:6000.
    +Most recent completion was 2024-03-14 16:56:38 (12 seconds ago) by swift-storage-0.swift-storage.openstack.svc:6200.
    +===============================================================================
    +[2024-03-14 16:56:50] Checking ring md5sums
    +4/4 hosts matched, 0 error[s] while checking hosts.
    +[...]
    +
    +
    +
  • +
+
+
+
+
+
+

Migrating the Ceph Storage Cluster

+
+
+

In the context of data plane adoption, where the OpenStack +(OSP) services are redeployed in OpenShift, you migrate a +TripleO-deployed Ceph Storage cluster by using a process +called “externalizing” the Ceph Storage cluster.

+
+
+

There are two deployment topologies that include an internal Ceph Storage +cluster:

+
+
+
    +
  • +

    OSP includes dedicated Ceph Storage nodes to host object +storage daemons (OSDs)

    +
  • +
  • +

    Hyperconverged Infrastructure (HCI), where Compute and Storage services are +colocated on hyperconverged nodes

    +
  • +
+
+
+

In either scenario, there are some Ceph processes that are deployed on +OSP Controller nodes: Ceph monitors, Ceph Object Gateway (RGW), +Rados Block Device (RBD), Ceph Metadata Server (MDS), Ceph Dashboard, and NFS +Ganesha. To migrate your Ceph Storage cluster, you must decommission the +Controller nodes and move the Ceph daemons to a set of target nodes that are +already part of the Ceph Storage cluster.

+
+
+

Ceph daemon cardinality

+
+

Ceph 6 and later applies strict constraints in the way daemons can be +colocated within the same node. +Your topology depends on the available hardware and the amount of Ceph services in the Controller nodes that you retire. +The amount of services that you can migrate depends on the amount of available nodes in the cluster. The following diagrams show the distribution of Ceph daemons on Ceph nodes where at least 3 nodes are required.

+
+
+
    +
  • +

    The following scenario includes only RGW and RBD, without the Ceph dashboard:

    +
    +
    +
    |    |                     |             |
    +|----|---------------------|-------------|
    +| osd | mon/mgr/crash      | rgw/ingress |
    +| osd | mon/mgr/crash      | rgw/ingress |
    +| osd | mon/mgr/crash      | rgw/ingress |
    +
    +
    +
  • +
  • +

    With the Ceph dashboard, but without Shared File Systems service (manila), at least 4 nodes are required. The Ceph dashboard has no failover:

    +
    +
    +
    |     |                     |             |
    +|-----|---------------------|-------------|
    +| osd | mon/mgr/crash | rgw/ingress       |
    +| osd | mon/mgr/crash | rgw/ingress       |
    +| osd | mon/mgr/crash | dashboard/grafana |
    +| osd | rgw/ingress   | (free)            |
    +
    +
    +
  • +
  • +

    With the Ceph dashboard and the Shared File Systems service, a minimum of 5 nodes are required, and the Ceph dashboard has no failover:

    +
    +
    +
    |     |                     |                         |
    +|-----|---------------------|-------------------------|
    +| osd | mon/mgr/crash       | rgw/ingress             |
    +| osd | mon/mgr/crash       | rgw/ingress             |
    +| osd | mon/mgr/crash       | mds/ganesha/ingress     |
    +| osd | rgw/ingress         | mds/ganesha/ingress     |
    +| osd | mds/ganesha/ingress | dashboard/grafana       |
    +
    +
    +
  • +
+
+
+
+

Migrating the monitoring stack component to new nodes within an existing Ceph cluster

+
+

The Ceph Dashboard module adds web-based monitoring and administration to the +Ceph Manager. With TripleO-deployed Ceph, the Ceph Dashboard is enabled as part of the overcloud deploy and is composed of the following components:

+
+
+
    +
  • +

    Ceph Manager module

    +
  • +
  • +

    Grafana

    +
  • +
  • +

    Prometheus

    +
  • +
  • +

    Alertmanager

    +
  • +
  • +

    Node exporter

    +
  • +
+
+
+

The Ceph Dashboard containers are included through tripleo-container-image-prepare parameters, and high availability (HA) relies +on HAProxy and Pacemaker to be deployed on the OpenStack (OSP) environment. For an external Ceph Storage cluster, HA is not supported.

+
+
+

In this procedure, you migrate and relocate the Ceph Monitoring components to free Controller nodes.

+
+
+
Prerequisites
+
    +
  • +

    You have an OSP Wallaby environment.

    +
  • +
  • +

    You have a Ceph Reef deployment that is managed by TripleO.

    +
  • +
  • +

    Your Ceph Reef deployment is managed by cephadm.

    +
  • +
  • +

    Both the Ceph public and cluster networks are propagated through TripleO to the target nodes.

    +
  • +
+
+
+

Completing prerequisites for a Ceph cluster with monitoring stack components

+
+

Complete the following prerequisites before you migrate a Ceph cluster with monitoring stack components.

+
+
+
Procedure
+
    +
  1. +

    Gather the current status of the monitoring stack. Verify that +the hosts have no monitoring label, or grafana, prometheus, or alertmanager, in cases of a per daemons placement evaluation:

    +
    + + + + + +
    + + +The entire relocation process is driven by cephadm and relies on labels to be +assigned to the target nodes, where the daemons are scheduled. +
    +
    +
    +
    +
    [tripleo-admin@controller-0 ~]$ sudo cephadm shell -- ceph orch host ls
    +
    +HOST                    	ADDR       	LABELS                 	STATUS
    +cephstorage-0.redhat.local  192.168.24.11  osd mds
    +cephstorage-1.redhat.local  192.168.24.12  osd mds
    +cephstorage-2.redhat.local  192.168.24.47  osd mds
    +controller-0.redhat.local   192.168.24.35  _admin mon mgr
    +controller-1.redhat.local   192.168.24.53  mon _admin mgr
    +controller-2.redhat.local   192.168.24.10  mon _admin mgr
    +6 hosts in cluster
    +
    +
    +
    +

    Confirm that the cluster is healthy and that both ceph orch ls and +ceph orch ps return the expected number of deployed daemons.

    +
    +
  2. +
  3. +

    Review and update the container image registry:

    +
    + + + + + +
    + + +If you run the Ceph externalization procedure after you migrate the OpenStack control plane, update the container images in the Ceph Storage cluster configuration. The current container images point to the undercloud registry, which might not be available anymore. Because the undercloud is not available after adoption is complete, replace the undercloud-provided images with an alternative registry. +In case the desired option is to rely on the default images +shipped by cephadm, remove the following config options from the Ceph Storage cluster. +
    +
    +
    +
    +
    $ ceph config dump
    +...
    +...
    +mgr   advanced  mgr/cephadm/container_image_alertmanager    undercloud-0.ctlplane.redhat.local:8787/ceph/alertmanager:v0.25.0
    +mgr   advanced  mgr/cephadm/container_image_base            undercloud-0.ctlplane.redhat.local:8787/ceph/ceph:v18
    +mgr   advanced  mgr/cephadm/container_image_grafana         undercloud-0.ctlplane.redhat.local:8787/ceph/ceph-grafana:9.4.7
    +mgr   advanced  mgr/cephadm/container_image_node_exporter   undercloud-0.ctlplane.redhat.local:8787/ceph/node-exporter:v1.5.0
    +mgr   advanced  mgr/cephadm/container_image_prometheus      undercloud-0.ctlplane.redhat.local:8787/ceph/prometheus:v2.43.0
    +
    +
    +
  4. +
  5. +

    Remove the undercloud container images:

    +
    +
    +
    $ cephadm shell -- ceph config rm mgr mgr/cephadm/container_image_base
    +for i in prometheus grafana alertmanager node_exporter; do
    +    cephadm shell -- ceph config rm mgr mgr/cephadm/container_image_$i
    +done
    +
    +
    +
  6. +
+
+
+
+

Migrating the monitoring stack to the target nodes

+
+

To migrate the monitoring stack to the target nodes, you add the monitoring label to your existing nodes and update the configuration of each daemon. You do not need to migrate node exporters. These daemons are deployed across +the nodes that are part of the Ceph cluster (the placement is ‘*’).

+
+
+
Prerequisites
+
    +
  • +

    Confirm that the firewall rules are in place and the ports are open for a given monitoring stack service.

    +
  • +
+
+
+ + + + + +
+ + +Depending on the target nodes and the number of deployed or active daemons, you can either relocate the existing containers to the target nodes, or +select a subset of nodes that host the monitoring stack daemons. High availability (HA) is not supported. Reducing the placement with count: 1 allows you to migrate the existing daemons in a Hyperconverged Infrastructure, or hardware-limited, scenario without impacting other services. +
+
+
+
Migrating the existing daemons to the target nodes
+
+

The following procedure is an example of an environment with 3 Ceph nodes or ComputeHCI nodes. This scenario extends the monitoring labels to all the Ceph or ComputeHCI nodes that are part of the cluster. This means that you keep 3 placements for the target nodes.

+
+
+
Procedure
+
    +
  1. +

    Add the monitoring label to all the Ceph Storage or ComputeHCI nodes in the cluster:

    +
    +
    +
    for item in $(sudo cephadm shell --  ceph orch host ls --format json | jq -r '.[].hostname'); do
    +    sudo cephadm shell -- ceph orch host label add  $item monitoring;
    +done
    +
    +
    +
  2. +
  3. +

    Verify that all the hosts on the target nodes have the monitoring label:

    +
    +
    +
    [tripleo-admin@controller-0 ~]$ sudo cephadm shell -- ceph orch host ls
    +
    +HOST                        ADDR           LABELS
    +cephstorage-0.redhat.local  192.168.24.11  osd monitoring
    +cephstorage-1.redhat.local  192.168.24.12  osd monitoring
    +cephstorage-2.redhat.local  192.168.24.47  osd monitoring
    +controller-0.redhat.local   192.168.24.35  _admin mon mgr monitoring
    +controller-1.redhat.local   192.168.24.53  mon _admin mgr monitoring
    +controller-2.redhat.local   192.168.24.10  mon _admin mgr monitoring
    +
    +
    +
  4. +
  5. +

    Remove the labels from the Controller nodes:

    +
    +
    +
    $ for i in 0 1 2; do ceph orch host label rm "controller-$i.redhat.local" monitoring; done
    +
    +Removed label monitoring from host controller-0.redhat.local
    +Removed label monitoring from host controller-1.redhat.local
    +Removed label monitoring from host controller-2.redhat.local
    +
    +
    +
  6. +
  7. +

    Dump the current monitoring stack spec:

    +
    +
    +
    function export_spec {
    +    local component="$1"
    +    local target_dir="$2"
    +    sudo cephadm shell -- ceph orch ls --export "$component" > "$target_dir/$component"
    +}
    +
    +SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"}
    +for m in grafana prometheus alertmanager; do
    +    export_spec "$m" "$SPEC_DIR"
    +done
    +
    +
    +
  8. +
  9. +

    For each daemon, edit the current spec and replace the placement:hosts: section with the placement:label: section, for example:

    +
    +
    +
    service_type: grafana
    +service_name: grafana
    +placement:
    +  label: monitoring
    +networks:
    +- 172.17.3.0/24
    +spec:
    +  port: 3100
    +
    +
    +
    +

    This step also applies to Prometheus and Alertmanager specs.

    +
    +
  10. +
  11. +

    Apply the new monitoring spec to relocate the monitoring stack daemons:

    +
    +
    +
    SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"}
    +function migrate_daemon {
    +    local component="$1"
    +    local target_dir="$2"
    +    sudo cephadm shell -m "$target_dir" -- ceph orch apply -i /mnt/ceph_specs/$component
    +}
    +for m in grafana prometheus alertmanager; do
    +    migrate_daemon  "$m" "$SPEC_DIR"
    +done
    +
    +
    +
  12. +
  13. +

    Verify that the daemons are deployed on the expected nodes:

    +
    +
    +
    [ceph: root@controller-0 /]# ceph orch ps | grep -iE "(prome|alert|grafa)"
    +alertmanager.cephstorage-2  cephstorage-2.redhat.local  172.17.3.144:9093,9094
    +grafana.cephstorage-0       cephstorage-0.redhat.local  172.17.3.83:3100
    +prometheus.cephstorage-1    cephstorage-1.redhat.local  172.17.3.53:9092
    +
    +
    +
    + + + + + +
    + + +After you migrate the monitoring stack, you lose high availability. The monitoring stack daemons no longer have a Virtual IP address and HAProxy anymore. Node exporters are still running on all the nodes. +
    +
    +
  14. +
  15. +

    Review the Ceph configuration to ensure that it aligns with the configuration on the target nodes. In particular, focus on the following configuration entries:

    +
    +
    +
    [ceph: root@controller-0 /]# ceph config dump
    +...
    +mgr  advanced  mgr/dashboard/ALERTMANAGER_API_HOST  http://172.17.3.83:9093
    +mgr  advanced  mgr/dashboard/GRAFANA_API_URL        https://172.17.3.144:3100
    +mgr  advanced  mgr/dashboard/PROMETHEUS_API_HOST    http://172.17.3.83:9092
    +mgr  advanced  mgr/dashboard/controller-0.ycokob/server_addr  172.17.3.33
    +mgr  advanced  mgr/dashboard/controller-1.lmzpuc/server_addr  172.17.3.147
    +mgr  advanced  mgr/dashboard/controller-2.xpdgfl/server_addr  172.17.3.138
    +
    +
    +
  16. +
  17. +

    Verify that the API_HOST/URL of the grafana, alertmanager and prometheus services points to the IP addresses on the storage network of the node where each daemon is relocated:

    +
    +
    +
    [ceph: root@controller-0 /]# ceph orch ps | grep -iE "(prome|alert|grafa)"
    +alertmanager.cephstorage-0  cephstorage-0.redhat.local  172.17.3.83:9093,9094
    +alertmanager.cephstorage-1  cephstorage-1.redhat.local  172.17.3.53:9093,9094
    +alertmanager.cephstorage-2  cephstorage-2.redhat.local  172.17.3.144:9093,9094
    +grafana.cephstorage-0       cephstorage-0.redhat.local  172.17.3.83:3100
    +grafana.cephstorage-1       cephstorage-1.redhat.local  172.17.3.53:3100
    +grafana.cephstorage-2       cephstorage-2.redhat.local  172.17.3.144:3100
    +prometheus.cephstorage-0    cephstorage-0.redhat.local  172.17.3.83:9092
    +prometheus.cephstorage-1    cephstorage-1.redhat.local  172.17.3.53:9092
    +prometheus.cephstorage-2    cephstorage-2.redhat.local  172.17.3.144:9092
    +
    +
    +
    +
    +
    [ceph: root@controller-0 /]# ceph config dump
    +...
    +...
    +mgr  advanced  mgr/dashboard/ALERTMANAGER_API_HOST   http://172.17.3.83:9093
    +mgr  advanced  mgr/dashboard/PROMETHEUS_API_HOST     http://172.17.3.83:9092
    +mgr  advanced  mgr/dashboard/GRAFANA_API_URL         https://172.17.3.144:3100
    +
    +
    +
    + + + + + +
    + + +The Ceph Dashboard, as the service provided by the Ceph mgr, is not impacted by the relocation. You might experience an impact when the active mgr daemon is migrated or is force-failed. However, you can define 3 replicas in the Ceph Manager configuration to redirect requests to a different instance. +
    +
    +
  18. +
+
+
+
+
Scenario 2: Relocating one instance of a monitoring stack to migrate daemons to target nodes
+
+

Instead of adding a single monitoring label to all the target nodes, it is +possible to relocate one instance of each monitoring stack daemon on a +particular node.

+
+
+
Procedure
+
    +
  1. +

    Set each of your nodes to host a particular daemon instance, for example, if you have three target nodes:

    +
    +
    +
    [tripleo-admin@controller-0 ~]$ sudo cephadm shell -- ceph orch host ls | grep -i cephstorage
    +
    +HOST                        ADDR           LABELS
    +cephstorage-0.redhat.local  192.168.24.11  osd ---> grafana
    +cephstorage-1.redhat.local  192.168.24.12  osd ---> prometheus
    +cephstorage-2.redhat.local  192.168.24.47  osd ---> alertmanager
    +
    +
    +
  2. +
  3. +

    Add the appropriate labels to the target nodes:

    +
    +
    +
    declare -A target_nodes
    +
    +target_nodes[grafana]=cephstorage-0
    +target_nodes[prometheus]=cephstorage-1
    +target_nodes[alertmanager]=cephstorage-2
    +
    +for label in "${!target_nodes[@]}"; do
    +    ceph orch host label add ${target_nodes[$label]} $label
    +done
    +
    +
    +
  4. +
  5. +

    Verify that the labels are properly applied to the target nodes:

    +
    +
    +
    [tripleo-admin@controller-0 ~]$ sudo cephadm shell -- ceph orch host ls | grep -i cephstorage
    +
    +HOST                    	ADDR       	LABELS          	STATUS
    +cephstorage-0.redhat.local  192.168.24.11  osd grafana
    +cephstorage-1.redhat.local  192.168.24.12  osd prometheus
    +cephstorage-2.redhat.local  192.168.24.47  osd alertmanager
    +
    +
    +
  6. +
  7. +

    Dump the current monitoring stack spec:

    +
    +
    +
    function export_spec {
    +    local component="$1"
    +    local target_dir="$2"
    +    sudo cephadm shell -- ceph orch ls --export "$component" > "$target_dir/$component"
    +}
    +
    +SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"}
    +for m in grafana prometheus alertmanager; do
    +    export_spec "$m" "$SPEC_DIR"
    +done
    +
    +
    +
  8. +
  9. +

    For each daemon, edit the current spec and replace the placement/hosts section +with the placement/label section, for example:

    +
    +
    +
    service_type: grafana
    +service_name: grafana
    +placement:
    +  label: grafana
    +networks:
    +- 172.17.3.0/24
    +spec:
    +  port: 3100
    +
    +
    +
    +

    The same procedure applies to Prometheus and Alertmanager specs.

    +
    +
  10. +
  11. +

    Apply the new monitoring spec to relocate the monitoring stack daemons:

    +
    +
    +
    SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"}
    +function migrate_daemon {
    +    local component="$1"
    +    local target_dir="$2"
    +    sudo cephadm shell -m "$target_dir" -- ceph orch apply -i /mnt/ceph_specs/$component
    +}
    +for m in grafana prometheus alertmanager; do
    +    migrate_daemon  "$m" "$SPEC_DIR"
    +done
    +
    +
    +
  12. +
  13. +

    Verify that the daemons are deployed on the expected nodes:

    +
    +
    +
    [ceph: root@controller-0 /]# ceph orch ps | grep -iE "(prome|alert|grafa)"
    +alertmanager.cephstorage-2  cephstorage-2.redhat.local  172.17.3.144:9093,9094
    +grafana.cephstorage-0       cephstorage-0.redhat.local  172.17.3.83:3100
    +prometheus.cephstorage-1    cephstorage-1.redhat.local  172.17.3.53:9092
    +
    +
    +
    + + + + + +
    + + +With the procedure described above we lose High Availability: the monitoring +stack daemons have no VIP and haproxy anymore; Node exporters are still +running on all the nodes: instead of using labels we keep the current approach +as we want to not reduce the monitoring space covered. +
    +
    +
  14. +
  15. +

    Update the Ceph Dashboard Manager configuration. An important aspect that should be considered at this point is to replace and +verify that the Ceph configuration is aligned with the relocation you just made. Run the ceph config dump command and review the current config. +In particular, focus on the following configuration entries:

    +
    +
    +
    [ceph: root@controller-0 /]# ceph config dump
    +...
    +mgr  advanced  mgr/dashboard/ALERTMANAGER_API_HOST  http://172.17.3.83:9093
    +mgr  advanced  mgr/dashboard/GRAFANA_API_URL        https://172.17.3.144:3100
    +mgr  advanced  mgr/dashboard/PROMETHEUS_API_HOST    http://172.17.3.83:9092
    +mgr  advanced  mgr/dashboard/controller-0.ycokob/server_addr  172.17.3.33
    +mgr  advanced  mgr/dashboard/controller-1.lmzpuc/server_addr  172.17.3.147
    +mgr  advanced  mgr/dashboard/controller-2.xpdgfl/server_addr  172.17.3.138
    +
    +
    +
  16. +
  17. +

    Verify that grafana, alertmanager and prometheus API_HOST/URL point to +the IP addresses (on the storage network) of the node where each daemon has been +relocated. This should be automatically addressed by cephadm and it shouldn’t +require any manual action.

    +
    +
    +
    [ceph: root@controller-0 /]# ceph orch ps | grep -iE "(prome|alert|grafa)"
    +alertmanager.cephstorage-0  cephstorage-0.redhat.local  172.17.3.83:9093,9094
    +alertmanager.cephstorage-1  cephstorage-1.redhat.local  172.17.3.53:9093,9094
    +alertmanager.cephstorage-2  cephstorage-2.redhat.local  172.17.3.144:9093,9094
    +grafana.cephstorage-0       cephstorage-0.redhat.local  172.17.3.83:3100
    +grafana.cephstorage-1       cephstorage-1.redhat.local  172.17.3.53:3100
    +grafana.cephstorage-2       cephstorage-2.redhat.local  172.17.3.144:3100
    +prometheus.cephstorage-0    cephstorage-0.redhat.local  172.17.3.83:9092
    +prometheus.cephstorage-1    cephstorage-1.redhat.local  172.17.3.53:9092
    +prometheus.cephstorage-2    cephstorage-2.redhat.local  172.17.3.144:9092
    +
    +
    +
    +
    +
    [ceph: root@controller-0 /]# ceph config dump
    +...
    +...
    +mgr  advanced  mgr/dashboard/ALERTMANAGER_API_HOST   http://172.17.3.83:9093
    +mgr  advanced  mgr/dashboard/PROMETHEUS_API_HOST     http://172.17.3.83:9092
    +mgr  advanced  mgr/dashboard/GRAFANA_API_URL         https://172.17.3.144:3100
    +
    +
    +
  18. +
  19. +

    The Ceph Dashboard (mgr module plugin) has not been impacted at all by this +relocation. The service is provided by the Ceph Manager daemon, hence we might +experience an impact when the active mgr is migrated or is force-failed. +However, having three replicas definition allows to redirect requests to a +different instance (it’s still an A/P model), hence the impact should be +limited.

    +
    +
      +
    1. +

      When the RBD migration is over, the following Ceph config keys must +be regenerated to point to the right mgr container:

      +
      +
      +
      mgr    advanced  mgr/dashboard/controller-0.ycokob/server_addr  172.17.3.33
      +mgr    advanced  mgr/dashboard/controller-1.lmzpuc/server_addr  172.17.3.147
      +mgr    advanced  mgr/dashboard/controller-2.xpdgfl/server_addr  172.17.3.138
      +
      +
      +
      +
      +
      $ sudo cephadm shell
      +$ ceph orch ps | awk '/mgr./ {print $1}'
      +
      +
      +
    2. +
    3. +

      For each retrieved mgr, update the entry in the Ceph configuration:

      +
      +
      +
      $ ceph config set mgr mgr/dashboard/<>/server_addr/<ip addr>
      +
      +
      +
    4. +
    +
    +
  20. +
+
+
+
+
+
+

Migrating Ceph MDS to new nodes within the existing cluster

+
+

You can migrate the MDS daemon when Shared File Systems service (manila), deployed with either a cephfs-native or ceph-nfs back end, is part of the overcloud deployment. The MDS migration is performed by cephadm, and you move the daemons placement from a hosts-based approach to a label-based approach. +This ensures that you can visualize the status of the cluster and where daemons are placed by using the ceph orch host command, and have a general view of how the daemons are co-located within a given host.

+
+
+
Prerequisites
+
    +
  • +

    An OSP Wallaby environment and a Ceph Reef deployment that is managed by TripleO.

    +
  • +
  • +

    Ceph is upgraded to Ceph Reef and is managed by cephadm.

    +
  • +
  • +

    Both the Ceph public and cluster networks are propagated throughTripleO to the target nodes.

    +
  • +
+
+
+
Prerequisites
+
    +
  • +

    Verify that the Ceph Storage cluster is healthy and check the MDS status:

    +
    +
    +
    [ceph: root@controller-0 /]# ceph fs ls
    +name: cephfs, metadata pool: manila_metadata, data pools: [manila_data ]
    +
    +[ceph: root@controller-0 /]# ceph mds stat
    +cephfs:1 {0=mds.controller-2.oebubl=up:active} 2 up:standby
    +
    +[ceph: root@controller-0 /]# ceph fs status cephfs
    +
    +cephfs - 0 clients
    +======
    +RANK  STATE         	MDS           	ACTIVITY 	DNS	INOS   DIRS   CAPS
    + 0	active  mds.controller-2.oebubl  Reqs:	0 /s   696	196	173  	0
    +  	POOL     	TYPE 	USED  AVAIL
    +manila_metadata  metadata   152M   141G
    +  manila_data  	data	3072M   141G
    +  	STANDBY MDS
    +mds.controller-0.anwiwd
    +mds.controller-1.cwzhog
    +MDS version: ceph version 17.2.6-100.el9cp (ea4e3ef8df2cf26540aae06479df031dcfc80343) quincy (stable)
    +
    +
    +
  • +
  • +

    Retrieve more detailed information on the Ceph File System (CephFS) MDS status:

    +
    +
    +
    [ceph: root@controller-0 /]# ceph fs dump
    +
    +e8
    +enable_multiple, ever_enabled_multiple: 1,1
    +default compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2}
    +legacy client fscid: 1
    +
    +Filesystem 'cephfs' (1)
    +fs_name cephfs
    +epoch   5
    +flags   12 joinable allow_snaps allow_multimds_snaps
    +created 2024-01-18T19:04:01.633820+0000
    +modified    	2024-01-18T19:04:05.393046+0000
    +tableserver 	0
    +root	0
    +session_timeout 60
    +session_autoclose   	300
    +max_file_size   1099511627776
    +required_client_features    	{}
    +last_failure	0
    +last_failure_osd_epoch  0
    +compat  compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2}
    +max_mds 1
    +in  	0
    +up  	{0=24553}
    +failed
    +damaged
    +stopped
    +data_pools  	[7]
    +metadata_pool   9
    +inline_data 	disabled
    +balancer
    +standby_count_wanted	1
    +[mds.mds.controller-2.oebubl{0:24553} state up:active seq 2 addr [v2:172.17.3.114:6800/680266012,v1:172.17.3.114:6801/680266012] compat {c=[1],r=[1],i=[7ff]}]
    +
    +
    +Standby daemons:
    +
    +[mds.mds.controller-0.anwiwd{-1:14715} state up:standby seq 1 addr [v2:172.17.3.20:6802/3969145800,v1:172.17.3.20:6803/3969145800] compat {c=[1],r=[1],i=[7ff]}]
    +[mds.mds.controller-1.cwzhog{-1:24566} state up:standby seq 1 addr [v2:172.17.3.43:6800/2227381308,v1:172.17.3.43:6801/2227381308] compat {c=[1],r=[1],i=[7ff]}]
    +dumped fsmap epoch 8
    +
    +
    +
  • +
  • +

    Check the OSD blocklist and clean up the client list:

    +
    +
    +
    [ceph: root@controller-0 /]# ceph osd blocklist ls
    +..
    +..
    +for item in $(ceph osd blocklist ls | awk '{print $0}'); do
    +     ceph osd blocklist rm $item;
    +done
    +
    +
    +
    + + + + + +
    + + +
    +

    When a file system client is unresponsive or misbehaving, the access to the file system might be forcibly terminated. This process is called eviction. Evicting a CephFS client prevents it from communicating further with MDS daemons and OSD daemons.

    +
    +
    +

    Ordinarily, a blocklisted client cannot reconnect to the servers; you must unmount and then remount the client. However, permitting a client that was evicted to attempt to reconnect can be useful. Because CephFS uses the RADOS OSD blocklist to control client eviction, you can permit CephFS clients to reconnect by removing them from the blocklist.

    +
    +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    Get the hosts that are currently part of the Ceph cluster:

    +
    +
    +
    [ceph: root@controller-0 /]# ceph orch host ls
    +HOST                        ADDR           LABELS          STATUS
    +cephstorage-0.redhat.local  192.168.24.25  osd mds
    +cephstorage-1.redhat.local  192.168.24.50  osd mds
    +cephstorage-2.redhat.local  192.168.24.47  osd mds
    +controller-0.redhat.local   192.168.24.24  _admin mgr mon
    +controller-1.redhat.local   192.168.24.42  mgr _admin mon
    +controller-2.redhat.local   192.168.24.37  mgr _admin mon
    +6 hosts in cluster
    +
    +[ceph: root@controller-0 /]# ceph orch ls --export mds
    +service_type: mds
    +service_id: mds
    +service_name: mds.mds
    +placement:
    +  hosts:
    +  - controller-0.redhat.local
    +  - controller-1.redhat.local
    +  - controller-2.redhat.local
    +
    +
    +
  2. +
  3. +

    Apply the MDS labels to the target nodes:

    +
    +
    +
    for item in $(sudo cephadm shell --  ceph orch host ls --format json | jq -r '.[].hostname'); do
    +    sudo cephadm shell -- ceph orch host label add  $item mds;
    +done
    +
    +
    +
  4. +
  5. +

    Verify that all the hosts have the MDS label:

    +
    +
    +
    [tripleo-admin@controller-0 ~]$ sudo cephadm shell -- ceph orch host ls
    +
    +HOST                    	ADDR       	   LABELS
    +cephstorage-0.redhat.local  192.168.24.11  osd mds
    +cephstorage-1.redhat.local  192.168.24.12  osd mds
    +cephstorage-2.redhat.local  192.168.24.47  osd mds
    +controller-0.redhat.local   192.168.24.35  _admin mon mgr mds
    +controller-1.redhat.local   192.168.24.53  mon _admin mgr mds
    +controller-2.redhat.local   192.168.24.10  mon _admin mgr mds
    +
    +
    +
  6. +
  7. +

    Dump the current MDS spec:

    +
    +
    +
    [ceph: root@controller-0 /]# ceph orch ls --export mds > mds.yaml
    +
    +
    +
  8. +
  9. +

    Edit the retrieved spec and replace the placement.hosts section with +placement.label:

    +
    +
    +
    service_type: mds
    +service_id: mds
    +service_name: mds.mds
    +placement:
    +  label: mds
    +
    +
    +
  10. +
  11. +

    Use the ceph orchestrator to apply the new MDS spec:

    +
    +
    +
    $ sudo cephadm shell -m mds.yaml -- ceph orch apply -i /mnt/mds.yaml
    +Scheduling new mds deployment …
    +
    +
    +
    +

    This results in an increased number of MDS daemons.

    +
    +
  12. +
  13. +

    Check the new standby daemons that are temporarily added to the CephFS:

    +
    +
    +
    $ ceph fs dump
    +
    +Active
    +
    +standby_count_wanted    1
    +[mds.mds.controller-0.awzplm{0:463158} state up:active seq 307 join_fscid=1 addr [v2:172.17.3.20:6802/51565420,v1:172.17.3.20:6803/51565420] compat {c=[1],r=[1],i=[7ff]}]
    +
    +
    +Standby daemons:
    +
    +[mds.mds.cephstorage-1.jkvomp{-1:463800} state up:standby seq 1 join_fscid=1 addr [v2:172.17.3.135:6820/2075903648,v1:172.17.3.135:6821/2075903648] compat {c=[1],r=[1],i=[7ff]}]
    +[mds.mds.controller-2.gfrqvc{-1:475945} state up:standby seq 1 addr [v2:172.17.3.114:6800/2452517189,v1:172.17.3.114:6801/2452517189] compat {c=[1],r=[1],i=[7ff]}]
    +[mds.mds.cephstorage-0.fqcshx{-1:476503} state up:standby seq 1 join_fscid=1 addr [v2:172.17.3.92:6820/4120523799,v1:172.17.3.92:6821/4120523799] compat {c=[1],r=[1],i=[7ff]}]
    +[mds.mds.cephstorage-2.gnfhfe{-1:499067} state up:standby seq 1 addr [v2:172.17.3.79:6820/2448613348,v1:172.17.3.79:6821/2448613348] compat {c=[1],r=[1],i=[7ff]}]
    +[mds.mds.controller-1.tyiziq{-1:499136} state up:standby seq 1 addr [v2:172.17.3.43:6800/3615018301,v1:172.17.3.43:6801/3615018301] compat {c=[1],r=[1],i=[7ff]}]
    +
    +
    +
  14. +
  15. +

    To migrate MDS to the target nodes, set the MDS affinity that manages the MDS failover:

    +
    + + + + + +
    + + +It is possible to elect a dedicated MDS as "active" for a particular file system. To configure this preference, CephFS provides a configuration option for MDS called mds_join_fs, which enforces this affinity. +When failing over MDS daemons, cluster monitors prefer standby daemons with mds_join_fs equal to the file system name with the failed rank. If no standby exists with mds_join_fs equal to the file system name, it chooses an unqualified standby as a replacement. +
    +
    +
    +
    +
    $ ceph config set mds.mds.cephstorage-0.fqcshx mds_join_fs cephfs
    +
    +
    +
  16. +
  17. +

    Remove the labels from the Controller nodes and force the MDS failover to the +target node:

    +
    +
    +
    $ for i in 0 1 2; do ceph orch host label rm "controller-$i.redhat.local" mds; done
    +
    +Removed label mds from host controller-0.redhat.local
    +Removed label mds from host controller-1.redhat.local
    +Removed label mds from host controller-2.redhat.local
    +
    +
    +
    +

    The switch to the target node happens in the background. The new active MDS is the one that you set by using the mds_join_fs command.

    +
    +
  18. +
  19. +

    Check the result of the failover and the new deployed daemons:

    +
    +
    +
    $ ceph fs dump
    +…
    +…
    +standby_count_wanted    1
    +[mds.mds.cephstorage-0.fqcshx{0:476503} state up:active seq 168 join_fscid=1 addr [v2:172.17.3.92:6820/4120523799,v1:172.17.3.92:6821/4120523799] compat {c=[1],r=[1],i=[7ff]}]
    +
    +
    +Standby daemons:
    +
    +[mds.mds.cephstorage-2.gnfhfe{-1:499067} state up:standby seq 1 addr [v2:172.17.3.79:6820/2448613348,v1:172.17.3.79:6821/2448613348] compat {c=[1],r=[1],i=[7ff]}]
    +[mds.mds.cephstorage-1.jkvomp{-1:499760} state up:standby seq 1 join_fscid=1 addr [v2:172.17.3.135:6820/452139733,v1:172.17.3.135:6821/452139733] compat {c=[1],r=[1],i=[7ff]}]
    +
    +
    +$ ceph orch ls
    +
    +NAME                     PORTS   RUNNING  REFRESHED  AGE  PLACEMENT
    +crash                                6/6  10m ago    10d  *
    +mds.mds                          3/3  10m ago    32m  label:mds
    +
    +
    +$ ceph orch ps | grep mds
    +
    +
    +mds.mds.cephstorage-0.fqcshx  cephstorage-0.redhat.local                     running (79m)     3m ago  79m    27.2M        -  17.2.6-100.el9cp  1af7b794f353  2a2dc5ba6d57
    +mds.mds.cephstorage-1.jkvomp  cephstorage-1.redhat.local                     running (79m)     3m ago  79m    21.5M        -  17.2.6-100.el9cp  1af7b794f353  7198b87104c8
    +mds.mds.cephstorage-2.gnfhfe  cephstorage-2.redhat.local                     running (79m)     3m ago  79m    24.2M        -  17.2.6-100.el9cp  1af7b794f353  f3cb859e2a15
    +
    +
    +
  20. +
+
+
+
Useful resources
+ +
+
+
+

Migrating Ceph RGW to external RHEL nodes

+
+

For Hyperconverged Infrastructure (HCI) or dedicated Storage nodes, you must migrate the Ceph Object Gateway (RGW) daemons that are included in the OpenStack Controller nodes into the existing external Red Hat Enterprise Linux (RHEL) nodes. The external RHEL nodes typically include the Compute nodes for an HCI environment or Ceph nodes. Your environment must have Ceph version 6 or later and be managed by cephadm or Ceph Orchestrator.

+
+
+

Completing prerequisites for Ceph RGW migration

+
+

Complete the following prerequisites before you begin the Ceph Object Gateway (RGW) migration.

+
+
+
Procedure
+
    +
  1. +

    Check the current status of the Ceph nodes:

    +
    +
    +
    (undercloud) [stack@undercloud-0 ~]$ metalsmith list
    +
    +
    +    +------------------------+    +----------------+
    +    | IP Addresses           |    |  Hostname      |
    +    +------------------------+    +----------------+
    +    | ctlplane=192.168.24.25 |    | cephstorage-0  |
    +    | ctlplane=192.168.24.10 |    | cephstorage-1  |
    +    | ctlplane=192.168.24.32 |    | cephstorage-2  |
    +    | ctlplane=192.168.24.28 |    | compute-0      |
    +    | ctlplane=192.168.24.26 |    | compute-1      |
    +    | ctlplane=192.168.24.43 |    | controller-0   |
    +    | ctlplane=192.168.24.7  |    | controller-1   |
    +    | ctlplane=192.168.24.41 |    | controller-2   |
    +    +------------------------+    +----------------+
    +
    +
    +
  2. +
  3. +

    Log in to controller-0 and check the Pacemaker status to identify important information for the RGW migration:

    +
    +
    +
    Full List of Resources:
    +  * ip-192.168.24.46	(ocf:heartbeat:IPaddr2):     	Started controller-0
    +  * ip-10.0.0.103   	(ocf:heartbeat:IPaddr2):     	Started controller-1
    +  * ip-172.17.1.129 	(ocf:heartbeat:IPaddr2):     	Started controller-2
    +  * ip-172.17.3.68  	(ocf:heartbeat:IPaddr2):     	Started controller-0
    +  * ip-172.17.4.37  	(ocf:heartbeat:IPaddr2):     	Started controller-1
    +  * Container bundle set: haproxy-bundle
    +
    +[undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp17-openstack-haproxy:pcmklatest]:
    +    * haproxy-bundle-podman-0   (ocf:heartbeat:podman):  Started controller-2
    +    * haproxy-bundle-podman-1   (ocf:heartbeat:podman):  Started controller-0
    +    * haproxy-bundle-podman-2   (ocf:heartbeat:podman):  Started controller-1
    +
    +
    +
  4. +
  5. +

    Identify the ranges of the storage networks. The following is an example and the values might differ in your environment:

    +
    +
    +
    [heat-admin@controller-0 ~]$ ip -o -4 a
    +
    +1: lo	inet 127.0.0.1/8 scope host lo\   	valid_lft forever preferred_lft forever
    +2: enp1s0	inet 192.168.24.45/24 brd 192.168.24.255 scope global enp1s0\   	valid_lft forever preferred_lft forever
    +2: enp1s0	inet 192.168.24.46/32 brd 192.168.24.255 scope global enp1s0\   	valid_lft forever preferred_lft forever
    +7: br-ex	inet 10.0.0.122/24 brd 10.0.0.255 scope global br-ex\   	valid_lft forever preferred_lft forever (1)
    +8: vlan70	inet 172.17.5.22/24 brd 172.17.5.255 scope global vlan70\   	valid_lft forever preferred_lft forever (2)
    +8: vlan70	inet 172.17.5.94/32 brd 172.17.5.255 scope global vlan70\   	valid_lft forever preferred_lft forever
    +9: vlan50	inet 172.17.2.140/24 brd 172.17.2.255 scope global vlan50\   	valid_lft forever preferred_lft forever
    +10: vlan30	inet 172.17.3.73/24 brd 172.17.3.255 scope global vlan30\   	valid_lft forever preferred_lft forever
    +10: vlan30	inet 172.17.3.68/32 brd 172.17.3.255 scope global vlan30\   	valid_lft forever preferred_lft forever
    +11: vlan20	inet 172.17.1.88/24 brd 172.17.1.255 scope global vlan20\   	valid_lft forever preferred_lft forever
    +12: vlan40	inet 172.17.4.24/24 brd 172.17.4.255 scope global vlan40\   	valid_lft forever preferred_lft forever
    +
    +
    +
    + + + + + + + + + +
    1br-ex represents the External Network, where in the current +environment, HAProxy has the front-end Virtual IP (VIP) assigned.
    2vlan30 represents the Storage Network, where the new RGW instances should be started on the Ceph Storage nodes.
    +
    +
  6. +
  7. +

    Identify the network that you previously had in HAProxy and propagate it through TripleO to the Ceph Storage nodes. Use this network to reserve a new VIP that is owned by Ceph as the entry point for the RGW service.

    +
    +
      +
    1. +

      Log in to controller-0 and find the ceph_rgw section in the current HAProxy configuration:

      +
      +
      +
      $ less /var/lib/config-data/puppet-generated/haproxy/etc/haproxy/haproxy.cfg
      +
      +...
      +...
      +listen ceph_rgw
      +  bind 10.0.0.103:8080 transparent
      +  bind 172.17.3.68:8080 transparent
      +  mode http
      +  balance leastconn
      +  http-request set-header X-Forwarded-Proto https if { ssl_fc }
      +  http-request set-header X-Forwarded-Proto http if !{ ssl_fc }
      +  http-request set-header X-Forwarded-Port %[dst_port]
      +  option httpchk GET /swift/healthcheck
      +  option httplog
      +  option forwardfor
      +  server controller-0.storage.redhat.local 172.17.3.73:8080 check fall 5 inter 2000 rise 2
      +  server controller-1.storage.redhat.local 172.17.3.146:8080 check fall 5 inter 2000 rise 2
      +  server controller-2.storage.redhat.local 172.17.3.156:8080 check fall 5 inter 2000 rise 2
      +
      +
      +
    2. +
    3. +

      Confirm that the network is used as an HAProxy front end. The following example shows that controller-0 exposes the services by using the external network, which is absent from the Ceph nodes. You must propagate the external network through TripleO:

      +
      +
      +
      [controller-0]$ ip -o -4 a
      +
      +...
      +7: br-ex	inet 10.0.0.106/24 brd 10.0.0.255 scope global br-ex\   	valid_lft forever preferred_lft forever
      +...
      +
      +
      +
    4. +
    +
    +
  8. +
  9. +

    Propagate the HAProxy front-end network to Ceph Storage nodes.

    +
    +
      +
    1. +

      Change the NIC template that you use to define the ceph-storage network interfaces and add the new config section:

      +
      +
      +
      ---
      +network_config:
      +- type: interface
      +  name: nic1
      +  use_dhcp: false
      +  dns_servers: {{ ctlplane_dns_nameservers }}
      +  addresses:
      +  - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }}
      +  routes: {{ ctlplane_host_routes }}
      +- type: vlan
      +  vlan_id: {{ storage_mgmt_vlan_id }}
      +  device: nic1
      +  addresses:
      +  - ip_netmask: {{ storage_mgmt_ip }}/{{ storage_mgmt_cidr }}
      +  routes: {{ storage_mgmt_host_routes }}
      +- type: interface
      +  name: nic2
      +  use_dhcp: false
      +  defroute: false
      +- type: vlan
      +  vlan_id: {{ storage_vlan_id }}
      +  device: nic2
      +  addresses:
      +  - ip_netmask: {{ storage_ip }}/{{ storage_cidr }}
      +  routes: {{ storage_host_routes }}
      +- type: ovs_bridge
      +  name: {{ neutron_physical_bridge_name }}
      +  dns_servers: {{ ctlplane_dns_nameservers }}
      +  domain: {{ dns_search_domains }}
      +  use_dhcp: false
      +  addresses:
      +  - ip_netmask: {{ external_ip }}/{{ external_cidr }}
      +  routes: {{ external_host_routes }}
      +  members:
      +  - type: interface
      +    name: nic3
      +    primary: true
      +
      +
      +
    2. +
    3. +

      Add the External Network to the baremetal.yaml file that is used by metalsmith:

      +
      +
      +
      - name: CephStorage
      +  count: 3
      +  hostname_format: cephstorage-%index%
      +  instances:
      +  - hostname: cephstorage-0
      +  name: ceph-0
      +  - hostname: cephstorage-1
      +  name: ceph-1
      +  - hostname: cephstorage-2
      +  name: ceph-2
      +  defaults:
      +  profile: ceph-storage
      +  network_config:
      +      template: /home/stack/composable_roles/network/nic-configs/ceph-storage.j2
      +  networks:
      +  - network: ctlplane
      +      vif: true
      +  - network: storage
      +  - network: storage_mgmt
      +  - network: external
      +
      +
      +
    4. +
    5. +

      Configure the new network on the bare metal nodes:

      +
      +
      +
      (undercloud) [stack@undercloud-0]$
      +
      +openstack overcloud node provision
      +   -o overcloud-baremetal-deployed-0.yaml
      +   --stack overcloud
      +   --network-config -y
      +  $PWD/network/baremetal_deployment.yaml
      +
      +
      +
    6. +
    7. +

      Verify that the new network is configured on the Ceph Storage nodes:

      +
      +
      +
      [root@cephstorage-0 ~]# ip -o -4 a
      +
      +1: lo	inet 127.0.0.1/8 scope host lo\   	valid_lft forever preferred_lft forever
      +2: enp1s0	inet 192.168.24.54/24 brd 192.168.24.255 scope global enp1s0\   	valid_lft forever preferred_lft forever
      +11: vlan40	inet 172.17.4.43/24 brd 172.17.4.255 scope global vlan40\   	valid_lft forever preferred_lft forever
      +12: vlan30	inet 172.17.3.23/24 brd 172.17.3.255 scope global vlan30\   	valid_lft forever preferred_lft forever
      +14: br-ex	inet 10.0.0.133/24 brd 10.0.0.255 scope global br-ex\   	valid_lft forever preferred_lft forever
      +
      +
      +
    8. +
    +
    +
  10. +
+
+
+
+

Migrating the Ceph RGW back ends

+
+

You must migrate your Ceph Object Gateway (RGW) back ends from your Controller nodes to your Ceph nodes. To ensure that you distribute the correct amount of services to your available nodes, you use cephadm labels to refer to a group of nodes where a given daemon type is deployed. For more information about the cardinality diagram, see Ceph daemon cardinality. +The following procedure assumes that you have three target nodes, cephstorage-0, cephstorage-1, cephstorage-2.

+
+
+
Procedure
+
    +
  1. +

    Add the RGW label to the Ceph nodes that you want to migrate your RGW back ends to:

    +
    +
    +
    $ ceph orch host label add cephstorage-0 rgw;
    +$ ceph orch host label add cephstorage-1 rgw;
    +$ ceph orch host label add cephstorage-2 rgw;
    +
    +Added label rgw to host cephstorage-0
    +Added label rgw to host cephstorage-1
    +Added label rgw to host cephstorage-2
    +
    +[ceph: root@controller-0 /]# ceph orch host ls
    +
    +HOST       	ADDR       	LABELS      	STATUS
    +cephstorage-0  192.168.24.54  osd rgw
    +cephstorage-1  192.168.24.44  osd rgw
    +cephstorage-2  192.168.24.30  osd rgw
    +controller-0   192.168.24.45  _admin mon mgr
    +controller-1   192.168.24.11  _admin mon mgr
    +controller-2   192.168.24.38  _admin mon mgr
    +
    +6 hosts in cluster
    +
    +
    +
  2. +
  3. +

    During the overcloud deployment, a cephadm-compatible spec is generated in +/home/ceph-admin/specs/rgw. Find and patch the RGW spec, specify the right placement by using labels, +and change the RGW back-end port to 8090 to avoid conflicts with the Ceph ingress daemon front-end port.

    +
    +
    +
    [root@controller-0 heat-admin]# cat rgw
    +
    +networks:
    +- 172.17.3.0/24
    +placement:
    +  hosts:
    +  - controller-0
    +  - controller-1
    +  - controller-2
    +service_id: rgw
    +service_name: rgw.rgw
    +service_type: rgw
    +spec:
    +  rgw_frontend_port: 8080
    +  rgw_realm: default
    +  rgw_zone: default
    +
    +
    +
    +

    This example assumes that 172.17.3.0/24 is the storage network.

    +
    +
  4. +
  5. +

    In the placement section, ensure that the label and rgw_frontend_port values are set:

    +
    +
    +
    ---
    +networks:
    +- 172.17.3.0/24(1)
    +placement:
    +  label: rgw (2)
    +service_id: rgw
    +service_name: rgw.rgw
    +service_type: rgw
    +spec:
    +  rgw_frontend_port: 8090 (3)
    +  rgw_realm: default
    +  rgw_zone: default
    +  rgw_frontend_ssl_certificate: | (4)
    +     -----BEGIN PRIVATE KEY-----
    +     ...
    +     -----END PRIVATE KEY-----
    +     -----BEGIN CERTIFICATE-----
    +     ...
    +     -----END CERTIFICATE-----
    +  ssl: true
    +
    +
    +
    + + + + + + + + + + + + + + + + + +
    1Add the storage network where the RGW back ends are deployed.
    2Replace the Controller nodes with the label: rgw label.
    3Change the rgw_frontend_port value to 8090 to avoid conflicts with the Ceph ingress daemon.
    4Optional: if TLS is enabled, add the SSL certificate and key concatenation as described in Configuring RGW with TLS for an external Red Hat Ceph Storage cluster in Configuring persistent storage.
    +
    +
  6. +
  7. +

    Apply the new RGW spec by using the orchestrator CLI:

    +
    +
    +
    $ cephadm shell -m /home/ceph-admin/specs/rgw
    +$ cephadm shell -- ceph orch apply -i /mnt/rgw
    +
    +
    +
    +

    This command triggers the redeploy, for example:

    +
    +
    +
    +
    ...
    +osd.9                     	cephstorage-2
    +rgw.rgw.cephstorage-0.wsjlgx  cephstorage-0  172.17.3.23:8090   starting
    +rgw.rgw.cephstorage-1.qynkan  cephstorage-1  172.17.3.26:8090   starting
    +rgw.rgw.cephstorage-2.krycit  cephstorage-2  172.17.3.81:8090   starting
    +rgw.rgw.controller-1.eyvrzw   controller-1   172.17.3.146:8080  running (5h)
    +rgw.rgw.controller-2.navbxa   controller-2   172.17.3.66:8080   running (5h)
    +
    +...
    +osd.9                     	cephstorage-2
    +rgw.rgw.cephstorage-0.wsjlgx  cephstorage-0  172.17.3.23:8090  running (19s)
    +rgw.rgw.cephstorage-1.qynkan  cephstorage-1  172.17.3.26:8090  running (16s)
    +rgw.rgw.cephstorage-2.krycit  cephstorage-2  172.17.3.81:8090  running (13s)
    +
    +
    +
  8. +
  9. +

    Ensure that the new RGW back ends are reachable on the new ports, so you can enable an ingress daemon on port 8080 later. Log in to each Ceph Storage node that includes RGW and add the iptables rule to allow connections to both 8080 and 8090 ports in the Ceph Storage nodes:

    +
    +
    +
    $ iptables -I INPUT -p tcp -m tcp --dport 8080 -m conntrack --ctstate NEW -m comment --comment "ceph rgw ingress" -j ACCEPT
    +$ iptables -I INPUT -p tcp -m tcp --dport 8090 -m conntrack --ctstate NEW -m comment --comment "ceph rgw backends" -j ACCEPT
    +
    +
    +
  10. +
  11. +

    From a Controller node, such as controller-0, try to reach the RGW back ends:

    +
    +
    +
    $ curl http://cephstorage-0.storage:8090;
    +
    +
    +
    +

    You should observe the following output:

    +
    +
    +
    +
    <?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>
    +
    +
    +
    +

    Repeat the verification for each node where a RGW daemon is deployed.

    +
    +
  12. +
  13. +

    If you migrated RGW back ends to the Ceph nodes, there is no internalAPI network, except in the case of HCI nodes. You must reconfigure the RGW keystone endpoint to point to the external network that you propagated:

    +
    +
    +
    [ceph: root@controller-0 /]# ceph config dump | grep keystone
    +global   basic rgw_keystone_url  http://172.16.1.111:5000
    +
    +[ceph: root@controller-0 /]# ceph config set global rgw_keystone_url http://<keystone_endpoint>:5000
    +
    +
    +
    +
      +
    • +

      Replace <keystone_endpoint> with the Identity service (keystone) internal endpoint of the service that is deployed in the OpenStackControlPlane CR when you adopt the Identity service. For more information, see xref: adopting-the-identity-service_adopt-control-plane[Adopting the Identity service].

      +
    • +
    +
    +
  14. +
+
+
+
+

Deploying a Ceph ingress daemon

+
+

To deploy the Ceph ingress daemon, you perform the following actions:

+
+
+
    +
  1. +

    Remove the existing ceph_rgw configuration.

    +
  2. +
  3. +

    Clean up the configuration created by TripleO.

    +
  4. +
  5. +

    Redeploy the Object Storage service (swift).

    +
  6. +
+
+
+

When you deploy the ingress daemon, two new containers are created:

+
+
+
    +
  • +

    HAProxy, which you use to reach the back ends.

    +
  • +
  • +

    Keepalived, which you use to own the virtual IP address.

    +
  • +
+
+
+

You use the rgw label to distribute the ingress daemon to only the number of nodes that host Ceph Object Gateway (RGW) daemons. For more information about distributing daemons among your nodes, see Ceph daemon cardinality.

+
+
+

After you complete this procedure, you can reach the RGW back end from the ingress daemon and use RGW through the Object Storage service CLI.

+
+
+
Procedure
+
    +
  1. +

    Log in to each Controller node and remove the following configuration from the /var/lib/config-data/puppet-generated/haproxy/etc/haproxy/haproxy.cfg file:

    +
    +
    +
    listen ceph_rgw
    +  bind 10.0.0.103:8080 transparent
    +  mode http
    +  balance leastconn
    +  http-request set-header X-Forwarded-Proto https if { ssl_fc }
    +  http-request set-header X-Forwarded-Proto http if !{ ssl_fc }
    +  http-request set-header X-Forwarded-Port %[dst_port]
    +  option httpchk GET /swift/healthcheck
    +  option httplog
    +  option forwardfor
    +   server controller-0.storage.redhat.local 172.17.3.73:8080 check fall 5 inter 2000 rise 2
    +  server controller-1.storage.redhat.local 172.17.3.146:8080 check fall 5 inter 2000 rise 2
    +  server controller-2.storage.redhat.local 172.17.3.156:8080 check fall 5 inter 2000 rise 2
    +
    +
    +
  2. +
  3. +

    Restart haproxy-bundle and confirm that it is started:

    +
    +
    +
    [root@controller-0 ~]# sudo pcs resource restart haproxy-bundle
    +haproxy-bundle successfully restarted
    +
    +
    +[root@controller-0 ~]# sudo pcs status | grep haproxy
    +
    +  * Container bundle set: haproxy-bundle [undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp17-openstack-haproxy:pcmklatest]:
    +    * haproxy-bundle-podman-0   (ocf:heartbeat:podman):  Started controller-0
    +    * haproxy-bundle-podman-1   (ocf:heartbeat:podman):  Started controller-1
    +    * haproxy-bundle-podman-2   (ocf:heartbeat:podman):  Started controller-2
    +
    +
    +
  4. +
  5. +

    Confirm that no process is connected to port 8080:

    +
    +
    +
    [root@controller-0 ~]# ss -antop | grep 8080
    +[root@controller-0 ~]#
    +
    +
    +
    +

    You can expect the Object Storage service (swift) CLI to fail to establish the connection:

    +
    +
    +
    +
    (overcloud) [root@cephstorage-0 ~]# swift list
    +
    +HTTPConnectionPool(host='10.0.0.103', port=8080): Max retries exceeded with url: /swift/v1/AUTH_852f24425bb54fa896476af48cbe35d3?format=json (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc41beb0430>: Failed to establish a new connection: [Errno 111] Connection refused'))
    +
    +
    +
  6. +
  7. +

    Set the required images for both HAProxy and Keepalived:

    +
    +
    +
    [ceph: root@controller-0 /]# ceph config set mgr mgr/cephadm/container_image_haproxy quay.io/ceph/haproxy:2.3
    +[ceph: root@controller-0 /]# ceph config set mgr mgr/cephadm/container_image_keepalived quay.io/ceph/keepalived:2.1.5
    +
    +
    +
  8. +
  9. +

    Create a file called rgw_ingress in the /home/ceph-admin/specs/ directory in controller-0:

    +
    +
    +
    $ sudo vim /home/ceph-admin/specs/rgw_ingress
    +
    +
    +
  10. +
  11. +

    Paste the following content into the rgw_ingress file:

    +
    +
    +
    ---
    +service_type: ingress
    +service_id: rgw.rgw
    +placement:
    +  label: rgw
    +spec:
    +  backend_service: rgw.rgw
    +  virtual_ip: 10.0.0.89/24
    +  frontend_port: 8080
    +  monitor_port: 8898
    +  virtual_interface_networks:
    +    - <external_network>
    +  ssl_cert: |
    +     -----BEGIN CERTIFICATE-----
    +     ...
    +     -----END CERTIFICATE-----
    +     -----BEGIN PRIVATE KEY-----
    +     ...
    +     -----END PRIVATE KEY-----
    +
    +
    +
    + +
    +
  12. +
  13. +

    Apply the rgw_ingress spec by using the Ceph orchestrator CLI:

    +
    +
    +
    $ cephadm shell -m /home/ceph-admin/specs/rgw_ingress
    +$ cephadm shell -- ceph orch apply -i /mnt/rgw_ingress
    +
    +
    +
  14. +
  15. +

    Wait until the ingress is deployed and query the resulting endpoint:

    +
    +
    +
    [ceph: root@controller-0 /]# ceph orch ls
    +
    +NAME                 	PORTS            	RUNNING  REFRESHED  AGE  PLACEMENT
    +crash                                         	6/6  6m ago 	3d   *
    +ingress.rgw.rgw      	10.0.0.89:8080,8898  	6/6  37s ago	60s  label:rgw
    +mds.mds                   3/3  6m ago 	3d   controller-0;controller-1;controller-2
    +mgr                       3/3  6m ago 	3d   controller-0;controller-1;controller-2
    +mon                       3/3  6m ago 	3d   controller-0;controller-1;controller-2
    +osd.default_drive_group   15  37s ago	3d   cephstorage-0;cephstorage-1;cephstorage-2
    +rgw.rgw   ?:8090          3/3  37s ago	4m   label:rgw
    +
    +
    +
    +
    +
    [ceph: root@controller-0 /]# curl  10.0.0.89:8080
    +
    +---
    +<?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>[ceph: root@controller-0 /]#
    +—
    +
    +
    +
  16. +
+
+
+
+

Updating the Object Storage service endpoints

+
+

You must update the Object Storage service (swift) endpoints to point to the new virtual IP address (VIP) that you reserved on the same network that you used to deploy RGW ingress.

+
+
+
Procedure
+
    +
  1. +

    List the current endpoints:

    +
    +
    +
    (overcloud) [stack@undercloud-0 ~]$ openstack endpoint list | grep object
    +
    +| 1326241fb6b6494282a86768311f48d1 | regionOne | swift    	| object-store   | True	| internal  | http://172.17.3.68:8080/swift/v1/AUTH_%(project_id)s |
    +| 8a34817a9d3443e2af55e108d63bb02b | regionOne | swift    	| object-store   | True	| public	| http://10.0.0.103:8080/swift/v1/AUTH_%(project_id)s  |
    +| fa72f8b8b24e448a8d4d1caaeaa7ac58 | regionOne | swift    	| object-store   | True	| admin 	| http://172.17.3.68:8080/swift/v1/AUTH_%(project_id)s |
    +
    +
    +
  2. +
  3. +

    Update the endpoints that are pointing to the Ingress VIP:

    +
    +
    +
    (overcloud) [stack@undercloud-0 ~]$ openstack endpoint set --url "http://10.0.0.89:8080/swift/v1/AUTH_%(project_id)s" 95596a2d92c74c15b83325a11a4f07a3
    +
    +(overcloud) [stack@undercloud-0 ~]$ openstack endpoint list | grep object-store
    +| 6c7244cc8928448d88ebfad864fdd5ca | regionOne | swift    	| object-store   | True	| internal  | http://172.17.3.79:8080/swift/v1/AUTH_%(project_id)s |
    +| 95596a2d92c74c15b83325a11a4f07a3 | regionOne | swift    	| object-store   | True	| public	| http://10.0.0.89:8080/swift/v1/AUTH_%(project_id)s   |
    +| e6d0599c5bf24a0fb1ddf6ecac00de2d | regionOne | swift    	| object-store   | True	| admin 	| http://172.17.3.79:8080/swift/v1/AUTH_%(project_id)s |
    +
    +
    +
    +

    Repeat this step for both internal and admin endpoints.

    +
    +
  4. +
  5. +

    Test the migrated service:

    +
    +
    +
    (overcloud) [stack@undercloud-0 ~]$ swift list --debug
    +
    +DEBUG:swiftclient:Versionless auth_url - using http://10.0.0.115:5000/v3 as endpoint
    +DEBUG:keystoneclient.auth.identity.v3.base:Making authentication request to http://10.0.0.115:5000/v3/auth/tokens
    +DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): 10.0.0.115:5000
    +DEBUG:urllib3.connectionpool:http://10.0.0.115:5000 "POST /v3/auth/tokens HTTP/1.1" 201 7795
    +DEBUG:keystoneclient.auth.identity.v3.base:{"token": {"methods": ["password"], "user": {"domain": {"id": "default", "name": "Default"}, "id": "6f87c7ffdddf463bbc633980cfd02bb3", "name": "admin", "password_expires_at": null},
    +
    +
    +...
    +...
    +...
    +
    +DEBUG:swiftclient:REQ: curl -i http://10.0.0.89:8080/swift/v1/AUTH_852f24425bb54fa896476af48cbe35d3?format=json -X GET -H "X-Auth-Token: gAAAAABj7KHdjZ95syP4c8v5a2zfXckPwxFQZYg0pgWR42JnUs83CcKhYGY6PFNF5Cg5g2WuiYwMIXHm8xftyWf08zwTycJLLMeEwoxLkcByXPZr7kT92ApT-36wTfpi-zbYXd1tI5R00xtAzDjO3RH1kmeLXDgIQEVp0jMRAxoVH4zb-DVHUos" -H "Accept-Encoding: gzip"
    +DEBUG:swiftclient:RESP STATUS: 200 OK
    +DEBUG:swiftclient:RESP HEADERS: {'content-length': '2', 'x-timestamp': '1676452317.72866', 'x-account-container-count': '0', 'x-account-object-count': '0', 'x-account-bytes-used': '0', 'x-account-bytes-used-actual': '0', 'x-account-storage-policy-default-placement-container-count': '0', 'x-account-storage-policy-default-placement-object-count': '0', 'x-account-storage-policy-default-placement-bytes-used': '0', 'x-account-storage-policy-default-placement-bytes-used-actual': '0', 'x-trans-id': 'tx00000765c4b04f1130018-0063eca1dd-1dcba-default', 'x-openstack-request-id': 'tx00000765c4b04f1130018-0063eca1dd-1dcba-default', 'accept-ranges': 'bytes', 'content-type': 'application/json; charset=utf-8', 'date': 'Wed, 15 Feb 2023 09:11:57 GMT'}
    +DEBUG:swiftclient:RESP BODY: b'[]'
    +
    +
    +
  6. +
  7. +

    Run tempest tests against Object Storage service:

    +
    +
    +
    (overcloud) [stack@undercloud-0 tempest-dir]$ tempest run --regex tempest.api.object_storage
    +...
    +...
    +...
    +======
    +Totals
    +======
    +Ran: 141 tests in 606.5579 sec.
    + - Passed: 128
    + - Skipped: 13
    + - Expected Fail: 0
    + - Unexpected Success: 0
    + - Failed: 0
    +Sum of execute time for each test: 657.5183 sec.
    +
    +==============
    +Worker Balance
    +==============
    + - Worker 0 (1 tests) => 0:10:03.400561
    + - Worker 1 (2 tests) => 0:00:24.531916
    + - Worker 2 (4 tests) => 0:00:10.249889
    + - Worker 3 (30 tests) => 0:00:32.730095
    + - Worker 4 (51 tests) => 0:00:26.246044
    + - Worker 5 (6 tests) => 0:00:20.114803
    + - Worker 6 (20 tests) => 0:00:16.290323
    + - Worker 7 (27 tests) => 0:00:17.103827
    +
    +
    +
  8. +
+
+
+
+
+

Migrating Red Hat Ceph Storage RBD to external RHEL nodes

+
+

For Hyperconverged Infrastructure (HCI) or dedicated Storage nodes that are +running Ceph version 6 or later, you must migrate the daemons that are +included in the OpenStack control plane into the existing external Red +Hat Enterprise Linux (RHEL) nodes. The external RHEL nodes typically include +the Compute nodes for an HCI environment or dedicated storage nodes.

+
+
+

To migrate Red Hat Ceph Storage Rados Block Device (RBD), your environment must +meet the following requirements:

+
+
+
    +
  • +

    Ceph is running version 6 or later and is managed by cephadm.

    +
  • +
  • +

    NFS Ganesha is migrated from a TripleO deployment to cephadm. For more information, see Creating a NFS Ganesha +cluster.

    +
  • +
  • +

    Both the Ceph public and cluster networks are propagated with +TripleO to the target nodes.

    +
  • +
  • +

    Ceph Metadata Server, monitoring stack, Ceph Object Gateway, and any other daemon that is deployed on Controller nodes.

    +
  • +
  • +

    The Ceph cluster is healthy, and the ceph -s command returns HEALTH_OK.

    +
  • +
+
+
+

Migrating Ceph Manager daemons to Ceph nodes

+
+

You must migrate your Ceph Manager daemons from the OpenStack (OSP) Controller nodes to a set of target nodes. Target nodes are either existing Ceph nodes, or OSP Compute nodes if Ceph is deployed by TripleO with a Hyperconverged Infrastructure (HCI) topology.

+
+
+ + + + + +
+ + +The following procedure uses cephadm and the Ceph Orchestrator to drive the Ceph Manager migration, and the Ceph spec to modify the placement and reschedule the Ceph Manager daemons. Ceph Manager is run in an active/passive state. It also provides many modules, including the Ceph Orchestrator. Every potential module, such as the Ceph Dashboard, that is provided by ceph-mgr is implicitly migrated with Ceph Manager. +
+
+
+
Prerequisites
+
    +
  • +

    The target nodes, CephStorage or ComputeHCI, are configured to have both storage and storage_mgmt networks. This ensures that you can use both Ceph public and cluster networks from the same node.

    +
    + + + + + +
    + + +This step requires you to interact with TripleO. From OSP Wallaby and later you do not have to run a stack update. +
    +
    +
  • +
+
+
+
Procedure
+
    +
  1. +

    SSH into the target node and enable the firewall rules that are required to reach a Ceph Manager service:

    +
    +
    +
    dports="6800:7300"
    +ssh heat-admin@<target_node> sudo iptables -I INPUT \
    +    -p tcp --match multiport --dports $dports -j ACCEPT;
    +
    +
    +
    +
      +
    • +

      Replace <target_node> with the hostname of the hosts that are listed in the Ceph environment. Run ceph orch host ls to see the list of the hosts.

      +
      +

      Repeat this step for each target node.

      +
      +
    • +
    +
    +
  2. +
  3. +

    Check that the rules are properly applied to the target node and persist them:

    +
    +
    +
    $ sudo iptables-save
    +$ sudo systemctl restart iptables
    +
    +
    +
  4. +
  5. +

    Prepare the target node to host the new Ceph Manager daemon, and add the mgr +label to the target node:

    +
    +
    +
    $ ceph orch host label add <target_node> mgr; done
    +
    +
    +
  6. +
  7. +

    Repeat steps 1-3 for each target node that hosts a Ceph Manager daemon.

    +
  8. +
  9. +

    Get the Ceph Manager spec:

    +
    +
    +
    $ sudo cephadm shell -- ceph orch ls --export mgr > mgr.yaml
    +
    +
    +
  10. +
  11. +

    Edit the retrieved spec and add the label: mgr section to the placement +section:

    +
    +
    +
    service_type: mgr
    +service_id: mgr
    +placement:
    +  label: mgr
    +
    +
    +
  12. +
  13. +

    Save the spec in the /tmp/mgr.yaml file.

    +
  14. +
  15. +

    Apply the spec with cephadm` by using the Ceph Orchestrator:

    +
    +
    +
    $ sudo cephadm shell -m /tmp/mgr.yaml -- ceph orch apply -i /mnt/mgr.yaml
    +
    +
    +
  16. +
+
+
+
Verification
+
    +
  1. +

    Verify that the new Ceph Manager daemons are created in the target nodes:

    +
    +
    +
    $ ceph orch ps | grep -i mgr
    +$ ceph -s
    +
    +
    +
    +

    The Ceph Manager daemon count should match the number of hosts where the mgr label is added.

    +
    +
    + + + + + +
    + + +The migration does not shrink the Ceph Manager daemons. The count grows by +the number of target nodes, and migrating Ceph Monitor daemons to Ceph nodes +decommissions the stand-by Ceph Manager instances. For more information, see +Migrating Ceph Monitor daemons to Ceph nodes. +
    +
    +
  2. +
+
+
+
+

Migrating Ceph Monitor daemons to Ceph nodes

+
+

You must move Ceph Monitor daemons from the OpenStack (OSP) Controller nodes to a set of target nodes. Target nodes are either existing Ceph nodes, or OSP Compute nodes if Ceph is +deployed by TripleO with a Hyperconverged Infrastructure (HCI) topology. Additional Ceph Monitors are deployed to the target nodes, and they are promoted as _admin nodes that you can use to manage the Ceph Storage cluster and perform day 2 operations.

+
+
+

To migrate the Ceph Monitor daemons, you must perform the following high-level steps:

+
+ +
+

Repeat these steps for any additional Controller node that hosts a Ceph Monitor until you migrate all the Ceph Monitor daemons to the target nodes.

+
+
+
Configuring target nodes for Ceph Monitor migration
+
+

Prepare the target Ceph nodes for the Ceph Monitor migration by performing the following actions:

+
+
+
    +
  1. +

    Enable firewall rules in a target node and persist them.

    +
  2. +
  3. +

    Create a spec that is based on labels and apply it by using cephadm.

    +
  4. +
  5. +

    Ensure that the Ceph Monitor quorum is maintained during the migration process.

    +
  6. +
+
+
+
Procedure
+
    +
  1. +

    SSH into the target node and enable the firewall rules that are required to +reach a Ceph Monitor service:

    +
    +
    +
    $ for port in 3300 6789; {
    +    ssh heat-admin@<target_node> sudo iptables -I INPUT \
    +    -p tcp -m tcp --dport $port -m conntrack --ctstate NEW \
    +    -j ACCEPT;
    +}
    +
    +
    +
    +
      +
    • +

      Replace <target_node> with the hostname of the node that hosts the new Ceph Monitor.

      +
    • +
    +
    +
  2. +
  3. +

    Check that the rules are properly applied to the target node and persist them:

    +
    +
    +
    $ sudo iptables-save
    +$ sudo systemctl restart iptables
    +
    +
    +
  4. +
  5. +

    To migrate the existing Ceph Monitors to the target Ceph nodes, create the following Ceph spec from the first Ceph Monitor, or the first Controller node, and add the label:mon section to the placement section:

    +
    +
    +
    service_type: mon
    +service_id: mon
    +placement:
    +  label: mon
    +
    +
    +
  6. +
  7. +

    Save the spec in the /tmp/mon.yaml file.

    +
  8. +
  9. +

    Apply the spec with cephadm by using the Ceph Orchestrator:

    +
    +
    +
    $ sudo cephadm shell -m /tmp/mon.yaml
    +$ ceph orch apply -i /mnt/mon.yaml
    +
    +
    +
  10. +
  11. +

    Apply the mon label to the remaining Ceph target nodes to ensure that +quorum is maintained during the migration process:

    +
    +
    +
    declare -A target_nodes
    +
    +target_nodes[mon]="oc0-ceph-0 oc0-ceph-1 oc0-ceph2"
    +
    +mon_nodes="${target_nodes[mon]}"
    +IFS=' ' read -r -a mons <<< "$mon_nodes"
    +
    +for node in "${mons[@]}"; do
    +    ceph orch host add label $node mon
    +    ceph orch host add label $node _admin
    +done
    +
    +
    +
    + + + + + +
    + + +Applying the mon.yaml spec allows the existing strategy to use labels +instead of hosts. As a result, any node with the mon label can host a Ceph +Monitor daemon. Perform this step only once to avoid multiple iterations when multiple Ceph Monitors are migrated. +
    +
    +
  12. +
  13. +

    Check the status of the Ceph Storage and the Ceph Orchestrator daemons list. +Ensure that Ceph Monitors are in a quorum and listed by the ceph orch command:

    +
    +
    +
    # ceph -s
    +  cluster:
    +    id:     f6ec3ebe-26f7-56c8-985d-eb974e8e08e3
    +    health: HEALTH_OK
    +
    +  services:
    +    mon: 6 daemons, quorum oc0-controller-0,oc0-controller-1,oc0-controller-2,oc0-ceph-0,oc0-ceph-1,oc0-ceph-2 (age 19m)
    +    mgr: oc0-controller-0.xzgtvo(active, since 32m), standbys: oc0-controller-1.mtxohd, oc0-controller-2.ahrgsk
    +    osd: 8 osds: 8 up (since 12m), 8 in (since 18m); 1 remapped pgs
    +
    +  data:
    +    pools:   1 pools, 1 pgs
    +    objects: 0 objects, 0 B
    +    usage:   43 MiB used, 400 GiB / 400 GiB avail
    +    pgs:     1 active+clean
    +
    +
    +
    +
    +
    [ceph: root@oc0-controller-0 /]# ceph orch host ls
    +HOST              ADDR           LABELS          STATUS
    +oc0-ceph-0        192.168.24.14  osd mon _admin
    +oc0-ceph-1        192.168.24.7   osd mon _admin
    +oc0-ceph-2        192.168.24.8   osd mon _admin
    +oc0-controller-0  192.168.24.15  _admin mgr mon
    +oc0-controller-1  192.168.24.23  _admin mgr mon
    +oc0-controller-2  192.168.24.13  _admin mgr mon
    +
    +
    +
  14. +
+
+
+
Next steps
+

Proceed to the next step Draining the source node.

+
+
+
+
Draining the source node
+
+

Drain the existing Controller nodes and remove the source node host from the Ceph Storage cluster.

+
+
+
Procedure
+
    +
  1. +

    On the source node, back up the /etc/ceph/ directory to run cephadm and get a shell for the Ceph cluster from the source node:

    +
    +
    +
    $ mkdir -p $HOME/ceph_client_backup
    +$ sudo cp -R /etc/ceph $HOME/ceph_client_backup
    +
    +
    +
  2. +
  3. +

    Identify the active ceph-mgr instance:

    +
    +
    +
    $ cepahdm shell -- ceph mgr stat
    +
    +
    +
  4. +
  5. +

    Fail the ceph-mgr if it is active on the source node or target node:

    +
    +
    +
    $ cephadm shell -- ceph mgr fail <mgr_instance>
    +
    +
    +
    +
      +
    • +

      Replace <mgr_instance> with the Ceph Manager daemon to fail.

      +
    • +
    +
    +
  6. +
  7. +

    From the cephadm shell, remove the labels on the source node:

    +
    +
    +
    $ for label in mon mgr _admin; do
    +    cephadm shell -- ceph orch host rm label <source_node> $label;
    +done
    +
    +
    +
    +
      +
    • +

      Replace <source_node> with the hostname of the source node.

      +
    • +
    +
    +
  8. +
  9. +

    Remove the running Ceph Monitor daemon from the source node:

    +
    +
    +
    $ cephadm shell -- ceph orch daemon rm mon.<source_node> --force"
    +
    +
    +
  10. +
  11. +

    Drain the source node:

    +
    +
    +
    $ cephadm shell -- ceph drain <source_node>
    +
    +
    +
  12. +
  13. +

    Remove the source node host from the Ceph Storage cluster:

    +
    +
    +
    $ cephadm shell -- ceph orch host rm <source_node> --force"
    +
    +
    +
    + + + + + +
    + + +
    +

    The source node is not part of the cluster anymore, and should not appear in +the Ceph host list when cephadm shell -- ceph orch host ls is run. +However, if you run sudo podman ps in the source node, the list might show that both Ceph Monitors and Ceph Managers are still up and running.

    +
    +
    +
    +
    [root@oc0-controller-1 ~]# sudo podman ps
    +CONTAINER ID  IMAGE                                                                                        COMMAND               CREATED         STATUS             PORTS       NAMES
    +5c1ad36472bc  quay.io/ceph/daemon@sha256:320c364dcc8fc8120e2a42f54eb39ecdba12401a2546763b7bef15b02ce93bc4  -n mon.oc0-contro...  35 minutes ago  Up 35 minutes ago              ceph-f6ec3ebe-26f7-56c8-985d-eb974e8e08e3-mon-oc0-controller-1
    +3b14cc7bf4dd  quay.io/ceph/daemon@sha256:320c364dcc8fc8120e2a42f54eb39ecdba12401a2546763b7bef15b02ce93bc4  -n mgr.oc0-contro...  35 minutes ago  Up 35 minutes ago              ceph-f6ec3ebe-26f7-56c8-985d-eb974e8e08e3-mgr-oc0-controller-1-mtxohd
    +
    +
    +
    +
    +
  14. +
  15. +

    Confirm that mons are still in quorum:

    +
    +
    +
    $ cephadm shell -- ceph -s
    +$ cephadm shell -- ceph orch ps | grep -i mon
    +
    +
    +
  16. +
+
+
+
Next steps
+

Proceed to the next step Migrating the Ceph Monitor IP address.

+
+
+
+
Migrating the Ceph Monitor IP address
+
+

You must migrate your Ceph Monitor IP addresses to the target Ceph nodes. The +IP address migration assumes that the target nodes are originally deployed by +TripleO and that the network configuration is managed by +os-net-config.

+
+
+
Procedure
+
    +
  1. +

    Get the original Ceph Monitor IP address from the existing /etc/ceph/ceph.conf file on the mon_host line, for example:

    +
    +
    +
    mon_host = [v2:172.17.3.60:3300/0,v1:172.17.3.60:6789/0] [v2:172.17.3.29:3300/0,v1:172.17.3.29:6789/0] [v2:172.17.3.53:3300/0,v1:172.17.3.53:6789/0]
    +
    +
    +
  2. +
  3. +

    Confirm that the Ceph Monitor IP address is present in the os-net-config configuration that is located in the /etc/os-net-config directory on the source node:

    +
    +
    +
    [tripleo-admin@controller-0 ~]$ grep "172.17.3.60" /etc/os-net-config/config.yaml
    +    - ip_netmask: 172.17.3.60/24
    +
    +
    +
  4. +
  5. +

    Edit the /etc/os-net-config/config.yaml file and remove the ip_netmask line.

    +
  6. +
  7. +

    Save the file and refresh the node network configuration:

    +
    +
    +
    $ sudo os-net-config -c /etc/os-net-config/config.yaml
    +
    +
    +
  8. +
  9. +

    Verify that the IP address is not present in the source node anymore, for example:

    +
    +
    +
    [controller-0]$ ip -o a | grep 172.17.3.60
    +
    +
    +
  10. +
  11. +

    SSH into the target node, for example cephstorage-0, and add the IP address +for the new Ceph Monitor.

    +
  12. +
  13. +

    On the target node, edit /etc/os-net-config/config.yaml and +add the - ip_netmask: 172.17.3.60 line that you removed in the source node.

    +
  14. +
  15. +

    Save the file and refresh the node network configuration:

    +
    +
    +
    $ sudo os-net-config -c /etc/os-net-config/config.yaml
    +
    +
    +
  16. +
  17. +

    Verify that the IP address is present in the target node.

    +
    +
    +
    $ ip -o a | grep 172.17.3.60
    +
    +
    +
  18. +
  19. +

    From the source node, ping the ip address that is migrated to the target node +and confirm that it is still reachable:

    +
    +
    +
    [controller-0]$ ping -c 3 172.17.3.60
    +
    +
    +
  20. +
+
+
+
Next steps
+

Proceed to the next step Redeploying the Ceph Monitor on the target node.

+
+
+
+
Redeploying a Ceph Monitor on the target node
+
+

You use the IP address that you migrated to the target node to redeploy the +Ceph Monitor on the target node.

+
+
+
Procedure
+
    +
  1. +

    Get the Ceph mon spec:

    +
    +
    +
    $ sudo cephadm shell -- ceph orch ls --export mon > mon.yaml
    +
    +
    +
  2. +
  3. +

    Edit the retrieved spec and add the unmanaged: true keyword:

    +
    +
    +
    service_type: mon
    +service_id: mon
    +placement:
    +  label: mon
    +unmanaged: true
    +
    +
    +
  4. +
  5. +

    Save the spec in the /tmp/mon.yaml file.

    +
  6. +
  7. +

    Apply the spec with cephadm by using the Ceph Orchestrator:

    +
    +
    +
    $ sudo cephadm shell -m /tmp/mon.yaml
    +$ ceph orch apply -i /mnt/mon.yaml
    +
    +
    +
    +

    The Ceph Monitor daemons are marked as unmanaged, and you can now redeploy the existing daemon and bind it to the migrated IP address.

    +
    +
  8. +
  9. +

    Delete the existing Ceph Monitor on the target node:

    +
    +
    +
    $ sudo cephadm shell -- ceph orch daemon add rm mon.<target_node> --force
    +
    +
    +
    +
      +
    • +

      Replace <target_node> with the hostname of the target node that is included in the Ceph cluster.

      +
    • +
    +
    +
  10. +
  11. +

    Redeploy the new Ceph Monitor on the target node by using the migrated IP address:

    +
    +
    +
    $ sudo cephadm shell -- ceph orch daemon add mon <target_node>:<ip_address>
    +
    +
    +
    +
      +
    • +

      Replace <ip_address> with the IP address of the migrated IP address.

      +
    • +
    +
    +
  12. +
  13. +

    Get the Ceph Monitor spec:

    +
    +
    +
    $ sudo cephadm shell -- ceph orch ls --export mon > mon.yaml
    +
    +
    +
  14. +
  15. +

    Edit the retrieved spec and set the unmanaged keyword to false:

    +
    +
    +
    service_type: mon
    +service_id: mon
    +placement:
    +  label: mon
    +unmanaged: false
    +
    +
    +
  16. +
  17. +

    Save the spec in the /tmp/mon.yaml file.

    +
  18. +
  19. +

    Apply the spec with cephadm by using the Ceph Orchestrator:

    +
    +
    +
    $ sudo cephadm shell -m /tmp/mon.yaml
    +$ ceph orch apply -i /mnt/mon.yaml
    +
    +
    +
    +

    The new Ceph Monitor runs on the target node with the original IP address.

    +
    +
  20. +
  21. +

    Identify the running mgr:

    +
    +
    +
    $ sudo cephadm shell --  mgr stat
    +
    +
    +
  22. +
  23. +

    Refresh the Ceph Manager information by force-failing it:

    +
    +
    +
    $ sudo cephadm shell -- ceph mgr fail
    +
    +
    +
  24. +
  25. +

    Refresh the OSD information:

    +
    +
    +
    $ sudo cephadm shell -- ceph orch reconfig osd.default_drive_group
    +
    +
    +
  26. +
+
+
+
Next steps
+

Proceed to the next step Verifying the Ceph Storage cluster after Ceph Monitor migration.

+
+
+
+
Verifying the Ceph Storage cluster after Ceph Monitor migration
+
+

After you finish migrating your Ceph Monitor daemons to the target nodes, verify that the the Ceph Storage cluster is healthy.

+
+
+
Procedure
+
    +
  • +

    Verify that the Ceph Storage cluster is healthy:

    +
    +
    +
    [ceph: root@oc0-controller-0 specs]# ceph -s
    +  cluster:
    +    id:     f6ec3ebe-26f7-56c8-985d-eb974e8e08e3
    +    health: HEALTH_OK
    +...
    +...
    +
    +
    +
  • +
+
+
+
+
+
+
+
+ + + + + + + \ No newline at end of file