diff --git a/content/en/contribute/medic/product-development-process/data-migration-k8s.md b/content/en/contribute/medic/product-development-process/data-migration-3x-eks-to-4x-eks.md similarity index 97% rename from content/en/contribute/medic/product-development-process/data-migration-k8s.md rename to content/en/contribute/medic/product-development-process/data-migration-3x-eks-to-4x-eks.md index 143a3326a..93f79d82a 100644 --- a/content/en/contribute/medic/product-development-process/data-migration-k8s.md +++ b/content/en/contribute/medic/product-development-process/data-migration-3x-eks-to-4x-eks.md @@ -1,9 +1,9 @@ --- -title: "Migration from CHT 3.x to CHT 4.x in Kubernetes" -linkTitle: "K8s Data Migration to 4.x" +title: "Migration from CHT 3.x to CHT 4.x in EKS - Kubernetes" +linkTitle: "Migration: 3.x EKS to 4.x EKS" weight: 1 description: > - Guide to migrate existing data from CHT 3.x to CHT 4.x in Kubernetes environments + Guide to migrate existing data from CHT 3.x on EKS to CHT 4.x on EKS (Kubernetes Environments) relatedContent: > --- @@ -150,10 +150,10 @@ Create a `values.yaml` file using the volume ID from the previous step: For single node deployment, create a YAML file with this contents, being sure to update: * `` (_two occurrences_) -* `` - 4.x version you're upgrading too +* `` - 4.x version you're upgrading to * `` - retrieved from `get-env` call above * `` - retrieved from `get-env` call above -* `` - needs to be the same as used in 3.x - likely `medic` +* `` - needs to be the same as used in 3.x - likely `medic` * `` - retrieved from `get-env` call above * `` - Size of original 3.x EBS volume, eg `100Mi` for 100 Megabytes or `100Gi` for 100 Gigabytes (_two occurrences_) * `` - For production use `prod-couchdb-only`, for dev use `dev-couchdb-only` diff --git a/content/en/hosting/4.x/app-developer.md b/content/en/hosting/4.x/app-developer.md index aebbb577d..6b1f90d37 100644 --- a/content/en/hosting/4.x/app-developer.md +++ b/content/en/hosting/4.x/app-developer.md @@ -1,7 +1,7 @@ --- title: "App Developer Hosting in CHT 4.x" linkTitle: "App Developer Hosting" -weight: 30 +weight: 10 aliases: - /apps/guides/hosting/4.x/app-developer - /apps/guides/hosting/app-developer diff --git a/content/en/hosting/4.x/migration/_index.md b/content/en/hosting/4.x/migration/_index.md new file mode 100644 index 000000000..5e020f59b --- /dev/null +++ b/content/en/hosting/4.x/migration/_index.md @@ -0,0 +1,6 @@ +--- +title: Migration Guides +weight: 20 +description: > + Guides for hosting, maintaining, and monitoring CHT applications +--- diff --git a/content/en/hosting/4.x/migration/_partial_migration_3x_docker_to_4x_k3s.md b/content/en/hosting/4.x/migration/_partial_migration_3x_docker_to_4x_k3s.md new file mode 100644 index 000000000..37241c2f3 --- /dev/null +++ b/content/en/hosting/4.x/migration/_partial_migration_3x_docker_to_4x_k3s.md @@ -0,0 +1,64 @@ +--- +toc_hide: true +hide_summary: true +--- + +The hosting architecture differs entirely between CHT-Core 3.x and CHT-Core 4.x. When migrating from Docker Compose to K3s, specific steps are required using the [couchdb-migration](https://github.com/medic/couchdb-migration) tool. This tool interfaces with CouchDB to update shard maps and database metadata. + +{{% alert title="Note" %}} +If after upgrading you get an error, `Cannot convert undefined or null to object` - please see [issue #8040](https://github.com/medic/cht-core/issues/8040) for a work around. This only affects CHT 4.0.0, 4.0.1, 4.1.0 and 4.1.1. It was fixed in CHT 4.2.0. +{{% /alert %}} + + +## Install Migration Tool +```shell +mkdir -p ~/couchdb-migration/ +cd ~/couchdb-migration/ +curl -s -o ./docker-compose.yml https://raw.githubusercontent.com/medic/couchdb-migration/main/docker-compose.yml +docker compose up +``` + +## Set Up Environment Variables + +Be sure to replace both `` and `` with your actual username and password. As well, update `` to the CouchDB URL from the Docker Compose setup: + +```shell +export COUCH_URL=http://:@:5984 +``` + +## Run Pre-Migration Commands +```shell +cd ~/couchdb-migration/ +docker compose run couch-migration pre-index-views +``` + +{{% alert title="Note" %}} +If pre-indexing is omitted, 4.x API will fail to respond to requests until all views are indexed. For large databases, this could take many hours or days. +{{% /alert %}} + +## Save CouchDB Configuration +```shell +cd ~/couchdb-migration/ +docker compose run couch-migration get-env +``` + +Save the output containing: +- CouchDB secret (used for encrypting passwords and session tokens) +- CouchDB server UUID (used for replication checkpointing) +- CouchDB admin credentials + +The next part of the guide assumes your K3s cluster is already prepared. If not, please run the set of commands [here](https://docs.k3s.io/quick-start). + +We are also going to utilize the `cht-deploy` script from the [cht-core](https://github.com/medic/cht-core) repo. If you don't already have that, clone it. + +## Prepare Node Storage + +```shell +# Create directory on the node +sudo mkdir -p /srv/couchdb1/data + +# Copy data from Docker Compose installation to the k3s node +sudo rsync -avz --progress --partial --partial-dir=/tmp/rsync-partial \ + /srv/storage/medic-core/couchdb/data/ \ + @:/srv/couchdb1/data/ +``` diff --git a/content/en/hosting/4.x/migration/_partial_values_explanation.md b/content/en/hosting/4.x/migration/_partial_values_explanation.md new file mode 100644 index 000000000..12f47ff96 --- /dev/null +++ b/content/en/hosting/4.x/migration/_partial_values_explanation.md @@ -0,0 +1,23 @@ +--- +toc_hide: true +hide_summary: true +--- + +Be sure to update the following values in your YAML file: + +* `` (_two occurrences_) +* `` - 4.x version you're upgrading to +* `` - retrieved from `get-env` call above +* `` - retrieved from `get-env` call above +* `` - needs to be the same as used in 3.x - likely `medic` +* `` - retrieved from `get-env` call above +* `` - the URL of your production instance goes here (eg `example.org`) +* `` - path to TLS files on disk + +Storage Configuration Notes: + +The storage related values don't need to be changed but here's an explanation: + +* `preExistingDataAvailable: "true"` - If this is false, the CHT gets launched with empty data. +* `dataPathOnDiskForCouchDB: "data"` - Leave as `data` because that's the directory we created above when moving the existing data. +* `partition: "0"` - Leave as `0` to use the whole disk. If you have moved data to a separate partition in a partitioned hard disk, then you'd put the partition number here. diff --git a/content/en/hosting/4.x/migration/data-migration-3x-docker-to-4x-k3s-multi.md b/content/en/hosting/4.x/migration/data-migration-3x-docker-to-4x-k3s-multi.md new file mode 100644 index 000000000..1b07e084b --- /dev/null +++ b/content/en/hosting/4.x/migration/data-migration-3x-docker-to-4x-k3s-multi.md @@ -0,0 +1,169 @@ +--- +title: "Migration from Docker Compose CHT 3.x to 3-Node Clustered CHT 4.x on K3s" +linkTitle: "To K3s Multi-node" +weight: 10 +description: > + Guide to migrate existing data from CHT 3.x Docker Compose deployment to CHT 4.x clustered K3s deployment with 3 CouchDB nodes +--- +{{< read-content file="hosting/4.x/migration/_partial_migration_3x_docker_to_4x_k3s.md" >}} + +# Create directories on secondary nodes + +```shell +ssh @ "sudo mkdir -p /srv/couchdb2/data/shards /srv/couchdb2/data/.shards" +ssh @ "sudo mkdir -p /srv/couchdb3/data/shards /srv/couchdb3/data/.shards" +``` + +## Create values.yaml for K3s Deployment +{{< read-content file="hosting/4.x/migration/_partial_values_explanation.md" >}} + +```yaml +project_name: "" +namespace: "" +chtversion: + +upstream_servers: + docker_registry: "public.ecr.aws/medic" + builds_url: "https://staging.dev.medicmobile.org/_couch/builds_4" +upgrade_service: + tag: 0.32 + +couchdb: + password: "" + secret: "" + user: "" + uuid: "" + clusteredCouch_enabled: true + couchdb_node_storage_size: 100Gi + +clusteredCouch: + noOfCouchDBNodes: 3 + +ingress: + host: "" + +environment: "remote" +cluster_type: "k3s-k3d" +cert_source: "specify-file-path" +certificate_crt_file_path: "/fullchain.crt" +certificate_key_file_path: "/privkey.key" + +nodes: + node-1: "couch01" + node-2: "couch02" + node-3: "couch03" + +couchdb_data: + preExistingDataAvailable: "true" + dataPathOnDiskForCouchDB: "data" + partition: "0" + +local_storage: + preExistingDiskPath-1: "/srv/couchdb1" + preExistingDiskPath-2: "/srv/couchdb2" + preExistingDiskPath-3: "/srv/couchdb3" +``` + +## Deploy to K3s + +We are going to use cht-deploy from the [cht-core](https://github.com/medic/cht-core) repo. + +```shell +cd cht-core/scripts/deploy +./cht-deploy -f /path/to/your/values.yaml +``` + +## Get Shard Distribution Instructions + +Access the primary CouchDB pod, being sure to replace `` with the name of your actual namespace: + +```shell +kubectl exec -it -n $(kubectl get pod -n -l cht.service=couchdb-1 -o name) -- bash +``` + +Set up the migration tool: +```shell +curl -fsSL https://deb.nodesource.com/setup_16.x | bash - +apt install -y nodejs npm git +git clone https://github.com/medic/couchdb-migration.git +cd couchdb-migration +npm ci --omit=dev + +# Create a global symlink to enable running commands directly +# Note: This may require sudo if npm's global directories aren't writable +npm link + +export ADMIN_USER= +export ADMIN_PASSWORD= +export COUCH_URL="http://${ADMIN_USER}:${ADMIN_PASSWORD}@localhost:5984" + +# Get shard distribution instructions +shard_matrix=$(generate-shard-distribution-matrix) +shard-move-instructions $shard_matrix +``` + +Example output: +``` +Move /shards/00000000-1fffffff to /shards/00000000-1fffffff +Move /.shards/00000000-1fffffff to /.shards/00000000-1fffffff +Move /shards/20000000-3fffffff to /shards/20000000-3fffffff +... +``` + +{{% alert title="Note" %}} +The actual shard ranges in your output may differ. Adjust the following rsync commands to match your specific shard distribution instructions. +{{% /alert %}} + +## Distribute Shards + +Move shards to Node 2: +```shell +# Copy main shards first +sudo rsync -avz --progress --partial --partial-dir=/tmp/rsync-partial \ + /srv/couchdb1/data/shards/20000000-3fffffff \ + /srv/couchdb1/data/shards/80000000-9fffffff \ + /srv/couchdb1/data/shards/e0000000-ffffffff \ + user@node2-hostname:/srv/couchdb2/data/shards/ + +# Then copy hidden shards +sudo rsync -avz --progress --partial --partial-dir=/tmp/rsync-partial \ + /srv/couchdb1/data/.shards/20000000-3fffffff \ + /srv/couchdb1/data/.shards/80000000-9fffffff \ + /srv/couchdb1/data/.shards/e0000000-ffffffff \ + user@node2-hostname:/srv/couchdb2/data/.shards/ + +# Touch the .shards to ensure they're newer +ssh user@node2-hostname "sudo find /srv/couchdb2/data/.shards -type f -exec touch {} +" +``` + +Move shards to Node 3: +```shell +# Copy main shards first +sudo rsync -avz --progress --partial --partial-dir=/tmp/rsync-partial \ + /srv/couchdb1/data/shards/40000000-5fffffff \ + /srv/couchdb1/data/shards/a0000000-bfffffff \ + user@node3-hostname:/srv/couchdb3/data/shards/ + +# Then copy hidden shards +sudo rsync -avz --progress --partial --partial-dir=/tmp/rsync-partial \ + /srv/couchdb1/data/.shards/40000000-5fffffff \ + /srv/couchdb1/data/.shards/a0000000-bfffffff \ + user@node3-hostname:/srv/couchdb3/data/.shards/ + +# Touch the .shards to ensure they're newer +ssh user@node3-hostname "sudo find /srv/couchdb3/data/.shards -type f -exec touch {} +" +``` + +## Update Cluster Configuration + +In the primary CouchDB pod: +```shell +# Apply shard distribution +move-shards $shard_matrix + +# Remove old node configuration +remove-node couchdb@127.0.0.1 + +# Verify migration +verify +``` diff --git a/content/en/hosting/4.x/migration/data-migration-3x-docker-to-4x-k3s-single.md b/content/en/hosting/4.x/migration/data-migration-3x-docker-to-4x-k3s-single.md new file mode 100644 index 000000000..da038a416 --- /dev/null +++ b/content/en/hosting/4.x/migration/data-migration-3x-docker-to-4x-k3s-single.md @@ -0,0 +1,103 @@ +--- +title: "Migration from 3.x Docker Compose to 4.x K3s (Single Node)" +linkTitle: "To K3s Single-Node" +weight: 20 +description: > + Guide on how to migrate existing data from CHT 3.x Docker Compose deployment to CHT 4.x single-node K3s deployment +relatedContent: > +--- +{{< read-content file="hosting/4.x/migration/_partial_migration_3x_docker_to_4x_k3s.md" >}} + +## Create values.yaml for K3s Deployment +{{< read-content file="hosting/4.x/migration/_partial_values_explanation.md" >}} + +```yaml +project_name: "" +namespace: "" +chtversion: + +upstream_servers: + docker_registry: "public.ecr.aws/medic" + builds_url: "https://staging.dev.medicmobile.org/_couch/builds_4" +upgrade_service: + tag: 0.32 + +couchdb: + password: "" + secret: "" + user: "" + uuid: "" + clusteredCouch_enabled: false + couchdb_node_storage_size: 100Gi + +ingress: + host: "" + +environment: "remote" +cluster_type: "k3s-k3d" +cert_source: "specify-file-path" +certificate_crt_file_path: "/fullchain.crt" +certificate_key_file_path: "/privkey.key" + +nodes: + node-1: "couch01" + +couchdb_data: + preExistingDataAvailable: "true" + dataPathOnDiskForCouchDB: "data" + partition: "0" + +local_storage: + preExistingDiskPath-1: "/srv/couchdb1" +``` + +## Deploy to K3s + +We are going to use cht-deploy from the [cht-core](https://github.com/medic/cht-core) repo. + +```shell +cd cht-core/scripts/deploy +./cht-deploy -f /path/to/your/values.yaml +``` + +## Run Migration Commands + +First verify CouchDB is running by getting the pod status and running `curl` inside the couchdb service to see if `localhost` is accessible. Be sure to replace `` with your actual namespace: + +```shell +kubectl get pods -n + +kubectl exec -it -n $(kubectl get pod -n -l cht.service=couchdb -o name) -- \ + curl -s http://localhost:5984/_up +``` + +Access the CouchDB pod: +```shell +kubectl exec -it -n $(kubectl get pod -n -l cht.service=couchdb -o name) -- bash +``` + +Set up migration tool in pod: +```shell +curl -fsSL https://deb.nodesource.com/setup_16.x | bash - +apt install -y nodejs npm git +git clone https://github.com/medic/couchdb-migration.git +cd couchdb-migration +npm ci --omit=dev + +# Create a global symlink to enable running commands directly +# Note: This may require sudo if npm's global directories aren't writable +npm link + +export ADMIN_USER= +export ADMIN_PASSWORD= +export COUCH_URL="http://${ADMIN_USER}:${ADMIN_PASSWORD}@localhost:5984" + +# Verify CouchDB is up and responding +check-couchdb-up + +# Update node configuration +move-node + +# Verify migration +verify +``` diff --git a/content/en/hosting/4.x/data-migration.md b/content/en/hosting/4.x/migration/migration-to-4x-docker.md similarity index 99% rename from content/en/hosting/4.x/data-migration.md rename to content/en/hosting/4.x/migration/migration-to-4x-docker.md index 428f7016b..de08d3335 100644 --- a/content/en/hosting/4.x/data-migration.md +++ b/content/en/hosting/4.x/migration/migration-to-4x-docker.md @@ -1,7 +1,7 @@ --- title: "Migration from CHT 3.x to CHT 4.x" -linkTitle: "Data migration to 4.x" -weight: 1 +linkTitle: "To Docker Single-Node" +weight: 30 aliases: - /apps/guides/hosting/4.x/data-migration description: > diff --git a/content/en/hosting/4.x/upgrade-troubleshooting.md b/content/en/hosting/4.x/upgrade-troubleshooting.md index c554bd545..d0a2dc8bd 100644 --- a/content/en/hosting/4.x/upgrade-troubleshooting.md +++ b/content/en/hosting/4.x/upgrade-troubleshooting.md @@ -7,7 +7,7 @@ aliases: description: > What to do when CHT 4.x upgrades don't work as planned relatedContent: > - hosting/4.x/data-migration + hosting/4.x/migration/migration-to-4x-docker --- 4.0.0 was released in November of 2022 so 4.x is mature and users have learned a number of important lessons on how to fix failed 4.x upgrades. Below are some specific tips as well as general practices on upgrading 4.x. diff --git a/content/en/hosting/vertical-vs-horizontal.md b/content/en/hosting/vertical-vs-horizontal.md index 87b8d38bb..278f491ae 100644 --- a/content/en/hosting/vertical-vs-horizontal.md +++ b/content/en/hosting/vertical-vs-horizontal.md @@ -7,7 +7,7 @@ aliases: description: > The power of clustered CouchDB to horizontally scale the CHT relatedContent: > - hosting/4.x/data-migration + hosting/4.x/migration/migration-to-4x-docker.md core/overview/architecture/ --- @@ -21,7 +21,7 @@ CHT Core 4.0.0 introduces [a new architecture]({{< relref "core/overview/archite Before getting into how the CHT horizontally scales, it should be well understood the importance of vertical scaling and what it is. This is the ability of the CHT to support more users by adding more RAM and CPU to either the bare-metal or virtual machine host. This ensures key services like API, Sentinel and, most importantly, CouchDB, can operate without performance degradation. -When thousands of users are simultaneously trying to synchronize with the CHT, the load can overwhelm CouchDB. As discovered [through extensive research](https://forum.communityhealthtoolkit.org/t/how-we-tested-scalability-of-cht-infrastructure/1532) and [large production deployments](https://github.com/medic/cht-core/issues/8324#issuecomment-1691411542), administrators will start to see errors in their logs and end users will complain of slow sync times. Before moving to more CouchDB nodes, administrators should consider adding more RAM and CPU to the single server where the CHT is hosted. This applies to both CHT 3.x and CHT 4.x. Given the ease of allocating more resources, presumably in virtualized environment like [EC2](https://aws.amazon.com/ec2/), [Proxmox](https://www.proxmox.com/en/) or [ESXi](https://www.vmware.com/products/cloud-infrastructure/esxi-and-esx), this is much easier than moving [from a single to multi-node CouchDB instance]({{< relref "hosting/4.x/data-migration" >}}). +When thousands of users are simultaneously trying to synchronize with the CHT, the load can overwhelm CouchDB. As discovered [through extensive research](https://forum.communityhealthtoolkit.org/t/how-we-tested-scalability-of-cht-infrastructure/1532) and [large production deployments](https://github.com/medic/cht-core/issues/8324#issuecomment-1691411542), administrators will start to see errors in their logs and end users will complain of slow sync times. Before moving to more CouchDB nodes, administrators should consider adding more RAM and CPU to the single server where the CHT is hosted. This applies to both CHT 3.x and CHT 4.x. Given the ease of allocating more resources, presumably in virtualized environment like [EC2](https://aws.amazon.com/ec2/), [Proxmox](https://www.proxmox.com/en/) or [ESXi](https://www.vmware.com/products/cloud-infrastructure/esxi-and-esx), this is much easier than moving [from a single to multi-node CouchDB instance]({{< relref "hosting/4.x/migration/migration-to-4x-docker" >}}). Here we see a normal deployment following the bare minimum [hosting requirements]({{< relref "hosting/requirements" >}}) for the CHT. We'll call this a "short" deployment because it is not yet vertically scaled: @@ -75,7 +75,7 @@ end API["API"] --> HAProxy --> couch4 ``` -To read up on how to migrate your data from a single to multi-node, please see the [data migration guide]({{< relref "hosting/4.x/data-migration" >}}). +To read up on how to migrate your data from a single to multi-node, please see the [data migration guide]({{< relref "hosting/4.x/migration/migration-to-4x-docker" >}}). It should be noted that, unlike vertical scaling, horizontal scaling of a large, existing dataset can take a while to prepare the transfer (hours to days) and may involve a brief service outage. This should be taken into consideration when planning a move of a CHT instance with a lot of data.