-
Notifications
You must be signed in to change notification settings - Fork 0
Databases and Storage
To ensure our storage needs we will be using a 2-tier storage system with flash and HDD memory. Flash storage will be used for all the applications requiring high-speed storage, such as our distributed PostgreSQL and MongoDB databases. HDD storage will be used for applications that require large storage quantities.
All nodes are equipped with SSDs used for the OS and pods.
In the K8s context, SSD storage is used for the distributed databases and for the distributed block storage system (currently Longhorn) to be used for general distributed reliable storage.
HDD storage is concentrated in the storage node.
This node is responsible for holding large quantities of less frequently accessed data, like backups and long ter file storage (drive). This storage will be accessible through a NFS server, which will be accessible by the other nodes in the cluster and available to be used by pods with these storage requirements.
Data kept in HDD storage will not be backed up to offsite storage, as it is considered to be less critical data. However, the storage node will have a RAID ? configuration to ensure some level data integrity in case of disk failure.
The cluster hosts PostgreSQL and MongoDB instances. These instances are distributed and are backed up to the storage node and to offsite storage.
Distributed PostgreSQL database using CloudNativePG (referred to as CNPG from now on).
To configure PostgreSQL, after setting up the cluster, you just need to run the script deploy-cnpg-dev.sh
, for development, or deploy-cnpg-prod.sh
, for production, from the root of the repository. For production, you may want to set (in the beginning of the script) the port to where you want to expose the service. The scripts do the following:
- Install the CNPG operator manifest
- Wait for the operator to be available
- Create a CNPG cluster in a new namespace
pg
, with a databasetts-db
with userstts
(owner) andni
(superuser). - Wait a hard-coded amount of time for the first pod to be created (at the time of writing,
kubectl
does not allow to wait for a non-existing resource) - Wait for the rest of the pods to be ready
- In the case of the development script, port-forwards the specified local port to the port of the service pod port
From there, you can connect to the database using the following command:
psql -h localhost -p <port> -U <user> tts-db
To deploy MongoDB, we use the MongoDB Community Kubernetes Operator
You can configure MongoDB very similarly to PostgreSQL: just run deploy-mongodb-dev.sh
for development and deploy-mongodb-prod.sh
for production, again, from the root of the repository. In development, the local port can be specified at the beginning of the script. The scripts do the following:
- Add the "MongoDB Helm Charts for Kubernetes" repository to Helm
- Install the "Custom Resource Definitions" and the "Community Operator" in a new namespace
mongodb
- Deploy the replica set, with a user
ni
- Wait a hard-coded amount of time for the first pod to be created (at the time of writing,
kubectl
does not allow to wait for a non-existing resource) - Wait for the rest of the pods to be ready
- In the case of the development script, port-forwards the specified local port to the port of the service pod port
From there, you can connect to the database using the following command:
mongosh --port <port> --username <user> --password <pass>
Longhorn is a distributed block storage system for Kubernetes that uses block storage and provides features like snapshots, and backups. By replicating the data across multiple nodes, Longhorn ensures data availability and redundancy, even in case of node failure.
After making sure all pre-requisites (the deploy ansible playbook should ensure this) are met and that a secret with the backup credential is created (instructions here), Longhorn can be installed using the script available at services/storage/longhorn/deploy.sh
. This script receives a values file with the desired configurations for the Longhorn installation as an argument. The values file should be based on the dev-values.yaml
file available at services/storage/longhorn/dev-values.yaml
.
New nodes are added without any tags and with default disk configuration. You should use the provided k8s node annotations to specify the desired tags and disk configuration.
Warning
Beware that some configurations require Longhorn to be launched with custom configurations and that, after initial setup, configurations are not syncronized with the k8s node annotations. This means that to change tags and disk configurations you will need to use the Longhorn UI or to directly change the config in the Longhorn node CRD (lhn, Longhorn node).
- Documentation here: https://longhorn.io/docs/1.6.1/nodes-and-volumes/nodes/default-disk-and-node-config/
Volumes can be created according to configurations specified by a StorageClass or in the Longhorn UI. These volumes can be created on demand by a PVC or can be pre-created and then attached to a PV and PVC. Volumes can be created with kubectl or with the Longhorn UI.
- Documentation here: https://longhorn.io/docs/1.6.1/nodes-and-volumes/volumes/create-volumes/
The Storage Class parameter parameters.dataLocality
controls whether Longhorn tries to keep a replica of the volume in the same node as the workload using it. This is useful for workloads that require high-speed storage, like databases and other high-throughput applications.
Available options are:
-
dataLocality: "disabled"
- This is the default option. There may or may not be a replica on the same node as the attached volume (workload). -
dataLocality: "best-effort"
- This option instructs Longhorn to try to keep a replica on the same node as the attached volume (workload). Longhorn will not stop the volume, even if it cannot keep a replica local to the attached volume (workload) due to an environment limitation, e.g. not enough disk space, incompatible disk tags, etc. -
dataLocality: "strict-local"
- This option enforces Longhorn keep the only one replica on the same node as the attached volume, and therefore, it offers higher IOPS and lower latency performance. -
Documentation here:
The Storage Class parameter replicaAutoBalance
controls where Longhorn keeps the requested replicas of a volume. This helps ensure that all replicas will not end up on the same node, which would make the volume unavailable if that node fails.
Available options are:
-
replicaAutoBalance: "ignored"
- This is the default option. Longhorn will follow the global setting. -
replicaAutoBalance: "disabled"
- This option instructs Longhorn not to balance the replicas of the volume across all nodes. -
replicaAutoBalance: "least-effort"
- This option instructs Longhorn to balance replicas for minimal redundancy. For example, after adding node-2, a volume with 4 off-balanced replicas will only rebalance 1 replica. -
replicaAutoBalance: "best-effort"
- This option instructs Longhorn to try balancing replicas for even redundancy. For example, after adding node-2, a volume with 4 off-balanced replicas will rebalance 2 replicas. -
Documentation here:
To exclude a volume from backups, its Storage Class parameter parameters.recurringJobSelector
should not include recurring jobs or recurring job groups with backup tasks.
Recurring jobs are a way to run tasks periodically in Longhorn. These tasks can run standalone or as part of a recurring job group. Groups that apply to a volume are defined by the Storage Class parameter parameters.recurringJobSelector
, as previously mentioned.
Snapshots and backups are done by recurring jobs. Snapshots are taken by snapshot jobs, and backups are done by backup jobs. Other tasks are available.
- Documentation here: https://longhorn.io/docs/1.6.1/snapshots-and-backups/scheduling-backups-and-snapshots/
- Recurring jobs of the project: https://github.com/NIAEFEUP/niployments-revamp/blob/main/services/storage/longhorn/recurringJobs/
- Recurring jobs of the default group: https://github.com/NIAEFEUP/niployments-revamp/blob/main/services/storage/longhorn/recurringJobs/default
Some Storage Classes with different settings are available at services/storage/longhorn/storageClasses/
. Snapshots are enabled for all of them, backups are only enabled when specified.
-
longhorn-strict-local
- Storage Class withdataLocality: "strict-local"
. Backups are enabled.strict-local
has only one replica per volume, working as a high-speed storage solution, but with no redundancy, like a the common local path provisioner. -
longhorn-strict-local-retain
- Storage Class withdataLocality: "strict-local"
andreclaimPolicy: "Retain"
. Backups are enabled.strict-local
has only one replica per volume, working as a high-speed storage solution, but with no redundancy, like a the common local path provisioner. -
longhorn-locality
- Storage Class withdataLocality: "best-effort"
andreplicaAutoBalance: "least-effort"
. Backups are enabled. -
longhorn-retain.yaml
- Storage Class withreplicaAutoBalance: "least-effort"
andreclaimPolicy: "Retain"
. Backups are enabled. -
longhorn-locality-retain
- Storage Class withdataLocality: "best-effort"
,replicaAutoBalance: "least-effort"
andreclaimPolicy: "Retain"
. Backups are enabled. -
longhorn-locality-no-backup
- Storage Class withdataLocality: "best-effort"
andreplicaAutoBalance: "least-effort"
. -
longhorn-retain-no-backup-retain
- Storage Class withdataLocality: "best-effort"
,replicaAutoBalance: "least-effort"
andreclaimPolicy: "Retain"
.
All classes have fake storage classes with matching names available at services/storage/longhorn/storageClasses/fakeDevClasses/
. These do not implement any Longhorn behavior and use local path provisioning. However, they keep the retention policies.
Longhorn provides backup mechanisms for volumes using many different strategies.
A snapshot in Longhorn captures the state of a volume at the time the snapshot is created. Each snapshot only captures changes that overwrite data from earlier snapshots, so a sequence of snapshots is needed to fully represent the full state of the volume. Volumes can be restored from a snapshot.
The snapshot recurring job is defined to run every day at 00:00 and 12:00. It will keep 6 snapshots of the volume, corresponding to the last 3 days.
- Snapshot recurring job: https://github.com/NIAEFEUP/niployments-revamp/blob/main/services/storage/longhorn/recurringJobs/default/snapshot.yaml
Offsite backups can be made using different storage providers. The most interesting for our use-case are NFS and S3.
We are currently planning on using R2 (Clouflares S3 compatible object storage) as the backup storage for Longhorn. It's a cheap and reliable solution without egress fees.
Another option could be to use a NFS server to store backups in the HDD storage node, and then backup this data to offsite storage. This would enable faster recovery, but should be overkill for now.
The backup recurring job is defined to run every Sunday at 03:00. It will keep 12 backups of the volume, corresponding to the last 3 months.
πππππππππππ (βΈβΈβΈ>οΉ<βΈβΈβΈ)