This repository demonstrates some of the basic features of Red Hat’s latest Data Grid release and how to deploy a Data Grid cluster on OCP and RHEL.
- 1. Introduction
- 2. Deploying RHDG on RHEL
- 3. Deploying RHDG on OCP using the Operator
- 4. Monitoring RHDG with Grafana
- Annex: Configure the scope of your operator
- Annex: Stern - Tail logs from multiple pods
- Annex: Deploying the Infinispan operator
- Annex: Deploying RHDG 8 without operator
- Useful links
Red Hat Data Grid is an in-memory, distributed, NoSQL datastore solution. Your applications can access, process, and analyze data at in-memory speed to deliver a superior user experience.
Red Hat Data Grid provides value as a standard architectural component in application infrastructures for a variety of real-world scenarios and use cases:
-
Data caching and transient data storage.
-
Primary data store.
-
Low latency compute grid.
To support modern data management requirements with rapid data processing, elastic scalability, and high availability, Red Hat Data Grid offers:
-
NoSQL data store. Provides simple, flexible storage for a variety of data without the constraints of a fixed data model.
-
Apache Spark and Hadoop integration. Offers full support as an in-memory data store for Apache Spark and Hadoop, with support for Spark resilient distributed datasets (RDDs) and Discretized Streams (Dstreams), as well as Hadoop I/O format.
-
Rich querying. Provides easy search for objects using values and ranges, without the need for key-based lookups or an object’s exact location.
-
Polyglot client and access protocol support. Offers read/write capabilities that let applications written in multiple programming languages easily access and share data. Applications can access the data grid remotely, using REST, or Hot Rod—for Java™, C++, and .NET.
-
Distributed parallel execution. Quickly process large volumes of data and support long-running compute applications using simplified Map-Reduce parallel operations.
-
Flexible persistence. Increase the lifespan of information in the memory for improved durability through support for both shared nothing and shared database—RDBMS or NoSQL—architectures.
-
Comprehensive security. Authentication, role-based authorization, and access control are integrated with existing security and identity structures to give only trusted users, services, and applications access to the data grid.
-
Cross-datacenter replication. Replicate applications across datacenters and achieve high availability to meet service-level agreement (SLA) requirements for data within and across datacenters.
-
Rolling upgrades. Upgrade your cluster without downtime for continuous, uninterrupted operations for remote users and applications.
The Data Grid Operator provides operational intelligence and reduces management complexity for deploying Data Grid on OpenShift, automatically upgrading clusters when new image versions become available.
In order to upgrade Data Grid clusters, the operator checks the version of the image of each Data Grid node. If the operator determines that a new version of the image is available, it gracefully shuts down all nodes, applies the new image, and restarts the nodes.
On OpenShift, the Operator Lifecycle Manager (OLM) enables upgrades for Data Grid Operator.
This section explains how to configure, run, and monitor Data Grid servers. The binaries are available on the RH downloads website.
This is the basic installation option, valid for basic exploration and development purposes. Follow these steps to run the server in this mode:
cd ~/Downloads
unzip redhat-datagrid-8.1.0-server.zip
./redhat-datagrid-8.1.0-server/bin/cli.sh user create developer -p developer
./redhat-datagrid-8.1.0-server/bin/server.sh
16:33:08,457 INFO (main) [BOOT] JVM OpenJDK 64-Bit Server VM Red Hat, Inc. 11.0.9+11
[...]
16:33:13,327 INFO (main) [org.infinispan.SERVER] ISPN080004: Protocol SINGLE_PORT listening on 127.0.0.1:11222
16:33:13,327 INFO (main) [org.infinispan.SERVER] ISPN080034: Server 'hat-42208' listening on http://127.0.0.1:11222
16:33:13,327 INFO (main) [org.infinispan.SERVER] ISPN080001: Red Hat Data Grid Server 8.1.0.GA started in 4806ms
Unzip the server as in the previous example, locate the folder named server
, duplicate it and then use one with each server instance.
cd ~/Downloads
unzip redhat-datagrid-8.1.0-server.zip
cp redhat-datagrid-8.1.0-server/server redhat-datagrid-8.1.0-server/server-01
cp redhat-datagrid-8.1.0-server/server redhat-datagrid-8.1.0-server/server-02
./redhat-datagrid-8.1.0-server/bin/cli.sh user create developer -p developer --server-root=server-01
./redhat-datagrid-8.1.0-server/bin/cli.sh user create developer -p developer --server-root=server-02
./redhat-datagrid-8.1.0-server/bin/server.sh --node-name=node-01 --server-root=redhat-datagrid-8.1.0-server/server-01 --port-offset=0
./redhat-datagrid-8.1.0-server/bin/server.sh --node-name=node-02 --server-root=redhat-datagrid-8.1.0-server/server-02 --port-offset=100
After running both commands, you will see in both terminals similar logs to the ones shown below:
[...]
20:19:29,614 INFO (main) [org.infinispan.SERVER] ISPN080034: Server 'node-01' listening on http://127.0.0.1:11222
20:19:29,614 INFO (main) [org.infinispan.SERVER] ISPN080001: Red Hat Data Grid Server 8.1.0.GA started in 5585ms
20:19:36,637 INFO (jgroups-8,node-01) [org.infinispan.CLUSTER] ISPN000094: Received new cluster view for channel cluster: [node-01|1] (2) [node-01, node-02]
20:19:36,647 INFO (jgroups-8,node-01) [org.infinispan.CLUSTER] ISPN100000: Node node-02 joined the cluster
20:19:37,365 INFO (jgroups-5,node-01) [org.infinispan.CLUSTER] [Context=org.infinispan.CLIENT_SERVER_TX_TABLE]ISPN100002: Starting rebalance with members [node-01, node-02], phase READ_OLD_WRITE_ALL, topology id 2
[...]
20:19:38,463 INFO (jgroups-5,node-01) [org.infinispan.CLUSTER] [Context=___hotRodTopologyCache_hotrod]ISPN100010: Finished rebalance with members [node-01, node-02], topology id 5
An Operator is a method of packaging, deploying and managing a Kubernetes-native application. A Kubernetes-native application is an application that is both deployed on Kubernetes and managed using the Kubernetes APIs and kubectl tooling.
Install Data Grid Operator into a OpenShift namespace to create and manage Data Grid clusters.
Create subscriptions to Data Grid Operator on OpenShift so you can install different Data Grid versions and receive automatic updates.
To deploy the RHDG operator, you will need to create three different objects:
-
Two Openshift projects that will contain the operator and the objects of the RHDG cluster.
-
An OperatorGroup, which provides multitenant configuration to OLM-installed Operators. An Operator group selects target namespaces in which to generate required RBAC access for its member Operators. As we are not deploying our operator in the default namespace (
openshift-operators
), we will need to create one to set the namespaces where the Data Grid operator will be able to create and monitorize clusters.
ℹ️
|
The OperatorGroup resource allows to configure four possible namespace-scopes for the operator. Please check Annex: Configure the scope of your operator before executing the commands of this section. |
-
A Subscription, which represents an intention to install an Operator. It is the custom resource that relates an Operator to a CatalogSource. Subscriptions describe which channel of an Operator package to subscribe to, and whether to perform updates automatically or manually.
I have created an OCP template to quickly deploy this operator. Just execute the following command have it up and running on your cluster.
❗
|
Bear in mind that you will need cluster-admin permissions to deploy an operator, as it is necessary to create cluster-wide CRDs (Custom Resource Definitions).
|
oc process -f templates/rhdg-01-operator.yaml | oc apply -f -
This template provides two parameters to modify the project where the operator and the cluster is installed. It is possible to deploy both on the same project or in different projects. By default, values are:
-
OPERATOR_NAMESPACE =
rhdg8-operator
-
CLUSTER_NAMESPACE =
rhdg8
Modify them just passing arguments to the template:
oc process -f templates/rhdg-01-operator.yaml -p OPERATOR_NAMESPACE="other-namespace" -p CLUSTER_NAMESPACE="another-namespace" | oc apply -f -
It is also possible to install the operator from the web console. For more information, please check the official documentation.
Data Grid Operator lets you create, configure, and manage Data Grid clusters. Data Grid Operator adds a new Custom Resource (CR) of type Infinispan that lets you handle Data Grid clusters as complex units on OpenShift.
Data Grid Operator watches for Infinispan Custom Resources (CR) that you use to instantiate and configure Data Grid clusters and manage OpenShift resources, such as StatefulSets and Services. In this way, the Infinispan CR is your primary interface to Data Grid on OpenShift.
I have created an OCP template to quickly deploy a basic RHDG cluster with 3 replicas. Execute the following command have it up and running on your cluster.
oc process -f templates/rhdg-02-cluster.yaml | oc apply -f -
This template provides two parameters to modify the project where the cluster is installed and the name of the cluster to deploy. The cluster namespace should be the same as in the previous step. By default, values are:
-
CLUSTER_NAMESPACE =
rhdg8
-
CLUSTER_NAME =
rhdg
Modify them just passing arguments to the template:
oc process -f templates/rhdg-02-cluster.yaml -p CLUSTER_NAMESPACE="another-namespace" -p CLUSTER_NAME="my-cluster" | oc apply -f -
|
Creating caches with Data Grid Operator is available as a technology preview. If you want to create caches in a production environment, I encourage you to use the REST or Java alternatives. |
Data Grid stores entries into caches, which can be created using several methods: REST, CLI, programmatically using the Java Client or using the Cache
CRD. In this section, we will explore how to create caches using the Operator.
ℹ️
|
For other ways of creating caches, please check this other Git repository with information about the Data Grid client. |
To create caches with Data Grid Operator, you use Cache CRs to add caches from templates or XML configuration. Bear in mind the following contrains:
-
You can create a single cache for each
Cache
CR. -
If you edit caches in the OpenShift Web Console, changes do not take effect on the Data Grid cluster. You must delete the CR and create it again with the new configuration.
-
Deleting Cache CRs in the OpenShift Web Console, does not remove caches from Data Grid clusters. You must delete caches through the console or CLI.
I have created an OCP template to quickly set up two caches on the RHDG cluster:
-
operator-cache-01
: Based on an xml configuration. -
operator-cache-02
: Based on an already defined templated.
In order to apply this template, just execute the following command:
oc process -f templates/rhdg-03-caches.yaml | oc apply -f -
This template provides two parameters to modify the project where the cluster is installed and the name of the cluster to deploy. The cluster namespace should be the same as in the previous step. By default, values are:
-
CLUSTER_NAMESPACE =
rhdg8
-
CLUSTER_NAME =
rhdg
Modify them just passing arguments to the template:
oc process -f templates/rhdg-03-caches.yaml -p CLUSTER_NAMESPACE="another-namespace" -p CLUSTER_NAME="my-cluster" | oc apply -f -
Interact with the newly created caches with he following commands:
# Set your variables. These are default:
CLUSTER_NAMESPACE="rhdg8"
CLUSTER_NAME="rhdg"
RHDG_URL=$(oc get route ${CLUSTER_NAME}-external -n ${CLUSTER_NAMESPACE} -o template='https://{{.spec.host}}')
# Check all the caches on your cluster
curl -X GET -k -u developer:developer -H "Content-Type: application/json" ${RHDG_URL}/rest/v2/caches | jq
# Check information about an specific cache
curl -X GET -k -u developer:developer -H "Content-Type: application/json" ${RHDG_URL}/rest/v2/caches/${CACHE_NAME} | jq
# Delete a cache
curl -X DELETE -k -u developer:developer ${RHDG_URL}/rest/v2/caches/${CACHE_NAME}
For more information about how to create caches using the CRD, please check the official documentation.
Data Grid exposes a metrics endpoint that provides statistics and events to Prometheus.
After installing OpenShift Container Platform 4.6, cluster administrators can optionally enable monitoring for user-defined projects. By using this feature, cluster administrators, developers, and other users can specify how services and pods are monitored in their own projects. You can then query metrics, review dashboards, and manage alerting rules and silences for your own projects in the OpenShift Container Platform web console. We are going to take advantage of this feature.
|
Enabling monitoring for user-defined projects
Monitoring of user-defined projects is not enabled by default. To enable it, you need to modify a Configmap of the oc apply -f templates/rhdg-02-ocp-user-workload-monitoring.yaml After executing the command above, you will see some pods in the following namespace: oc get pods -n openshift-user-workload-monitoring |
I have created an OCP template to quickly configure metrics monitorization of a RHDG cluster. Execute the following command:
oc process -f templates/rhdg-04-monitoring.yaml | oc apply -f -
This template provides two parameters to modify the project where the cluster was installed and the name of the cluster itself. By default, values are:
-
CLUSTER_NAMESPACE =
rhdg8
-
CLUSTER_NAME =
rhdg
Modify them just passing arguments to the template:
oc process -f templates/rhdg-04-monitoring.yaml -p CLUSTER_NAMESPACE="another-namespace" -p CLUSTER_NAME="my-cluster" | oc apply -f -
For more information, access the Openshift documentation for the monitoring stack and the RHDG documenation to configure monitoring for RHDG 8 on OCP.
A typical OpenShift monitoring stack includes Prometheus for monitoring both systems and services, and Grafana for analyzing and visualizing metrics.
Administrators are often looking to write custom queries and create custom dashboards in Grafana. However, Grafana instances provided with the monitoring stack (and its dashboards) are read-only. To solve this problem, we can use the community-powered Grafana operator provided by OperatorHub.
To deploy the community-powered Grafana operator on OCP 4.6 just follow these steps:
Now, we will create a Grafana instance using the operator:
oc process -f templates/grafana-02-instance.yaml | oc apply -f -
Now, we will create a Grafana data source:
PROJECT=grafana
oc adm policy add-cluster-role-to-user cluster-monitoring-view -z grafana-serviceaccount -n ${PROJECT}
BEARER_TOKEN=$(oc serviceaccounts get-token grafana-serviceaccount -n ${PROJECT})
oc process -f templates/grafana-03-datasource.yaml -p BEARER_TOKEN=${BEARER_TOKEN} | oc apply -f -
Now, we will create a Grafana dashboard:
DASHBOARD_NAME="grafana-dashboard-rhdg8"
# Create a configMap containing the Dashboard
oc create configmap $DASHBOARD_NAME --from-file=dashboard=grafana/$DASHBOARD_NAME.json -n $PROJECT
# Create a Dashboard object that automatically updates Grafana
oc process -f templates/grafana-04-dashboard.yaml -p DASHBOARD_NAME=$DASHBOARD_NAME | oc apply -f -
After accessing Grafana using the OCP SSO, you may log in as admin. Retrieve the credentials from the secret using the following commands:
oc get secret grafana-admin-credentials -n $PROJECT -o jsonpath='{.data.GF_SECURITY_ADMIN_USER}' | base64 --decode
oc get secret grafana-admin-credentials -n $PROJECT -o jsonpath='{.data.GF_SECURITY_ADMIN_PASSWORD}' | base64 --decode
For more information, access the Grafana main documentation or the Grafana operator documentation.
An Operator group, defined by the OperatorGroup resource, provides multitenant configuration to OLM-installed Operators. An Operator group selects target namespaces in which to generate required RBAC access for its member Operators.
If you want to modify the default behavior of the template provided in this repository, modify lines 26 to 33 of this template.
1) AllNamespaces: The Operator can be a member of an Operator group that selects all namespaces (target namespace set is the empty string ""). This configuration allows us to create DG clusters in every namespace of the cluster:
- apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: datagrid
namespace: ${OPERATOR_NAMESPACE}
spec: {}
2) MultiNamespace: The Operator can be a member of an Operator group that selects more than one namespace. Choose this option if you want to have several operators that manage RHDG clusters. For example, if you want to have a different operator per Business Unit managing several Openshift projects:
- apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: datagrid
namespace: ${OPERATOR_NAMESPACE}
spec:
targetNamespaces:
- ${CLUSTER_NAMESPACE-1}
- ${CLUSTER_NAMESPACE-2}
3) SingleNamespace: The Operator can be a member of an Operator group that selects one namespace. This is useful if we want every application (Each OCP namespace) to be able to configure and deploy their own DG clusters:
- apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: datagrid
namespace: ${OPERATOR_NAMESPACE}
spec:
targetNamespaces:
- ${CLUSTER_NAMESPACE}
For more information, check the Openshift documentation about Operator Groups and the official documentation to install DG on Openshift.
In some situations, you will need to monitor logs from several pods of the same application and maybe you want to check to which pod did the request arrived. Stern allows you to tail multiple pods on Kubernetes and multiple containers within the pod. Each result is color coded for quicker debugging.
First, you will need to install it on your machine. After that, log in to your cluster and monitoring the previous deployment is as simple as executing the following command:
stern --namespace=$CLUSTER_NAMESPACE -l clusterName=$CLUSTER_NAME
The previous command will show all the logs from all the pods from a namespace that contain a given label.
There are many filters and configuration options. Check the documentation for a full list of them
The same configuration rules from the previous chapter apply.
oc process -f templates/infinispan-01-operator.yaml -p OPERATOR_NAMESPACE="infinispan-operator" -p CLUSTER_NAMESPACE="infinispan" | oc apply -f -
oc process -f templates/infinispan-02-cluster.yaml -p CLUSTER_NAMESPACE="infinispan" -p CLUSTER_NAME="infinispan" | oc apply -f -
It is also possible to install the operator from the web console. For more information, please check the official documentation.
|
Bear in mind that this template is for testing purposes and that the installation made with it will not benefit from any kind of Red Hat support. |
If you need to test any configuration that the operator does not provide, it is possible to deploy the RHDG cluster manually creating all the Openshift resources that the Operator manages automagically.
By default, the template deploys a RHDG cluster named rhdg
in the project rhdg8-operatorless
. Deploy your cluster performing the following command:
oc process -f templates/rhdg-99-operatorless.yaml | oc apply -f -
Due to the limitation of OCP templates, if you want to change the cluster name and the namespace, you will have to edit lines #103, and #107 of the file templates/rhdg-99-operatorless.yaml
. This is the pattern that you have to follow [Substitute the ENV_VARS with the correct values]:
data:
infinispan.yaml: |
clusterName: ${CLUSTER_NAME}
jgroups:
transport: tcp
dnsPing:
query: ${CLUSTER_NAME}-ping.${CLUSTER_NAMESPACE}.svc.cluster.local
diagnostics: false
After modifying the yaml, process the template with the correct params:
oc process -f templates/rhdg-99-operatorless.yaml -p CLUSTER_NAMESPACE="another-namespace" -p CLUSTER_NAME="my-cluster" | oc apply -f -