Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploy prometheus #1013

Open
wants to merge 19 commits into
base: main
Choose a base branch
from
Open

Deploy prometheus #1013

wants to merge 19 commits into from

Conversation

HaoYang0000
Copy link
Collaborator

@HaoYang0000 HaoYang0000 commented Dec 24, 2024

This PR add make target to deploy/undeploy prometheus instance, and its related service monitor, secrets.

@HaoYang0000 HaoYang0000 marked this pull request as ready for review December 26, 2024 08:28
@HaoYang0000 HaoYang0000 changed the title [WIP]Deploy prometheus Deploy prometheus Dec 26, 2024
Copy link
Collaborator

@roypaulin roypaulin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try it with 2 vdb in separate namespaces and let me know if prometheus can scrape their metrics.

Makefile Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
scripts/deploy-prometheus.sh Outdated Show resolved Hide resolved
scripts/deploy-prometheus.sh Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
@HaoYang0000
Copy link
Collaborator Author

Try it with 2 vdb in separate namespaces and let me know if prometheus can scrape their metrics.
Yes I tried to deploy two Vertica db in different namespace, once the two service monitor are deployed in the same namespace as the db, we can get the metrics from both. I will update the confluence page for the steps.

Copy link
Collaborator

@roypaulin roypaulin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can already add an e2e test, that we will extend later for the autoscaler. For now, you can deploy prometheus, create vdb, create a service monitor and check that prometheus can fetch the metrics from the db: curl http://<prometheus-instance>:9090/api/v1/label/__name__/values and then search for one vertica metric.

Makefile Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Given it is not a chart, let's this file from helm-charts/prometheus to prometheus
  • This file must also be generated by a script so we can parametrize a few of these fields.
  • The number of replicas must be configurable
  • We also want to be able to control the resources(cpu, mem) of prometheus pods(server.resources.requests/server.resources.limits)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file is moved to prometheus.
This file meant to provide a default value for the helm, we can parameterized fields by providing PROMETHEUS_HELM_OVERRIDES=--set attribute.name=something

helm-charts/prometheus/values.yaml Outdated Show resolved Hide resolved
helm-charts/prometheus/values.yaml Outdated Show resolved Hide resolved
helm-charts/prometheus/values.yaml Outdated Show resolved Hide resolved
helm-charts/prometheus/values.yaml Outdated Show resolved Hide resolved
helm-charts/prometheus/values.yaml Outdated Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an assert to check the service monitor

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check for the deployment instead(all replicas must be available), and also the service.

Copy link
Collaborator

@roypaulin roypaulin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When uninstalling a prometheus release, we should first clean up all the servicemonitors and secrets tied to that release. You can use kubectl delete servicemonitor/secret -l release=

@@ -623,7 +633,7 @@ uninstall: manifests kustomize ## Uninstall CRDs from the K8s cluster specified
# If this secret does not exist then it is simply ignored.
deploy-operator: manifests kustomize ## Using helm or olm, deploy the operator in the K8s cluster
ifeq ($(DEPLOY_WITH), helm)
helm install $(DEPLOY_WAIT) -n $(NAMESPACE) --create-namespace $(HELM_RELEASE_NAME) $(OPERATOR_CHART) --set image.repo=null --set image.name=${OPERATOR_IMG} --set image.pullPolicy=$(HELM_IMAGE_PULL_POLICY) --set imagePullSecrets[0].name=priv-reg-cred --set controllers.scope=$(CONTROLLERS_SCOPE) --set controllers.vdbMaxBackoffDuration=$(VDB_MAX_BACKOFF_DURATION) $(HELM_OVERRIDES)
helm install $(DEPLOY_WAIT) -n $(NAMESPACE) --create-namespace $(HELM_RELEASE_NAME) $(OPERATOR_CHART) --set image.repo=null --set image.name=${OPERATOR_IMG} --set image.pullPolicy=$(HELM_IMAGE_PULL_POLICY) --set imagePullSecrets[0].name=priv-reg-cred $(HELM_OVERRIDES) --set controllers.scope=$(CONTROLLERS_SCOPE) --set controllers.vdbMaxBackoffDuration=$(VDB_MAX_BACKOFF_DURATION)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
helm install $(DEPLOY_WAIT) -n $(NAMESPACE) --create-namespace $(HELM_RELEASE_NAME) $(OPERATOR_CHART) --set image.repo=null --set image.name=${OPERATOR_IMG} --set image.pullPolicy=$(HELM_IMAGE_PULL_POLICY) --set imagePullSecrets[0].name=priv-reg-cred $(HELM_OVERRIDES) --set controllers.scope=$(CONTROLLERS_SCOPE) --set controllers.vdbMaxBackoffDuration=$(VDB_MAX_BACKOFF_DURATION)
helm install $(DEPLOY_WAIT) -n $(NAMESPACE) --create-namespace $(HELM_RELEASE_NAME) $(OPERATOR_CHART) --set image.repo=null --set image.name=${OPERATOR_IMG} --set image.pullPolicy=$(HELM_IMAGE_PULL_POLICY) --set imagePullSecrets[0].name=priv-reg-cred --set controllers.scope=$(CONTROLLERS_SCOPE) --set controllers.vdbMaxBackoffDuration=$(VDB_MAX_BACKOFF_DURATION) $(HELM_OVERRIDES)

NAMESPACE=''
USERNAME=''
PASSWORD=''
DBNAME=''
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DBNAME=''
VDBNAME=''

INTERVAL='5s'

function usage() {
echo "usage: $(basename $0) [-n <namespace>] [-l <label>] [-a <action>] [-u <username>] [-p <password>] [-d <dbname>] [-i <interval>]"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace dbname with vdbname, they are not the same.

Comment on lines +136 to +166
cat <<EOF | kubectl delete -f -
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: k8s-vertica-prometheus-$DBNAME
namespace: $NAMESPACE
labels:
release: $LABEL
spec:
selector:
matchLabels:
app.kubernetes.io/instance: $DBNAME
namespaceSelector:
matchNames:
- $NAMESPACE
endpoints:
- basicAuth:
password:
key: password
name: prometheus-$DBNAME
username:
key: username
name: prometheus-$DBNAME
optional: true
interval: $INTERVAL
path: /v1/metrics
port: vertica-http
scheme: https
tlsConfig:
insecureSkipVerify: true
EOF
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cat <<EOF | kubectl delete -f -
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: k8s-vertica-prometheus-$DBNAME
namespace: $NAMESPACE
labels:
release: $LABEL
spec:
selector:
matchLabels:
app.kubernetes.io/instance: $DBNAME
namespaceSelector:
matchNames:
- $NAMESPACE
endpoints:
- basicAuth:
password:
key: password
name: prometheus-$DBNAME
username:
key: username
name: prometheus-$DBNAME
optional: true
interval: $INTERVAL
path: /v1/metrics
port: vertica-http
scheme: https
tlsConfig:
insecureSkipVerify: true
EOF
kubectl delete servicemonitor k8s-vertica-prometheus-$DBNAME -n $NAMESPACE

Comment on lines +169 to +179
cat <<EOF | kubectl delete -f -
apiVersion: v1
kind: Secret
metadata:
namespace: $NAMESPACE
name: prometheus-$DBNAME
data:
username: '$(echo -n $USERNAME | base64)'
password: '$(echo -n $PASSWORD | base64)'
type: Opaque
EOF
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cat <<EOF | kubectl delete -f -
apiVersion: v1
kind: Secret
metadata:
namespace: $NAMESPACE
name: prometheus-$DBNAME
data:
username: '$(echo -n $USERNAME | base64)'
password: '$(echo -n $PASSWORD | base64)'
type: Opaque
EOF
kubectl delete secret prometheus-$DBNAME -n $NAMESPACE

Comment on lines +92 to +93
namespace: $NAMESPACE
name: prometheus-$DBNAME
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
namespace: $NAMESPACE
name: prometheus-$DBNAME
namespace: $NAMESPACE
name: prometheus-$DBNAME
labels:
release: $LABEL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants