Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Allow to restrict the CRs watched according to their labels #1832

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions deploy/helm/grafana-operator/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,5 +104,6 @@ It's easier to just manage this configuration outside of the operator.
| serviceMonitor.targetLabels | list | `[]` | Set of labels to transfer from the Kubernetes Service onto the target |
| serviceMonitor.telemetryPath | string | `"/metrics"` | Set path to metrics path |
| tolerations | list | `[]` | pod tolerations |
| watchLabelSelectors | string | `""` | Sets the `WATCH_LABEL_SELECTORS` environment variable, if defines which CRs are watched according to thei labels. By default, the operator watches all CRs. To make it watch only a subset of CRs, define the variable as a *Set-based requirement* See also: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/ |
| watchNamespaceSelector | string | `""` | Sets the `WATCH_NAMESPACE_SELECTOR` environment variable, it defines which namespaces the operator should be listening for based on a namespace label (e.g. `"environment: dev"`). By default, the operator watches all namespaces. To make it watch only its own namespace, check out `namespaceScope` option instead. |
| watchNamespaces | string | `""` | Sets the `WATCH_NAMESPACE` environment variable, it defines which namespaces the operator should be listening for (e.g. `"grafana, foo"`). By default, the operator watches all namespaces. To make it watch only its own namespace, check out `namespaceScope` option instead. |
6 changes: 6 additions & 0 deletions deploy/helm/grafana-operator/templates/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,12 @@ spec:
{{ else }}
value: {{quote .Values.watchNamespaceSelector }}
{{- end }}
- name: WATCH_LABEL_SELECTORS
{{- if and .Values.watchLabelSelectors (eq .Values.watchLabelSelectors "") }}
value: ""
{{ else }}
value: {{quote .Values.watchLabelSelectors }}
{{- end }}
{{- with .Values.env }}
{{- toYaml . | nindent 12 }}
{{- end }}
Expand Down
7 changes: 7 additions & 0 deletions deploy/helm/grafana-operator/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,13 @@ watchNamespaces: ""
# By default, the operator watches all namespaces. To make it watch only its own namespace, check out `namespaceScope` option instead.
watchNamespaceSelector: ""

# -- Sets the `WATCH_LABEL_SELECTORS` environment variable,
# it defines which CRs are watched according to their labels.
# By default, the operator watches all CRs. To make it watch only a subset of CRs, define the variable as a *stringified label selector*.
# See also: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/
watchLabelSelectors: ""
# watchLabelSelectors: "partition in (customerA, customerB),environment!=qa"

# -- Determines if the target cluster is OpenShift. Additional rbac permissions for routes will be added on OpenShift
isOpenShift: false

Expand Down
23 changes: 21 additions & 2 deletions main.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
import (
"context"
"flag"
"fmt"
"os"
"os/signal"
"strings"
Expand Down Expand Up @@ -65,6 +66,10 @@
// eg: "environment: dev"
// If empty or undefined, the operator will run in cluster scope.
watchNamespaceEnvSelector = "WATCH_NAMESPACE_SELECTOR"
// watchLabelSelectorsEnvVar is the constant for env variable WATCH_LABEL_SELECTORS which specifies the resources to watch according to their labels.
// eg: 'partition in (customerA, customerB),environment!=qa'
// If empty of undefined, the operator will watch all CRs.
watchLabelSelectorsEnvVar = "WATCH_LABEL_SELECTORS"
)

var (
Expand All @@ -81,7 +86,7 @@
//+kubebuilder:scaffold:scheme
}

func main() {

Check failure on line 89 in main.go

View workflow job for this annotation

GitHub Actions / go-lint

cyclomatic complexity 26 of func `main` is high (> 25) (gocyclo)

Check failure on line 89 in main.go

View workflow job for this annotation

GitHub Actions / test

cyclomatic complexity 26 of func `main` is high (> 25) (gocyclo)
var metricsAddr string
var enableLeaderElection bool
var probeAddr string
Expand All @@ -105,6 +110,7 @@

watchNamespace, _ := os.LookupEnv(watchNamespaceEnvVar)
watchNamespaceSelector, _ := os.LookupEnv(watchNamespaceEnvSelector)
watchLabelSelectors, _ := os.LookupEnv(watchLabelSelectorsEnvVar)

controllerOptions := ctrl.Options{
Scheme: scheme,
Expand All @@ -120,6 +126,18 @@
PprofBindAddress: pprofAddr,
}

var labelSelectors labels.Selector
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block triggers an error from the linter

main.go:89:1: cyclomatic complexity 26 of func main is high (> 25) (gocyclo)

I could move this block of code in a dedicated func outside of the main.
WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I think it's more readable when contained in the main function. If the refactor is easy enough then go for it, otherwise I'm good with ignoring the linter here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep it like that, I would have moved the code outside only to please the linter. I find the code readable enough.

var err error
if watchLabelSelectors != "" {
labelSelectors, err = labels.Parse(watchLabelSelectors)
Copy link
Contributor

@Baarsgaard Baarsgaard Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I like this quite a lot, and would prefer to swap the existing namespace selector parsing to this or the ValidatedSelectorFromSet to maintain current behaviour.
To properly support sharding at scale, changing to labels.Parse might be better for the namespace selector

if err != nil {
setupLog.Error(err, fmt.Sprintf("unable to parse %s", watchLabelSelectorsEnvVar))
os.Exit(1) //nolint
}
} else {
labelSelectors = labels.Everything() // Match any labels
}

getNamespaceConfig := func(namespaces string) map[string]cache.Config {
defaultNamespaces := map[string]cache.Config{}
for _, v := range strings.Split(namespaces, ",") {
Expand All @@ -128,7 +146,7 @@
// this is the default behavior of the operator on v5, if you require finer grained control over this
// please file an issue in the grafana-operator/grafana-operator GH project
defaultNamespaces[v] = cache.Config{
LabelSelector: labels.Everything(), // Match any labels
LabelSelector: labelSelectors,
FieldSelector: fields.Everything(), // Match any fields
Transform: nil,
UnsafeDisableDeepCopy: nil,
Expand Down Expand Up @@ -156,7 +174,7 @@
// this is the default behavior of the operator on v5, if you require finer grained control over this
// please file an issue in the grafana-operator/grafana-operator GH project
defaultNamespaces[v.Name] = cache.Config{
LabelSelector: labels.Everything(), // Match any labels
LabelSelector: labelSelectors,
FieldSelector: fields.Everything(), // Match any fields
Transform: nil,
UnsafeDisableDeepCopy: nil,
Expand All @@ -180,6 +198,7 @@

case watchNamespace == "" && watchNamespaceSelector == "":
// cluster scoped
controllerOptions.Cache.DefaultLabelSelector = labelSelectors
Copy link
Contributor

@Baarsgaard Baarsgaard Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will unfortunately break the operator in several ways:

  1. Setting a default LabelSelector would break valuesFrom referencing unlabelled ConfigMaps and Secrets, the workaround is the same cache ignore rule that is necessary in Fix: Do not cache native resources created without CommonLabels #1818
  2. Any resource created by the Operator itself, specifically related to internal Grafana CRs would be created and maybe be in the cache for a while, but any updates or a restart and they would be orphaned as labels are not inherited, see fix: add common labels to created resources #1661
  3. I'm not 100% sure on this, but even if you fix 2. If you create an Internal Grafana and migrate it to another shard, it would have to be with a full delete and re-apply to ensure the backing Deployment, PVC, Route, ConfigMap and such are not orphaned, losing all alert history and more if no remote database is configured.
    This could potentially be fixed by cascading label updates,but would require passing down the old and new labels to all the supporting reconcilers.
  4. I think Cache is nil at this point 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With #1818 most of this should be fine if we either

  • only apply the watch label selector to grafana CRDs
  • or somehow merge it with "app.kubernetes.io/managed-by": "grafana-operator"

Copy link
Contributor

@Baarsgaard Baarsgaard Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only applying the watchLabels on Grafana CRDs will hide them due to the defaultLabelSelector unless overwritten with byObject, and you will end up with multiple reconciles of the same resources from different shards when omitted.
Merging is an option, but does not solve the shard migration/cascading label updates to ensure nothing is orphaned.

setupLog.Info("operator running in cluster scoped mode")
}

Expand Down
Loading