Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: TLS support for the Prometheus web endpoint #492

Merged
merged 5 commits into from
Sep 25, 2024

Conversation

vyzigold
Copy link
Contributor

@vyzigold vyzigold commented May 28, 2024

This PR adds a new "WebTLSConfig" field to the PrometheusConfig section of the MonitoringStack CRD, which allows the user to specify secrets containing TLS certificates. The Prometheus object is configured to use these secrets. The feature is tested in a new "assertPrometheusScrapesItselfTLS" testcase.

I needed to add the secrets containing CA certificates to the "secrets" field in the Prometheus object in order to use them for the self scraping jobs. This means the "secrets" field is now managed by OBO, so I added a new "secrets" field to the MonitoringStack CRD, which gives the users the ability to specify additional secrets to mount into the prometheus container. Specifying additional secrets is one of the ways to configure TLS CAs for scraping.

Copy link

openshift-ci bot commented May 28, 2024

Hi @vyzigold. Thanks for your PR.

I'm waiting for a rhobs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Contributor

@simonpasquier simonpasquier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's be easier for reviewers if you split the PR: 1 for Alertmanager and 1 for Prometheus.

test/e2e/monitoring_stack_controller_test.go Outdated Show resolved Hide resolved
pkg/apis/monitoring/v1alpha1/types.go Outdated Show resolved Hide resolved
@vyzigold vyzigold changed the title feat: TLS support for Alertmanager and Prometheus web endpoints feat: TLS support for the Prometheus web endpoint May 31, 2024
@vyzigold
Copy link
Contributor Author

It's be easier for reviewers if you split the PR: 1 for Alertmanager and 1 for Prometheus.

I removed the alertmanager bits from this PR and I'll create another PR with them shortly. This PR now includes only the Prometheus TLS implementation. I renamed the PR and rewrote the description accordingly.

@vyzigold vyzigold marked this pull request as ready for review May 31, 2024 07:36
@vyzigold vyzigold requested a review from a team as a code owner May 31, 2024 07:36
@vyzigold vyzigold requested review from sthaha and JoaoBraveCoding and removed request for a team May 31, 2024 07:36
@openshift-ci openshift-ci bot requested review from jan--f and machine424 May 31, 2024 07:36
@jan--f
Copy link
Collaborator

jan--f commented Jun 26, 2024

Thanks. I have only one comment regarding the Secrets api addition. Since OBO uses server-side-apply user should still be able to add secrets to that list by also user server-side-apply.
Though its possible this requires an annotation in the Prometheus type. @simonpasquier I seem to remember we discussed this before.

If possible I'd like to avoid adding Secrets to the MonitoringStack API.

@simonpasquier
Copy link
Contributor

Regarding the .spec.secrets field in Prometheus CRD, it should be possible to modify upstream to validate the field as a set because specifying [foo, foo] breaks the statefulset.
Having said that, OBO could also read the secret's value and use ca instead of ca_file (at least until upstream properly supports SSA).

@vyzigold
Copy link
Contributor Author

I don't think that using ca instead of ca_file would be entirely possible everywhere. I think there are 2 places, where we need to use ca_file. The first is the the "additional scrape configs". If I understand that correctly, that is basically just copied into the Prometheus configuration without modification. I'd need to look into it, but I don't think the prometheus-operator interprets this to mount CAs. The second place is the prometheus.Spec.Alerting.Alertmanagers in #495 as using ca didn't seem to work (I still need to try replicating the possible bug with the upstream prometheus-operator). See https://github.com/rhobs/observability-operator/compare/adc714f4792654978f02899429e05c4e26a404ef..9f1e010a60ecf4de6935889e4bcac3e462d59440#diff-1b87d7169ef5b2a3785bd99c951d995909da935035a924781c1f0ddd12369fd2L250

Regarding the SSA. I don't understand the comment about what's exactly missing upstream. Does it mean, that SSA won't work for .spec.secrets at the moment? I'll try and see if I can make it work, but I think I'll be pretty busy for the next 2 weeks unfortunately, so it might take me a little bit more time to get back to this.

@jan--f
Copy link
Collaborator

jan--f commented Jul 9, 2024

Thanks for your work on this, we much appreciate it! No worries if other priorities come up.

The SSA comment refers to a missing kubebuilder annotation upstream. The secrets field needs to set the correct // +listType so that the API server validate it propoerly in the SSA context. Right now there can only be one owner of .spec.secrets. But with the right annotation each item of it can have a different owner, i.e. OBO can set the secret it needs, but we can leave it to users to add their own, in turn saving us from adding an API field that is merely an additional layer so that we can inject the secret we want in their. I.e. just less code for us to maintain.

Regarding your quesiton about ca vs ca_file I'm not sure. Maybe @slashpai can shed more light whether prom-op can mount secrets referenced in `additional scrape configs?

@vyzigold vyzigold force-pushed the am_prom_tls branch 2 times, most recently from 1405fa7 to bc11d9c Compare July 17, 2024 08:15
@vyzigold
Copy link
Contributor Author

I rebased on top of main, I hopefully fixed all the CI complaints. I removed the commit, which added the Secret to the api. At the moment the code works, but OBO takes ownership of the Prometheus's secret field and users aren't able to use SSA to add new secrets. As you mentioned in previous comments, we'll need some upstream changes for SSA to work for the secrets.

@vyzigold
Copy link
Contributor Author

I went ahead and created an upstream PR to add the +listType. I tested it together with this PR and it seems to work quite nicely. OBO would add whatever it needed to the secrets field and I was able to use SSA to add other secrets. OBO was able to remove the TLS secrets when I disabled TLS while leaving the manually added secrets intact and so on.

PR: prometheus-operator/prometheus-operator#6762

@jan--f
Copy link
Collaborator

jan--f commented Jul 30, 2024

Nice I think this
/lgtm
I suppose we'll have the po change in a release before our next point release, but will confirm that before merging.
Apologies about the repeated conflicts, the CSV tracking generatedAt is rather annoying when generating and tracking .

pkg/apis/monitoring/v1alpha1/types.go Outdated Show resolved Hide resolved
pkg/apis/monitoring/v1alpha1/types.go Outdated Show resolved Hide resolved
pkg/apis/monitoring/v1alpha1/types.go Outdated Show resolved Hide resolved
pkg/apis/monitoring/v1alpha1/types.go Outdated Show resolved Hide resolved
pkg/apis/monitoring/v1alpha1/types.go Outdated Show resolved Hide resolved
pkg/apis/monitoring/v1alpha1/types.go Outdated Show resolved Hide resolved
pkg/apis/monitoring/v1alpha1/types.go Outdated Show resolved Hide resolved
pkg/apis/monitoring/v1alpha1/types.go Outdated Show resolved Hide resolved
@vyzigold
Copy link
Contributor Author

@simonpasquier thank you for your suggestions. I rebased on top of main, included Simon's suggestions and regenerated the manifests / bundle.

I'll do the same for the other 3 PRs

@vyzigold
Copy link
Contributor Author

vyzigold commented Aug 1, 2024

Looks like the managed fields test was only half complete, for some reason I forgot to fully complete it. I switched it to deploy with TLS and added the expected fields to the expected json. I still have a difference between the expected managed fields and the managed fields from the test when running it locally.

    monitoring_stack_controller_test.go:703: assertion failed: 
        --- have
        +++ expected
          map[string]any{
          	... // 2 identical entries
          	"f:alerting":                    map[string]any{"f:alertmanagers": map[string]any{}},
          	"f:arbitraryFSAccessThroughSMs": map[string]any{},
        - 	"f:enableFeatures":              map[string]any{},
        + 	"f:enableFeatures":              map[string]any{`v:"otlp-write-receiver"`: map[string]any{}},
          	"f:enableRemoteWriteReceiver":   map[string]any{},
          	"f:externalLabels":              map[string]any{"f:key": map[string]any{}},
        + 	"f:image":                       map[string]any{},
          	"f:logLevel":                    map[string]any{},
          	"f:podMetadata":                 map[string]any{"f:labels": map[string]any{"f:app.kubernetes.io/component": map[string]any{}, "f:app.kubernetes.io/part-of": map[string]any{}}},
          	... // 23 identical entries
          }

But I don't see a reason why. AFAIK I didn't do anything with image and enabledFeatures. Maybe it's because I use crc for development instead of kind?

@vyzigold
Copy link
Contributor Author

I deployed in kind and rerun the tests. I can confirm, that the CI was green except for post-e2e, which I think doesn't matter until #200 is fixed. https://paste.opendev.org/show/bYHXIta79uQXR4q0Y9pU/

@jan--f
Copy link
Collaborator

jan--f commented Sep 24, 2024

/ok-to-test

Copy link
Collaborator

@jan--f jan--f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this
/lgtm

@openshift-ci openshift-ci bot added the lgtm label Sep 25, 2024
Copy link

openshift-ci bot commented Sep 25, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jan--f, vyzigold

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit 1b494d1 into rhobs:main Sep 25, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants