Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Replace kube-rbac-proxy with controller-runtime metrics authentication/authorization #1475

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

camilamacedo86
Copy link
Contributor

@camilamacedo86 camilamacedo86 commented Nov 17, 2024

This commit removes the use of the kube-rbac-proxy image and replaces it with metrics authentication/authorization provided by controller-runtime. The kube-rbac-proxy image is deprecated and will no longer be maintained, which introduces risks to production environments. For more details, see: kubernetes-sigs/kubebuilder#3907

PR for catalogd: operator-framework/catalogd#460
Motivation: #1509

Tests done locally

To ensure backwards compatibility

Create the clusterrolebinding

$ kubectl create clusterrolebinding operator-controller-metrics-binding \
   --clusterrole=operator-controller-metrics-reader \
   --serviceaccount=olmv1-system:operator-controller-controller-manager

Create the TOKEN

TOKEN=$(kubectl create token operator-controller-controller-manager -n olmv1-system)
echo $TOKEN

Use curl to validate

kubectl label namespace olmv1-system pod-security.kubernetes.io/enforce-
kubectl run curl-metrics --rm -it --restart=Never \
  --image=curlimages/curl:7.87.0 -n olmv1-system -- /bin/sh

Call the metrics:

 curl -v -k -H "Authorization: Bearer $TOKEN" \
> https://operator-controller-controller-manager-metrics-service.olmv1-system.sv
c.cluster.local:8443/metrics
*   Trying 10.96.16.165:8443...
* Connected to operator-controller-controller-manager-metrics-service.olmv1-system.svc.cluster.local (10.96.16.165) port 8443 (#0)
* ALPN: offers h2
* ALPN: offers http/1.1
* [CONN-0-0][CF-SSL] TLSv1.3 (OUT), TLS handshake, Client hello (1):
* [CONN-0-0][CF-SSL] TLSv1.3 (IN), TLS handshake, Server hello (2):
* [CONN-0-0][CF-SSL] TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* [CONN-0-0][CF-SSL] TLSv1.3 (IN), TLS handshake, Certificate (11):
* [CONN-0-0][CF-SSL] TLSv1.3 (IN), TLS handshake, CERT verify (15):
* [CONN-0-0][CF-SSL] TLSv1.3 (IN), TLS handshake, Finished (20):
* [CONN-0-0][CF-SSL] TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* [CONN-0-0][CF-SSL] TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN: server accepted h2
* Server certificate:
*  subject: [NONE]
*  start date: Dec 12 20:40:15 2024 GMT
*  expire date: Mar 12 20:40:15 2025 GMT
*  issuer: CN=olmv1-ca
*  SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.
* Using HTTP2, server supports multiplexing
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* h2h3 [:method: GET]
* h2h3 [:path: /metrics]
* h2h3 [:scheme: https]
* h2h3 [:authority: operator-controller-controller-manager-metrics-service.olmv1-system.svc.cluster.local:8443]
* h2h3 [user-agent: curl/7.87.0-DEV]
* h2h3 [accept: */*]
* h2h3 [authorization: Bearer TOKEN
* Using Stream ID: 1 (easy handle 0xffff82fa3aa0)
> GET /metrics HTTP/2
> Host: operator-controller-controller-manager-metrics-service.olmv1-system.svc.cluster.local:8443
> user-agent: curl/7.87.0-DEV
> accept: */*
> authorization: Bearer TOKEN 
* [CONN-0-0][CF-SSL] TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* Connection state changed (MAX_CONCURRENT_STREAMS == 250)!
< HTTP/2 200 
< content-type: text/plain; version=0.0.4; charset=utf-8; escaping=values
< date: Thu, 12 Dec 2024 20:43:10 GMT
< 
# HELP certwatcher_read_certificate_errors_total Total number of certificate read errors
# TYPE certwatcher_read_certificate_errors_total counter
certwatcher_read_certificate_errors_total 0
# HELP certwatcher_read_certificate_total Total number of certificate reads
# TYPE certwatcher_read_certificate_total counter
certwatcher_read_certificate_total 1
# HELP controller_runtime_active_workers Number of currently used workers per controller
# TYPE controller_runtime_active_workers gauge
....
workqueue_work_duration_seconds_count{controller="clusterextension",name="clusterextension"} 0
* Connection #0 to host operator-controller-controller-manager-metrics-service.olmv1-system.svc.cluster.local left intact

Now, to validate the call with the certificates

Create the POD with the secret

kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: curl-metrics
  namespace: olmv1-system
spec:
  serviceAccountName: operator-controller-controller-manager
  containers:
  - name: curl
    image: curlimages/curl:7.87.0
    command:
    - sh
    - -c
    - sleep 3600
    volumeMounts:
    - mountPath: /tmp/cert
      name: olm-cert
      readOnly: true
  volumes:
  - name: olm-cert
    secret:
      secretName: olmv1-cert
  restartPolicy: Never
EOF
$ kubectl exec -it curl-metrics -n olmv1-system -- sh

$ curl -v --cacert /tmp/cert/ca.crt --cert /tmp/cert/tls.crt --key /tmp/cert/t
ls.key \
> -H "Authorization: Bearer $TOKEN" \
> https://operator-controller-controller-manager-metrics-service.olmv1-system.sv
c.cluster.local:8443/metrics
*   Trying 10.96.16.165:8443...
* Connected to operator-controller-controller-manager-metrics-service.olmv1-system.svc.cluster.local (10.96.16.165) port 8443 (#0)
* ALPN: offers h2
* ALPN: offers http/1.1
*  CAfile: /tmp/cert/ca.crt
*  CApath: none
* [CONN-0-0][CF-SSL] TLSv1.3 (OUT), TLS handshake, Client hello (1):
* [CONN-0-0][CF-SSL] TLSv1.3 (IN), TLS handshake, Server hello (2):
* [CONN-0-0][CF-SSL] TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* [CONN-0-0][CF-SSL] TLSv1.3 (IN), TLS handshake, Certificate (11):
* [CONN-0-0][CF-SSL] TLSv1.3 (IN), TLS handshake, CERT verify (15):
* [CONN-0-0][CF-SSL] TLSv1.3 (IN), TLS handshake, Finished (20):
* [CONN-0-0][CF-SSL] TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* [CONN-0-0][CF-SSL] TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN: server accepted h2
* Server certificate:
*  subject: [NONE]
*  start date: Dec 12 20:40:15 2024 GMT
*  expire date: Mar 12 20:40:15 2025 GMT
*  subjectAltName: host "operator-controller-controller-manager-metrics-service.olmv1-system.svc.cluster.local" matched cert's "operator-controller-controller-manager-metrics-service.olmv1-system.svc.cluster.local"
*  issuer: CN=olmv1-ca
*  SSL certificate verify ok.
* Using HTTP2, server supports multiplexing
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* h2h3 [:method: GET]
* h2h3 [:path: /metrics]
* h2h3 [:scheme: https]
* h2h3 [:authority: operator-controller-controller-manager-metrics-service.olmv1-system.svc.cluster.local:8443]
* h2h3 [user-agent: curl/7.87.0-DEV]
* h2h3 [accept: */*]
* h2h3 [authorization: Bearer TOKEN
* Using Stream ID: 1 (easy handle 0xffffb0cf9aa0)
> GET /metrics HTTP/2
> Host: operator-controller-controller-manager-metrics-service.olmv1-system.svc.cluster.local:8443
> user-agent: curl/7.87.0-DEV
> accept: */*
> authorization: Bearer TOKEN
> 
* [CONN-0-0][CF-SSL] TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* Connection state changed (MAX_CONCURRENT_STREAMS == 250)!
< HTTP/2 200 
< content-type: text/plain; version=0.0.4; charset=utf-8; escaping=values
< date: Thu, 12 Dec 2024 20:46:29 GMT
< 

@camilamacedo86 camilamacedo86 requested a review from a team as a code owner November 17, 2024 10:19
Copy link

netlify bot commented Nov 17, 2024

Deploy Preview for olmv1 ready!

Name Link
🔨 Latest commit 71dd44e
🔍 Latest deploy log https://app.netlify.com/sites/olmv1/deploys/675cad787fce8b00086df8da
😎 Deploy Preview https://deploy-preview-1475--olmv1.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@camilamacedo86 camilamacedo86 force-pushed the replace-rbac-protect-metrics branch 3 times, most recently from 4349f0e to 48dc64a Compare November 17, 2024 10:22
Copy link

codecov bot commented Nov 17, 2024

Codecov Report

Attention: Patch coverage is 73.58491% with 14 lines in your changes missing coverage. Please review.

Project coverage is 74.62%. Comparing base (61fe142) to head (71dd44e).

Files with missing lines Patch % Lines
cmd/manager/main.go 73.58% 11 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1475      +/-   ##
==========================================
- Coverage   74.68%   74.62%   -0.07%     
==========================================
  Files          42       42              
  Lines        3271     3322      +51     
==========================================
+ Hits         2443     2479      +36     
- Misses        652      664      +12     
- Partials      176      179       +3     
Flag Coverage Δ
e2e 52.46% <73.58%> (+0.31%) ⬆️
unit 57.07% <0.00%> (-0.93%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

cmd/manager/main.go Outdated Show resolved Hide resolved
@camilamacedo86 camilamacedo86 force-pushed the replace-rbac-protect-metrics branch from 48dc64a to 3487ab3 Compare November 18, 2024 12:02
cmd/manager/main.go Outdated Show resolved Hide resolved
@joelanford
Copy link
Member

/hold

This doesn't seem like a change that should be made after we've released 1.0.0-rc1. Let's wait until after we cut 1.0.0.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 18, 2024
@camilamacedo86 camilamacedo86 force-pushed the replace-rbac-protect-metrics branch from 3487ab3 to a7ace73 Compare November 18, 2024 13:35
@camilamacedo86

This comment was marked as resolved.

@joelanford
Copy link
Member

Let's prioritize a design for this. I do not want to add any risk to our 1.0.0 in the final week after we've already released an RC.

@camilamacedo86
Copy link
Contributor Author

/hold

@camilamacedo86 camilamacedo86 changed the title ⚠️ Replace kube-rbac-proxy with controller-runtime metrics authentication/authorization (WIP) - POC - RFC ⚠️ Replace kube-rbac-proxy with controller-runtime metrics authentication/authorization Nov 19, 2024
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 19, 2024
@camilamacedo86 camilamacedo86 force-pushed the replace-rbac-protect-metrics branch from 938fb19 to 7e22db8 Compare November 19, 2024 19:06
@camilamacedo86 camilamacedo86 changed the title (WIP) - POC - RFC ⚠️ Replace kube-rbac-proxy with controller-runtime metrics authentication/authorization (WIP) - ⚠️ Replace kube-rbac-proxy with controller-runtime metrics authentication/authorization Nov 19, 2024
@camilamacedo86 camilamacedo86 force-pushed the replace-rbac-protect-metrics branch 4 times, most recently from aaf8ce9 to e3b9df9 Compare November 19, 2024 19:38
@camilamacedo86 camilamacedo86 force-pushed the replace-rbac-protect-metrics branch 3 times, most recently from f0dc04c to 954a65e Compare December 7, 2024 04:00
@camilamacedo86 camilamacedo86 force-pushed the replace-rbac-protect-metrics branch from 954a65e to 8c791f8 Compare December 7, 2024 04:29
cmd/manager/main.go Outdated Show resolved Hide resolved
@camilamacedo86 camilamacedo86 force-pushed the replace-rbac-protect-metrics branch 3 times, most recently from eb63ab7 to 930df24 Compare December 12, 2024 20:47
@camilamacedo86 camilamacedo86 changed the title WIP ⚠️ Replace kube-rbac-proxy with controller-runtime metrics authentication/authorization ✨ Replace kube-rbac-proxy with controller-runtime metrics authentication/authorization Dec 12, 2024
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 12, 2024
@camilamacedo86 camilamacedo86 force-pushed the replace-rbac-protect-metrics branch 2 times, most recently from d75a97d to 9a52766 Compare December 12, 2024 21:05
tmshort
tmshort previously approved these changes Dec 13, 2024
Copy link
Contributor

@tmshort tmshort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 13, 2024
@camilamacedo86 camilamacedo86 force-pushed the replace-rbac-protect-metrics branch from 9a52766 to 5342759 Compare December 13, 2024 20:12
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Dec 13, 2024
Copy link

openshift-ci bot commented Dec 13, 2024

New changes are detected. LGTM label has been removed.

@camilamacedo86
Copy link
Contributor Author

Rebased with the tests to ensure backwords compatibility: #1522

@camilamacedo86
Copy link
Contributor Author

Hi @joelanford

Can we unhold this one now?

…ontroller-runtime feature

Utilise Controller-Runtime's WithAuthenticationAndAuthorization feature to protect the metrics endpoint. This approach provides access control, similar to the functionality of kube-rbac-proxy. kube-rbac-proxy image from gcr.io/kubebuilder/kube-rbac-proxy is deprecated and should no longer be used

More info: kubernetes-sigs/kubebuilder#3907
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants