OCPBUGS-29729: Refactor security context configuration in pod reconciler #3185

bentito · 2024-03-04T23:03:37Z

Description of the change:
This change updates the logic for setting security contexts within the OLM pod reconciler. Now, it differentiates between 'Restricted' and 'Legacy' security contexts more explicitly. The 'Restricted' security context applies default security settings unless overridden, while the 'Legacy' context clears all security settings. When no security context is configured, it defaults to restricted. Additionally, the related tests have been updated to reflect these changes and ensure correct behavior.

Motivation for the change:
OCPBUGS-29729

Architectural changes:
None

Testing remarks:

Reviewer Checklist

bentito · 2024-03-05T17:20:56Z

/unit unit

pkg/controller/registry/reconciler/reconciler.go

joelanford · 2024-03-05T20:28:34Z

pkg/controller/registry/reconciler/reconciler.go

+	} else {
+		// If GrpcPodConfig is nil, apply 'restricted' security settings by default
+		addSecurityContext(pod, runAsUser) // Default to restricted settings


I mentioned this elsewhere, but the implication here is that we will break existing catalog sources that reference catalog images that are not compatible with the securityContext required by the PSA restricted mode.

In general, are these implications well understood and gamed out? Any chance we have a design doc about this change of the default behavior?

I think IBM had suggested a change where if the securityContextConfig is unset, we look at the PSA configuration of the namespace. If the namespace has enforce: restricted, then we default to restricted. Otherwise, we keep the default as legacy.

That seems like it would resolve the bug without breaking existing CatalogSources that are running correctly in non-restricted namespaces.

ok, that approach makes sense. I will rework this fix to check the namespace PSA config first.

So if we want to check the namespace for its PSA config, we'd make a change something like this with new calling code in the Pod() function and the new PSA checking function below. But we need to get the namespace info and we don't have the wherewithal in Pod() currently, specifically an operatorClient.

Pod() is called in many places in the codebase, so I am trying not to modify its signature. We could add the needed building blocks globally or is there some better way with controller-runtime I am missing?

// Define default security context config securityContextConfig := operatorsv1alpha1.Legacy // Determine the security context configuration based on GrpcPodConfig if source.Spec.GrpcPodConfig != nil { if source.Spec.GrpcPodConfig.SecurityContextConfig != "" { securityContextConfig = source.Spec.GrpcPodConfig.SecurityContextConfig } else { // If SecurityContextConfig is unset, check the PSA configuration of the namespace var err error securityContextConfig, err = getNamespacePSAConfig(source.Namespace, opClient) if err != nil { return nil, fmt.Errorf("failed to get namespace PSA configuration: %v", err) } } } else { // If GrpcPodConfig is nil, apply legacy security settings by default securityContextConfig = operatorsv1alpha1.Legacy } // Apply the determined security settings to the pod if securityContextConfig == operatorsv1alpha1.Restricted { // Apply 'restricted' security settings addSecurityContext(pod, runAsUser) // This should be a predefined function that sets restricted security context to the pod } else { // For 'Legacy' or any other case, clear all security contexts clearAllSecurityContexts(pod) // This should be a predefined function that removes all security contexts from the pod } // getNamespacePSAConfig checks the namespace for the PSA configuration and returns the applicable security config. func getNamespacePSAConfig(namespace string, client operatorclient.ClientInterface) (operatorsv1alpha1.SecurityConfig, error) { ns, err := client.KubernetesInterface().CoreV1().Namespaces().Get(context.TODO(), namespace, metav1.GetOptions{}) if err != nil { return "", fmt.Errorf("error fetching namespace: %v", err) } // 'pod-security.kubernetes.io/enforce' is the label used for enforcing namespace level security, // and 'restricted' is the value indicating a restricted security policy. if val, exists := ns.Labels["pod-security.kubernetes.io/enforce"]; exists && val == "restricted" { return operatorsv1alpha1.Restricted, nil } return operatorsv1alpha1.Legacy, nil }

Are we still waiting for an update on this?

yes @tmshort if you have any idea for best way to get a client avail at that scope to get a namespace to check?

I would definitely avoid making a client call inside the Pod function, especially since we call it so many times.

Steve ran into issues where the resolver would use clients to fetch data from the cache and/or apiserver. The result was really strange race conditions because the cache and apiserver can change during the course of a single resolution.

This sounds like it could run into similar problems. I think we should find a way to call getNamespacePSAConfig just once during a reconciliation and then use its result anywhere it is needed.

The last commit (f19d68a) adds checking of the namespace PSA restrictions. Currently, I am at about 8 files changed with many changes in each file (git diff is 871 lines). I am close, but still have some failing unit tests.

My basic approach was looking at the call graph and trying to minimize changes. Given that, I am trying to change the two wrapper Pod() functions in configmap.go and grpc.go. These locations already have access to the client and seem like a good place to put a fetch of the namespace object. We're checking if it's cached already to meet the only call once per reconcile loop goal.

I guess my question is, does this seem worth continuing to pursue for this bug fix, hammer out the rest of the unit test problems? Or, try some other approach...

openshift-merge-robot · 2024-04-03T08:59:45Z

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

This change updates the logic for setting security contexts within the OLM pod reconciler. Now, it differentiates between 'Restricted' and 'Legacy' security contexts more explicitly. The 'Restricted' security context applies default security settings unless overridden, while the 'Legacy' context clears all security settings. When no security context is configured, it defaults to restricted. Additionally, the related tests have been updated to reflect these changes and ensure correct behavior. Signed-off-by: btofel <[email protected]>

Signed-off-by: btofel <[email protected]>

Signed-off-by: btofel <[email protected]> Signed-off-by: Brett Tofel <[email protected]>

perdasilva · 2024-04-24T12:13:53Z

Closing in favor of: #3206

openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 4, 2024

openshift-ci bot requested review from kevinrizza and tmshort March 4, 2024 23:03

openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 4, 2024

bentito force-pushed the OCPBUGS-29729 branch from b3ab5ef to 08212c4 Compare March 4, 2024 23:16

openshift-merge-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Mar 4, 2024

bentito force-pushed the OCPBUGS-29729 branch from 8e586c2 to 2786d6c Compare March 4, 2024 23:26

openshift-merge-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Mar 4, 2024

joelanford reviewed Mar 5, 2024

View reviewed changes

pkg/controller/registry/reconciler/reconciler.go Outdated Show resolved Hide resolved

joelanford reviewed Mar 5, 2024

View reviewed changes

openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 3, 2024

bentito added 3 commits April 4, 2024 17:24

Hash in test needs fix

f7d09c6

Signed-off-by: btofel <[email protected]>

Add checking of the namespace PSA restrictions

f19d68a

Signed-off-by: btofel <[email protected]>

bentito force-pushed the OCPBUGS-29729 branch from 55c04cd to f19d68a Compare April 4, 2024 22:02

Fix linter issues

7229b81

Signed-off-by: btofel <[email protected]> Signed-off-by: Brett Tofel <[email protected]>

perdasilva closed this Apr 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OCPBUGS-29729: Refactor security context configuration in pod reconciler #3185

OCPBUGS-29729: Refactor security context configuration in pod reconciler #3185

bentito commented Mar 4, 2024

bentito commented Mar 5, 2024

joelanford Mar 5, 2024

joelanford Mar 5, 2024

bentito Mar 5, 2024

bentito Mar 13, 2024

tmshort Mar 14, 2024

bentito Mar 14, 2024

joelanford Mar 25, 2024

bentito Apr 4, 2024

openshift-merge-robot commented Apr 3, 2024

perdasilva commented Apr 24, 2024

OCPBUGS-29729: Refactor security context configuration in pod reconciler #3185

OCPBUGS-29729: Refactor security context configuration in pod reconciler #3185

Conversation

bentito commented Mar 4, 2024

bentito commented Mar 5, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

openshift-merge-robot commented Apr 3, 2024

perdasilva commented Apr 24, 2024