-
Notifications
You must be signed in to change notification settings - Fork 37
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add the design for expanding hub templates access
Relates: https://issues.redhat.com/browse/ACM-12129 Signed-off-by: mprahl <[email protected]>
- Loading branch information
Showing
2 changed files
with
320 additions
and
0 deletions.
There are no files selected for viewing
310 changes: 310 additions & 0 deletions
310
enhancements/sig-policy/122-expanded-hub-templates-access/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,310 @@ | ||
# Long-term Compliance History of Policies | ||
|
||
## Release Signoff Checklist | ||
|
||
- [ ] Enhancement is `implementable` | ||
- [ ] Design details are appropriately documented from clear requirements | ||
- [ ] Test plan is defined | ||
- [ ] Graduation criteria for dev preview, tech preview, GA | ||
- [ ] User-facing documentation is created in | ||
[website](https://github.com/open-cluster-management-io/open-cluster-management-io.github.io/) | ||
|
||
## Summary | ||
|
||
Open Cluster Management policy hub templates are limited to referencing objects in the same namespace as the root | ||
policy. The proposal is to add an additional field of `spec.hubTemplateOptions.serviceAccountName` that must reference a | ||
service account in the same namespace as the root policy. The hub templates would execute using that service account. | ||
The user is responsible for providing `list` and `watch` permissions on the objects that the hub templates reference for | ||
that service account. | ||
|
||
## Motivation | ||
|
||
An Open Cluster Management policy can contain hub templates that get executed as replicated policies are created in each | ||
managed cluster namespace on the hub. To prevent privilege escalation and data leakage, there is a current restriction | ||
that only the `ManagedCluster` object and objects in the same namespace as the root policy can be referenced. This is | ||
limiting in situations such as copying a `Secret` from another namespace to the managed cluster. The current workaround | ||
is to have a policy applied to `local-cluster` (Hub) that copies the `Secret` to the root policy namespace. Workarounds | ||
for cluster-scoped objects or objects that have side effects are more complicated. | ||
|
||
### Goals | ||
|
||
1. A user can specify a service account for hub templates to access objects outside of the root policy namespace. | ||
1. The current event-driven nature of hub templates is retained with a custom service account. | ||
1. Tokens for the service account have an expiration of 2 hours. | ||
|
||
### Non-Goals | ||
|
||
1. Allowing service accounts outside of the root policy namespace. | ||
|
||
## Proposal | ||
|
||
### Design | ||
|
||
Define a new field at `spec.hubTemplateOptions.serviceAccountName` to specify a service account that exists in the root | ||
policy namespace. When specified, the service account is used instead of the service account of the Governance Policy | ||
Propagator for hub template execution. Additionally, no namespace or cluster-scoped restrictions will be enforced by the | ||
Governance Policy Propagator, and instead, it will rely on the Kubernetes API performing authorization checks of the | ||
service account. Note that due to the event-driven nature of hub templates, the service account must have `list` and | ||
`watch` permissions for the objects that the hub templates reference. | ||
|
||
Note that for backwards compatibility, if the namespace is not specified in a hub template object lookup, the root | ||
policy namespace will be the default, however, the service account must still have the appropriate permissions in the | ||
root policy namespace to perform the lookup. | ||
|
||
See an example of such a policy below: | ||
|
||
```yaml | ||
apiVersion: policy.open-cluster-management.io/v1 | ||
kind: Policy | ||
metadata: | ||
name: test-config | ||
namespace: test-policies | ||
spec: | ||
disabled: false | ||
remediationAction: enforce | ||
hubTemplateOptions: | ||
serviceAccountName: testsa | ||
policy-templates: | ||
- objectDefinition: | ||
apiVersion: policy.open-cluster-management.io/v1 | ||
kind: ConfigurationPolicy | ||
metadata: | ||
name: test-config | ||
spec: | ||
remediationAction: enforce | ||
severity: high | ||
object-templates: | ||
- complianceType: musthave | ||
objectDefinition: | ||
apiVersion: v1 | ||
data: | ||
field1: '{{hub fromConfigMap "some-other-namespace" "test-configmap" "test-field1" hub}}' | ||
field2: '{{hub fromConfigMap "some-other-namespace" "test-configmap" "test-field1" hub}}' | ||
kind: ConfigMap | ||
metadata: | ||
name: my-configmap | ||
namespace: test | ||
``` | ||
### User Stories | ||
#### Story 1 | ||
As a policy user, I would like to specify hub templates that reference objects outside of the root policy namespace so | ||
that I can stop defining additional policies on `local-cluster` to sync the required information to the root policy | ||
namespace. | ||
|
||
### Implementation Details/Notes/Constraints | ||
|
||
Although the user experience is rather simple, there's complexity in the implementation. One aspect is that the number | ||
of service accounts is not known at startup and can constantly change. It is also a bit complex to prevent duplication | ||
of Kubernetes API watches, underlying TCP connections, and tokens if multiple policies reference the same service | ||
account. | ||
|
||
Currently, the Governance Policy Propagator only uses a single `TemplateResolver` client from the | ||
[go-template-utils](https://github.com/stolostron/go-template-utils) library and sets restrictions based on the root | ||
policy namespace. That will still remain the default, however, we'll need exactly one `TemplateResolver` client per | ||
service account. To do this, I recommend having a new `sync.Map` field named `TemplateResolvers` on the | ||
`ReplicatedPolicyReconciler` struct and each key would be a `NamespacedName` from `k8s.io/apimachinery/pkg/types` | ||
representing the service account. Each value would be an instance of the following struct: | ||
|
||
```go | ||
type templateResolverWithCancel struct { | ||
lock *sync.RWMutex | ||
cancel context.CancelFunc | ||
resolver *templates.TemplateResolver | ||
dynamicWatcher k8sdepwatches.DynamicWatcher | ||
} | ||
``` | ||
|
||
When the replicated policy controller with the Governance Policy Propagator encounters a | ||
`spec.hubTemplateOptions.serviceAccountName` field, it'll perform an operation similar to this to either create the | ||
`TemplateResolver` if it's the first reconcile for that `serviceAccountName` or to use the preexisting one: | ||
|
||
```go | ||
var templateResolver *templates.TemplateResolver | ||
nsName := types.NamespacedName{ | ||
Namespace: rootPlc.Namespace, Name: replicatedPlc.Spec.HubTemplateOptions.ServiceAccountName, | ||
} | ||
resolver := &templateResolverWithCancel{ | ||
lock: &sync.RWMutex{}, | ||
} | ||
resolver.lock.Lock() | ||
oldResolver, loaded := r.TemplateResolvers.LoadOrStore(nsName, resolver) | ||
if loaded { | ||
resolver.lock.Unlock() | ||
oldResolverTyped := oldResolver.(*templateResolverWithCancel) | ||
oldResolverTyped.lock.RLock() | ||
// Handle the case where resolver is getting cleaned up while the lock is held. | ||
// Just try again in this case. | ||
if oldResolverTyped.resolver == nil { | ||
oldResolverTyped.lock.RUnlock() | ||
// TODO: Write code to handle repeating the above code block | ||
} | ||
defer oldResolverTyped.lock.RUnlock() | ||
templateResolver = oldResolverTyped.resolver | ||
break | ||
} | ||
// TODO: Code to get a token and instantiate a TemplateResolver instance for the service account and | ||
// set the values on `resolver`. | ||
``` | ||
|
||
There should be a goroutine that periodically cleans up entries in `TemplateResolvers` that are for service accounts no | ||
longer referenced by any root policy. This is simpler than managing a reference count for each service account. | ||
|
||
Another complication is that when the policy's service account name changes, the watches need to be removed for the | ||
replicated policy on the previous service account `TemplateResolver`. The suggestion is to have another `sync.Map` | ||
called `policyToServiceAccount` on the `ReplicatedPolicyController` struct to keep track of the service account the | ||
replicated policy had. The keys are the `ObjectIdentifier` type from the | ||
[kubernetes-dependency-watches](https://github.com/stolostron/kubernetes-dependency-watches) library representing the | ||
replicated policy. The values are `types.NamespacedName` values of the service account used. Note that for the default | ||
template resolver, we will use the special name of `<default>` with an empty namespace. In the `Reconcile` method, if | ||
the service account changes, the `DynamicWatcher` of the old service account is retrieved from `r.TemplateResolvers` and | ||
`resolver.DynamicWatcher.RemoveWatcher` is called. This should be done regardless of if hub templates are detected since | ||
the update of the service account name may have removed hub templates as well. Note that this `policyToServiceAccount` | ||
map can also be used to clean up watches when a replicated policy is deleted since the original replicated policy spec | ||
is no longer available. | ||
|
||
The other complication is the service account must be periodically refreshed before expiration. This can be handled when | ||
the token is generated using the | ||
[TokenRequest API](https://kubernetes.io/docs/reference/kubernetes-api/authentication-resources/token-request-v1/) and | ||
launching a goroutine that replaces the token before expiration. Note that if the token is provided in a token file, | ||
client-go will periodically check for a new token in the file and use that for API requests. Taking advantage of this | ||
prevents the Governance Policy Propagator from needing to recreate the `TemplateResolver` instance when the token | ||
changes. See an example function to get a service account token that gets automatically refreshed: | ||
|
||
```go | ||
func (r *ReplicatedPolicyReconciler) getToken(ctx context.Context, namespace string, saName string) (string, error) { | ||
var expiration int64 = 7200 | ||
|
||
req := &authv1.TokenRequest{ | ||
Spec: authv1.TokenRequestSpec{ | ||
// TODO: Is this always the correct value? | ||
Audiences: []string{"https://kubernetes.default.svc.cluster.local"}, | ||
// 2 hours | ||
ExpirationSeconds: &expiration, | ||
}, | ||
} | ||
|
||
returnedTokenReq, err := r.HubClient.CoreV1().ServiceAccounts(namespace).CreateToken( | ||
ctx, saName, req, metav1.CreateOptions{}, | ||
) | ||
if err != nil { | ||
return "", err | ||
} | ||
|
||
tokenFile, err := os.CreateTemp("", fmt.Sprintf("token-%s.%s-*", namespace, saName)) | ||
if err != nil { | ||
return "", err | ||
} | ||
|
||
_, err = tokenFile.Write([]byte(returnedTokenReq.Status.Token)) | ||
if err != nil { | ||
return "", err | ||
} | ||
|
||
tokenFilePath, err := filepath.Abs(tokenFile.Name()) | ||
if err != nil { | ||
return "", err | ||
} | ||
|
||
expirationTimestamp := returnedTokenReq.Status.ExpirationTimestamp | ||
|
||
go func() { | ||
defer func() { | ||
err := os.Remove(tokenFilePath) | ||
if err != nil { | ||
log.Error(err, "Failed to clean up the service account token file", "serviceAccount", saName, "namespace", namespace, "path", tokenFilePath) | ||
} | ||
}() | ||
|
||
for { | ||
deadlineCtx, deadlineCtxCancel := context.WithDeadline(ctx, expirationTimestamp.Add(-10*time.Minute)) | ||
|
||
<-deadlineCtx.Done() | ||
if !errors.Is(deadlineCtx.Err(), context.DeadlineExceeded) { | ||
// This really does nothing but this satisfies the linter that cancel is called. | ||
deadlineCtxCancel() | ||
|
||
return | ||
} | ||
|
||
returnedTokenReq, err := r.HubClient.CoreV1().ServiceAccounts(namespace).CreateToken( | ||
ctx, saName, req, metav1.CreateOptions{}, | ||
) | ||
if err != nil { | ||
log.Error(err, "Failed to renew the token. Will retry in 5 seconds.", "serviceAccount", saName, "namespace", namespace) | ||
|
||
time.Sleep(5 * time.Second) | ||
|
||
continue | ||
} | ||
|
||
_, err = tokenFile.Write([]byte(returnedTokenReq.Status.Token)) | ||
if err != nil { | ||
log.Error(err, "Failed to write the token. Will retry in 5 seconds.", "serviceAccount", saName, "namespace", namespace, "path", tokenFilePath) | ||
|
||
time.Sleep(5 * time.Second) | ||
|
||
continue | ||
} | ||
|
||
expirationTimestamp = returnedTokenReq.Status.ExpirationTimestamp | ||
|
||
// This really does nothing but this satisfies the linter that cancel is called. | ||
deadlineCtxCancel() | ||
} | ||
}() | ||
|
||
return tokenFilePath, nil | ||
} | ||
``` | ||
|
||
### Risks and Mitigation | ||
|
||
### Open Questions [optional] | ||
|
||
None yet. | ||
|
||
### Test Plan | ||
|
||
**Note:** _Section not required until targeted at a release._ | ||
|
||
### Graduation Criteria | ||
|
||
N/A | ||
|
||
### Upgrade / Downgrade Strategy | ||
|
||
Downgrades would cause the `spec.serviceAccountName` field to be removed and fall back to the default service account. | ||
|
||
### Version Skew Strategy | ||
|
||
## Implementation History | ||
|
||
N/A | ||
|
||
## Drawbacks | ||
|
||
- Complex implementation | ||
|
||
## Alternatives | ||
|
||
One could use `SubjectAccessReview` to see if the service account has access and leverage the existing default | ||
`TemplateResolver` but this is expensive on the API server, it would not take into account group membership, and it | ||
would not be responsive to RBAC changes on the service account. | ||
|
||
## Infrastructure Needed [optional] | ||
|
||
N/A |
10 changes: 10 additions & 0 deletions
10
enhancements/sig-policy/122-expanded-hub-templates-access/metadata.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
title: expanded-hub-templates-access | ||
authors: | ||
- "@mprahl" | ||
reviewers: | ||
- TBD | ||
approvers: | ||
- TBD | ||
creation-date: 2024-07-03 | ||
last-updated: 2024-07-03 | ||
status: implementable |