Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the design for expanding hub templates access #122

Conversation

mprahl
Copy link
Member

@mprahl mprahl commented Jul 3, 2024

@openshift-ci openshift-ci bot requested review from deads2k and dhaiducek July 3, 2024 14:49
@openshift-ci openshift-ci bot added the approved label Jul 3, 2024
@mprahl mprahl force-pushed the design-expanded-hub-templates-access branch from 1eea630 to c4d8f07 Compare July 3, 2024 14:49
@mprahl mprahl force-pushed the design-expanded-hub-templates-access branch from c4d8f07 to 2938ebe Compare July 3, 2024 14:49
@mprahl
Copy link
Member Author

mprahl commented Jul 3, 2024

map can also be used to clean up watches when a replicated policy is deleted since the original replicated policy spec
is no longer available.

The other complication is the service account must be periodically refreshed before expiration. This can be handled when

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I am wrong, Based on my understanding, 100 service account saved, then run 100 goroutine to watch token expire?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if there are 100 different serviceAccountName values, then there'd be 100 goroutines. It should be quite cheap since it just essentially sleeps until near the expiration, then it refreshes the token, and sleeps again.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh so token expiration is 2 hour so during 2 hour, it sleeps


### Implementation Details/Notes/Constraints

Although the user experience is rather simple, there's complexity in the implementation. One aspect is that the number

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you going to share this in tech share? Hard to understand 100% sorry too complex

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can also do that during the daily scrum next week. If you have specific questions, I can answer those.

Copy link

openshift-ci bot commented Jul 5, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mprahl, yiraeChristineKim

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mprahl
Copy link
Member Author

mprahl commented Jul 5, 2024

/hold for other reviews

@openshift-merge-bot openshift-merge-bot bot merged commit 1e552da into open-cluster-management-io:main Jul 5, 2024
2 checks passed
Comment on lines +107 to +108
service account. To do this, I recommend having a new `sync.Map` field named `TemplateResolvers` on the
`ReplicatedPolicyReconciler` struct and each key would be a `NamespacedName` from `k8s.io/apimachinery/pkg/types`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have used this pattern of a sync.Map to things with individual mutexes before, but it feels a little bit clunky to me. I wonder if any other teams have patterns they use for things like this, just to compare to.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do use this pattern and I agree it's a bit clunky but the sync.Map is required due to the dynamic nature of the keys since they can change at runtime and can come and go. The alternative is a global lock on a normal map but sync.Map is optimized for multiple reads and few writes:
https://pkg.go.dev/sync#Map

One thing we could try is sync.OnceValue instead of a lock in the struct for sync.Map but it uses locks under the hood so we may be better off managing the lock ourselves for efficiency.

Comment on lines +163 to +164
There should be a goroutine that periodically cleans up entries in `TemplateResolvers` that are for service accounts no
longer referenced by any root policy. This is simpler than managing a reference count for each service account.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe if there will be a goroutine to clean things up anyway, there should just be that goroutine fully managing all of the entries, and we use channels to synchronize work. Just a thought - I'm not sure exactly what it would look like.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an interesting idea but I think it'd be more complex than what is suggested in the design since you'd need a channel to request the template resolver for the SA and then a channel to receive it. Each reconcile call would somehow need to pick the correct channel to listen on to get the correct template resolver. I think a sync.Map and a sync.RWMutex to synchronize the creation but read concurrently is easier to understand and likely more efficient.

Let me know if you had something else in mind than what I pictured!

`watch` permissions for the objects that the hub templates reference.

Note that for backwards compatibility, if the namespace is not specified in a hub template object lookup, the root
policy namespace will be the default, however, the service account must still have the appropriate permissions in the
Copy link
Member

@gparvin gparvin Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in the example, the following function:

                  field3: '{{hub fromConfigMap "" "test-configmap" "test-field1" hub}}'

would default to the test-policies namespace and would require list and watch permissions for ConfigMaps in the test-policies namespace to be in the roles for the serviceaccount (even though it's the root policy namespace and without the serviceaccount definition no special roles would be needed).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right.

}
```

### Risks and Mitigation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There could be some cases where customers have policies deployed to namespaces that have serviceaccounts already and they could take advantage of these permissions to gain some extra privileges they didn't have before. The best practice of not mixing policies with workloads would be the mitigation.


## Drawbacks

- Complex implementation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requires management of an additional service account by the policy author. This makes the policy development a little more complex, but trades the benefit of not copying resources to the policy authoring namespace.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants