Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fine-grained RBAC for Argo Server #6490

Open
alexec opened this issue Aug 4, 2021 · 55 comments
Open

Fine-grained RBAC for Argo Server #6490

alexec opened this issue Aug 4, 2021 · 55 comments

Comments

@alexec
Copy link
Contributor

alexec commented Aug 4, 2021

Summary

The current model of RBAC leans heavily on Kubernetes RBAC, so is easy to make secure, but may not scale well.

Consider the situation where you have 1000s teams and 1000s namespace. At the very least you may need a OIDC group for each namespace, and then a service account, role and role binding.

It maybe better to use Casbin (or similar) to provide a more flexible way to configure this.

Use Cases

When would you use this?


Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

@alexec alexec added the type/feature Feature request label Aug 4, 2021
@andrewm-aero
Copy link

I would like to add onto this. In our setup, each "team" is given a namespace, and each "role" within that team (developer, admin, etc) for that team within our SSO is mapped onto a k8s service account, and given a role and rolebinding to match. The issue arises when a user belongs to more than one team, there's no way to give them multiple service accounts, so the only options are to:

  1. Generate one service account per user (defeating the purpose of SSO)
  2. Generate one (service account,role,rolebinding triple) per possible combination of SSO roles (combinatorial explosion)
  3. Abandon SSO
  4. Disallow users from belonging to more than one team/role

I hope you'll agree none of these are acceptable outcomes. We agree that using k8s-native RBAC makes a lot of sense for a k8s-native app, so here are some potential fixes we've thought up:

First, consider adding an annotation to service accounts, that, if present, indicates that all service accounts that match the criteria are selected, instead of just the first according to priority. From there, each SA would be tried, by order of priority. If a "permission denied"-type error is received, the next would be tried. If the list is exhausted, if the request succeeds, or a non-"permission denied"-type error is received, the current behavior would take place. While this would multiply the time each request takes by the number of matched SA's, I think it would allow any arbitrary combination of roles (assuming one role per SA) across namespaces to be assigned to any given user.

Another possibility would be to add an annotation to service accounts to indicate that the service account only be considered if the request is for one of a set of namespaces, and the server would keep a cache of matched service accounts for each user, instead of just one, and select the most appropriate one for each request based on namespace and priority. While this would only allow one role (assuming one role per SA) per namespace, that seems like a much friendlier restriction than one SA per cluster, and the mapping of namespace -> SA could be done fairly efficiently.

Finally, the server could dynamically generate SA's on a per-session basis based on OIDC claims, and instead collect a set of (cluster)roles to bind to those generated SA's in the same way that SA's are currently selected (but allowing multiple roles to be selected). This might be more permissions than some admins would wish to give to argo, but would allow a clean mapping of SSO roles/groups to k8s roles.

@HouseoLogy
Copy link

HouseoLogy commented Aug 31, 2021

Dear @alexec,

first of all, thanks a lot for that great piece of software you and the community is providing.
We started to use it for some simple Workflows and now we're already looking into sophisticated ones with dozens of steps/tasks and dependencies.
The Kubernetes native nature is just a cherry on top of its capabilities. GREAT WORK!

So I've now given credits to the Ferrari under the hood you've built.
Nevertheless, to be able to roll it completely out in our productive Environments we're missing some UI RBAC Capabilities / Controls.
As it's usual in Financial Industries and bigger regulatory dependent companies, there are audit and compliance rules to follow.

With respect to segregation of duties and separate ownership for different Workflows/Workflowtemplates, we need to cover following requirements:

  • UI Should have Ready-Only Mode
  • Kubernetes Objects like Workflow, WorkflowTemplates, Cron Workflows and so son should be immutable in the UI (Only Parameters and Metadata --> but no YAML EDIT)
  • Logged in user should only see Workflows, which he's allowed to list (even in Cluster Install)
  • It should be possible to segregate between admins, operators, viewer

ALL ROUTES LEAD TO ROME. There are multiple ways to incorporate these capabilities into Argo Workflows. Here 2 suggestions:

Option 1. Predefine Permission Matrix for UI Capabilities

As an example three predefined Permission Groups:

  • Admin
  • Operator
  • Read-Only

Admin: Can fully use the current UI and its capabilities like Submit new Workflow, Edit Json/Yaml, Upload File, Delete, terminate and so on.
Operator: Can submit, resubmit, terminate, delete workflows but no creation of custom resources like workflows, workflow templates, sensors and so on. Also no edit possibility of "full workflow options" --> YAML Read-only
Read-Only: Can only view and list resources, which the logged in user is allowed to over its Kubernetes RBACs settings

Theses UI Permission Groups could be assigned over annotations directly on the Service-accounts. If none is assigned, then read-only could be set as default:
eg. workflows.argoproj.io/rbac-ui-permission-group: "admin"

Pros:

  • relatively easy to implement and rollout
    Cons:
  • still no possibilty of fine-grained rbac settings on different resources
  • no multi-namespace configuration possible, because User is always mapped to one service-account

Option 2. Use similar RBAC Configuration Approach like Argo CD and probably intended in your issue description

Definition of a ConfigMap with additional and fine-grained RBAC roles defined. Also a default role:readonly could be applied.

RBAC Permission Structure

Permission policy definition should be feasible for all resources or for namespace scoped (see next bullet)
p, <role/user/group>, <resource>, <action>, <object>

Namespace scoped Permission policy definition:
p, <role/user/group>, <resource>, <action>, <namespace>/<object>

RBAC Resources and Actions

Ideally the other Resources like events and sensors should also be covered, because they're also handled in the UI even if the CRD is not in argo-workflows itself:
clusterworkflowtemplates, cronworkflows, eventbus, eventsources, sensors, workfloweventbindings, workflows, workflowtemplates

Actions: get, create, delete, terminate, submit, edit

get= get/list resources
create= create resources
delete= delete resources
terminate= terminate running resource
submit= submit workflow with possible parameter settings (no YAML edit option)
edit= possible to submit and edit yaml / workflow options

Here an example how such a policy could look like:

apiVersion: v1
kind: ConfigMap
metadata:
  name: argoworkflow-rbac-cm
  namespace: <yournamespace>
data:
  policy.default: role:readonly
  policy.csv: |
    p, role:workflow-admin, clusterworkflowtemplates, *, *, allow
    p, role:workflow-admin, cronworkflows, *, *, allow
    p, role:workflow-admin, eventbus, *, *, allow
    p, role:workflow-admin, eventsources, *, *, allow
    p, role:workflow-admin, sensors, *, *, allow
    p, role:workflow-admin, workfloweventbindings, *, *, allow
    p, role:workflow-admin, workflows, *, *, allow
    p, role:workflow-admin, workflowtemplates, *, *, allow
    p, role:workflow-ops, workflows, get, *, allow
    p, role:workflow-ops, workflows, delete, *, allow
    p, role:workflow-ops, workflows, submit, *, allow
    p, role:workflow-ops, workflows, terminate, *, allow
    p, role:workflow-team-blue-scoped, workflows, *, targetnamespace-blue/*, allow
    p, role:workflow-team-red-scoped, workflows, *, targetnamespace-red/*, allow

    g, your-admin-group, role:workflow-admin
    g, your-workflow-ops-group, role:workflow-ops
    g, your-team-blue-scoped-group, role:workflow-team-blue-scoped
    g, your-team-red-scoped-group, role:workflow-team-red-scoped

Pros:

  • fine-grained permissions possible
  • user can handle resources over x namespaces

Cons:

  • implementation effort is probably higher for integrating this

The Option 1 is more to be seen as a "quick fix".

The Option 2 is definitely preferred and the bulletproof approach (as seen in Argo CD)

BTW: One setup which I don't understand completely. Why is there a dedicated namespaced install needed?
If you have clear K8S RBAC settings (which you obviously have) and on top additional Groups/Policies, you could always install it as Cluster and let the RBAC rules and Policies do its job.

@alexec
Copy link
Contributor Author

alexec commented Aug 31, 2021

@HouseoLogy I can't don't have much to add to what you said. It is basically correct.

Today's solution is intentionally frugal. Basically we have service accounts that have annotations that select a service account for you to use. The upside is we lean on Kubernetes RBAC (less effort, especially on critical security code), the downside is that it does not scale well, ultimately requiring one service account for each user, and each user need to have they're permissions copied into that account.

There are other solutions, I've not mentioned:

If the problem is operational overhead, we could consider using impersonation. Basically, the users must create a service account (typically within their own namespace.). The argo-server service account could then use impersonation to become that uses. This moves the work from the operator to the user, and enables the user to self-serve.

We could look at using Kubernetes SSO. I don't know much about this, but it looks like it is not well supported.

Finally, we could have the option to use Casbin. I envisage this implemented as a HTTP interceptor, so the resources are the API URLs, not Kubernetes resources.

@HouseoLogy this issue is not currently on the core team's roadmap, so won't get looked at > 6 months. But, we do want to do more collaborative features, where someone from the community does the implementation with the guidance from the core team. Would you be interested?

@andrewm-aero
Copy link

I like @HouseoLogy 's idea of using ArgoCD's approach, we've use it extensively for more than a year with no complaints. If it isn't already, perhaps that behavior could be extracted into https://github.com/argoproj/pkg for shared, generic use.

@alexec Because this is such a blocker for us, I'd be happy to offer some time in implementing one of these solutions.

@alexec
Copy link
Contributor Author

alexec commented Aug 31, 2021

Thank you @andrewm-aero . I think we probably need to nail down whether or not "resources" are defined as "API URLs" or as "Kubernetes resources", I think this is a closed-door decision:

  • The later is more work - it cannot be done as an interceptor. It must be implemented in code.
  • It is more porous - it's hard to know which API endpoints change which resources.
  • It will break - it is easy to change code functionality (e.g. to read a new resource) and forget to change the security enforcement.

But... it may not be what users want.

If we use "API URLs" then we may need to change some of those URLs to include namespace (for example).

@andrewm-aero have you managed to start the dev set-up?

@HouseoLogy
Copy link

Hello @alexec & @andrewm-aero,
thanks a lot for the fast feedback and @andrewm-aero for offering time to implement one of these solution.
I probably wouldn't be a huge help in this area, because GO isn't my domain and I'm currently overloaded with workload in my daily business. Nevertheless, I'll follow this issue and if I'm able to free up some time for contribution I'll ping you.

Like suggested, I would definitely go with the Casbin / Argo CD approach and not implement it as dedicated Kubernetes resources.

@alexec
Copy link
Contributor Author

alexec commented Sep 1, 2021

I think I might make a U-turn on saying we should do URLs:

  • This cannot support clients using gRPC.
  • This would mean we'd have to change the artifact and archive URLs.
  • It's much nicer to be working with verb+resource+namespace+name.

@alexec
Copy link
Contributor Author

alexec commented Sep 1, 2021

I've created a PoC. I do not plan to complete this. Would anyone like to volunteer to take this over the finish line?

@jeanlouhallee
Copy link

Hey @alexec! I would like to offer some help by taking over your PoC. We would also greatly benefit from this feature at my company 👍

@alexec
Copy link
Contributor Author

alexec commented Sep 2, 2021

@jeanlouhallee I’m assuming you’ve done some Golang before? If so, step 1 is to read the CONTRIBUTING.md guide - clone the source code onto your local machine and checkout the casbin-poc Branch. You’ll need to start it using make start PROFILE=sso.

The branch has a number of TODO left on it which need completing. It’ll also need testing with a configmap volume mounted at /casbin.

We’ll want to get some community members to test it too. We can figure out the details closer to the time.

@jeanlouhallee
Copy link

Thanks for the pointers @alexec. I have done a bit of Golang, but not much. Will learn a lot by diving into this.

@alexec
Copy link
Contributor Author

alexec commented Sep 9, 2021

Hi @jeanlouhallee thank you!

@jeanlouhallee
Copy link

Hello @alexec, do you mind if I ping you directly on Slack for design questions/considerations?

@alexec
Copy link
Contributor Author

alexec commented Sep 10, 2021

Sure. That's fine.

@HouseoLogy
Copy link

Hi @jeanlouhallee & @alexec,
were you able to agree on certain design considerations? Do you have any further insights?

@basanthjenuhb
Copy link
Contributor

basanthjenuhb commented Nov 2, 2021

We have added new feature related to SSO
https://github.com/argoproj/argo-workflows/blob/master/docs/argo-server-sso.md#sso-rbac-namespace-delegation

@HouseoLogy @andrewm-aero Would this help in your use case

@thesuperzapper
Copy link
Contributor

For all those watching, I have created PR #7193 that implements "SSO Impersonation" support in argo-server by using Kubernetes SubjectAccessReviews with the user's email or sub OIDC claim. This means that argo-server access can be managed by standard Kubernetes RoleBindings and ClusterRoleBindings with your user's email in the subjects.

NOTE: these changes apply equally to the aro UI and CLI, as they affect all K8S API calls made by argo-server of behalf of users

@DingGGu
Copy link

DingGGu commented Nov 17, 2021

Amazing stuffs!
Kiali already support impersonate with Kubernetes OIDC

I had a good experience when configuring kiali's authentication.

Usually, OIDC is already configured for authentication in Kubernetes Cluster.

Kiali to use the same App as Cluster authentication, It will map to the RoleBinding defined in the cluster.
Thus, users can use the same privileges they can use in kubectl .

In my cases, there are hundreds of users in the cluster, so permissions are managed by scopes: groups of OIDC and RoleBinding is managed in the cluster.

You can set up with in kube-apiserver: --oidc-groups-claim

and Here is RoleBinding examples

subjects:
- kind: Group
   name: "frontend-admins"
   apiGroup: rbac.authorization.k8s.io

If Argo Workflow can follow this flow, it will be very helpful in using it.

I think that the method provided by the current Argo Workflow (v3.2) has management limitations, so I plan to operate it in client auth-mode.

@thesuperzapper
Copy link
Contributor

@DingGGu as described in the "limitations" section of my PR #7193, the initial implementation won't support Group bindings, only User. This is because it uses SubjectAccessReviews, which require us to explicitly pass the list of Groups which a user is in, and I am not sure the best way to check that.

I see two possibilities:

  1. We run some kind of K8S query based on the User (email or sub), which checks what Groups that user is in (not good because it requires another K8S API call)
  2. We add an alternate mode to the "impersonate" feature which extracts the groups OIDC claim, and runs a SubjectAccessReview on the groups rather than just email or sub.

@DingGGu
Copy link

DingGGu commented Nov 17, 2021

@thesuperzapper Looking at your PR comments and the limitations of SubjectAccessReview, I wonder if this is the correct implementation.

My current configuration is as follows.

kube-apiserver
--oidc-issuer-url=https://<oidc.provider.com>
--oidc-username-claim=email
--oidc-groups-claim=groups
--oidc-groups-prefix='oidc:'

The JWT token for the proper authenticated to Kubernetes and payload is:

{
  // ...
  "email": "[email protected]",
  "groups": ["ns-data-readonly", "ns-workflow-writer"],
  // ...
}

The RoleBinding of the namespace is set as follows.

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: oidc:namespace-readonly
  namespace: data
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: oidc:namespace-readonly
subjects:
- kind: Group
  name: oidc:ns-data-readonly
  apiGroup: rbac.authorization.k8s.io

In my use case, the group name of RoleBinding and the group name provided by OIDC are slightly different.

Note the oidc:ns-data-readonly in the RoleBinding and the ns-data-readonly of the OIDC groups.
This difference can be configured with --oidc-groups-claim in kube-apiserver.

In Kiali, even if prefix is ​​set, permission is granted work properly.
They are using SelfSubjectAccessReviews not SubjectAccessReviews.
https://github.com/kiali/kiali/blob/fb20c789419ba0950cb6a3a5c5d296e1df778e58/kubernetes/kubernetes.go#L472-L508

@alexec
Copy link
Contributor Author

alexec commented Nov 17, 2021

Most users will not be able to reconfigure the Kubernetes API server (e.g. it is EKS, GCP auto-pilot, etc). How do these ideas work for them?

@DingGGu
Copy link

DingGGu commented Nov 17, 2021

@alexec This option was supported on EKS and kOps.

And I also found GKE was supported too.

@alexec
Copy link
Contributor Author

alexec commented Nov 17, 2021

There will be many users in enterprise environments who do not have this option to change, regardless of whether it is available.

@DingGGu
Copy link

DingGGu commented Nov 17, 2021

EKS and kOps I'm using have all been applied with in-place upgrade.

Why this option is useful is also explained in the EKS documentation.
This is to prevent the permission worker from accidentally granting permission starting with system: in the OIDC Provider.

My guess is that Kubernetes was also aware of this problem and provided a prefix option. So, there doesn't seem to be any need to rule out this option.

@DingGGu
Copy link

DingGGu commented Nov 17, 2021

The reason why the cluster administrator wants to use "impersonate" is that to use another web console, there is no need to add or change the authorization scheme already configured in Kubernetes.
which that means, User can use the same privileges granted to kubectl.

I manage hundreds of developers using dozens of clusters, and if this feature is added correctly, I think the management will be very efficient.

Also, other Cluster Administrstor will be happy to change the options of kube-apiserver if they know of these advantages.

Usually kube-apiserver consists of 3 Pods, so there is no difficulty to change it. (In my experience it is)

@Sefriol
Copy link

Sefriol commented Mar 30, 2022

Since we have our application deployed on AKS under quite strict company control, I would assume that casbin based implementation would be preferable for our use case. Also because we already use casbin for other services, so having unified authorization layer (atleast in terms of tech stack being used) would be nice ;)

Hopefully this gets implemented sooner rather than later.

@alexec
Copy link
Contributor Author

alexec commented Mar 30, 2022

Fixed by #8120

@bygui86
Copy link

bygui86 commented Mar 31, 2022

@alexec so in your PR #8120 you implemented Casbin RBAC for Argo Server same as available in ArgoCD?
that's amazing!!

@alexec
Copy link
Contributor Author

alexec commented Mar 31, 2022

It was not my original goal, but that is what happened. It does not support groups yet. Need to determine that.

@bygui86
Copy link

bygui86 commented Mar 31, 2022

that's amazing anyway!!

@Sefriol
Copy link

Sefriol commented Mar 31, 2022

Amazing. How much work is there still to be done? It already looks quite huge feature set.

@alexec
Copy link
Contributor Author

alexec commented Apr 6, 2022

@Sefriol
Copy link

Sefriol commented Oct 17, 2022

Any plans to continue this development, or is it resolved by something else @alexec?

@ryancurrah
Copy link
Contributor

It appears to be on the roadmap under UI/Usability. Not sure how old that is though. https://docs.google.com/document/d/1TzhgIPHnlUI9tVqcjoZVmvjuPAIZf5AyygGqL98BBaI

@thesuperzapper
Copy link
Contributor

Personally, I think we should offer an option that uses Kubernetes RBAC roles/groups, like what I was doing in #7193 (which was closed due to inactivity).

I am happy to reopen that if the project is still interested in that functionality, I have just been a bit busy moving countries!

@ryancurrah
Copy link
Contributor

Any solution that doesn't require us to setup a new Okta tile per namespace is okay with me.

@aaron-arellano
Copy link

aaron-arellano commented Feb 25, 2023

@thesuperzapper SAR would be great to implement as we use this for our services on our cluster and like the ability to maintain control with roles, rolebindings, and SAs.

@tooptoop4
Copy link
Contributor

tooptoop4 commented Jun 16, 2023

@HouseoLogy I'm very interested in the "Operator: Can submit, resubmit, terminate, delete workflows but no creation of custom resources like workflows, workflow templates, sensors and so on. Also no edit possibility of "full workflow options" --> YAML Read-only" permission you mentioned, can you share your role definition for this? I have the Admin and Read-Only roles but am struggling to: allow submitting from existing workflowtemplates but disable submitting arbitrary new workflow yamls (as they have not gone through code review)

@agilgur5 agilgur5 added the type/security Security related label Feb 20, 2024
@agilgur5
Copy link

agilgur5 commented Feb 20, 2024

Discussed this in SIG Security last week as internal RBAC requires a decent amount of maintenance overhead and new CRDs, as is the case with Argo CD. Was looking for ideas there in particular as any auth system can become a common attack vector, so relying on existing auth system (like k8s RBAC for authZ) is both simpler and more secure (i.e. "don't reinvent the wheel", "don't invent your own crypto" etc).
Note that I was mainly looking at the authZ portion of this, not the authN portion (there may be some existing good ideas above on authN already).

Some authZ ideas there included making (potentially optional) CRDs for actions (e.g. retry, resubmit) and that way it would use built-in k8s RBAC. Imperative actions as CRDs is a bit of an anti-pattern (k8s is declarative), so that is not ideal either, but it would create or affect declarative resources in the end.

Another idea was configurable Lua script actions that CD has (e.g. to restart a Deployment)

@alexec
Copy link
Contributor Author

alexec commented Feb 20, 2024

IMHO both Argo CD and Argo Workflows have not done this well.

  • Argo Workflows relies on the Kubernetes RBAC. It's therefore inflexible and cannot do finer-grained auth-z.
  • Argo CD uses CRDs (such as Project). I think that CRDs are unnecessarily complex and create a new vector for security bugs.

My proposal is that the Casbin model and policy are stored in a Configmap within the controller namespace and managed using GitOps. They can be mounted as files, very simple.

Casbin policies are standard RBAC, i.e. the rules decides whether the user is allowed to perform an action on a resource. So one big task would be determining what they are and making sure the data model is good enough to describe all the potential rules that users would want.

Essentially, this would be Argo CD simplified.

This proposal does not work if users need to re-configure security at runtime (yuk).

@agilgur5
Copy link

IMHO both Argo CD and Argo Workflows have not done this well [...]

Yea I'm more or less agreed, that's why I was seeking new ideas and to avoid things CD folks regret in that RBAC model.

stored in a Configmap within the controller namespace

This is certainly simpler than CRDs, but that means that people with ConfigMap access will have access to this as well. And folks want namespace-specific config already as well, so we'd need to support multiple ConfigMaps.

one big task would be determining what they are and making sure the data model is good enough to describe all the potential rules that users would want.

That and uh, adding authZ logic everywhere 😅

And since Casbin is not k8s native, either you have to go through the Server or you can bypass it by editing resources directly in k8s. Do you have any thoughts on that specifically @alexec?

@thesuperzapper
Copy link
Contributor

I know it's been a while, but I really think using subject access reviews (at least as one of the options) is probably the simplest solution. It delegates everything to Kubernetes RBAC.

Since Argo Server is all about Argo Workflows, and the permission model in workflows is simply "are you allowed to do this action with the CRD". It just makes sense to reflect the same permissions in the user interface by taking the email of the user as the "actor" in the subject access review.

The PR that I made a number of years ago could pretty easily be rebased to work with the current HEAD, and it implements exactly this:

@agilgur5
Copy link

agilgur5 commented Feb 22, 2024

I know it's been a while, but I really think using subject access reviews

As I wrote above, I was not commenting on authN, and did actually mention past work on that. I have seen and would like to take a look in-depth at your SAR work at some point in the future (also #9325 has some very relevant notes about that approach).

It delegates everything to Kubernetes RBAC.

As Alex, I, and many users have stated, k8s RBAC is not granular enough. Several imperative actions of Workflows do not fall neatly into k8s RBAC, and many users would like to be able to add authZ to those. Including retry, resubmit, terminate, allowing full YAML, etc.
An internal RBAC system is one way to remediate that. I wrote above of an alternative idea, using non-standard imperative CRDs. I'd like to explore more options as delegating to k8s RBAC would be great (for reasons I stated above and others), but is not currently sufficient.

@thesuperzapper
Copy link
Contributor

@agilgur5 these "extra" permissions just be represented as virtual "subresources", similar to how Kubernetes already exposes a pod/log subresource for Pod RBAC, to allow giving independent access to the logs.

But either way, we don't need to get bogged down trying to make some crazy solution when simply reflecting the same level of access the user has in kubectl is sufficient for almost all use cases.

It's not like Argo Workflows currently allows more granular access than CREATE/DELETE/LIST on each resource kind (e.g. Workflow), so we might as well get Argo Server at least up to that level.

@agilgur5
Copy link

agilgur5 commented Feb 22, 2024

reflecting the same level of access the user has in kubectl is sufficient for almost all use cases.

This is not accurate. More granular RBAC is a very popular & common request from users.

It's not like Argo Workflows currently allows more granular access than CREATE/DELETE/LIST on each resource kind

This issue does try to address that and allow more granular RBAC.

@agilgur5 these "extra" permissions just be represented as virtual "subresources", similar to how Kubernetes already exposes a pod/log subresource for Pod RBAC, to allow giving independent access to the logs. [sic]

I was looking into this, but there are some problems with subresources:

  1. Custom subresources are not allowed in CRDs: Support arbitrary subresources for custom resources kubernetes/kubernetes#72637
  2. The imperative vs declarative part still stands
  3. The Workflow resource already has issues as it can get too large
  4. etc

@agilgur5
Copy link

agilgur5 commented Feb 23, 2024

  1. Custom subresources are not allowed in CRDs: Support arbitrary subresources for custom resources kubernetes/kubernetes#72637

Reading the comments in more depth here, I think custom CRDs for actions might indeed be the right way to go here, as opposed to an internal RBAC. There are a few examples of this in the ecosystem given there of *Request CRDs, including Velero's operations and k8s's own CertificateSigningRequest. Not mentioned there but similar is cert-manager's CertificateRequest.
There are some caveats to those, like the velero CLI being used to abstract any CRs and k8s not having *Request resources for other imperative actions like kubectl rollout restart.

But the general concept works well: it properly delegates these actions to the Controller (whereas the Argo Server is currently responsible for a few things, c.f. #12538) and makes them all accessible via kubectl and as standard k8s resources with standard k8s RBAC -- importantly that means the Server would not be required for these actions.
Caveats for these being that the Server is still required for DB features like node status offloading and the Workflow Archive and that new CRDs would be a bit of a hassle to additionally manage (and of course that migrating to those will be quite breaking, although we can partly mitigate that by leaving some of the API behavior backward-compatible for some time).

CRDs: RetryRequest, TerminateRequest, StopRequest, SuspendRequest, etc

@alexec
Copy link
Contributor Author

alexec commented Feb 23, 2024

I think if you start with the customer persons, list the different use case, create the resource model you want for RBAC, then you’ll be able to tell if Casbin or something else.

This issue should probably not be titled “Casbin RBAC”, perhaps it should titled “Fine Grained RBAC” or similar as it put the solution before the problem?

@andrewm-aero
Copy link

A technique I've used before is to proxy the k8s api, and implement custom RBAC by using a policy language that translates from an SSO token to a set of fictitious roles/clusterroles that the proxy should act as though the user has. By doing this, you can "invent" whatever custom subresources you want, since its then your own code checking if the request is authorized. Direct k8s access would still be subject to whatever roles are on the actual cluster.

@agilgur5 agilgur5 changed the title Casbin RBAC for Argo Server Fine-grained RBAC for Argo Server Oct 14, 2024
@tooptoop4
Copy link
Contributor

#6755

@agilgur5
Copy link

agilgur5 commented Oct 22, 2024

#6755

Per above comments, there are options other than Casbin that align better with the k8s RBAC model, do not require the API to be deployed / can be used without the API, and leave the security up to k8s RBAC instead of an internal RBAC system (best way to reduce surface area to vulnerabilities is to not have it at all).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet