Skip to content

Commit

Permalink
Add decision page for managing the admin kubeconfig
Browse files Browse the repository at this point in the history
  • Loading branch information
simu committed Sep 22, 2023
1 parent a874afb commit 85f5aad
Show file tree
Hide file tree
Showing 2 changed files with 83 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
= Admin kubeconfig management

== Problem

We currently store the kubeconfig for the `system:admin` user which is generated by the `openshift-install` program in Passbolt for emergency access to clusters.
The client certificate generated for that kubeconfig has a lifetime of 10 years.
Unfortunately, Kubernetes doesn't support revoking client certificates, see https://github.com/kubernetes/kubernetes/issues/18982[this GitHub issue (kubernetes/kubernetes#18982)].

We would like to have another form of emergency access to OpenShift 4 clusters.
The main reason is that having credentials with a lifetime of 10 years which can't be revoked is less than ideal.

=== Goals

* Define a method to manage emergency access credentials for OpenShift 4 clusters
* The credentials should be relatively short-lived and it must be possible to rotate them

=== Non-Goals

* Replace regular authentication

== Proposals

=== Credential type

==== Issue short-lived certificates with cluster-admin privileges

The first approach is that we issue client certificates with cluster-admin privileges.
This can be done either through Kubernetes' `CertificateSigningRequest` (CSR) resources, or by manually issuing certificates against a self-signed CA certificate which is installed as a client CA certificate in the cluster.

One point to consider is that Kubernetes doesn't support issuing client certificates for group `system:masters` through CSRs, but group `system:cluster-admins` is allowed, and functionally equivalent on OpenShift 4.

Note that the restriction that we can't revoke existing certificates still applies for certificates issued through CSRs or through a self-managed CA certificate.
However, if we use a self-managed CA certificate, we can invalidate existing certificates by rotating the CA and issuing a new certificate against the new CA.

==== Use service account tokens with cluster-admin privileges

The second approach is that we setup a Kubernetes service account which is granted `cluster-admin` privileges through a `ClusterRoleBinding` and issue service account tokens for that service account.

We have two options to generate tokens for service accounts:

. The https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#manually-create-an-api-token-for-a-serviceaccount[TokenRequest] API allows us to generate service account tokens which expire after a defined amount of time.
However, tokens which are manually created through the TokenRequest API (for example with `kubectl create token`) can't be invalidated before they expire.

. [Non-expiring API tokens] are created by defining a secret of type `kubernetes.io/service-account-token`.
As the name suggests, these tokens don't expire.

The only way to permanently invalidate service account tokens (both non-expiring and time-bound) is to delete the service account.
Creating a new service account with the same name in the same namespace doesn't reactivate tokens associated with a previous service account.

The proposed approach for using service account tokens is to use the TokenRequest API to create short-lived API tokens to generate expiring admin credentials by default.
Additionally, introduce a mechanism to force the tool to recreate the service account to invalidate any old tokens that might have leaked.
That mechanism might be as simple as having the tool reconcile the service account and recreate it if it gets deleted.

=== Credential management

==== Extend Steward to manage credentials and write them to Vault

We can extend https://syn.tools/steward[Steward] to manage and renew the credentials and store them in Vault.

This allows us to issue relatively short-lived credentials (on the order of days), which limits the attack surface presented by engineers accessing admin credentials in emergency situations.

Optionally, we can also extend Steward to render a full kubeconfig file based on the managed credentials and store that file in Vault in addition to the raw credentials.
If we store a full kubeconfig file in Vault, we can document a single `vault` CLI command which fetches the emergency kubeconfig for a cluster.

==== Create a new tool which manages the credentials and writes them to Vault

Instead of extending Steward, we could also create a new tool which manages admin credentials and writes them to Vault.
This would provide some level of separation of concerns, since managing admin credentials isn't necessarily part of the Project Syn bootstrap process.
Additionally, having a separate tool allows us to have releases independent of the fairly complex Steward release process.

==== Manage credentials by hand

Another approach is to manage and renew the admin credentials by hand.

== Decision

== Rationale

== References

* https://access.redhat.com/solutions/4845381[Red Hat solution which gives some details on the admin kubeconfig]
* https://access.redhat.com/solutions/6054981[Red Hat solution describing how to replace the CA for the initial admin kubeconfig]
1 change: 1 addition & 0 deletions docs/modules/ROOT/partials/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -225,3 +225,4 @@
** xref:oc4:ROOT:explanations/decisions/shipping-metrics-to-centralized-instance.adoc[]
** xref:oc4:ROOT:explanations/decisions/scheduled-mr-merges.adoc[]
** xref:oc4:ROOT:explanations/decisions/subscription-tracking.adoc[]
** xref:oc4:ROOT:explanations/decisions/admin-kubeconfig.adoc[]

0 comments on commit 85f5aad

Please sign in to comment.