Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add BoundServiceAccountToken trigger authentication type #6272

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

maxcao13
Copy link
Contributor

@maxcao13 maxcao13 commented Oct 24, 2024

Provide a description of what has been changed
Implements a new auth provider which allows users to provide authentication using a Kubernetes service account's credentials without creating an explicit long-lived secret.

apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: keda-trigger-auth-prometheus
  namespace: ingress-operator
spec:
  boundServiceAccountToken:
  - parameter: bearerToken
    serviceAccountName: thanos 
    expiry: 15m

You could already inject Kubernetes service account tokens in triggerAuth refs before by using the Secret trigger auth type, but instead of manually embedding it in a long-lived secret, you can now directly specify the service account instead, and it will embed the sa token in an annotation in the triggerAuth object, and the keda-operator will autorotate the token if the expiry is at least 50% stale. You can specify which parameter you pull into the trigger with parameter, the serviceAccountName in the same namespace as the TriggerAuth CR, and the expiry as a duration. If you use a ClusterTriggerAuth, note that this works similarly to the Secret trigger auth, and the service account then has to be in the KEDA_CLUSTER_OBJECT_NAMESPACE namespace.

Before I write any tests, I'd like to get feedback first! :)

This for example can allow keda to scale a workload based on a prometheus instance that requires kubernetes rbac auth inside the cluster without creating a long-lived secret that can potentially be exposed and unrevokable unless you delete the serviceaccount itself. You must specify the service account name (same namespace as the object), the authentication parameter, and an expiry (duration). Note that Kubernetes doesn't allow expirys less than 10m.

Checklist

Fixes: #6136

@maxcao13 maxcao13 requested a review from a team as a code owner October 24, 2024 20:20
@maxcao13 maxcao13 force-pushed the bound-serviceacctoken-trigauth branch from 538290e to ea01caa Compare October 24, 2024 20:28
controllers/keda/scaledobject_controller.go Outdated Show resolved Hide resolved
Comment on lines 645 to 697
func resolveBoundServiceAccountToken(ctx context.Context, client client.Client, logger logr.Logger, namespace string, bsat *kedav1alpha1.BoundServiceAccountToken, triggerAuthRef *kedav1alpha1.AuthenticationRef, acs *authentication.AuthClientSet) string {
serviceAccountName, expiry := bsat.ServiceAccountName, bsat.Expiry
if serviceAccountName == "" {
logger.Error(fmt.Errorf("error trying to get token"), "serviceAccountName is required")
return ""
}
var err error
expirySeconds := ptr.Int64(3600)
if expiry != "" {
duration, err := time.ParseDuration(expiry)
if err != nil {
logger.Error(err, "error trying to parse expiry duration", "expiry", expiry)
return ""
}
// convert duration to seconds
expirySeconds = ptr.Int64(int64(duration.Seconds()))
}

triggerAuth, _, err := getTriggerAuth(ctx, client, triggerAuthRef, namespace)
if err != nil {
logger.Error(err, "error trying to get [cluster]triggerAuth", "TriggerAuth.Namespace", namespace, "TriggerAuth.Name", triggerAuthRef.Name)
return ""
}
currentAnnotations := triggerAuth.GetAnnotations()
encodedToken := currentAnnotations["keda-serviceAccountToken"]

// check if service account exists in the namespace
serviceAccount := &corev1.ServiceAccount{}
err = client.Get(ctx, types.NamespacedName{Name: serviceAccountName, Namespace: namespace}, serviceAccount)
if err != nil {
logger.Error(err, "error trying to get service account from namespace", "ServiceAccount.Namespace", namespace, "ServiceAccount.Name", serviceAccountName)
return ""
}

// check if token is already referenced in the TriggerAuthentication annotation
if encodedToken != "" {
tokenValid := checkTokenValidity(ctx, logger, encodedToken, expirySeconds, acs)
switch tokenValid {
case tokenStatusInvalid:
// token is invalid, or if more than 50% of the token's expiry has passed, create new token
return generateAndAnnotateNewToken(ctx, client, logger, serviceAccountName, namespace, expirySeconds, triggerAuth, acs)
case tokenStatusValid:
return encodedToken
default:
// tokenStatusUnknown
return ""
}
} else {
// token doesn't exist; create new token and embed it in the the TriggerAuth
logger.Info("Token doesn't exist; creating new token and embedding it in the triggerauth", "ServiceAccount.Namespace", namespace, "ServiceAccount.Name", serviceAccountName)
return generateAndAnnotateNewToken(ctx, client, logger, serviceAccountName, namespace, expirySeconds, triggerAuth, acs)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, KEDA only re-executes the resolver in case of scaler creation. It means that the expiration of the token won't be checked until it has expired, triggering the error in the scaler and re-evaluating this code.
About storing the token in the TriggerAuthentication, I don't like this approach too much as it can be exposed just accessing the TriggerAuthentication CR, which isn't a sensitive API.
Considering these 2 things together, I'd not store the token anywhere and just request a fresh token each time, storing it in the memory of the scaler. It solves the storing issue and removed the verification code (which won't be executed until the token has already expired in the common scenario)

Copy link
Contributor Author

@maxcao13 maxcao13 Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, make sense. Then should users still be able to embed an "expiry"? I would imagine a failing query every expiry period might be annoying to some users for availability reasons, but it might help if it fails at expected intervals? Plus, it will be an extra TokenRequest API request every interval.

Or maybe it would better to register scheduled goroutines with Timer to autorotate every half expiry on each authref?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we try to handle the expired token? The we can try to refresh the token and fail -> error if we are not able to do that.

Copy link
Member

@JorTurFer JorTurFer Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would imagine a failing query every expiry period might be annoying to some users for availability reasons

Actually, the query won't fail because it's silently retried once after the first fail, refreshing the cache and enforcing a new token from TokenRequest API

Copy link
Contributor Author

@maxcao13 maxcao13 Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback everyone! Does my current implementation in 6c9e605 make sense then?

@maxcao13 maxcao13 force-pushed the bound-serviceacctoken-trigauth branch from 44f1a81 to 542d918 Compare December 2, 2024 17:13
@maxcao13
Copy link
Contributor Author

maxcao13 commented Dec 3, 2024

I know in the community meeting we concluded on a range for expiry as 1h-6h, but is there a reason why we shouldn't allow people to use lower expiries like 10m?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support bound service account tokens
3 participants