You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What steps did you take and what happened:
I encountered an issue while trying to back up a volume with Restic in Velero. Here are the steps I took and the resulting error:
I annotated a test deployment with backup.velero.io/backup-volumes to enable backup of a volume using Restic. The volume uses the storage class private-azurefile-csi.
I enabled the node-agent deployment in my setup.
Upon triggering a backup, I received the following error message:
azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/%SUBSCRIPTIONID%/resourceGroups/%RESOURCEGROUP%/providers/Microsoft.Storage/storageAccounts/%STORAGE_ACCOUNT%/listKeys?%24expand=kerb&api-version=2019-06-01: StatusCode=400 -- Original Error: adal: Refresh request failed. Status Code = '400'. Response body: {"error":"invalid_request","error_description":"Identity not found"} Endpoint http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&client_id=%CLIENT_ID%&resource=https%3A%2F%2Fmanagement.azure.com%2F
The identity mentioned in the error does exist, and I can locate it in the Azure portal. I'm wondering if there's an issue with how the node agent handles workload identities or if I'm missing a configuration step.
What did you expect to happen:
I expected the Node-Agent to correctly identify and use the client ID for authentication, allowing Velero to perform the restic backup operation without any errors.
The output of the following commands will help us better understand what's going on:
(Pasting long output into a GitHub gist or other pastebin is fine.)
Anything else you would like to add:
Additional resources including our Helmfile, logs, and backup repository description are available in this GitHub gist.
Some historical context that might be relevant: about a year ago, we successfully backed up an Azure File volume using a similar setup. However, at that time, we were utilizing aad-pod-identity instead of the current workload identity. This change might be a contributing factor to the issue, although I cannot rule out other changes like updates in the Kubernetes version.
It's also worth noting that our regular backups, which do not involve File Share Backup (FSB), are running without any issues. This problem seems specific to backups involving Restic and potentially related to the transition from aad-pod-identity to workload identity.
Environment:
helm version (use helm version): v3.13.2
helm chart version and app version (use helm list -n <YOUR NAMESPACE>): Chart: velero-5.1.6, App: 1.12.2
Kubernetes version (use kubectl version): 1.28.4
Kubernetes installer & version: AKS 1.27.3
Cloud provider or hardware configuration: Azure
OS (e.g. from /etc/os-release): Ubuntu 22.04
The text was updated successfully, but these errors were encountered:
What steps did you take and what happened:
I encountered an issue while trying to back up a volume with Restic in Velero. Here are the steps I took and the resulting error:
The identity mentioned in the error does exist, and I can locate it in the Azure portal. I'm wondering if there's an issue with how the node agent handles workload identities or if I'm missing a configuration step.
What did you expect to happen:
I expected the Node-Agent to correctly identify and use the client ID for authentication, allowing Velero to perform the restic backup operation without any errors.
The output of the following commands will help us better understand what's going on:
(Pasting long output into a GitHub gist or other pastebin is fine.)
Anything else you would like to add:
Additional resources including our Helmfile, logs, and backup repository description are available in this GitHub gist.
Some historical context that might be relevant: about a year ago, we successfully backed up an Azure File volume using a similar setup. However, at that time, we were utilizing aad-pod-identity instead of the current workload identity. This change might be a contributing factor to the issue, although I cannot rule out other changes like updates in the Kubernetes version.
It's also worth noting that our regular backups, which do not involve File Share Backup (FSB), are running without any issues. This problem seems specific to backups involving Restic and potentially related to the transition from aad-pod-identity to workload identity.
Environment:
helm version
): v3.13.2helm list -n <YOUR NAMESPACE>
): Chart: velero-5.1.6, App: 1.12.2kubectl version
): 1.28.4/etc/os-release
): Ubuntu 22.04The text was updated successfully, but these errors were encountered: