After upgrading to 2.13.2 (1.5.1 velero) backup location not found error #168

sagor999 · 2020-10-05T20:39:09Z

What steps did you take and what happened:
Upgraded from helm chart version 2.12.17 to 2.13.2.
Velero pod is now spitting out this into log:

time="2020-10-05T20:33:20Z" level=info msg="Checking for existing backup locations ready to be verified; there needs to be at least 1 backup location available" controller=backupstoragelocation logSource="pkg/controller/backupstoragelocation_controller.go:58"
time="2020-10-05T20:33:20Z" level=info msg="No backup locations were ready to be verified" controller=backupstoragelocation logSource="pkg/controller/backupstoragelocation_controller.go:120"

Checked that backuplocations do exist in the cluster (and they were working just fine in previous release)

Tried to remove the chart, removed CRDs, re-install the chart. Checked that backuplocations were added back. And same error.
For some reason it no longer sees back up locations.

Environment:

helm version (use helm version):
Version:"v3.3.1"
helm chart version and app version:
- helm3 (use helm list -n <YOUR NAMESPACE>):

NAME  	NAMESPACE	REVISION	UPDATED                               	STATUS  	CHART        	APP VERSION
velero	velero   	1       	2020-10-05 20:31:07.85468752 +0000 UTC	deployed	velero-2.13.2	1.5.1

Kubernetes version (use kubectl version):
1.17.3
Kubernetes installer & version:
Cloud provider or hardware configuration:
OS (e.g. from /etc/os-release):
ubuntu 18.04

The text was updated successfully, but these errors were encountered:

zubron · 2020-10-05T21:27:46Z

This is definitely a confusing message, but does not indicate that there is an error with your setup. The messages comes from the BackupStorageLocation controller when it is verifying each of the backup locations. Each backup location has a frequency at which it should be verified and if enough time has passed since the last time it was verified, it will be verified again. While iterating over all the locations, if none of the locations needed to be verified, then the controller will print the message that no backup locations were ready to be verified. Although this looks like an error, the state is OK and the backup location will be verified at a later time.

This frequency and wording of this log message is confusing and is something that we are tracking to fix in the velero project (vmware-tanzu/velero#2942). If you have any further comments or questions about this, it would be really helpful if you could direct them to the issue in the velero project.

zubron · 2020-10-05T21:30:22Z

Given that the issue with the wording is being tracked in that project, I am going to close this issue.

Slutzky · 2020-11-14T20:19:50Z

hi @zubron

All backups stuck with progress status after removing helm chart 1.4.2 + crd and installing 1.5.2

velero-backup-application-20201114125750 InProgress 2020-11-14 13:57:50 +0100 CET 29d gcp <none> velero-backup-application-20201113125712 InProgress 2020-11-13 13:57:28 +0100 CET 28d gcp <none> velero-backup-kubedb-20201114125750 InProgress 2020-11-14 13:58:07 +0100 CET 29d gcp <none>

log from velero pod

time="2020-11-14T20:12:21Z" level=error msg="Error patching backup location's last-synced time" backupLocation=gcp controller=backup-sync error="backupstoragelocations.velero.io \"gcp\" not found" error.file="/go/src/github.com/vmware-tanzu/velero/pkg/controller/backup_sync_controller.go:311" error.function="github.com/vmware-tanzu/velero/pkg/controller.(*backupSyncController).run" logSource="pkg/controller/backup_sync_controller.go:311" time="2020-11-14T20:12:51Z" level=error msg="Error patching backup location's last-synced time" backupLocation=gcp controller=backup-sync error="backupstoragelocations.velero.io \"gcp\" not found" error.file="/go/src/github.com/vmware-tanzu/velero/pkg/controller/backup_sync_controller.go:311" error.function="github.com/vmware-tanzu/velero/pkg/controller.(*backupSyncController).run" logSource="pkg/controller/backup_sync_controller.go:311"

helm ls

velero velero 2 2020-11-14 19:57:51.32498833 +0000 UTC deployed velero-2.12.17 1.4.2

No problem with version 1.4.2.

jenting · 2020-11-16T04:33:14Z

hi @zubron

All backups stuck with progress status after removing helm chart 1.4.2 + crd and installing 1.5.2

velero-backup-application-20201114125750 InProgress 2020-11-14 13:57:50 +0100 CET 29d gcp <none> velero-backup-application-20201113125712 InProgress 2020-11-13 13:57:28 +0100 CET 28d gcp <none> velero-backup-kubedb-20201114125750 InProgress 2020-11-14 13:58:07 +0100 CET 29d gcp <none>

log from velero pod

time="2020-11-14T20:12:21Z" level=error msg="Error patching backup location's last-synced time" backupLocation=gcp controller=backup-sync error="backupstoragelocations.velero.io \"gcp\" not found" error.file="/go/src/github.com/vmware-tanzu/velero/pkg/controller/backup_sync_controller.go:311" error.function="github.com/vmware-tanzu/velero/pkg/controller.(*backupSyncController).run" logSource="pkg/controller/backup_sync_controller.go:311" time="2020-11-14T20:12:51Z" level=error msg="Error patching backup location's last-synced time" backupLocation=gcp controller=backup-sync error="backupstoragelocations.velero.io \"gcp\" not found" error.file="/go/src/github.com/vmware-tanzu/velero/pkg/controller/backup_sync_controller.go:311" error.function="github.com/vmware-tanzu/velero/pkg/controller.(*backupSyncController).run" logSource="pkg/controller/backup_sync_controller.go:311"

helm ls

velero velero 2 2020-11-14 19:57:51.32498833 +0000 UTC deployed velero-2.12.17 1.4.2

No problem with version 1.4.2.

@Slutzky could you please provides the helm install commands and the related values.yaml? Thank you.

Slutzky · 2020-11-16T07:17:59Z

I'm using flux. Flux installs the velero by applying the following "HelmRelease"

---
apiVersion: helm.fluxcd.io/v1
kind: HelmRelease
metadata:
  name: velero
  namespace: velero
  annotations:
    fluxcd.io/automated: "false"
spec:
  releaseName: velero
  chart:
    repository: https://vmware-tanzu.github.io/helm-charts
    name: velero
    version: 2.12.17
  values:
    image:
      repository: velero/velero
      tag: v1.5.2    
    configuration:
      backupStorageLocation:
        bucket: bucket
        name: gcp
        prefix: null
      provider: gcp
      resticTimout: 4h
      volumeSnapshotLocation:
        name: gcp
    credentials:
      existingSecret: cloud-credentials
      useSecret: true
    deployRestic: false
    initContainers:
    - image: velero/velero-plugin-for-gcp:v1.0.0
      imagePullPolicy: IfNotPresent
      name: velero-plugin-for-gcp
      volumeMounts:
      - mountPath: /target
        name: plugins
    metrics:
      enabled: true
      scrapeInterval: 30s
      serviceMonitor:
        additionalLabels:
          prometheus: kube-prometheus
        enabled: true
    schedules:
      backup-volumes:
        schedule: '@every 24h'
        template:
          ttl: 720h
          includeClusterResources: false
          includedNamespaces:
            - dev
          includedResources:
            - persistentvolumes
            - persistentvolumeclaims
          snapshotVolumes: true
          storageLocation: gcp
    serviceAccount:
      server:
        name: velero

jenting · 2020-11-19T07:36:06Z

@Slutzky,
Could you please check the output of velero backup-location get and kubectl get backupstoragelocations.velero.io?
And I'd suggest bumping the velero-plugin-for-gcp plugin tag to v1.1.0.

Slutzky · 2020-11-19T07:41:00Z

hi @jenting .

It seems problem is in new CRD they added a check from velero pod so velero pod checks if it is up then it marks it as available but because of flux and the lifecycle hook this gcp backupstoragelocation was always synced from flux , this means it would set the flag available as empty when it synced next run and available is only set at beginning in velero so velero cannot recover.

Issue has been solved.

Thank you for your support.

Best

jenting · 2020-11-19T07:45:54Z

hi @jenting .

It seems problem is in new CRD they added a check from velero pod so velero pod checks if it is up then it marks it as available but because of flux and the lifecycle hook this gcp backupstoragelocation was always synced from flux , this means it would set the flag available as empty when it synced next run and available is only set at beginning in velero so velero cannot recover.

Issue has been solved.

Thank you for your support.

Best

@Slutzky good to know the main root cause, although I did not know how you resolve the issue. If possible, could you please elaborate more on how you fix the issue? Thank you.

Slutzky · 2020-11-19T08:19:06Z

sure, I'm using Flux to install velero, I added
enableHelmHooks: false into the velero.yaml file.

chart:
    repository: https://vmware-tanzu.github.io/helm-charts
    name: velero
    version:  2.12.13
  values:
    image:
      repository: velero/velero
      tag: v1.5.2    
    configuration:
      backupStorageLocation:
        bucket: prod
      useSecret: true
    deployRestic: false
    initContainers:
    - image: velero/velero-plugin-for-gcp:v1.0.0
      imagePullPolicy: IfNotPresent
      name: velero-plugin-for-gcp
      volumeMounts:
      - mountPath: /target
        name: plugins
    enableHelmHooks: false

https://docs.fluxcd.io/projects/helm-operator/en/1.0.0-rc9/references/helmrelease-custom-resource.html#configuration

jenting · 2020-11-19T08:23:17Z

Thank you, so this is related to #171

sure, I'm using Flux to install velero, I added
enableHelmHooks: false into the velero.yaml file.

chart:
    repository: https://vmware-tanzu.github.io/helm-charts
    name: velero
    version:  2.12.13
  values:
    image:
      repository: velero/velero
      tag: v1.5.2    
    configuration:
      backupStorageLocation:
        bucket: prod
      useSecret: true
    deployRestic: false
    initContainers:
    - image: velero/velero-plugin-for-gcp:v1.0.0
      imagePullPolicy: IfNotPresent
      name: velero-plugin-for-gcp
      volumeMounts:
      - mountPath: /target
        name: plugins
    enableHelmHooks: false

https://docs.fluxcd.io/projects/helm-operator/en/1.0.0-rc9/references/helmrelease-custom-resource.html#configuration

Thank you, so this is related to #171.

talha0324 · 2021-03-02T07:30:12Z

Hey, I'm using latest helm release v2.14.12 and still seeing this error,
velero v1.5.3
restic enabled

jenting · 2021-03-02T10:37:39Z

Hey, I'm using latest helm release v2.14.12 and still seeing this error,
velero v1.5.3
restic enabled

Are you install Velero helm chart fresh install or perform the upgrade?
If you perform the upgrade, remember that helm upgrade won't upgrade the CRD manifest, you have to install the velero v1.5.3 client and run velero install --crds-only --dry-run -o yaml | kubectl apply -f -.

talha0324 · 2021-03-02T17:12:04Z

I did the fresh install.
I also tried to make template out of helm charts and individually apply everything. But I still got the same error

jenting · 2021-03-03T06:16:55Z

I did the fresh install.
I also tried to make template out of helm charts and individually apply everything. But I still got the same error

Could you please provide more information? Like velero backup-location get and also the log of Velero server Pod, thank u.

talha0324 · 2021-03-03T06:49:12Z

velero backup-location get -n openshift-velero

NAME      PROVIDER   BUCKET/PREFIX   PHASE       LAST VALIDATED                  ACCESS MODE
default   aws        velero_backup   Available   2021-03-03 11:44:33 +0500 PKT   ReadWrite

Velero Server logs

INFO[0021] Checking for existing backup locations ready to be verified; there needs to be at least 1 backup location available  controller=backupstoragelocation logSource="pkg/controller/backupstoragelocation_controller.go:58"
INFO[0021] No backup locations were ready to be verified  controller=backupstoragelocation logSource="pkg/controller/backupstoragelocation_controller.go:120"
ERRO[0021] Error getting backup store for this location  backupLocation=default controller=backup-sync error="unable to locate ObjectStore plugin named velero.io/aws" logSource="pkg/controller/backup_sync_controller.go:168"
ERRO[0021] Error checking repository for stale locks     controller=restic-repository error="error running command=restic unlock --repo=s3:https://object-eu-se-1a.binero.cloud/velero_backup/restic/hanzala-12 --password-file=/tmp/velero-restic-credentials-hanzala-12848765126, stdout=, stderr=: exec: \"restic\": executable file not found in $PATH" error.file="/github.com/vmware-tanzu/velero/pkg/restic/repository_manager.go:286" error.function="github.com/vmware-tanzu/velero/pkg/restic.(*repositoryManager).exec" logSource="pkg/controller/restic_repository_controller.go:144" name=hanzala-12-default-xtbsx namespace=openshift-velero
ERRO[0021] Error checking repository for stale locks     controller=restic-repository error="error running command=restic unlock --repo=s3:https://object-eu-se-1a.binero.cloud/velero_backup/restic/hanzala-12 --password-file=/tmp/velero-restic-credentials-hanzala-12947945837, stdout=, stderr=: exec: \"restic\": executable file not found in $PATH" error.file="/github.com/vmware-tanzu/velero/pkg/restic/repository_manager.go:286" error.function="github.com/vmware-tanzu/velero/pkg/restic.(*repositoryManager).exec" logSource="pkg/controller/restic_repository_controller.go:144" name=hanzala-12-default-xtbsx namespace=openshift-velero
INFO[0021] Checking for existing backup locations ready to be verified; there needs to be at least 1 backup location available  controller=backupstoragelocation logSource="pkg/controller/backupstoragelocation_controller.go:58"
INFO[0021] No backup locations were ready to be verified  controller=backupstoragelocation logSource="pkg/controller/backupstoragelocation_controller.go:120"
INFO[0021] Checking for existing backup locations ready to be verified; there needs to be at least 1 backup location available  controller=backupstoragelocation logSource="pkg/controller/backupstoragelocation_controller.go:58"

patrickheinzelmann · 2021-03-08T21:52:04Z

I've installed Velero for the first time and my logs are also filled with the following ouputs.

time="2021-03-08T21:32:27Z" level=info msg="Checking for existing backup locations ready to be verified; there needs to be at least 1 backup location available" controller=backupstoragelocation logSource="pkg/controller/backupstoragelocation_controller.go:58"
time="2021-03-08T21:32:28Z" level=info msg="Checking for existing backup locations ready to be verified; there needs to be at least 1 backup location available" controller=backupstoragelocation logSource="pkg/controller/backupstoragelocation_controller.go:58"
time="2021-03-08T21:32:28Z" level=info msg="No backup locations were ready to be verified" controller=backupstoragelocation logSource="pkg/controller/backupstoragelocation_controller.go:120"
time="2021-03-08T21:33:28Z" level=info msg="Checking for existing backup locations ready to be verified; there needs to be at least 1 backup location available" controller=backupstoragelocation logSource="pkg/controller/backupstoragelocation_controller.go:58"
time="2021-03-08T21:33:28Z" level=info msg="Checking for existing backup locations ready to be verified; there needs to be at least 1 backup location available" controller=backupstoragelocation logSource="pkg/controller/backupstoragelocation_controller.go:58"
time="2021-03-08T21:33:28Z" level=info msg="No backup locations were ready to be verified" controller=backupstoragelocation logSource="pkg/controller/backupstoragelocation_controller.go:120"

Here is the info from velero backup-location get

NAME      PROVIDER   BUCKET/PREFIX   PHASE       LAST VALIDATED                  ACCESS MODE
default   azure      velero          Available   2021-03-08 22:36:29 +0100 CET   ReadWrite

talha0324 · 2021-03-09T09:57:17Z

Hey @patrickheinzelmann , I figured that this log message doesn't prevent velero from creating a backup. I'm having this log but my backups are being created successfully.

patrickheinzelmann · 2021-03-09T10:10:13Z

My problem is that Velero is creating the the folder and the log file, but isn't creating the data of the backup.

It worked only once, all other attempts to backup a namespace failed.

talha0324 · 2021-03-09T10:13:23Z

As it pushed logs to the backupstorageLocation, so I'm sure this is not related to backupStorageLocation. I might be wrong.
You can look into these logs for some error

patrickheinzelmann · 2021-03-09T10:23:36Z

I looked into the error log of the backup creation and the Velero POD.

Name:         default-202103091015
Namespace:    velero
Labels:       velero.io/storage-location=default
Annotations:  velero.io/source-cluster-k8s-gitversion=v1.17.13
              velero.io/source-cluster-k8s-major-version=1
              velero.io/source-cluster-k8s-minor-version=17

Phase:  Failed (run `velero backup logs default-202103091015` for more information)

Errors:    0
Warnings:  1

Namespaces:
  Included:  default
  Excluded:  <none>

Resources:
  Included:        *
  Excluded:        <none>
  Cluster-scoped:  auto

Label selector:  <none>

Storage Location:  default

Velero-Native Snapshot PVs:  auto

TTL:  720h0m0s

Hooks:  <none>

Backup Format Version:  1.1.0

Started:    2021-03-09 11:16:13 +0100 CET
Completed:  2021-03-09 11:16:23 +0100 CET

Expiration:  2021-04-08 12:16:13 +0200 CEST

Total items to be backed up:  76
Items backed up:              76

Velero-Native Snapshots: <none included>

But I can't find any error message beside the backup location messages.

velero-default-202109031015.log

patrickheinzelmann · 2021-03-09T10:25:21Z

@talha0324 I agree that the problems seems not related to the backup location, but this is the only message I can see in the log file.

talha0324 · 2021-03-09T10:27:01Z

as your backup says:
Phase: Failed (run velero backup logs default-202103091015 for more information)

patrickheinzelmann · 2021-03-09T10:48:36Z

@talha0324 I know that the backup fails but I can't see any error when I execute "velero backup logs default-202103091015". I have attached the output in the file velero-default-202109031015.log.

talha0324 · 2021-03-09T10:52:59Z

I did went through that file and I'm not sure about this.

dsu-igeek · 2021-03-09T16:06:51Z

It looks like you're using the Azure plug-in. There was an issue where the Azure plug-in needed more memory than we were giving to the Velero containers by default. That was corrected in 1.5.3 and the amount of memory was increased to 512M but I don't see anything in the Helm charts that sets the amount of memory.

There is some info in the PR about the error messages you would see. vmware-tanzu/velero#3234 You will see the messages in the Velero pod's log. They happen after the backup has finished but while uploading the data to the blob store, so they are not getting tracked properly.

Check the amount of memory on your Velero pod with kubectl describe. It should be set to 512Mi for use with the Azure plug-in.

patrickheinzelmann · 2021-03-09T17:26:10Z

@dsu-igeek The memory is limited to 256Mi.

    Limits:
      cpu:     1
      memory:  256Mi
    Requests:
      cpu:     500m
      memory:  128Mi

I followed the installation guide for the Azure Plugin. https://github.com/vmware-tanzu/velero-plugin-for-microsoft-azure#setup. However the increasing of the memory wasn't mentioned.

dsu-igeek · 2021-03-09T20:17:25Z

It's fixed in the main Velero install in rev 1.5.3. The Helm chart is not setting the resource limit properly. I added an issue for the Helm chart - #231

jenting · 2021-03-10T07:21:31Z

@patrickheinzelmann, is your issue been addressed after increase the memory limit?

patrickheinzelmann · 2021-03-10T09:48:35Z

@jenting Yes, after increasing the memory the backup data will be uploaded to the Azure Storage.

* Resync rules and dashboards Signed-off-by: Jesse Bye <[email protected]> * Bump chart version Signed-off-by: Jesse Bye <[email protected]> * resync again Signed-off-by: Jesse Bye <[email protected]>

zubron closed this as completed Oct 5, 2020

zubron mentioned this issue Jan 6, 2021

Velero backup failure after upgrading to 1.5.2 vmware-tanzu/velero#3221

Closed

dsu-igeek mentioned this issue Mar 9, 2021

Velero pod should have 512Mi for the memory limit #231

Closed

johnjbateman mentioned this issue Mar 23, 2021

Error patching backup location's last-synced time. <backupStorageLocation.name> not found #237

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

After upgrading to 2.13.2 (1.5.1 velero) backup location not found error #168

After upgrading to 2.13.2 (1.5.1 velero) backup location not found error #168

sagor999 commented Oct 5, 2020

zubron commented Oct 5, 2020

zubron commented Oct 5, 2020

Slutzky commented Nov 14, 2020 •

edited

Loading

jenting commented Nov 16, 2020

Slutzky commented Nov 16, 2020 •

edited by jenting

Loading

jenting commented Nov 19, 2020 •

edited

Loading

Slutzky commented Nov 19, 2020

jenting commented Nov 19, 2020

Slutzky commented Nov 19, 2020

jenting commented Nov 19, 2020 •

edited

Loading

talha0324 commented Mar 2, 2021

jenting commented Mar 2, 2021

talha0324 commented Mar 2, 2021

jenting commented Mar 3, 2021

talha0324 commented Mar 3, 2021

patrickheinzelmann commented Mar 8, 2021

talha0324 commented Mar 9, 2021

patrickheinzelmann commented Mar 9, 2021

talha0324 commented Mar 9, 2021

patrickheinzelmann commented Mar 9, 2021

patrickheinzelmann commented Mar 9, 2021

talha0324 commented Mar 9, 2021

patrickheinzelmann commented Mar 9, 2021 •

edited

Loading

talha0324 commented Mar 9, 2021

dsu-igeek commented Mar 9, 2021

patrickheinzelmann commented Mar 9, 2021

dsu-igeek commented Mar 9, 2021

jenting commented Mar 10, 2021

patrickheinzelmann commented Mar 10, 2021

After upgrading to 2.13.2 (1.5.1 velero) backup location not found error #168

After upgrading to 2.13.2 (1.5.1 velero) backup location not found error #168

Comments

sagor999 commented Oct 5, 2020

zubron commented Oct 5, 2020

zubron commented Oct 5, 2020

Slutzky commented Nov 14, 2020 • edited Loading

jenting commented Nov 16, 2020

Slutzky commented Nov 16, 2020 • edited by jenting Loading

jenting commented Nov 19, 2020 • edited Loading

Slutzky commented Nov 19, 2020

jenting commented Nov 19, 2020

Slutzky commented Nov 19, 2020

jenting commented Nov 19, 2020 • edited Loading

talha0324 commented Mar 2, 2021

jenting commented Mar 2, 2021

talha0324 commented Mar 2, 2021

jenting commented Mar 3, 2021

talha0324 commented Mar 3, 2021

patrickheinzelmann commented Mar 8, 2021

talha0324 commented Mar 9, 2021

patrickheinzelmann commented Mar 9, 2021

talha0324 commented Mar 9, 2021

patrickheinzelmann commented Mar 9, 2021

patrickheinzelmann commented Mar 9, 2021

talha0324 commented Mar 9, 2021

patrickheinzelmann commented Mar 9, 2021 • edited Loading

talha0324 commented Mar 9, 2021

dsu-igeek commented Mar 9, 2021

patrickheinzelmann commented Mar 9, 2021

dsu-igeek commented Mar 9, 2021

jenting commented Mar 10, 2021

patrickheinzelmann commented Mar 10, 2021

Slutzky commented Nov 14, 2020 •

edited

Loading

Slutzky commented Nov 16, 2020 •

edited by jenting

Loading

jenting commented Nov 19, 2020 •

edited

Loading

jenting commented Nov 19, 2020 •

edited

Loading

patrickheinzelmann commented Mar 9, 2021 •

edited

Loading