Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create volume snapshots in parallel #4242

Closed
wants to merge 4 commits into from

Conversation

jglick
Copy link

@jglick jglick commented Oct 13, 2021

Experimental patch to run volume snapshotter plugins in parallel, at least for the volume creation. Necessary for vmware-tanzu/velero-plugin-for-aws#90 to be practical since it must wait for an EBS snapshot to be complete before initiating a request to copy it to another region, and this often blocks for about 2m37s (apparently regardless of the size of the delta, or number of snapshots currently pending).

jglick added a commit to jglick/velero-plugin-for-aws that referenced this pull request Oct 14, 2021
@dsu-igeek
Copy link
Contributor

We're redoing architecture to support this feature, this is not the direction for handling this.

@dsu-igeek dsu-igeek closed this Oct 18, 2021
@jglick
Copy link
Author

jglick commented Oct 19, 2021

We're redoing architecture to support this feature

Good news! Do you have anything I can look at yet?

@jglick
Copy link
Author

jglick commented Oct 25, 2021

@jglick
Copy link
Author

jglick commented Oct 27, 2021

In fact something like this patch is necessary even for the AWS plugin as currently designed (with no cross-zone/region functionality): API calls such as DescribeVolumes (example) which are normally near-instantaneous (only waiting for one HTTP request to complete) can nonetheless fail with RequestLimitExceeded under load, as I found when I tried running backups on a cluster with ~100 PVs, and as others have apparently seen as well: #1135 (comment). The remedy is to pause and retry (example): vmware-tanzu/velero-plugin-for-aws@b5d7c52 is working reliably for me under load, but requires this core patch to be practical. Note that even implementing #3533 would not suffice for that case, as the issue is not that the backup has been successfully initiated and the volume snapshotter plugin was merely waiting for infrastructure to mark it as complete, but rather than the snapshot cannot even be initiated without waiting.

@jglick
Copy link
Author

jglick commented Nov 2, 2021

see #2888

@jglick
Copy link
Author

jglick commented Apr 18, 2022

$ COSIGN_EXPERIMENTAL=1 cosign verify ghcr.io/jglick/velero:concurrent-snapshot

Verification for ghcr.io/jglick/velero:concurrent-snapshot --
The following checks were performed on each of these signatures:
  - The cosign claims were validated
  - Existence of the claims in the transparency log was verified offline
  - Any certificates were verified against the Fulcio roots.

[{"critical":{"identity":{"docker-reference":"ghcr.io/jglick/velero"},"image":{"docker-manifest-digest":"sha256:b8c86e7c129cd40ae2079f11a79b2f52ed2e28e4f1677a8ba7125eb1aaa95151"},…

@jglick
Copy link
Author

jglick commented Jul 12, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants