Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random race condition in the restore with managed fields #8132

Open
mpryc opened this issue Aug 20, 2024 · 3 comments · May be fixed by #8133
Open

Random race condition in the restore with managed fields #8132

mpryc opened this issue Aug 20, 2024 · 3 comments · May be fixed by #8133
Assignees
Labels
1.16-candidate Bug Needs triage We need discussion to understand problem and decide the priority

Comments

@mpryc
Copy link
Contributor

mpryc commented Aug 20, 2024

What steps did you take and what happened:
There is a race condition in the restore operation causing random errors visible in the logs during restore operation. Below is stripped down log presenting the error during restore:

time="2024-08-15T12:21:15Z" level=error msg="error patch for managed fields test-velero: Operation cannot be fulfilled on namespaces \"test-velero\": the object has been modified; please apply your changes to the latest version and try again" logSource="/remote-source/velero/app/pkg/restore/restore.go:1682" restore=openshift-adp-2/mysql-test-velero-11abd1b8-3a6e-a161-1b3c-435a2812c5a0

[...]

time="2024-08-15T12:22:43Z" level=error msg="Cluster resource restore error: Operation cannot be fulfilled on namespaces \"test-velero\": the object has been modified; please apply your changes to the latest version and try again" logSource="/remote-source/velero/app/pkg/controller/restore_controller.go:587" restore=openshift-adp-2/mysql-test-velero-11abd1b8-3a6e-a161-1b3c-435a2812c5a0

What did you expect to happen:
Managed fields properly restored without error message in the logs.

mpryc added a commit to mpryc/velero that referenced this issue Aug 20, 2024
This commit addresses issue vmware-tanzu#8132, where an error randomly
appears in the logs during the restore operation.

The error occurs due to a race condition when attempting
to patch managed fields on an object that has been modified
in the cluster. The error message indicates that the operation
cannot be fulfilled because the object has been modified,
suggesting that changes should be applied to the latest version.

To resolve this, a retry mechanism has been implemented in the restore
process when encountering this error, ensuring that managed fields
are properly restored without the error message appearing in the logs.

Signed-off-by: Michal Pryc <[email protected]>
@mpryc mpryc linked a pull request Aug 20, 2024 that will close this issue
3 tasks
mpryc added a commit to mpryc/velero that referenced this issue Aug 21, 2024
This commit addresses issue vmware-tanzu#8132, where an error randomly
appears in the logs during the restore operation.

The error occurs due to a race condition when attempting
to patch managed fields on an object that has been modified
in the cluster. The error message indicates that the operation
cannot be fulfilled because the object has been modified,
suggesting that changes should be applied to the latest version.

To resolve this, a retry mechanism has been implemented in the restore
process when encountering this error, ensuring that managed fields
are properly restored without the error message appearing in the logs.

Signed-off-by: Michal Pryc <[email protected]>
mpryc added a commit to mpryc/velero that referenced this issue Aug 21, 2024
This commit addresses issue vmware-tanzu#8132, where an error randomly
appears in the logs during the restore operation.

The error occurs due to a race condition when attempting
to patch managed fields on an object that has been modified
in the cluster. The error message indicates that the operation
cannot be fulfilled because the object has been modified,
suggesting that changes should be applied to the latest version.

To resolve this, a retry mechanism has been implemented in the restore
process when encountering this error, ensuring that managed fields
are properly restored without the error message appearing in the logs.

Signed-off-by: Michal Pryc <[email protected]>
@blackpiglet
Copy link
Contributor

error patch for managed fields test-velero

Could you check what the resource is represented by the name test-velero?

@reasonerjt
Copy link
Contributor

@shubham-pampattiwar there seems some disagreement in the PR #8133

@reasonerjt reasonerjt added the Needs triage We need discussion to understand problem and decide the priority label Dec 6, 2024
@shubham-pampattiwar
Copy link
Collaborator

@shubham-pampattiwar there seems some disagreement in the PR #8133

Tagged @mpryc #8133 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.16-candidate Bug Needs triage We need discussion to understand problem and decide the priority
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants