Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

instrument pycloudlib failures in lxc delete /mnt/path --force #231

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

holmanb
Copy link
Member

@holmanb holmanb commented Jan 12, 2023

Our integration tests are currently seeing lxd vm failures like the following:

failed on teardown with "RuntimeError: Failure (rc=1): Error: Stopping the instance failed: Failed unmounting instance: Failed to unmount "/var/snap/lxd/common/lxd/storage-pools/default/virtual-machines/cloudinit-0111-2315175ouu6z5k": device or resource busy"

Another example

Since lxd doesn't include diagnostic information required to understand why this is happening, I propose temporarily parsing the error message for the mount point and checking the mount for open files.

Update: almost 2 years later and the underlying issue hasn't been resolved and now we are seeing similar issues on lxd VM restarts as well. I just revived this PR because I'd like to get to the bottom of these failures, but for that we need more information. Triaging constant integration test failures is a waste of time and I don't see any better proposals at this currently.

No tests because this is just some simple instrumentation that I plan to rip out as soon as the underlying issue is resolved. It should only run in the failure path anyways so risk of breaking things is low.

This is a pretty ugly hack Parsing lxd's error messages isn't ideal, so I didn't go further than open file checking before getting eyes on this. We may want/need to add checks for the various other causes of EBUSY. This is a common cause for EBUSY, so this could be enough, however if a reviewer is okay with the general approach I can also move forward with something a little more comprehensive than just checking for open files.

@blackboxsw blackboxsw self-assigned this Jan 12, 2023
@holmanb holmanb force-pushed the holmanb/instrument-introspection branch from 1e2614c to 4b5b0bb Compare January 18, 2023 02:29
@TheRealFalcon
Copy link
Member

@holmanb is this PR still relevant?

@holmanb
Copy link
Member Author

holmanb commented Mar 6, 2023

@holmanb is this PR still relevant?

I think @blackboxsw was had preference for a different approach. I'll close it.

@holmanb holmanb closed this Mar 6, 2023
@holmanb holmanb reopened this Nov 23, 2024
@holmanb holmanb force-pushed the holmanb/instrument-introspection branch 3 times, most recently from 6ad7db0 to e25472f Compare November 23, 2024 02:19
@holmanb holmanb force-pushed the holmanb/instrument-introspection branch from e25472f to c0a4d8f Compare November 23, 2024 02:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants