Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CA-405417: add tracing of process holding device #731

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions drivers/lvutil.py
Original file line number Diff line number Diff line change
Expand Up @@ -446,15 +446,15 @@ def _openExclusive(dev, retry):
try:
return os.open("%s" % dev, os.O_RDWR | os.O_EXCL)
except OSError as ose:
opened_by = ''
ret = util.pread2(["lsof", dev])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think lsof can get stuck though if you have a stuck remote filesystem (NFS/GFS2), because as it looks through /proc it might try to resolve links pointing to these FS.

Or at least fuser would get stuck, and we had to remove a usage of that from XAPI to prevent XAPI from getting stuck:
xapi-project/xen-api#6197

Is it possible to run this command, but with a timeout so it doesn't completely block everything?

if ose.errno == 16:
if retry:
util.SMlog('Device %s is busy, settle and one shot retry' %
dev)
util.SMlog('Device %s is busy, opened by %s. Settle and one shot retry' %
(dev, ret))
util.pread2(['/usr/sbin/udevadm', 'settle'])
return _openExclusive(dev, False)
else:
util.SMlog('Device %s is busy after retry' % dev)
util.SMlog('Device %s is busy after retry, opened by %s' % (dev, ret))

util.SMlog('Opening device %s failed with %d' % (dev, ose.errno))
raise xs_errors.XenError(
Expand Down