yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #79239
[Bug 1836212] [NEW] libvirt: Failure to recover from failed detach
Public bug reported:
1020162 ERROR root [req-46fbc6c8-de2c-4afb-9f24-9d75947c9a3c
9ccddbb72e2d42b6ab1a31ad48ea21fb 86bea4eb057b412a98402a1b7e1d9222 - - -]
Original exception being dropped: ['Traceback (most recent call
last):\n', ' File "/usr/lib/python2.7/site-
packages/nova/virt/libvirt/guest.py", line 390, in _try_detach_device\n
self.detach_device(conf, persistent=persistent, live=live)\n', ' File
"/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 467,
in detach_device\n self._domain.detachDeviceFlags(device_xml,
flags=flags)\n', ' File "/usr/lib/python2.7/site-
packages/eventlet/tpool.py", line 186, in doit\n result =
proxy_call(self._autowrap, f, *args, **kwargs)\n', ' File
"/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in
proxy_call\n rv = execute(f, *args, **kwargs)\n', ' File
"/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in
execute\n six.reraise(c, e, tb)\n', ' File "/usr/lib/python2.7/site-
packages/eventlet/tpool.py", line 83, in tworker\n rv = meth(*args,
**kwargs)\n', ' File "/usr/lib64/python2.7/site-packages/libvirt.py",
line 1194, in detachDeviceFlags\n if ret == -1: raise libvirtError
(\'virDomainDetachDeviceFlags() failed\', dom=self)\n', 'libvirtError:
invalid argument: no target device vdb\n']
This appears to happen because when we call
detach_device_with_retry(live=True) we ultimately call
detachDeviceFlags(flags=VIR_DOMAIN_AFFECT_CONFIG |
VIR_DOMAIN_AFFECT_LIVE). 'no target device' is the error generated when
libvirt failed to remove the device from CONFIG (persistent). This can
happen because detachDeviceFlags(flags=VIR_DOMAIN_AFFECT_CONFIG |
VIR_DOMAIN_AFFECT_LIVE) will succeed and remove the device from the
CONFIG domain as long as the LIVE domain removal was queued, even though
this is an asynchronous operation. Consequently, a subsequent check for
the device may return the device because it hasn't yet been (and may
never be) removed from the LIVE domain, but it has been removed from the
CONFIG domain. This will prevent libvirt from attempting to remove the
device from the LIVE domain, and so the detach will never succeed.
** Affects: nova
Importance: Undecided
Status: New
** Bug watch added: Red Hat Bugzilla #1669225
https://bugzilla.redhat.com/show_bug.cgi?id=1669225
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1836212
Title:
libvirt: Failure to recover from failed detach
Status in OpenStack Compute (nova):
New
Bug description:
1020162 ERROR root [req-46fbc6c8-de2c-4afb-9f24-9d75947c9a3c
9ccddbb72e2d42b6ab1a31ad48ea21fb 86bea4eb057b412a98402a1b7e1d9222 - -
-] Original exception being dropped: ['Traceback (most recent call
last):\n', ' File "/usr/lib/python2.7/site-
packages/nova/virt/libvirt/guest.py", line 390, in
_try_detach_device\n self.detach_device(conf,
persistent=persistent, live=live)\n', ' File "/usr/lib/python2.7
/site-packages/nova/virt/libvirt/guest.py", line 467, in
detach_device\n self._domain.detachDeviceFlags(device_xml,
flags=flags)\n', ' File "/usr/lib/python2.7/site-
packages/eventlet/tpool.py", line 186, in doit\n result =
proxy_call(self._autowrap, f, *args, **kwargs)\n', ' File
"/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in
proxy_call\n rv = execute(f, *args, **kwargs)\n', ' File
"/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in
execute\n six.reraise(c, e, tb)\n', ' File "/usr/lib/python2.7
/site-packages/eventlet/tpool.py", line 83, in tworker\n rv =
meth(*args, **kwargs)\n', ' File "/usr/lib64/python2.7/site-
packages/libvirt.py", line 1194, in detachDeviceFlags\n if ret ==
-1: raise libvirtError (\'virDomainDetachDeviceFlags() failed\',
dom=self)\n', 'libvirtError: invalid argument: no target device
vdb\n']
This appears to happen because when we call
detach_device_with_retry(live=True) we ultimately call
detachDeviceFlags(flags=VIR_DOMAIN_AFFECT_CONFIG |
VIR_DOMAIN_AFFECT_LIVE). 'no target device' is the error generated
when libvirt failed to remove the device from CONFIG (persistent).
This can happen because
detachDeviceFlags(flags=VIR_DOMAIN_AFFECT_CONFIG |
VIR_DOMAIN_AFFECT_LIVE) will succeed and remove the device from the
CONFIG domain as long as the LIVE domain removal was queued, even
though this is an asynchronous operation. Consequently, a subsequent
check for the device may return the device because it hasn't yet been
(and may never be) removed from the LIVE domain, but it has been
removed from the CONFIG domain. This will prevent libvirt from
attempting to remove the device from the LIVE domain, and so the
detach will never succeed.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1836212/+subscriptions
Follow ups