← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1836212] [NEW] libvirt: Failure to recover from failed detach

 

Public bug reported:

1020162 ERROR root [req-46fbc6c8-de2c-4afb-9f24-9d75947c9a3c
9ccddbb72e2d42b6ab1a31ad48ea21fb 86bea4eb057b412a98402a1b7e1d9222 - - -]
Original exception being dropped: ['Traceback (most recent call
last):\n', '  File "/usr/lib/python2.7/site-
packages/nova/virt/libvirt/guest.py", line 390, in _try_detach_device\n
self.detach_device(conf, persistent=persistent, live=live)\n', '  File
"/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 467,
in detach_device\n    self._domain.detachDeviceFlags(device_xml,
flags=flags)\n', '  File "/usr/lib/python2.7/site-
packages/eventlet/tpool.py", line 186, in doit\n    result =
proxy_call(self._autowrap, f, *args, **kwargs)\n', '  File
"/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in
proxy_call\n    rv = execute(f, *args, **kwargs)\n', '  File
"/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in
execute\n    six.reraise(c, e, tb)\n', '  File "/usr/lib/python2.7/site-
packages/eventlet/tpool.py", line 83, in tworker\n    rv = meth(*args,
**kwargs)\n', '  File "/usr/lib64/python2.7/site-packages/libvirt.py",
line 1194, in detachDeviceFlags\n    if ret == -1: raise libvirtError
(\'virDomainDetachDeviceFlags() failed\', dom=self)\n', 'libvirtError:
invalid argument: no target device vdb\n']

This appears to happen because when we call
detach_device_with_retry(live=True) we ultimately call
detachDeviceFlags(flags=VIR_DOMAIN_AFFECT_CONFIG |
VIR_DOMAIN_AFFECT_LIVE). 'no target device' is the error generated when
libvirt failed to remove the device from CONFIG (persistent). This can
happen because detachDeviceFlags(flags=VIR_DOMAIN_AFFECT_CONFIG |
VIR_DOMAIN_AFFECT_LIVE) will succeed and remove the device from the
CONFIG domain as long as the LIVE domain removal was queued, even though
this is an asynchronous operation. Consequently, a subsequent check for
the device may return the device because it hasn't yet been (and may
never be) removed from the LIVE domain, but it has been removed from the
CONFIG domain. This will prevent libvirt from attempting to remove the
device from the LIVE domain, and so the detach will never succeed.

** Affects: nova
     Importance: Undecided
         Status: New

** Bug watch added: Red Hat Bugzilla #1669225
   https://bugzilla.redhat.com/show_bug.cgi?id=1669225

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1836212

Title:
  libvirt: Failure to recover from failed detach

Status in OpenStack Compute (nova):
  New

Bug description:
  1020162 ERROR root [req-46fbc6c8-de2c-4afb-9f24-9d75947c9a3c
  9ccddbb72e2d42b6ab1a31ad48ea21fb 86bea4eb057b412a98402a1b7e1d9222 - -
  -] Original exception being dropped: ['Traceback (most recent call
  last):\n', '  File "/usr/lib/python2.7/site-
  packages/nova/virt/libvirt/guest.py", line 390, in
  _try_detach_device\n    self.detach_device(conf,
  persistent=persistent, live=live)\n', '  File "/usr/lib/python2.7
  /site-packages/nova/virt/libvirt/guest.py", line 467, in
  detach_device\n    self._domain.detachDeviceFlags(device_xml,
  flags=flags)\n', '  File "/usr/lib/python2.7/site-
  packages/eventlet/tpool.py", line 186, in doit\n    result =
  proxy_call(self._autowrap, f, *args, **kwargs)\n', '  File
  "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in
  proxy_call\n    rv = execute(f, *args, **kwargs)\n', '  File
  "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in
  execute\n    six.reraise(c, e, tb)\n', '  File "/usr/lib/python2.7
  /site-packages/eventlet/tpool.py", line 83, in tworker\n    rv =
  meth(*args, **kwargs)\n', '  File "/usr/lib64/python2.7/site-
  packages/libvirt.py", line 1194, in detachDeviceFlags\n    if ret ==
  -1: raise libvirtError (\'virDomainDetachDeviceFlags() failed\',
  dom=self)\n', 'libvirtError: invalid argument: no target device
  vdb\n']

  This appears to happen because when we call
  detach_device_with_retry(live=True) we ultimately call
  detachDeviceFlags(flags=VIR_DOMAIN_AFFECT_CONFIG |
  VIR_DOMAIN_AFFECT_LIVE). 'no target device' is the error generated
  when libvirt failed to remove the device from CONFIG (persistent).
  This can happen because
  detachDeviceFlags(flags=VIR_DOMAIN_AFFECT_CONFIG |
  VIR_DOMAIN_AFFECT_LIVE) will succeed and remove the device from the
  CONFIG domain as long as the LIVE domain removal was queued, even
  though this is an asynchronous operation. Consequently, a subsequent
  check for the device may return the device because it hasn't yet been
  (and may never be) removed from the LIVE domain, but it has been
  removed from the CONFIG domain. This will prevent libvirt from
  attempting to remove the device from the LIVE domain, and so the
  detach will never succeed.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1836212/+subscriptions


Follow ups