← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1642689] [NEW] ceph: volume detach fails with "libvirtError: operation failed: disk vdb not found"

 

Public bug reported:

Seeing this failure in this job:

tempest.api.volume.test_volumes_snapshots.VolumesV1SnapshotTestJSON.test_snapshot_create_with_volume_in_use[compute
,id-b467b54c-07a4-446d-a1cf-651dedcc3ff1]

While detaching a volume on cleanup it times out waiting for the volume
status to go from 'in-use' back to 'available':

2016-11-17 11:58:59.110301 | Captured traceback:
2016-11-17 11:58:59.110310 | ~~~~~~~~~~~~~~~~~~~
2016-11-17 11:58:59.110322 |     Traceback (most recent call last):
2016-11-17 11:58:59.110342 |       File "tempest/common/waiters.py", line 189, in wait_for_volume_status
2016-11-17 11:58:59.110356 |         raise lib_exc.TimeoutException(message)
2016-11-17 11:58:59.110373 |     tempest.lib.exceptions.TimeoutException: Request timed out
2016-11-17 11:58:59.110405 |     Details: Volume db12eda4-4ce6-4f00-a4e0-9df115f230e5 failed to reach available status (current in-use) within the required time (196 s).

The volume detach request is here:

2016-11-17 11:58:59.031058 |     2016-11-17 11:38:55,018 8316 INFO     [tempest.lib.common.rest_client] Request (VolumesV1SnapshotTestJSON:_run_cleanups): 202 DELETE http://127.0.0.1:8774/v2.1/servers/584a65b5-07fa-4994-a2d5-1676d0e13a8c/os-volume_attachments/db12eda4-4ce6-4f00-a4e0-9df115f230e5 0.277s
2016-11-17 11:58:59.031103 |     2016-11-17 11:38:55,018 8316 DEBUG    [tempest.lib.common.rest_client] Request - Headers: {'X-Auth-Token': '<omitted>', 'Accept': 'application/json', 'Content-Type': 'application/json'}
2016-11-17 11:58:59.031113 |             Body: None
2016-11-17 11:58:59.031235 |         Response - Headers: {'content-location': 'http://127.0.0.1:8774/v2.1/servers/584a65b5-07fa-4994-a2d5-1676d0e13a8c/os-volume_attachments/db12eda4-4ce6-4f00-a4e0-9df115f230e5', 'content-type': 'application/json', 'x-openstack-nova-api-version': '2.1', 'date': 'Thu, 17 Nov 2016 11:38:55 GMT', 'content-length': '0', 'status': '202', 'connection': 'close', 'x-compute-request-id': 'req-9f0541d3-6eec-4793-8852-7bd01708932e', 'openstack-api-version': 'compute 2.1', 'vary': 'X-OpenStack-Nova-API-Version'}
2016-11-17 11:58:59.031248 |             Body: 

Following the req-9f0541d3-6eec-4793-8852-7bd01708932e request ID to the
compute logs we see this detach failure:

http://logs.openstack.org/00/398800/1/gate/gate-tempest-dsvm-full-
devstack-plugin-ceph-ubuntu-
xenial/a387fb0/logs/screen-n-cpu.txt.gz?level=TRACE#_2016-11-17_11_39_00_649

2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [req-9f0541d3-6eec-4793-8852-7bd01708932e tempest-VolumesV1SnapshotTestJSON-1819335716 tempest-VolumesV1SnapshotTestJSON-1819335716] [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] Failed to detach volume db12eda4-4ce6-4f00-a4e0-9df115f230e5 from /dev/vdb
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] Traceback (most recent call last):
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/opt/stack/new/nova/nova/compute/manager.py", line 4757, in _driver_detach_volume
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     encryption=encryption)
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 1307, in detach_volume
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     wait_for_detach()
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/usr/local/lib/python2.7/dist-packages/oslo_service/loopingcall.py", line 385, in func
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     return evt.wait()
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 121, in wait
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     return hubs.get_hub().switch()
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 294, in switch
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     return self.greenlet.switch()
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/usr/local/lib/python2.7/dist-packages/oslo_service/loopingcall.py", line 136, in _run_loop
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     result = func(*self.args, **self.kw)
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/usr/local/lib/python2.7/dist-packages/oslo_service/loopingcall.py", line 356, in _func
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     result = f(*args, **kwargs)
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/opt/stack/new/nova/nova/virt/libvirt/guest.py", line 380, in _do_wait_and_retry_detach
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     self.detach_device(config, persistent=False, live=live)
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/opt/stack/new/nova/nova/virt/libvirt/guest.py", line 401, in detach_device
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     self._domain.detachDeviceFlags(device_xml, flags=flags)
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, in doit
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     result = proxy_call(self._autowrap, f, *args, **kwargs)
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, in proxy_call
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     rv = execute(f, *args, **kwargs)
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, in execute
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     six.reraise(c, e, tb)
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     rv = meth(*args, **kwargs)
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/usr/local/lib/python2.7/dist-packages/libvirt.py", line 1190, in detachDeviceFlags
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     if ret == -1: raise libvirtError ('virDomainDetachDeviceFlags() failed', dom=self)
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] libvirtError: operation failed: disk vdb not found
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] 

http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Failed%20to%20detach%20volume%5C%22%20AND%20message%3A%5C%22_do_wait_and_retry_detach%5C%22%20AND%20message%3A%5C%22libvirtError%3A%20operation%20failed%3A%20disk%5C%22%20AND%20message%3A%5C%22not%20found%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22&from=7d

21 hits in the last 7 days, check and gate, all failures, 90% in ceph
jobs.

Looks like that wait for detach code should be handling the not found
from libvirt and considering the job complete.

** Affects: nova
     Importance: Medium
         Status: Triaged


** Tags: ceph libvirt volumes

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1642689

Title:
  ceph: volume detach fails with "libvirtError: operation failed: disk
  vdb not found"

Status in OpenStack Compute (nova):
  Triaged

Bug description:
  Seeing this failure in this job:

  tempest.api.volume.test_volumes_snapshots.VolumesV1SnapshotTestJSON.test_snapshot_create_with_volume_in_use[compute
  ,id-b467b54c-07a4-446d-a1cf-651dedcc3ff1]

  While detaching a volume on cleanup it times out waiting for the
  volume status to go from 'in-use' back to 'available':

  2016-11-17 11:58:59.110301 | Captured traceback:
  2016-11-17 11:58:59.110310 | ~~~~~~~~~~~~~~~~~~~
  2016-11-17 11:58:59.110322 |     Traceback (most recent call last):
  2016-11-17 11:58:59.110342 |       File "tempest/common/waiters.py", line 189, in wait_for_volume_status
  2016-11-17 11:58:59.110356 |         raise lib_exc.TimeoutException(message)
  2016-11-17 11:58:59.110373 |     tempest.lib.exceptions.TimeoutException: Request timed out
  2016-11-17 11:58:59.110405 |     Details: Volume db12eda4-4ce6-4f00-a4e0-9df115f230e5 failed to reach available status (current in-use) within the required time (196 s).

  The volume detach request is here:

  2016-11-17 11:58:59.031058 |     2016-11-17 11:38:55,018 8316 INFO     [tempest.lib.common.rest_client] Request (VolumesV1SnapshotTestJSON:_run_cleanups): 202 DELETE http://127.0.0.1:8774/v2.1/servers/584a65b5-07fa-4994-a2d5-1676d0e13a8c/os-volume_attachments/db12eda4-4ce6-4f00-a4e0-9df115f230e5 0.277s
  2016-11-17 11:58:59.031103 |     2016-11-17 11:38:55,018 8316 DEBUG    [tempest.lib.common.rest_client] Request - Headers: {'X-Auth-Token': '<omitted>', 'Accept': 'application/json', 'Content-Type': 'application/json'}
  2016-11-17 11:58:59.031113 |             Body: None
  2016-11-17 11:58:59.031235 |         Response - Headers: {'content-location': 'http://127.0.0.1:8774/v2.1/servers/584a65b5-07fa-4994-a2d5-1676d0e13a8c/os-volume_attachments/db12eda4-4ce6-4f00-a4e0-9df115f230e5', 'content-type': 'application/json', 'x-openstack-nova-api-version': '2.1', 'date': 'Thu, 17 Nov 2016 11:38:55 GMT', 'content-length': '0', 'status': '202', 'connection': 'close', 'x-compute-request-id': 'req-9f0541d3-6eec-4793-8852-7bd01708932e', 'openstack-api-version': 'compute 2.1', 'vary': 'X-OpenStack-Nova-API-Version'}
  2016-11-17 11:58:59.031248 |             Body: 

  Following the req-9f0541d3-6eec-4793-8852-7bd01708932e request ID to
  the compute logs we see this detach failure:

  http://logs.openstack.org/00/398800/1/gate/gate-tempest-dsvm-full-
  devstack-plugin-ceph-ubuntu-
  xenial/a387fb0/logs/screen-n-cpu.txt.gz?level=TRACE#_2016-11-17_11_39_00_649

  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [req-9f0541d3-6eec-4793-8852-7bd01708932e tempest-VolumesV1SnapshotTestJSON-1819335716 tempest-VolumesV1SnapshotTestJSON-1819335716] [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] Failed to detach volume db12eda4-4ce6-4f00-a4e0-9df115f230e5 from /dev/vdb
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] Traceback (most recent call last):
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/opt/stack/new/nova/nova/compute/manager.py", line 4757, in _driver_detach_volume
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     encryption=encryption)
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 1307, in detach_volume
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     wait_for_detach()
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/usr/local/lib/python2.7/dist-packages/oslo_service/loopingcall.py", line 385, in func
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     return evt.wait()
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 121, in wait
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     return hubs.get_hub().switch()
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 294, in switch
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     return self.greenlet.switch()
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/usr/local/lib/python2.7/dist-packages/oslo_service/loopingcall.py", line 136, in _run_loop
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     result = func(*self.args, **self.kw)
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/usr/local/lib/python2.7/dist-packages/oslo_service/loopingcall.py", line 356, in _func
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     result = f(*args, **kwargs)
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/opt/stack/new/nova/nova/virt/libvirt/guest.py", line 380, in _do_wait_and_retry_detach
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     self.detach_device(config, persistent=False, live=live)
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/opt/stack/new/nova/nova/virt/libvirt/guest.py", line 401, in detach_device
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     self._domain.detachDeviceFlags(device_xml, flags=flags)
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, in doit
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     result = proxy_call(self._autowrap, f, *args, **kwargs)
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, in proxy_call
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     rv = execute(f, *args, **kwargs)
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, in execute
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     six.reraise(c, e, tb)
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     rv = meth(*args, **kwargs)
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]   File "/usr/local/lib/python2.7/dist-packages/libvirt.py", line 1190, in detachDeviceFlags
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]     if ret == -1: raise libvirtError ('virDomainDetachDeviceFlags() failed', dom=self)
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] libvirtError: operation failed: disk vdb not found
  2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] 

  http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Failed%20to%20detach%20volume%5C%22%20AND%20message%3A%5C%22_do_wait_and_retry_detach%5C%22%20AND%20message%3A%5C%22libvirtError%3A%20operation%20failed%3A%20disk%5C%22%20AND%20message%3A%5C%22not%20found%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22&from=7d

  21 hits in the last 7 days, check and gate, all failures, 90% in ceph
  jobs.

  Looks like that wait for detach code should be handling the not found
  from libvirt and considering the job complete.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1642689/+subscriptions


Follow ups