yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #59248
[Bug 1642689] Re: ceph: volume detach fails with "libvirtError: operation failed: disk vdb not found"
** Also affects: nova/newton
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1642689
Title:
ceph: volume detach fails with "libvirtError: operation failed: disk
vdb not found"
Status in OpenStack Compute (nova):
In Progress
Status in OpenStack Compute (nova) newton series:
In Progress
Bug description:
Seeing this failure in this job:
tempest.api.volume.test_volumes_snapshots.VolumesV1SnapshotTestJSON.test_snapshot_create_with_volume_in_use[compute
,id-b467b54c-07a4-446d-a1cf-651dedcc3ff1]
While detaching a volume on cleanup it times out waiting for the
volume status to go from 'in-use' back to 'available':
2016-11-17 11:58:59.110301 | Captured traceback:
2016-11-17 11:58:59.110310 | ~~~~~~~~~~~~~~~~~~~
2016-11-17 11:58:59.110322 | Traceback (most recent call last):
2016-11-17 11:58:59.110342 | File "tempest/common/waiters.py", line 189, in wait_for_volume_status
2016-11-17 11:58:59.110356 | raise lib_exc.TimeoutException(message)
2016-11-17 11:58:59.110373 | tempest.lib.exceptions.TimeoutException: Request timed out
2016-11-17 11:58:59.110405 | Details: Volume db12eda4-4ce6-4f00-a4e0-9df115f230e5 failed to reach available status (current in-use) within the required time (196 s).
The volume detach request is here:
2016-11-17 11:58:59.031058 | 2016-11-17 11:38:55,018 8316 INFO [tempest.lib.common.rest_client] Request (VolumesV1SnapshotTestJSON:_run_cleanups): 202 DELETE http://127.0.0.1:8774/v2.1/servers/584a65b5-07fa-4994-a2d5-1676d0e13a8c/os-volume_attachments/db12eda4-4ce6-4f00-a4e0-9df115f230e5 0.277s
2016-11-17 11:58:59.031103 | 2016-11-17 11:38:55,018 8316 DEBUG [tempest.lib.common.rest_client] Request - Headers: {'X-Auth-Token': '<omitted>', 'Accept': 'application/json', 'Content-Type': 'application/json'}
2016-11-17 11:58:59.031113 | Body: None
2016-11-17 11:58:59.031235 | Response - Headers: {'content-location': 'http://127.0.0.1:8774/v2.1/servers/584a65b5-07fa-4994-a2d5-1676d0e13a8c/os-volume_attachments/db12eda4-4ce6-4f00-a4e0-9df115f230e5', 'content-type': 'application/json', 'x-openstack-nova-api-version': '2.1', 'date': 'Thu, 17 Nov 2016 11:38:55 GMT', 'content-length': '0', 'status': '202', 'connection': 'close', 'x-compute-request-id': 'req-9f0541d3-6eec-4793-8852-7bd01708932e', 'openstack-api-version': 'compute 2.1', 'vary': 'X-OpenStack-Nova-API-Version'}
2016-11-17 11:58:59.031248 | Body:
Following the req-9f0541d3-6eec-4793-8852-7bd01708932e request ID to
the compute logs we see this detach failure:
http://logs.openstack.org/00/398800/1/gate/gate-tempest-dsvm-full-
devstack-plugin-ceph-ubuntu-
xenial/a387fb0/logs/screen-n-cpu.txt.gz?level=TRACE#_2016-11-17_11_39_00_649
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [req-9f0541d3-6eec-4793-8852-7bd01708932e tempest-VolumesV1SnapshotTestJSON-1819335716 tempest-VolumesV1SnapshotTestJSON-1819335716] [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] Failed to detach volume db12eda4-4ce6-4f00-a4e0-9df115f230e5 from /dev/vdb
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] Traceback (most recent call last):
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] File "/opt/stack/new/nova/nova/compute/manager.py", line 4757, in _driver_detach_volume
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] encryption=encryption)
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 1307, in detach_volume
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] wait_for_detach()
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] File "/usr/local/lib/python2.7/dist-packages/oslo_service/loopingcall.py", line 385, in func
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] return evt.wait()
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 121, in wait
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] return hubs.get_hub().switch()
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 294, in switch
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] return self.greenlet.switch()
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] File "/usr/local/lib/python2.7/dist-packages/oslo_service/loopingcall.py", line 136, in _run_loop
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] result = func(*self.args, **self.kw)
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] File "/usr/local/lib/python2.7/dist-packages/oslo_service/loopingcall.py", line 356, in _func
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] result = f(*args, **kwargs)
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] File "/opt/stack/new/nova/nova/virt/libvirt/guest.py", line 380, in _do_wait_and_retry_detach
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] self.detach_device(config, persistent=False, live=live)
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] File "/opt/stack/new/nova/nova/virt/libvirt/guest.py", line 401, in detach_device
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] self._domain.detachDeviceFlags(device_xml, flags=flags)
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, in doit
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] result = proxy_call(self._autowrap, f, *args, **kwargs)
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, in proxy_call
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] rv = execute(f, *args, **kwargs)
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, in execute
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] six.reraise(c, e, tb)
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] rv = meth(*args, **kwargs)
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] File "/usr/local/lib/python2.7/dist-packages/libvirt.py", line 1190, in detachDeviceFlags
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] if ret == -1: raise libvirtError ('virDomainDetachDeviceFlags() failed', dom=self)
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c] libvirtError: operation failed: disk vdb not found
2016-11-17 11:39:00.649 2249 ERROR nova.compute.manager [instance: 584a65b5-07fa-4994-a2d5-1676d0e13a8c]
http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Failed%20to%20detach%20volume%5C%22%20AND%20message%3A%5C%22_do_wait_and_retry_detach%5C%22%20AND%20message%3A%5C%22libvirtError%3A%20operation%20failed%3A%20disk%5C%22%20AND%20message%3A%5C%22not%20found%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22&from=7d
21 hits in the last 7 days, check and gate, all failures, 90% in ceph
jobs.
Looks like that wait for detach code should be handling the not found
from libvirt and considering the job complete.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1642689/+subscriptions
References