← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1770211] [NEW] AttachVolumeMultiAttachTest.test_resize_server_with_multiattached_volume intermittently fails with "Unable to detach from guest transient domain."

 

Public bug reported:

http://logs.openstack.org/37/522537/27/check/nova-multiattach/7af78b6
/job-output.txt.gz#_2018-05-08_17_39_54_410553

2018-05-08 17:39:54.410553 | primary | {2} tempest.api.compute.volumes.test_attach_volume.AttachVolumeMultiAttachTest.test_resize_server_with_multiattached_volume [672.593719s] ... FAILED
2018-05-08 17:39:54.412202 | primary |
2018-05-08 17:39:54.412269 | primary | Captured traceback-2:
2018-05-08 17:39:54.412327 | primary | ~~~~~~~~~~~~~~~~~~~~~
2018-05-08 17:39:54.412393 | primary |     Traceback (most recent call last):
2018-05-08 17:39:54.412483 | primary |       File "tempest/common/waiters.py", line 211, in wait_for_volume_resource_status
2018-05-08 17:39:54.412559 | primary |         raise lib_exc.TimeoutException(message)
2018-05-08 17:39:54.412629 | primary |     tempest.lib.exceptions.TimeoutException: Request timed out
2018-05-08 17:39:54.412755 | primary |     Details: volume b4c0ac0e-5814-4092-9fef-658691f2b702 failed to reach available status (current detaching) within the required time (196 s).

http://logs.openstack.org/37/522537/27/check/nova-
multiattach/7af78b6/logs/screen-n-cpu.txt.gz?level=TRACE#_May_08_17_31_46_919238

May 08 17:31:46.919238 ubuntu-xenial-ovh-gra1-0003921747 nova-
compute[27633]: WARNING nova.virt.block_device [None req-
45144cb8-5878-4f55-9f5b-adac59c02685 tempest-
ServerDiskConfigTestJSON-1657103191 tempest-
ServerDiskConfigTestJSON-1657103191] [instance: 675ac2f4-9483-4766-b31c-
714cb314c53d] Guest refused to detach volume b4c0ac0e-5814-4092-9fef-
658691f2b702: DeviceDetachFailed: Device detach failed for vdb: Unable
to detach from guest transient domain.

I'm not sure why the log says "ServerDiskConfigTestJSON" in it, that
could be because of our known issue with cached oslo.context request ID
information in the service code. But this was definitely on a
multiattach volume, as seen here:

RESP BODY: {"volume": {"status": "in-use", "user_id":
"97d56cb74d094f3bb5412594d0d69105", "attachments": [{"server_id":
"675ac2f4-9483-4766-b31c-714cb314c53d", "attachment_id": "cc3cc4ac-
cbf7-4213-a278-3b6c4822faad", "attached_at":
"2018-05-08T17:29:36.000000", "host_name": "ubuntu-xenial-ovh-
gra1-0003921747", "volume_id": "b4c0ac0e-5814-4092-9fef-658691f2b702",
"device": "/dev/vdb", "id": "b4c0ac0e-5814-4092-9fef-658691f2b702"}],
"links": [{"href":
"http://149.202.186.74/volume/v3/2c7c9754edbc493b9f7b35fa1860ce2e/volumes/b4c0ac0e-5814-4092
-9fef-658691f2b702", "rel": "self"}, {"href":
"http://149.202.186.74/volume/2c7c9754edbc493b9f7b35fa1860ce2e/volumes/b4c0ac0e-5814-4092
-9fef-658691f2b702", "rel": "bookmark"}], "availability_zone": "nova",
"bootable": "false", "encrypted": false, "created_at":
"2018-05-08T17:29:05.000000", "description": null, "os-vol-tenant-
attr:tenant_id": "2c7c9754edbc493b9f7b35fa1860ce2e", "updated_at":
"2018-05-08T17:43:46.000000", "volume_type": "lvmdriver-1", "name":
"tempest-AttachVolumeMultiAttachTest-volume-317264940",
"replication_status": null, "consistencygroup_id": null, "source_volid":
null, "snapshot_id": null, "multiattach": true, "metadata": {}, "id":
"b4c0ac0e-5814-4092-9fef-658691f2b702", "size": 1}}

This is where we start detaching the volume:

http://logs.openstack.org/37/522537/27/check/nova-
multiattach/7af78b6/logs/screen-n-cpu.txt.gz#_May_08_17_30_05_620402

May 08 17:30:05.620402 ubuntu-xenial-ovh-gra1-0003921747 nova-compute[27633]: INFO nova.virt.block_device [None req-5ef02975-4f91-4b24-963b-1cf67fcaac1a tempest-AttachVolumeMultiAttachTest-169541097 tempest-AttachVolumeMultiAttachTest-169541097] [instance: 675ac2f4-9483-4766-b31c-714cb314c53d] Attempting to driver detach volume b4c0ac0e-5814-4092-9fef-658691f2b702 from mountpoint /dev/vdb
May 08 17:30:05.630483 ubuntu-xenial-ovh-gra1-0003921747 nova-compute[27633]: DEBUG nova.virt.libvirt.guest [None req-5ef02975-4f91-4b24-963b-1cf67fcaac1a tempest-AttachVolumeMultiAttachTest-169541097 tempest-AttachVolumeMultiAttachTest-169541097] Attempting initial detach for device vdb {{(pid=27633) detach_device_with_retry /opt/stack/new/nova/nova/virt/libvirt/guest.py:426}}
May 08 17:30:05.632159 ubuntu-xenial-ovh-gra1-0003921747 nova-compute[27633]: DEBUG nova.virt.libvirt.guest [None req-5ef02975-4f91-4b24-963b-1cf67fcaac1a tempest-AttachVolumeMultiAttachTest-169541097 tempest-AttachVolumeMultiAttachTest-169541097] detach device xml: <disk type="block" device="disk">
May 08 17:30:05.632312 ubuntu-xenial-ovh-gra1-0003921747 nova-compute[27633]:   <driver name="qemu" type="raw" cache="none" io="native"/>
May 08 17:30:05.632489 ubuntu-xenial-ovh-gra1-0003921747 nova-compute[27633]:   <source dev="/dev/sde"/>
May 08 17:30:05.632647 ubuntu-xenial-ovh-gra1-0003921747 nova-compute[27633]:   <target bus="virtio" dev="vdb"/>
May 08 17:30:05.632793 ubuntu-xenial-ovh-gra1-0003921747 nova-compute[27633]:   <serial>b4c0ac0e-5814-4092-9fef-658691f2b702</serial>
May 08 17:30:05.632932 ubuntu-xenial-ovh-gra1-0003921747 nova-compute[27633]:   <shareable/>
May 08 17:30:05.633070 ubuntu-xenial-ovh-gra1-0003921747 nova-compute[27633]:   <address type="pci" domain="0x0000" bus="0x00" slot="0x05" function="0x0"/>
May 08 17:30:05.633196 ubuntu-xenial-ovh-gra1-0003921747 nova-compute[27633]: </disk>
May 08 17:30:05.633365 ubuntu-xenial-ovh-gra1-0003921747 nova-compute[27633]:  {{(pid=27633) detach_device /opt/stack/new/nova/nova/virt/libvirt/guest.py:477}}

http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Guest%20refused%20to%20detach%20volume%5C%22%20AND%20message%3A%5C%22Unable%20to%20detach%20from%20guest%20transient%20domain.%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22&from=7d

8 hits in 7 days, multiple changes, all failures, mostly in the
multiattach job.

** Affects: nova
     Importance: Medium
         Status: Confirmed


** Tags: libvirt multiattach volumes

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1770211

Title:
  AttachVolumeMultiAttachTest.test_resize_server_with_multiattached_volume
  intermittently fails with "Unable to detach from guest transient
  domain."

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  http://logs.openstack.org/37/522537/27/check/nova-multiattach/7af78b6
  /job-output.txt.gz#_2018-05-08_17_39_54_410553

  2018-05-08 17:39:54.410553 | primary | {2} tempest.api.compute.volumes.test_attach_volume.AttachVolumeMultiAttachTest.test_resize_server_with_multiattached_volume [672.593719s] ... FAILED
  2018-05-08 17:39:54.412202 | primary |
  2018-05-08 17:39:54.412269 | primary | Captured traceback-2:
  2018-05-08 17:39:54.412327 | primary | ~~~~~~~~~~~~~~~~~~~~~
  2018-05-08 17:39:54.412393 | primary |     Traceback (most recent call last):
  2018-05-08 17:39:54.412483 | primary |       File "tempest/common/waiters.py", line 211, in wait_for_volume_resource_status
  2018-05-08 17:39:54.412559 | primary |         raise lib_exc.TimeoutException(message)
  2018-05-08 17:39:54.412629 | primary |     tempest.lib.exceptions.TimeoutException: Request timed out
  2018-05-08 17:39:54.412755 | primary |     Details: volume b4c0ac0e-5814-4092-9fef-658691f2b702 failed to reach available status (current detaching) within the required time (196 s).

  http://logs.openstack.org/37/522537/27/check/nova-
  multiattach/7af78b6/logs/screen-n-cpu.txt.gz?level=TRACE#_May_08_17_31_46_919238

  May 08 17:31:46.919238 ubuntu-xenial-ovh-gra1-0003921747 nova-
  compute[27633]: WARNING nova.virt.block_device [None req-
  45144cb8-5878-4f55-9f5b-adac59c02685 tempest-
  ServerDiskConfigTestJSON-1657103191 tempest-
  ServerDiskConfigTestJSON-1657103191] [instance: 675ac2f4-9483-4766
  -b31c-714cb314c53d] Guest refused to detach volume b4c0ac0e-5814-4092
  -9fef-658691f2b702: DeviceDetachFailed: Device detach failed for vdb:
  Unable to detach from guest transient domain.

  I'm not sure why the log says "ServerDiskConfigTestJSON" in it, that
  could be because of our known issue with cached oslo.context request
  ID information in the service code. But this was definitely on a
  multiattach volume, as seen here:

  RESP BODY: {"volume": {"status": "in-use", "user_id":
  "97d56cb74d094f3bb5412594d0d69105", "attachments": [{"server_id":
  "675ac2f4-9483-4766-b31c-714cb314c53d", "attachment_id": "cc3cc4ac-
  cbf7-4213-a278-3b6c4822faad", "attached_at":
  "2018-05-08T17:29:36.000000", "host_name": "ubuntu-xenial-ovh-
  gra1-0003921747", "volume_id": "b4c0ac0e-5814-4092-9fef-658691f2b702",
  "device": "/dev/vdb", "id": "b4c0ac0e-5814-4092-9fef-658691f2b702"}],
  "links": [{"href":
  "http://149.202.186.74/volume/v3/2c7c9754edbc493b9f7b35fa1860ce2e/volumes/b4c0ac0e-5814-4092
  -9fef-658691f2b702", "rel": "self"}, {"href":
  "http://149.202.186.74/volume/2c7c9754edbc493b9f7b35fa1860ce2e/volumes/b4c0ac0e-5814-4092
  -9fef-658691f2b702", "rel": "bookmark"}], "availability_zone": "nova",
  "bootable": "false", "encrypted": false, "created_at":
  "2018-05-08T17:29:05.000000", "description": null, "os-vol-tenant-
  attr:tenant_id": "2c7c9754edbc493b9f7b35fa1860ce2e", "updated_at":
  "2018-05-08T17:43:46.000000", "volume_type": "lvmdriver-1", "name":
  "tempest-AttachVolumeMultiAttachTest-volume-317264940",
  "replication_status": null, "consistencygroup_id": null,
  "source_volid": null, "snapshot_id": null, "multiattach": true,
  "metadata": {}, "id": "b4c0ac0e-5814-4092-9fef-658691f2b702", "size":
  1}}

  This is where we start detaching the volume:

  http://logs.openstack.org/37/522537/27/check/nova-
  multiattach/7af78b6/logs/screen-n-cpu.txt.gz#_May_08_17_30_05_620402

  May 08 17:30:05.620402 ubuntu-xenial-ovh-gra1-0003921747 nova-compute[27633]: INFO nova.virt.block_device [None req-5ef02975-4f91-4b24-963b-1cf67fcaac1a tempest-AttachVolumeMultiAttachTest-169541097 tempest-AttachVolumeMultiAttachTest-169541097] [instance: 675ac2f4-9483-4766-b31c-714cb314c53d] Attempting to driver detach volume b4c0ac0e-5814-4092-9fef-658691f2b702 from mountpoint /dev/vdb
  May 08 17:30:05.630483 ubuntu-xenial-ovh-gra1-0003921747 nova-compute[27633]: DEBUG nova.virt.libvirt.guest [None req-5ef02975-4f91-4b24-963b-1cf67fcaac1a tempest-AttachVolumeMultiAttachTest-169541097 tempest-AttachVolumeMultiAttachTest-169541097] Attempting initial detach for device vdb {{(pid=27633) detach_device_with_retry /opt/stack/new/nova/nova/virt/libvirt/guest.py:426}}
  May 08 17:30:05.632159 ubuntu-xenial-ovh-gra1-0003921747 nova-compute[27633]: DEBUG nova.virt.libvirt.guest [None req-5ef02975-4f91-4b24-963b-1cf67fcaac1a tempest-AttachVolumeMultiAttachTest-169541097 tempest-AttachVolumeMultiAttachTest-169541097] detach device xml: <disk type="block" device="disk">
  May 08 17:30:05.632312 ubuntu-xenial-ovh-gra1-0003921747 nova-compute[27633]:   <driver name="qemu" type="raw" cache="none" io="native"/>
  May 08 17:30:05.632489 ubuntu-xenial-ovh-gra1-0003921747 nova-compute[27633]:   <source dev="/dev/sde"/>
  May 08 17:30:05.632647 ubuntu-xenial-ovh-gra1-0003921747 nova-compute[27633]:   <target bus="virtio" dev="vdb"/>
  May 08 17:30:05.632793 ubuntu-xenial-ovh-gra1-0003921747 nova-compute[27633]:   <serial>b4c0ac0e-5814-4092-9fef-658691f2b702</serial>
  May 08 17:30:05.632932 ubuntu-xenial-ovh-gra1-0003921747 nova-compute[27633]:   <shareable/>
  May 08 17:30:05.633070 ubuntu-xenial-ovh-gra1-0003921747 nova-compute[27633]:   <address type="pci" domain="0x0000" bus="0x00" slot="0x05" function="0x0"/>
  May 08 17:30:05.633196 ubuntu-xenial-ovh-gra1-0003921747 nova-compute[27633]: </disk>
  May 08 17:30:05.633365 ubuntu-xenial-ovh-gra1-0003921747 nova-compute[27633]:  {{(pid=27633) detach_device /opt/stack/new/nova/nova/virt/libvirt/guest.py:477}}

  http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Guest%20refused%20to%20detach%20volume%5C%22%20AND%20message%3A%5C%22Unable%20to%20detach%20from%20guest%20transient%20domain.%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22&from=7d

  8 hits in 7 days, multiple changes, all failures, mostly in the
  multiattach job.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1770211/+subscriptions