yahoo-eng-team team mailing list archive

Thread
Date
[Bug 1807723] Re: swap multiattach volume intermittently fails when servers are on different hosts

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Matt Riedemann <mriedem.os@xxxxxxxxx>
Date: Mon, 10 Dec 2018 16:41:34 -0000
Reply-to: Bug 1807723 <1807723@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx
Looking at the swap volume flow in nova again, I think

https://github.com/openstack/nova/blob/ae3064b7a820ea02f7fc8a1aa4a41f35a06534f1/nova/compute/manager.py#L5798-L5806

is likely intentional since for volume1/server1 there is only a single
BDM record. It starts out with the old volume_id and attachment_id, and
then once the swap is complete, the compute manager code, rather than
delete the old BDM for vol1/server1 and create a new BDM for
vol2/server1, it just updates the BDM for vol1/serv1 to point at vol2
(updates the connection_info, attachment_id and volume_id to be for vol2
instead of vol1).

So when we delete server1, it should cleanup volume2 from the bdm that
points at volume2. And when we delete server2, it should cleanup volume1
from the bdm that points at volume1.

The problem likely goes back to the old vol1/server1 attachment delete
failing here:

https://github.com/openstack/cinder/blob/9d4fd16bcd8eca930910798cc519cb5bc5846c59/cinder/volume/manager.py#L4528

Maybe we need to retry in that case if we're hitting a race? Maybe we
need to detect "tgtadm: this target is still active" here:

https://github.com/openstack/cinder/blob/9d4fd16bcd8eca930910798cc519cb5bc5846c59/cinder/volume/targets/tgt.py#L283

And retry?

** Also affects: cinder
   Importance: Undecided
       Status: New

** Changed in: cinder
       Status: New => Confirmed

** Changed in: cinder
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1807723

Title:
  swap multiattach volume intermittently fails when servers are on
  different hosts

Status in Cinder:
  Confirmed
Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  This is found from debugging the
  tempest.api.compute.admin.test_volume_swap.TestMultiAttachVolumeSwap
  failures in the (multinode) tempest-slow job here:

  https://review.openstack.org/#/c/606981/

  The failure is ultimately during teardown of the test class, it fails
  to delete the "old" multiattach volume:

  http://logs.openstack.org/81/606981/4/check/tempest-
  slow/fafde23/controller/logs/screen-c-vol.txt.gz?level=TRACE#_Dec_08_01_45_41_380161

  Dec 08 01:45:41.380161 ubuntu-xenial-ovh-gra1-0001066161 cinder-volume[1776]: ERROR cinder.volume.targets.tgt [req-4aa23d64-309a-44a4-a7c9-381a6fbef330 req-f47e0ea9-7e80-4481-bce1-7a1060c6728b tempest-TestMultiAttachVolumeSwap-1803326020 None] Failed to remove iscsi target for Volume ID: 2a48ae08-2833-488c-8771-eea86f41c7b8: Unexpected error while running command.
  Dec 08 01:45:41.380475 ubuntu-xenial-ovh-gra1-0001066161 cinder-volume[1776]: Command: sudo cinder-rootwrap /etc/cinder/rootwrap.conf tgt-admin --force --delete iqn.2010-10.org.openstack:volume-2a48ae08-2833-488c-8771-eea86f41c7b8
  Dec 08 01:45:41.380783 ubuntu-xenial-ovh-gra1-0001066161 cinder-volume[1776]: Exit code: 22
  Dec 08 01:45:41.381094 ubuntu-xenial-ovh-gra1-0001066161 cinder-volume[1776]: Stdout: u'Command:\n\ttgtadm -C 0 --mode target --op delete --tid=1\nexited with code: 22.\n'
  Dec 08 01:45:41.381386 ubuntu-xenial-ovh-gra1-0001066161 cinder-volume[1776]: Stderr: u'tgtadm: this target is still active\n': ProcessExecutionError: Unexpected error while running command.
  Dec 08 01:45:49.587081 ubuntu-xenial-ovh-gra1-0001066161 cinder-volume[1776]: ERROR cinder.volume.targets.tgt [None req-9dddc5b9-7e06-4ab9-83a6-ef211e0d4c48 tempest-TestMultiAttachVolumeSwap-1803326020 None] Failed to remove iscsi target for Volume ID: 2a48ae08-2833-488c-8771-eea86f41c7b8: Unexpected error while running command.
  Dec 08 01:45:49.587355 ubuntu-xenial-ovh-gra1-0001066161 cinder-volume[1776]: Command: sudo cinder-rootwrap /etc/cinder/rootwrap.conf tgt-admin --force --delete iqn.2010-10.org.openstack:volume-2a48ae08-2833-488c-8771-eea86f41c7b8

  After digging through the logs, I can see at least one set of issues
  in the ComputeManager.swap_volume code here:

  https://github.com/openstack/nova/blob/ae3064b7a820ea02f7fc8a1aa4a41f35a06534f1/nova/compute/manager.py#L5798-L5806

  Which is updating the "old" bdm with values from the new volume,
  including updating the bdm.volume_id from the old to the new, which
  likely orphans the old record somehow, or messes up the cleanup.

  My detailed notes follow:

  server1:

  {"server": {"OS-EXT-STS:task_state": null, "addresses": {}, "links":
  [{"href":
  "https://149.202.185.218/compute/v2.1/servers/d46fba31-9469-4799-b2bf-
  1fbad4369a9a", "rel": "self"}, {"href":
  "https://149.202.185.218/compute/servers/d46fba31-9469-4799-b2bf-
  1fbad4369a9a", "rel": "bookmark"}], "image": {"id": "f55b4d12-f159
  -453f-86cb-8965684121c8", "links": [{"href":
  "https://149.202.185.218/compute/images/f55b4d12-f159-453f-86cb-
  8965684121c8", "rel": "bookmark"}]}, "OS-EXT-STS:vm_state": "active",
  "OS-SRV-USG:launched_at": "2018-12-08T01:43:11.000000", "flavor":
  {"ephemeral": 0, "ram": 64, "original_name": "m1.nano", "vcpus": 1,
  "extra_specs": {}, "swap": 0, "disk": 0}, "id": "d46fba31-9469-4799
  -b2bf-1fbad4369a9a", "description": null, "user_id":
  "9bc61597b6794ca49c79ce599c26b223", "OS-DCF:diskConfig": "MANUAL",
  "accessIPv4": "", "accessIPv6": "", "progress": 0, "OS-EXT-
  STS:power_state": 1, "OS-EXT-AZ:availability_zone": "nova",
  "config_drive": "", "status": "ACTIVE", "updated":
  "2018-12-08T01:43:11Z", "hostId":
  "f4600b0fc614ce95d5be30fc1fc82bdb150c47ff8c84b22262be6d19", "OS-SRV-
  USG:terminated_at": null, "tags": [], "key_name": null, "locked":
  false, "name": "tempest-TestMultiAttachVolumeSwap-server-226416218-1",
  "created": "2018-12-08T01:43:04Z", "tenant_id":
  "5db98765c5d94a73bae8aba2f13c7762", "os-extended-
  volumes:volumes_attached": [], "metadata": {}}}

  server1.hostId = f4600b0fc614ce95d5be30fc1fc82bdb150c47ff8c84b22262be6d19
  server1 builds on the subnode (ubuntu-xenial-ovh-gra1-0001066278)

  server2:

  {"server": {"OS-EXT-STS:task_state": null, "addresses": {}, "links":
  [{"href": "https://149.202.185.218/compute/v2.1/servers/e0b74695-f9a0
  -4f4c-8a08-a0de3420cc6c", "rel": "self"}, {"href":
  "https://149.202.185.218/compute/servers/e0b74695-f9a0-4f4c-
  8a08-a0de3420cc6c", "rel": "bookmark"}], "image": {"id":
  "f55b4d12-f159-453f-86cb-8965684121c8", "links": [{"href":
  "https://149.202.185.218/compute/images/f55b4d12-f159-453f-86cb-
  8965684121c8", "rel": "bookmark"}]}, "OS-EXT-STS:vm_state": "active",
  "OS-SRV-USG:launched_at": "2018-12-08T01:43:10.000000", "flavor":
  {"ephemeral": 0, "ram": 64, "original_name": "m1.nano", "vcpus": 1,
  "extra_specs": {}, "swap": 0, "disk": 0}, "id": "e0b74695-f9a0-4f4c-
  8a08-a0de3420cc6c", "description": null, "user_id":
  "9bc61597b6794ca49c79ce599c26b223", "OS-DCF:diskConfig": "MANUAL",
  "accessIPv4": "", "accessIPv6": "", "progress": 0, "OS-EXT-
  STS:power_state": 1, "OS-EXT-AZ:availability_zone": "nova",
  "config_drive": "", "status": "ACTIVE", "updated":
  "2018-12-08T01:43:11Z", "hostId":
  "5f1a8abdf427d6cbda8003bd178ef40c605f24f626ec88c3a02816df", "OS-SRV-
  USG:terminated_at": null, "tags": [], "key_name": null, "locked":
  false, "name": "tempest-TestMultiAttachVolumeSwap-server-226416218-2",
  "created": "2018-12-08T01:43:04Z", "tenant_id":
  "5db98765c5d94a73bae8aba2f13c7762", "os-extended-
  volumes:volumes_attached": [], "metadata": {}}}

  server2.hostId = 5f1a8abdf427d6cbda8003bd178ef40c605f24f626ec88c3a02816df
  server2 builds on the primary node (ubuntu-xenial-ovh-gra1-0001066161)

  So clearly the two servers are on different hosts. Also, volume1 is on
  the primary node (same as server2).

  attach volume1 to server1:

  Body: {"volumeAttachment": {"device": "/dev/vdb", "serverId":
  "d46fba31-9469-4799-b2bf-1fbad4369a9a", "id":
  "2a48ae08-2833-488c-8771-eea86f41c7b8", "volumeId":
  "2a48ae08-2833-488c-8771-eea86f41c7b8"}} _log_request_full
  tempest/lib/common/rest_client.py:437

  attach volume1 to server2:

  Body: {"volumeAttachment": {"device": "/dev/vdb", "serverId":
  "e0b74695-f9a0-4f4c-8a08-a0de3420cc6c", "id":
  "2a48ae08-2833-488c-8771-eea86f41c7b8", "volumeId":
  "2a48ae08-2833-488c-8771-eea86f41c7b8"}} _log_request_full
  tempest/lib/common/rest_client.py:437

  swap volume1 (2a48ae08-2833-488c-8771-eea86f41c7b8) to volume2
  (fa56e173-3bd4-4701-bc71-cb4810c53c51) on server1 (e0b74695-f9a0-4f4c-
  8a08-a0de3420cc6c):

  2018-12-08 01:43:28.336 8815 INFO tempest.lib.common.rest_client [req-b6cb2f89-ebf6-47b5-9757-8692dd91c812 ] Request (TestMultiAttachVolumeSwap:test_volume_swap_with_multiattach): 202 PUT https://149.202.185.218/compute/v2.1/servers/d46fba31-9469-4799-b2bf-1fbad4369a9a/os-volume_attachments/2a48ae08-2833-488c-8771-eea86f41c7b8 0.618s
  2018-12-08 01:43:28.337 8815 DEBUG tempest.lib.common.rest_client [req-b6cb2f89-ebf6-47b5-9757-8692dd91c812 ] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>', 'X-OpenStack-Nova-API-Version': '2.60'}
          Body: {"volumeAttachment": {"volumeId": "fa56e173-3bd4-4701-bc71-cb4810c53c51"}}
      Response - Headers: {u'x-openstack-request-id': 'req-b6cb2f89-ebf6-47b5-9757-8692dd91c812', u'content-length': '0', u'vary': 'OpenStack-API-Version,X-OpenStack-Nova-API-Version', u'server': 'Apache/2.4.18 (Ubuntu)', u'x-compute-request-id': 'req-b6cb2f89-ebf6-47b5-9757-8692dd91c812', u'content-type': 'application/json', 'content-location': 'https://149.202.185.218/compute/v2.1/servers/d46fba31-9469-4799-b2bf-1fbad4369a9a/os-volume_attachments/2a48ae08-2833-488c-8771-eea86f41c7b8', u'date': 'Sat, 08 Dec 2018 01:43:27 GMT', 'status': '202', u'openstack-api-version': 'compute 2.60', u'connection': 'close', u'x-openstack-nova-api-version': '2.60'}

  now the test waits for the swap to complete and the attachments are all recorded correctly;
  once the swap is done, volume1 should be attached to server2 and volume2 should be attached
  to server1

  This is the connection_info for the old/new volume during the swap:

  Dec 08 01:43:31.321263 ubuntu-xenial-ovh-gra1-0001066278 nova-
  compute[20206]: DEBUG nova.compute.manager [None req-
  b6cb2f89-ebf6-47b5-9757-8692dd91c812 tempest-
  TestMultiAttachVolumeSwap-477435682 tempest-
  TestMultiAttachVolumeSwap-477435682] [instance: d46fba31-9469-4799
  -b2bf-1fbad4369a9a] swap_volume: Calling driver volume swap with
  connection infos:

  new: {'status': u'reserved', 'multiattach': True, 'detached_at': u'',
  u'volume_id': u'fa56e173-3bd4-4701-bc71-cb4810c53c51', 'attach_mode':
  u'null', 'driver_volume_type': u'iscsi', 'instance':
  u'd46fba31-9469-4799-b2bf-1fbad4369a9a', 'attached_at': u'', 'serial':
  u'fa56e173-3bd4-4701-bc71-cb4810c53c51', 'data': {u'access_mode':
  u'rw', u'target_discovered': False, u'encrypted': False, u'qos_specs':
  None, u'target_iqn': u'iqn.2010-10.org.openstack:volume-
  fa56e173-3bd4-4701-bc71-cb4810c53c51', u'target_portal':
  u'213.32.74.238:3260', u'volume_id':
  u'fa56e173-3bd4-4701-bc71-cb4810c53c51', u'target_lun': 1,
  u'auth_password': u'***', u'auth_username': u'oAJ38A88Rom7xrwQ5A4Y',
  u'auth_method': u'CHAP'}};

  old: {u'status': u'reserved', u'instance': u'd46fba31-9469-4799-b2bf-
  1fbad4369a9a', u'multiattach': True, u'attached_at': u'',
  u'attach_mode': u'null', u'driver_volume_type': u'iscsi',
  u'detached_at': u'', u'volume_id':
  u'2a48ae08-2833-488c-8771-eea86f41c7b8', u'serial':
  u'2a48ae08-2833-488c-8771-eea86f41c7b8', u'data': {u'device_path':
  u'/dev/sda', u'target_discovered': False, u'encrypted': False,
  u'qos_specs': None, u'target_iqn': u'iqn.2010-10.org.openstack:volume-
  2a48ae08-2833-488c-8771-eea86f41c7b8', u'target_portal':
  u'149.202.185.218:3260', u'volume_id':
  u'2a48ae08-2833-488c-8771-eea86f41c7b8', u'auth_password': u'***',
  u'target_lun': 1, u'access_mode': u'rw', u'auth_username':
  u'HMHytc239biDCT6NTgMa', u'auth_method': u'CHAP'}} {{(pid=20206)
  _swap_volume /opt/stack/nova/nova/compute/manager.py:5646}}

  When the test cleans up and tries to delete volume1, it fails trying
  to cleanup iqn.2010-10.org.openstack:volume-
  2a48ae08-2833-488c-8771-eea86f41c7b8 from the subnode (which is the
  old connection to server1).

  volume1 connection is supposed to be detached from server1 here:

  http://logs.openstack.org/81/606981/4/check/tempest-
  slow/fafde23/compute1/logs/screen-n-cpu.txt.gz#_Dec_08_01_45_36_699753

  Dec 08 01:45:36.699753 ubuntu-xenial-ovh-gra1-0001066278 nova-
  compute[20206]: DEBUG os_brick.initiator.connectors.iscsi [None req-
  b6cb2f89-ebf6-47b5-9757-8692dd91c812 tempest-
  TestMultiAttachVolumeSwap-477435682 tempest-
  TestMultiAttachVolumeSwap-477435682] ==> disconnect_volume: call
  u"{'args': (<os_brick.initiator.connectors.iscsi.ISCSIConnector object
  at 0x7fe1aa096d50>, {u'device_path': u'/dev/sda',
  u'target_discovered': False, u'encrypted': False, u'qos_specs': None,
  u'target_iqn': u'iqn.2010-10.org.openstack:volume-
  2a48ae08-2833-488c-8771-eea86f41c7b8', u'target_portal':
  u'149.202.185.218:3260', u'volume_id':
  u'2a48ae08-2833-488c-8771-eea86f41c7b8', u'auth_password': u'***',
  u'target_lun': 1, u'access_mode': u'rw', u'auth_username':
  u'HMHytc239biDCT6NTgMa', u'auth_method': u'CHAP'}, None), 'kwargs':
  {}}" {{(pid=20206) trace_logging_wrapper /usr/local/lib/python2.7
  /dist-packages/os_brick/utils.py:146}}

  It says it logged out:

  Dec 08 01:45:37.336156 ubuntu-xenial-ovh-gra1-0001066278 nova-
  compute[20206]: Logout of [sid: 1, target: iqn.2010-10.org.openstack
  :volume-2a48ae08-2833-488c-8771-eea86f41c7b8, portal:
  149.202.185.218,3260] successful.

  This definitely looks wrong (post-swap volume, updating old BDM record), note it's updating the BDM for volume1 (2a48ae08-2833-488c-8771-eea86f41c7b8) but with the volume2 (fa56e173-3bd4-4701-bc71-cb4810c53c51) connection_info - note that for the new volume attach flow the connection_info might not matter since we likely don't use it (we just work with the attachment records), but if
  the attachment_id (769b75cc-4283-4b46-a975-243faae6e263) is for the wrong volume, that could be a problem:

  Dec 08 01:45:37.699221 ubuntu-xenial-ovh-gra1-0001066278 nova-
  compute[20206]: DEBUG nova.compute.manager [None req-
  b6cb2f89-ebf6-47b5-9757-8692dd91c812 tempest-
  TestMultiAttachVolumeSwap-477435682 tempest-
  TestMultiAttachVolumeSwap-477435682] [instance: d46fba31-9469-4799
  -b2bf-1fbad4369a9a] swap_volume: Updating volume
  2a48ae08-2833-488c-8771-eea86f41c7b8 BDM record with {'attachment_id':
  u'769b75cc-4283-4b46-a975-243faae6e263', 'no_device': None,
  'connection_info': '{"status": "reserved", "multiattach": true,
  "detached_at": "", "volume_id":
  "fa56e173-3bd4-4701-bc71-cb4810c53c51", "attach_mode": "null",
  "driver_volume_type": "iscsi", "instance": "d46fba31-9469-4799-b2bf-
  1fbad4369a9a", "attached_at": "", "serial":
  "fa56e173-3bd4-4701-bc71-cb4810c53c51", "data": {"access_mode": "rw",
  "target_discovered": false, "encrypted": false, "qos_specs": null,
  "target_iqn": "iqn.2010-10.org.openstack:volume-
  fa56e173-3bd4-4701-bc71-cb4810c53c51", "target_portal":
  "213.32.74.238:3260", "volume_id":
  "fa56e173-3bd4-4701-bc71-cb4810c53c51", "target_lun": 1,
  "device_path": "/dev/sdb", "auth_password": "Ko6DLxWNfrbqWQ7E",
  "auth_username": "oAJ38A88Rom7xrwQ5A4Y", "auth_method": "CHAP"}}',
  'volume_id': u'fa56e173-3bd4-4701-bc71-cb4810c53c51', 'source_type':
  'volume', 'snapshot_id': None, 'destination_type': 'volume'}
  {{(pid=20206) swap_volume
  /opt/stack/nova/nova/compute/manager.py:5819}}

  Probably the bigger problem is the BDM for volume1 is updated to have
  volume_id fa56e173-3bd4-4701-bc71-cb4810c53c51 so we changed the
  volume_id on the bdm for volume1, so that might have something to do
  with orphaning volume1's connections when we go to delete - since
  deleting server2 (which still has volume1 attached) likely doesn't
  cleanup volume1 properly.

  The other things I'm wondering about are:

  1. Is nova-compute deleting the old volume1/server1 attachment record
  here:

  https://github.com/openstack/nova/blob/ae3064b7a820ea02f7fc8a1aa4a41f35a06534f1/nova/compute/manager.py#L5691

  because I didn't see anything about attachment delete in the logs.

  2. Is there a periodic task in the cinder-volume service that is
  recreating the target connection for volume1 and server1 in between
  the time we swap the volume and nova-compute deletes the connection
  but before nova-compute deletes the old attachment record?

  The first and most obvious thing to do is fix that bdm update code at
  the end of the ComputeManager.swap_volume method. I'm not exactly sure
  yet how the multi-host nature of this comes into play, except possibly
  in the libvirt driver volume disconnect code here:

  https://github.com/openstack/nova/blob/ae3064b7a820ea02f7fc8a1aa4a41f35a06534f1/nova/virt/libvirt/driver.py#L1221

To manage notifications about this bug go to:
https://bugs.launchpad.net/cinder/+bug/1807723/+subscriptions
References

[Bug 1807723] [NEW] swap multiattach volume intermittently fails when servers are on different hosts
From: Matt Riedemann, 2018-12-10