← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1862375] Re: Subsequent nova-api volume attach request waiting for previous one to complete

 

I don't think this is a bug. The reason of the synchronous handling of
attachment requests is that we need to atomically select a device name
for the new attachment and then return that in the API response. So two
attachments for the same instance cannot be handled in parallel.

To be able to return an API response asynchronously we would need to
remove the device name from the response. That would be a backward
incompatible change that needs a nova specification.

** Changed in: nova
       Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1862375

Title:
  Subsequent nova-api volume attach request waiting for previous one to
  complete

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  Description
  ===========
  Subsequent nova-api requests for attachment of different volumes to the same VM are blocking and waiting for the previous attach action to be finished and "in-use" state. In my opinion, this is unnecessary and can lead to timeouting errors. Observed on Openstack Rocky.

  Steps to reproduce
  ==================
  Preconditions:
  - cinder configured with a backend storage, best if a HW storage is used where the attach action takes considerable time - say >10s
  - 1 VM ("vm")
  - 2 volumes ("vol1", "vol2")

  Actions:

  1. 
  $ openstack server add volume vm vol1
  -> is accepted immediately by nova-api

  2. immediately after (1.), when the vol1 is being attached, run
  $ openstack server add volume vm vol2
  -> this openstack command (aka nova-api call) blocks and does not return until the volume attach command in (1.) is completed and vol1 is "in-use" state.

  Expected result
  ===============
  Step (2.) should be immediately accepted and handled asychronously. I don't see a reason why step (2.) should wait until volume from step (1.) is "in-use"

  Logs
  ====
  In cases, when the attachment of (1.) takes more than 60s, it leads to an error of (2.) with following messaging timeout, which also exposes where the call waits - obviously reserve_block_device_name to a compute node :
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi [req-44c4d473-9916-4d73-82d6-0115a1305f2a 0b5290e72cf546cb9e1921d81abb303c b21f6c73cba24a4280156f1d3b77af98 - default default] Unexpected exception in API method: MessagingTimeout:
   Timed out waiting for a reply to message ID 3af45090624b4fa29425e6fc05f41149
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi Traceback (most recent call last):
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi   File "/usr/lib/python2.7/site-packages/nova/api/openstack/wsgi.py", line 801, in wrapped
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi     return f(*args, **kwargs)
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi   File "/usr/lib/python2.7/site-packages/nova/api/validation/__init__.py", line 110, in wrapper
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi     return func(*args, **kwargs)
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi   File "/usr/lib/python2.7/site-packages/nova/api/validation/__init__.py", line 110, in wrapper
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi     return func(*args, **kwargs)
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi   File "/usr/lib/python2.7/site-packages/nova/api/openstack/compute/volumes.py", line 336, in create
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi     supports_multiattach=supports_multiattach)
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi   File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 205, in inner
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi     return function(self, context, instance, *args, **kwargs)
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi   File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 153, in inner
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi     return f(self, context, instance, *args, **kw)
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi   File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 4172, in attach_volume
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi     supports_multiattach=supports_multiattach)
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi   File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 4047, in _attach_volume
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi     device_type=device_type, tag=tag)
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi   File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 3958, in _create_volume_bdm
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi     multiattach=volume['multiattach'])
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi   File "/usr/lib/python2.7/site-packages/nova/compute/rpcapi.py", line 897, in reserve_block_device_name
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi     return cctxt.call(ctxt, 'reserve_block_device_name', **kw)
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 179, in call
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi     retry=self.retry)
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi   File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 133, in _send
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi     retry=retry)
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 645, in send
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi     call_monitor_timeout, retry=retry)
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 634, in _send
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi     call_monitor_timeout)
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 520, in wait
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi     message = self.waiters.get(msg_id, timeout=timeout)
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 397, in get
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi     'to message ID %s' % msg_id)
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi MessagingTimeout: Timed out waiting for a reply to message ID 3af45090624b4fa29425e6fc05f41149
  2020-02-06 02:03:14.744 30 ERROR nova.api.openstack.wsgi

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1862375/+subscriptions


References