← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1709287] [NEW] Volume detach fails if there are multiple BDM entries

 

Public bug reported:

Steps to reproduce:
1. Attaching volume to an instance fails because of an RPC timeout when nova-api calls nova-compute to create BDM
2. Attaching the same volume to the same instance succeeds the second time
3. There are two BDMs for this volume and one of them has empty connection_info.  When we try to detach the volume, an error is thrown because of the stale BDM entry created on step 1:

[req-b14eb2a2-10bc-4b1a-b62f-ead07947eb66 7c0126911c154f3db23e4f013c70f5aa b006cefe78734655ad29cf49445f2f67 - - -] Exception during message handling: <type 'NoneType'> can't be decoded
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 138, in _dispatch_and_reply
    incoming.message))
  File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 185, in _dispatch
    return self._do_dispatch(endpoint, method, ctxt, args)
  File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 127, in _do_dispatch
    result = func(ctxt, **new_args)
  File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 154, in wrapper
    return f(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 110, in wrapped
    payload)
  File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
    self.force_reraise()
  File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)
  File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 89, in wrapped
    return f(self, context, *args, **kw)
  File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 395, in decorated_function
    kwargs['instance'], e, sys.exc_info())
  File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
    self.force_reraise()
  File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)
  File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 383, in decorated_function
    return function(self, context, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 466, in decorated_function
    instance=instance)
  File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
    self.force_reraise()
  File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)
  File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 456, in decorated_function
    *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 4976, in detach_volume
    attachment_id=attachment_id)
  File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 4906, in _detach_volume
    connection_info = jsonutils.loads(bdm.connection_info)
  File "/usr/lib/python2.7/dist-packages/oslo_serialization/jsonutils.py", line 229, in loads
    return json.loads(encodeutils.safe_decode(s, encoding), **kwargs)
  File "/usr/lib/python2.7/dist-packages/oslo_utils/encodeutils.py", line 39, in safe_decode
    raise TypeError("%s can't be decoded" % type(text))
TypeError: <type 'NoneType'> can't be decoded


It is not easy to catch the timeout and then delete the BDM entry because the entry may get created later after the timeout (we have seen this in our environment). Also, we may accidentally delete the entry created by a concurrent attach request.

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1709287

Title:
  Volume detach fails if there are multiple BDM entries

Status in OpenStack Compute (nova):
  New

Bug description:
  Steps to reproduce:
  1. Attaching volume to an instance fails because of an RPC timeout when nova-api calls nova-compute to create BDM
  2. Attaching the same volume to the same instance succeeds the second time
  3. There are two BDMs for this volume and one of them has empty connection_info.  When we try to detach the volume, an error is thrown because of the stale BDM entry created on step 1:

  [req-b14eb2a2-10bc-4b1a-b62f-ead07947eb66 7c0126911c154f3db23e4f013c70f5aa b006cefe78734655ad29cf49445f2f67 - - -] Exception during message handling: <type 'NoneType'> can't be decoded
  Traceback (most recent call last):
    File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 138, in _dispatch_and_reply
      incoming.message))
    File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 185, in _dispatch
      return self._do_dispatch(endpoint, method, ctxt, args)
    File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 127, in _do_dispatch
      result = func(ctxt, **new_args)
    File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 154, in wrapper
      return f(*args, **kwargs)
    File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 110, in wrapped
      payload)
    File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()
    File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 89, in wrapped
      return f(self, context, *args, **kw)
    File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 395, in decorated_function
      kwargs['instance'], e, sys.exc_info())
    File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()
    File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 383, in decorated_function
      return function(self, context, *args, **kwargs)
    File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 466, in decorated_function
      instance=instance)
    File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()
    File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 456, in decorated_function
      *args, **kwargs)
    File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 4976, in detach_volume
      attachment_id=attachment_id)
    File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 4906, in _detach_volume
      connection_info = jsonutils.loads(bdm.connection_info)
    File "/usr/lib/python2.7/dist-packages/oslo_serialization/jsonutils.py", line 229, in loads
      return json.loads(encodeutils.safe_decode(s, encoding), **kwargs)
    File "/usr/lib/python2.7/dist-packages/oslo_utils/encodeutils.py", line 39, in safe_decode
      raise TypeError("%s can't be decoded" % type(text))
  TypeError: <type 'NoneType'> can't be decoded

  
  It is not easy to catch the timeout and then delete the BDM entry because the entry may get created later after the timeout (we have seen this in our environment). Also, we may accidentally delete the entry created by a concurrent attach request.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1709287/+subscriptions


Follow ups