← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1499012] [NEW] Nova and Cinder get desynced on volume attachments

 

Public bug reported:

This bug occurred with versions 2015.1.0 of Nova and Cinder installed.

When bulk deleting large numbers of Nova instances, we occasionally
encounter something that appears to be a race condition where Cinder
believes a volume is detached and available but Nova reports the volume
is still attached to an instance.

Example below shows an instance in ERROR state (some output of nova show
truncated for brevity) and Cinder showing that the volume Nova thinks is
attached is available:

+--------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Property                             | Value                                                                                                                                                                                                                                |
+--------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| OS-DCF:diskConfig                    | MANUAL                                                                                                                                                                                                                               |
| OS-EXT-AZ:availability_zone          | ---                                                                                                                                                                                                                              |
| OS-EXT-SRV-ATTR:host                 | ---                                                                                                                                                                                                                       |
| OS-EXT-SRV-ATTR:hypervisor_hostname  | ---                                                                                                                                                                                                         |
| OS-EXT-SRV-ATTR:instance_name        | instance-0000427f                                                                                                                                                                                                                    |
| OS-EXT-STS:power_state               | 0                                                                                                                                                                                                                                    |
| OS-EXT-STS:task_state                | -                                                                                                                                                                                                                                    |
| OS-EXT-STS:vm_state                  | error                                                                                                                                                                                                                                |
| OS-SRV-USG:launched_at               | 2015-09-22T15:39:06.000000                                                                                                                                                                                                           |
| OS-SRV-USG:terminated_at             | -                                                                                                                                                                                                                                    |
| accessIPv4                           |                                                                                                                                                                                                                                      |
| accessIPv6                           |                                                                                                                                                                                                                                      |
| config_drive                         |                                                                                                                                                                                                                                      |
| created                              | 2015-09-22T14:04:40Z                                                                                                                                                                                                                 |
                                                                                                                                                                              |
| flavor                               | m1.small (2)                                                                                                                                                                                                                         |
| hostId                               | e2d6789d6505366a3d78097d4c0bf99b8f30dd49b5561b1f94e0cdcb                                                                                                                                                                             |
| id                                   | 247d9ebe-6794-4119-b677-d990fd6244b1                                                                                                                                                                                                 |
| image                                | Red Hat Enterprise Linux 7 (current) (297e1979-cfa0-4596-8446-95c29ac640cc)                                                                                                                                                          |
| key_name                             | ---                                                                                                                                                                                                                         |
| metadata                             | {}                                                                                                                                                                                                                                   |
| name                                 | r_img_test_z2_1                                                                                                                                                                                                                      |
| os-extended-volumes:volumes_attached | [{"id": "7adab934-360f-4cd0-9c98-7d6bea1554b3"}]                                                                                                                                                                                     |
| security_groups                      | default                                                                                                                                                                                                                              |
| status                               | ERROR                                                                                                                                                                                                                                |
| tenant_id                            | a9421c18f6fc48fea53c5685654f625a                                                                                                                                                                                                     |
| updated                              | 2015-09-23T13:51:10Z                                                                                                                                                                                                                 |
| user_id                              | ---                                                                                                                                                                                                                            |
+--------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

+--------------------------------------+----------------------------------+-----------+--------------+------+-------------+----------+-------------+
|                  ID                  |            Tenant ID             |   Status  | Display Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+----------------------------------+-----------+--------------+------+-------------+----------+-------------+
| 7adab934-360f-4cd0-9c98-7d6bea1554b3 | a9421c18f6fc48fea53c5685654f625a | available |    test2     |  20  |     SSD     |  false   |             |
+--------------------------------------+----------------------------------+-----------+--------------+------+-------------+----------+-------------+

Attempting to delete the instance while in this state results in the
following traceback from nova-client:

File \"/usr/lib/python2.7/dist-packages/nova/compute/manager.py\", line 343, in decorated_function
  return function(self, context, *args, **kwargs)
File \"/usr/lib/python2.7/dist-packages/nova/compute/manager.py\", line 2737, in terminate_instance
  do_terminate_instance(instance, bdms)
File \"/usr/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py\", line 445, in inner
  return f(*args, **kwargs)
File \"/usr/lib/python2.7/dist-packages/nova/compute/manager.py\", line 2735, in do_terminate_instance
  self._set_instance_error_state(context, instance)
File \"/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py\", line 85, in __exit__
  six.reraise(self.type_, self.value, self.tb)
File \"/usr/lib/python2.7/dist-packages/nova/compute/manager.py\", line 2725, in do_terminate_instance
  self._delete_instance(context, instance, bdms, quotas)
File \"/usr/lib/python2.7/dist-packages/nova/hooks.py\", line 149, in inner
  rv = f(*args, **kwargs)
File \"/usr/lib/python2.7/dist-packages/nova/compute/manager.py\", line 2694, in _delete_instance
  quotas.rollback()
File \"/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py\", line 85, in __exit__
  six.reraise(self.type_, self.value, self.tb)
File \"/usr/lib/python2.7/dist-packages/nova/compute/manager.py\", line 2664, in _delete_instance
  self._shutdown_instance(context, instance, bdms)
File \"/usr/lib/python2.7/dist-packages/nova/compute/manager.py\", line 2604, in _shutdown_instance
  self.volume_api.detach(context, bdm.volume_id)
File \"/usr/lib/python2.7/dist-packages/nova/volume/cinder.py\", line 214, in wrapper
  res = method(self, ctx, volume_id, *args, **kwargs)
File \"/usr/lib/python2.7/dist-packages/nova/volume/cinder.py\", line 365, in detach
  cinderclient(context).volumes.detach(volume_id)
File \"/usr/lib/python2.7/dist-packages/cinderclient/v2/volumes.py\", line 334, in detach
  return self._action('os-detach', volume)
File \"/usr/lib/python2.7/dist-packages/cinderclient/v2/volumes.py\", line 311, in _action
  return self.api.client.post(url, body=body)
File \"/usr/lib/python2.7/dist-packages/cinderclient/client.py\", line 91, in post
  return self._cs_request(url, 'POST', **kwargs)
File \"/usr/lib/python2.7/dist-packages/cinderclient/client.py\", line 85, in _cs_request
  return self.request(url, method, **kwargs)
File \"/usr/lib/python2.7/dist-packages/cinderclient/client.py\", line 80, in request
  return super(SessionClient, self).request(*args, **kwargs)
File \"/usr/lib/python2.7/dist-packages/keystoneclient/adapter.py\", line 200, in request
  resp = super(LegacyJsonAdapter, self).request(*args, **kwargs)
File \"/usr/lib/python2.7/dist-packages/keystoneclient/adapter.py\", line 89, in request
  return self.session.request(url, method, **kwargs)
File \"/usr/lib/python2.7/dist-packages/keystoneclient/utils.py\", line 318, in inner
  return func(*args, **kwargs)
File \"/usr/lib/python2.7/dist-packages/keystoneclient/session.py\", line 389, in request
  raise exceptions.from_response(resp, method, url)

Cinder logs an exception stating that the volume has no attachments:

2015-09-23 09:49:47.923 42963 ERROR cinder.volume.manager [req-4a2f6878-54ec-45a9-abcf-f4439a6c2de8 cloud_ll2 e000f9d1f3bf4f758e1d77c94da61281 - - -] Volume 7adab934-360f-4cd0-9c98-7d6bea1554b3 doesn't have any attachments to detach
2015-09-23 09:49:47.924 42963 ERROR oslo_messaging.rpc.dispatcher [req-4a2f6878-54ec-45a9-abcf-f4439a6c2de8 cloud_ll2 e000f9d1f3bf4f758e1d77c94da61281 - - -] Exception during message handling: Invalid volume: Volume 7adab934-360f-4cd0-9c98-7d6bea1554b3 doesn't have any attachments to detach
2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher Traceback (most recent call last):
2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply
2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher     executor_callback))
2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch
2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher     executor_callback)
2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 130, in _do_dispatch
2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher     result = func(ctxt, **new_args)
2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 105, in wrapper
2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher     return f(*args, **kwargs)
2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/cinder/volume/manager.py", line 157, in ldo_inner1
2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher     return ldo_inner2(inst, context, volume_id, attachment_id, **kwargs)
2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 445, in inner
2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher     return f(*args, **kwargs)
2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/cinder/volume/manager.py", line 156, in ldo_inner2
2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher     return f(*_args, **_kwargs)
2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/cinder/volume/manager.py", line 882, in detach_volume
2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher     raise exception.InvalidVolume(reason=msg)
2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher InvalidVolume: Invalid volume: Volume 7adab934-360f-4cd0-9c98-7d6bea1554b3 doesn't have any attachments to detach
2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher

While in this state, Nova refuses to take any action on the instance,
including force deletion. Removing the volume via cinder delete unsticks
things; nova delete will remove the instance once the offending volume
has been removed.

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1499012

Title:
  Nova and Cinder get desynced on volume attachments

Status in OpenStack Compute (nova):
  New

Bug description:
  This bug occurred with versions 2015.1.0 of Nova and Cinder installed.

  When bulk deleting large numbers of Nova instances, we occasionally
  encounter something that appears to be a race condition where Cinder
  believes a volume is detached and available but Nova reports the
  volume is still attached to an instance.

  Example below shows an instance in ERROR state (some output of nova
  show truncated for brevity) and Cinder showing that the volume Nova
  thinks is attached is available:

  +--------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  | Property                             | Value                                                                                                                                                                                                                                |
  +--------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  | OS-DCF:diskConfig                    | MANUAL                                                                                                                                                                                                                               |
  | OS-EXT-AZ:availability_zone          | ---                                                                                                                                                                                                                              |
  | OS-EXT-SRV-ATTR:host                 | ---                                                                                                                                                                                                                       |
  | OS-EXT-SRV-ATTR:hypervisor_hostname  | ---                                                                                                                                                                                                         |
  | OS-EXT-SRV-ATTR:instance_name        | instance-0000427f                                                                                                                                                                                                                    |
  | OS-EXT-STS:power_state               | 0                                                                                                                                                                                                                                    |
  | OS-EXT-STS:task_state                | -                                                                                                                                                                                                                                    |
  | OS-EXT-STS:vm_state                  | error                                                                                                                                                                                                                                |
  | OS-SRV-USG:launched_at               | 2015-09-22T15:39:06.000000                                                                                                                                                                                                           |
  | OS-SRV-USG:terminated_at             | -                                                                                                                                                                                                                                    |
  | accessIPv4                           |                                                                                                                                                                                                                                      |
  | accessIPv6                           |                                                                                                                                                                                                                                      |
  | config_drive                         |                                                                                                                                                                                                                                      |
  | created                              | 2015-09-22T14:04:40Z                                                                                                                                                                                                                 |
                                                                                                                                                                                |
  | flavor                               | m1.small (2)                                                                                                                                                                                                                         |
  | hostId                               | e2d6789d6505366a3d78097d4c0bf99b8f30dd49b5561b1f94e0cdcb                                                                                                                                                                             |
  | id                                   | 247d9ebe-6794-4119-b677-d990fd6244b1                                                                                                                                                                                                 |
  | image                                | Red Hat Enterprise Linux 7 (current) (297e1979-cfa0-4596-8446-95c29ac640cc)                                                                                                                                                          |
  | key_name                             | ---                                                                                                                                                                                                                         |
  | metadata                             | {}                                                                                                                                                                                                                                   |
  | name                                 | r_img_test_z2_1                                                                                                                                                                                                                      |
  | os-extended-volumes:volumes_attached | [{"id": "7adab934-360f-4cd0-9c98-7d6bea1554b3"}]                                                                                                                                                                                     |
  | security_groups                      | default                                                                                                                                                                                                                              |
  | status                               | ERROR                                                                                                                                                                                                                                |
  | tenant_id                            | a9421c18f6fc48fea53c5685654f625a                                                                                                                                                                                                     |
  | updated                              | 2015-09-23T13:51:10Z                                                                                                                                                                                                                 |
  | user_id                              | ---                                                                                                                                                                                                                            |
  +--------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

  +--------------------------------------+----------------------------------+-----------+--------------+------+-------------+----------+-------------+
  |                  ID                  |            Tenant ID             |   Status  | Display Name | Size | Volume Type | Bootable | Attached to |
  +--------------------------------------+----------------------------------+-----------+--------------+------+-------------+----------+-------------+
  | 7adab934-360f-4cd0-9c98-7d6bea1554b3 | a9421c18f6fc48fea53c5685654f625a | available |    test2     |  20  |     SSD     |  false   |             |
  +--------------------------------------+----------------------------------+-----------+--------------+------+-------------+----------+-------------+

  Attempting to delete the instance while in this state results in the
  following traceback from nova-client:

  File \"/usr/lib/python2.7/dist-packages/nova/compute/manager.py\", line 343, in decorated_function
    return function(self, context, *args, **kwargs)
  File \"/usr/lib/python2.7/dist-packages/nova/compute/manager.py\", line 2737, in terminate_instance
    do_terminate_instance(instance, bdms)
  File \"/usr/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py\", line 445, in inner
    return f(*args, **kwargs)
  File \"/usr/lib/python2.7/dist-packages/nova/compute/manager.py\", line 2735, in do_terminate_instance
    self._set_instance_error_state(context, instance)
  File \"/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py\", line 85, in __exit__
    six.reraise(self.type_, self.value, self.tb)
  File \"/usr/lib/python2.7/dist-packages/nova/compute/manager.py\", line 2725, in do_terminate_instance
    self._delete_instance(context, instance, bdms, quotas)
  File \"/usr/lib/python2.7/dist-packages/nova/hooks.py\", line 149, in inner
    rv = f(*args, **kwargs)
  File \"/usr/lib/python2.7/dist-packages/nova/compute/manager.py\", line 2694, in _delete_instance
    quotas.rollback()
  File \"/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py\", line 85, in __exit__
    six.reraise(self.type_, self.value, self.tb)
  File \"/usr/lib/python2.7/dist-packages/nova/compute/manager.py\", line 2664, in _delete_instance
    self._shutdown_instance(context, instance, bdms)
  File \"/usr/lib/python2.7/dist-packages/nova/compute/manager.py\", line 2604, in _shutdown_instance
    self.volume_api.detach(context, bdm.volume_id)
  File \"/usr/lib/python2.7/dist-packages/nova/volume/cinder.py\", line 214, in wrapper
    res = method(self, ctx, volume_id, *args, **kwargs)
  File \"/usr/lib/python2.7/dist-packages/nova/volume/cinder.py\", line 365, in detach
    cinderclient(context).volumes.detach(volume_id)
  File \"/usr/lib/python2.7/dist-packages/cinderclient/v2/volumes.py\", line 334, in detach
    return self._action('os-detach', volume)
  File \"/usr/lib/python2.7/dist-packages/cinderclient/v2/volumes.py\", line 311, in _action
    return self.api.client.post(url, body=body)
  File \"/usr/lib/python2.7/dist-packages/cinderclient/client.py\", line 91, in post
    return self._cs_request(url, 'POST', **kwargs)
  File \"/usr/lib/python2.7/dist-packages/cinderclient/client.py\", line 85, in _cs_request
    return self.request(url, method, **kwargs)
  File \"/usr/lib/python2.7/dist-packages/cinderclient/client.py\", line 80, in request
    return super(SessionClient, self).request(*args, **kwargs)
  File \"/usr/lib/python2.7/dist-packages/keystoneclient/adapter.py\", line 200, in request
    resp = super(LegacyJsonAdapter, self).request(*args, **kwargs)
  File \"/usr/lib/python2.7/dist-packages/keystoneclient/adapter.py\", line 89, in request
    return self.session.request(url, method, **kwargs)
  File \"/usr/lib/python2.7/dist-packages/keystoneclient/utils.py\", line 318, in inner
    return func(*args, **kwargs)
  File \"/usr/lib/python2.7/dist-packages/keystoneclient/session.py\", line 389, in request
    raise exceptions.from_response(resp, method, url)

  Cinder logs an exception stating that the volume has no attachments:

  2015-09-23 09:49:47.923 42963 ERROR cinder.volume.manager [req-4a2f6878-54ec-45a9-abcf-f4439a6c2de8 cloud_ll2 e000f9d1f3bf4f758e1d77c94da61281 - - -] Volume 7adab934-360f-4cd0-9c98-7d6bea1554b3 doesn't have any attachments to detach
  2015-09-23 09:49:47.924 42963 ERROR oslo_messaging.rpc.dispatcher [req-4a2f6878-54ec-45a9-abcf-f4439a6c2de8 cloud_ll2 e000f9d1f3bf4f758e1d77c94da61281 - - -] Exception during message handling: Invalid volume: Volume 7adab934-360f-4cd0-9c98-7d6bea1554b3 doesn't have any attachments to detach
  2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher Traceback (most recent call last):
  2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply
  2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher     executor_callback))
  2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch
  2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher     executor_callback)
  2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 130, in _do_dispatch
  2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher     result = func(ctxt, **new_args)
  2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 105, in wrapper
  2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher     return f(*args, **kwargs)
  2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/cinder/volume/manager.py", line 157, in ldo_inner1
  2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher     return ldo_inner2(inst, context, volume_id, attachment_id, **kwargs)
  2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 445, in inner
  2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher     return f(*args, **kwargs)
  2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/cinder/volume/manager.py", line 156, in ldo_inner2
  2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher     return f(*_args, **_kwargs)
  2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/cinder/volume/manager.py", line 882, in detach_volume
  2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher     raise exception.InvalidVolume(reason=msg)
  2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher InvalidVolume: Invalid volume: Volume 7adab934-360f-4cd0-9c98-7d6bea1554b3 doesn't have any attachments to detach
  2015-09-23 09:49:47.924 42963 TRACE oslo_messaging.rpc.dispatcher

  While in this state, Nova refuses to take any action on the instance,
  including force deletion. Removing the volume via cinder delete
  unsticks things; nova delete will remove the instance once the
  offending volume has been removed.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1499012/+subscriptions


Follow ups