yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #32590
[Bug 1450658] Re: VolumeBackendAPIException during _shutdown_instance are not handled
** Also affects: python-cinderclient
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1450658
Title:
VolumeBackendAPIException during _shutdown_instance are not handled
Status in OpenStack Compute (Nova):
Confirmed
Status in Python client library for Cinder:
New
Bug description:
Use rally for cinder boot from voume test, after test completed, found a lot of VMs were deleted failed with error status, manual retry a few times to delete VM, but can't work.
[root@abba-n04 home]# nova list --all-tenants
+--------------------------------------+-----------------------------------+----------------------------------+--------+------------+-------------+----------+
| ID | Name | Tenant ID | Status | Task State | Power State | Networks |
+--------------------------------------+-----------------------------------+----------------------------------+--------+------------+-------------+----------+
| 335f2ca0-a86d-45a5-b4a6-4d4ea1930e89 | rally_novaserver_aftvfqejftusyflf | 3e87ac6d12264286a10ac68eb913dacb | ERROR | - | NOSTATE | |
| 888aa512-07f0-42ad-ae5e-255bbca6fe34 | rally_novaserver_ajqhjxjxnjojelgm | 7e05339543e743c3b023cee8128bbbbe | ERROR | - | NOSTATE | |
| 84b32483-2997-4a2a-8d75-236395b4ef2f | rally_novaserver_arycquunqovvyxnl | 3e87ac6d12264286a10ac68eb913dacb | ERROR | - | NOSTATE | |
| 535f1ce1-63fe-4681-b215-e21d38f77fbb | rally_novaserver_arzjesynagiqfjgt | 3e87ac6d12264286a10ac68eb913dacb | ERROR | - | NOSTATE | |
| 6ef3fa37-e5a9-4a21-8996-d291070c246a | rally_novaserver_aufevkgzvynumwwo | d37e8219a3de4e5b985828a0b959f1d6 | ERROR | - | NOSTATE | |
| e5d2dc5f-6e86-43e2-8f1f-1a3f6d4951bc | rally_novaserver_ayzjeqcplouwcaht | d37e8219a3de4e5b985828a0b959f1d6 | ERROR | - | NOSTATE | |
| 2dcb294a-e2cc-4989-9e87-74081f1567db | rally_novaserver_bbphsjrexkcgcjtt | 7e05339543e743c3b023cee8128bbbbe | ERROR | - | NOSTATE | |
| 88053991-2fab-4442-86c7-7825ed47ff0a | rally_novaserver_beveqwdokixwdbgi | 3e87ac6d12264286a10ac68eb913dacb | ERROR | - | NOSTATE | |
| 1e109862-34ea-4686-a099-28f5840244cf | rally_novaserver_bimlwsfkndjbrczv | d37e8219a3de4e5b985828a0b959f1d6 | ERROR | - | NOSTATE | |
| 6143bbb2-d3eb-4635-9554-85b30fbb5aa5 | rally_novaserver_bmycsptyoicymdmb | 3e87ac6d12264286a10ac68eb913dacb | ERROR | - | NOSTATE | |
| 5e9308ae-9736-485f-92b7-e6783c613ba1 | rally_novaserver_bsvcnsqeyawcnbgp | d37e8219a3de4e5b985828a0b959f1d6 | ERROR | - | NOSTATE | |
| 13fb2413-15b6-41a3-a8a0-a0b775747261 | rally_novaserver_bvfyfnnixwgisbkk | d37e8219a3de4e5b985828a0b959f1d6 | ERROR | - | NOSTATE | |
| d5ce6fa3-99b5-4d61-ac7c-4b5bc13d1c27 | rally_novaserver_cpbevhulxylepvql | 3e87ac6d12264286a10ac68eb913dacb | ERROR | - | NOSTATE | |
| 23ce9007-ab43-4b57-8686-4aa1ce336f09 | rally_novaserver_cswudaopawepnmms | 7e05339543e743c3b023cee8128bbbbe | ERROR | - | NOSTATE | |
[root@abba-n04 home]# nova delete 335f2ca0-a86d-45a5-b4a6-4d4ea1930e89
Request to delete server 335f2ca0-a86d-45a5-b4a6-4d4ea1930e89 has been accepted.
[root@abba-n04 home]# nova list --all | grep 335f2ca0-a86d-45a5-b4a6-4d4ea1930e89
| 335f2ca0-a86d-45a5-b4a6-4d4ea1930e89 | rally_novaserver_aftvfqejftusyflf | 3e87ac6d12264286a10ac68eb913dacb | ERROR | - | NOSTATE | |
[root@abba-n04 home]#
The system is using local LVM volumes attached via iSCSI. It appears
that something is going wrong when the attempt to detach the volume is
being made:
2015-04-29 07:26:02.680 21775 DEBUG oslo_concurrency.processutils [req-fec5ff89-5465-4e76-b573-6a5afc0d4ee2 a9d0160b91f2412d84972a615b7547dc 828348052e494d76a401f669f85829f3 - - -] Running cmd (subprocess): sudo cinder-rootwrap /etc/cinder/rootwrap.conf cinder-rtstool delete-initiator iqn.2010-10.org.openstack:volume-a2907017-dda6-4243-bc50-85fe2164f05f iqn.1994-05.com.redhat:734fb33285d0 execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:199
2015-04-29 07:26:02.813 21775 DEBUG oslo_concurrency.processutils [req-fec5ff89-5465-4e76-b573-6a5afc0d4ee2 a9d0160b91f2412d84972a615b7547dc 828348052e494d76a401f669f85829f3 - - -] CMD "sudo cinder-rootwrap /etc/cinder/rootwrap.conf cinder-rtstool delete-initiator iqn.2010-10.org.openstack:volume-a2907017-dda6-4243-bc50-85fe2164f05f iqn.1994-05.com.redhat:734fb33285d0" returned: 1 in 0.133s execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:225
2015-04-29 07:26:02.814 21775 DEBUG oslo_concurrency.processutils [req-fec5ff89-5465-4e76-b573-6a5afc0d4ee2 a9d0160b91f2412d84972a615b7547dc 828348052e494d76a401f669f85829f3 - - -] u'sudo cinder-rootwrap /etc/cinder/rootwrap.conf cinder-rtstool delete-initiator iqn.2010-10.org.openstack:volume-a2907017-dda6-4243-bc50-85fe2164f05f iqn.1994-05.com.redhat:734fb33285d0' failed. Not Retrying. execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:258
2015-04-29 07:26:02.815 21775 ERROR cinder.volume.targets.lio [req-fec5ff89-5465-4e76-b573-6a5afc0d4ee2 a9d0160b91f2412d84972a615b7547dc 828348052e494d76a401f669f85829f3 - - -] Failed to delete initiator iqn iqn.1994-05.com.redhat:734fb33285d0 to target.
2015-04-29 07:26:02.816 21775 ERROR cinder.volume.manager [req-fec5ff89-5465-4e76-b573-6a5afc0d4ee2 a9d0160b91f2412d84972a615b7547dc 828348052e494d76a401f669f85829f3 - - -] Unable to terminate volume connection: Failed to detach iSCSI target for volume a2907017-dda6-4243-bc50-85fe2164f05f.
2015-04-29 07:26:02.817 21775 ERROR oslo_messaging.rpc.dispatcher [req-fec5ff89-5465-4e76-b573-6a5afc0d4ee2 a9d0160b91f2412d84972a615b7547dc 828348052e494d76a401f669f85829f3 - - -] Exception during message handling: Bad or unexpected response from the storage volume backend API: Unable to terminate volume connection: Failed to detach iSCSI target for volume a2907017-dda6-4243-bc50-85fe2164f05f.
2015-04-29 07:26:02.817 21775 TRACE oslo_messaging.rpc.dispatcher Traceback (most recent call last):
2015-04-29 07:26:02.817 21775 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply
2015-04-29 07:26:02.817 21775 TRACE oslo_messaging.rpc.dispatcher executor_callback))
2015-04-29 07:26:02.817 21775 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch
2015-04-29 07:26:02.817 21775 TRACE oslo_messaging.rpc.dispatcher executor_callback)
2015-04-29 07:26:02.817 21775 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 130, in _do_dispatch
2015-04-29 07:26:02.817 21775 TRACE oslo_messaging.rpc.dispatcher result = func(ctxt, **new_args)
2015-04-29 07:26:02.817 21775 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 105, in wrapper
2015-04-29 07:26:02.817 21775 TRACE oslo_messaging.rpc.dispatcher return f(*args, **kwargs)
2015-04-29 07:26:02.817 21775 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/cinder/volume/manager.py", line 1168, in terminate_connection
2015-04-29 07:26:02.817 21775 TRACE oslo_messaging.rpc.dispatcher raise exception.VolumeBackendAPIException(data=err_msg)
2015-04-29 07:26:02.817 21775 TRACE oslo_messaging.rpc.dispatcher VolumeBackendAPIException: Bad or unexpected response from the storage volume backend API: Unable to terminate volume connection: Failed to detach iSCSI target for volume a2907017-dda6-4243-bc50-85fe2164f05f.
This in turn causes issues for Nova:
2015-04-29 07:26:03.455 18637 INFO nova.scheduler.client.report [req-c0ca7f2a-d2ff-4e6d-8841-06c860781c55 a9d0160b91f2412d84972a615b7547dc 828348052e494d76a401f669f85829f3 - - -] Compute_service record updated for ('abba-n04.private.cloud.com', 'abba-n04.private.cloud.com')
2015-04-29 07:26:03.462 18637 ERROR oslo_messaging.rpc.dispatcher [req-c0ca7f2a-d2ff-4e6d-8841-06c860781c55 a9d0160b91f2412d84972a615b7547dc 828348052e494d76a401f669f85829f3 - - -] Exception during message handling: Internal Server Error (HTTP 500) (Request-ID: req-fec5ff89-5465-4e76-b573-6a5afc0d4ee2)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher Traceback (most recent call last):
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher executor_callback))
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher executor_callback)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 130, in _do_dispatch
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher result = func(ctxt, **new_args)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6853, in terminate_instance
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher reservations=reservations)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/exception.py", line 88, in wrapped
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher payload)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/exception.py", line 71, in wrapped
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher return f(self, context, *args, **kw)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 328, in decorated_function
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher LOG.warning(msg, e, instance_uuid=instance_uuid)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 299, in decorated_function
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 378, in decorated_function
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 356, in decorated_function
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher kwargs['instance'], e, sys.exc_info())
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 344, in decorated_function
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2792, in terminate_instance
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher do_terminate_instance(instance, bdms)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 445, in inner
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher return f(*args, **kwargs)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2790, in do_terminate_instance
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher self._set_instance_error_state(context, instance)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2780, in do_terminate_instance
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher self._delete_instance(context, instance, bdms, quotas)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/hooks.py", line 149, in inner
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher rv = f(*args, **kwargs)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2749, in _delete_instance
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher quotas.rollback()
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2719, in _delete_instance
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher self._shutdown_instance(context, instance, bdms)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2658, in _shutdown_instance
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher connector)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/volume/cinder.py", line 214, in wrapper
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher res = method(self, ctx, volume_id, *args, **kwargs)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/volume/cinder.py", line 375, in terminate_connection
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher connector)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/cinderclient/v2/volumes.py", line 384, in terminate_connection
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher {'connector': connector})
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/cinderclient/v2/volumes.py", line 311, in _action
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher return self.api.client.post(url, body=body)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/cinderclient/client.py", line 91, in post
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher return self._cs_request(url, 'POST', **kwargs)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/cinderclient/client.py", line 85, in _cs_request
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher return self.request(url, method, **kwargs)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/cinderclient/client.py", line 80, in request
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher return super(SessionClient, self).request(*args, **kwargs)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/keystoneclient/adapter.py", line 200, in request
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher resp = super(LegacyJsonAdapter, self).request(*args, **kwargs)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/keystoneclient/adapter.py", line 89, in request
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher return self.session.request(url, method, **kwargs)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/keystoneclient/utils.py", line 318, in inner
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher return func(*args, **kwargs)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/keystoneclient/session.py", line 390, in request
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher raise exceptions.from_response(resp, method, url)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher InternalServerError: Internal Server Error (HTTP 500) (Request-ID: req-fec5ff89-5465-4e76-b573-6a5afc0d4ee2)
2015-04-29 07:26:03.462 18637 TRACE oslo_messaging.rpc.dispatcher
It looks to me like the following code in nova/compute/manager.py
_shutdown_instance() might need to be expanded:
for bdm in vol_bdms:
try:
# NOTE(vish): actual driver detach done in driver.destroy, so
# just tell cinder that we are done with it.
connector = self.driver.get_volume_connector(instance)
self.volume_api.terminate_connection(context,
bdm.volume_id,
connector)
self.volume_api.detach(context, bdm.volume_id)
except exception.DiskNotFound as exc:
LOG.debug('Ignoring DiskNotFound: %s', exc,
instance=instance)
except exception.VolumeNotFound as exc:
LOG.debug('Ignoring VolumeNotFound: %s', exc,
instance=instance)
except (cinder_exception.EndpointNotFound,
keystone_exception.EndpointNotFound) as exc:
LOG.warning(_LW('Ignoring EndpointNotFound: %s'), exc,
instance=instance)
It looks to me like things are going wrong because it is getting the
VolumeBackendAPIException and things fall apart from there. It would
make more sense to me, in the case of a volume exception, to log a
warning and then keep cleaning up so we don't have a bunch of orphaned
instances left around.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1450658/+subscriptions
References