yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #64358
[Bug 1693438] Re: error instances remain in "Build" status and can't delete it
Reviewed: https://review.openstack.org/468401
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=5fac17ae960d36eeb7a642725a37f82e8ca95ec1
Submitter: Jenkins
Branch: master
commit 5fac17ae960d36eeb7a642725a37f82e8ca95ec1
Author: Matt Riedemann <mriedem.os@xxxxxxxxx>
Date: Fri May 26 08:27:07 2017 -0400
Use targeted context when burying instances in cell0
After Iccdf6b80f5fc8adcc8a89ce6ece3f37b6cbcaee2 we need to
use the yielded context which is targeted to the cell when
we do DB operations, which in this case is creating the
instance in the cell0 database and then updating it's status.
There is another place in here where this was missed, which is
when we're trying to delete a build request which was already
deleted.
Closes-Bug: #1693438
Change-Id: I142f97d691fa55e9824714c9c224f998ad72337e
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1693438
Title:
error instances remain in "Build" status and can't delete it
Status in OpenStack Compute (nova):
Fix Released
Bug description:
Description
===========
When create server API fails because of no valid host, the instance's
status remains in BUILD status and "scheduling" task state.
Additionally, users can't delete the instance by delete server API.
Steps to reproduce
==================
1. create devstack with default configs
2. boot instances until nova scheduler says "no valid host was found"
3. check the error instance's status, then its status remains in "BUILD" and its task state remains in "scheduling".
4. check nova-conductor's log, then it has a following error
5. I can't delete the failed instance by delete instance API
Expected result
===============
the instance goes ERROR status and none task state because of "no
valid host was found". And I can delete the instance by delete
instance API.
Environment
===========
- git log
Merge: bedcf29 3838d5e
Author: Jenkins <jenkins@xxxxxxxxxxxxxxxxxxxx>
Date: Thu May 25 01:15:41 2017 +0000
Merge "Handle uuid in HostAPI._find_service"
- hypervisor: KVM
Logs & Configs
==============
- nova-conductor's log:
nova-conductor[28120]: NoValidHost: No valid host was found. There are not enough hosts available.
nova-conductor[28120]:
nova-conductor[28120]: WARNING nova.scheduler.utils [req-7969ec7f-795d-420d-b847-b5c3c6bc8489 admin admin] [instance: e5e9cfe9-49ec-40a4-b763-d8bda68d5e56] Setting instance to ERROR state.
nova-conductor[28120]: ERROR oslo_messaging.rpc.server [req-7969ec7f-795d-420d-b847-b5c3c6bc8489 admin admin] Exception during message handling
nova-conductor[28120]: ERROR oslo_messaging.rpc.server Traceback (most recent call last):
nova-conductor[28120]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 157, in _process_incoming
nova-conductor[28120]: ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
nova-conductor[28120]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 213, in dispatch
nova-conductor[28120]: ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
nova-conductor[28120]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _do_dispatch
nova-conductor[28120]: ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
nova-conductor[28120]: ERROR oslo_messaging.rpc.server build_requests=build_requests)
nova-conductor[28120]: ERROR oslo_messaging.rpc.server File "/devstack/devstack_data/nova/nova/conductor/manager.py", line 893, in _bury_in_cell0
nova-conductor[28120]: ERROR oslo_messaging.rpc.server exc, legacy_spec)
nova-conductor[28120]: ERROR oslo_messaging.rpc.server File "/devstack/devstack_data/nova/nova/conductor/manager.py", line 355, in _set_vm_state_and_notify
nova-conductor[28120]: ERROR oslo_messaging.rpc.server ex, request_spec)
nova-conductor[28120]: ERROR oslo_messaging.rpc.server File "/devstack/devstack_data/nova/nova/scheduler/utils.py", line 104, in set_vm_state_and_notify
nova-conductor[28120]: ERROR oslo_messaging.rpc.server instance.save()
nova-conductor[28120]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
nova-conductor[28120]: ERROR oslo_messaging.rpc.server return fn(self, *args, **kwargs)
nova-conductor[28120]: ERROR oslo_messaging.rpc.server File "/devstack/devstack_data/nova/nova/objects/instance.py", line 781, in save
nova-conductor[28120]: ERROR oslo_messaging.rpc.server columns_to_join=_expected_cols(expected_attrs))
nova-conductor[28120]: ERROR oslo_messaging.rpc.server File "/devstack/devstack_data/nova/nova/db/api.py", line 860, in instance_update_and_get_original
nova-conductor[28120]: ERROR oslo_messaging.rpc.server expected=expected)
nova-conductor[28120]: ERROR oslo_messaging.rpc.server File "/devstack/devstack_data/nova/nova/db/sqlalchemy/api.py", line 180, in wrapper
nova-conductor[28120]: ERROR oslo_messaging.rpc.server return f(*args, **kwargs)
nova-conductor[28120]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
nova-conductor[28120]: ERROR oslo_messaging.rpc.server ectxt.value = e.inner_exc
nova-conductor[28120]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
nova-conductor[28120]: ERROR oslo_messaging.rpc.server self.force_reraise()
nova-conductor[28120]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
nova-conductor[28120]: ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
nova-conductor[28120]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
nova-conductor[28120]: ERROR oslo_messaging.rpc.server return f(*args, **kwargs)
nova-conductor[28120]: ERROR oslo_messaging.rpc.server File "/devstack/devstack_data/nova/nova/db/sqlalchemy/api.py", line 251, in wrapped
nova-conductor[28120]: ERROR oslo_messaging.rpc.server return f(context, *args, **kwargs)
nova-conductor[28120]: ERROR oslo_messaging.rpc.server File "/devstack/devstack_data/nova/nova/db/sqlalchemy/api.py", line 2673, in instance_update_and_get_original
nova-conductor[28120]: ERROR oslo_messaging.rpc.server columns_to_join=columns_to_join)
nova-conductor[28120]: ERROR oslo_messaging.rpc.server File "/devstack/devstack_data/nova/nova/db/sqlalchemy/api.py", line 1929, in _instance_get_by_uuid
nova-conductor[28120]: ERROR oslo_messaging.rpc.server raise exception.InstanceNotFound(instance_id=uuid)
nova-conductor[28120]: ERROR oslo_messaging.rpc.server InstanceNotFound: Instance e5e9cfe9-49ec-40a4-b763-d8bda68d5e56 could not be found.
- inspections for the bug
* nova_api db has the failed instance info in its db, but it's cell_id is NULL
mysql> select instance_uuid, cell_id from instance_mappings;
+--------------------------------------+---------+
| instance_uuid | cell_id |
+--------------------------------------+---------+
| 86be67fc-86db-4153-b142-2cdd77874e9a | 2 |
| db4f56bc-5069-4f0a-9fdc-f23e7fdb613d | 2 |
| 072e6db4-9ccd-4da3-935e-15e16f300cc7 | 2 |
| f4d577a7-4c3f-4c03-a875-6d3c1d3441e4 | NULL |
| 9720785e-210e-4607-a16c-58250d092efc | NULL |
| 8ced7975-b234-463e-86aa-bd30632812f7 | NULL |
| 5b675dff-d588-491b-8d13-9f0d95aec131 | 2 |
| 3785265d-b05e-47f6-9ff5-5f2300cb9a1f | NULL |
| 50dafdb8-dbd4-4d09-a5b4-cb604c1c28e9 | 2 |
| 901d03fb-a079-4707-a50c-8a9ea12f3347 | NULL |
| b3ddeeb4-d5a5-4fd6-b847-b5bb49d4f984 | NULL |
| 78439d4c-65cd-432b-9d43-835d583529b7 | 2 |
| e5e9cfe9-49ec-40a4-b763-d8bda68d5e56 | NULL |
+--------------------------------------+---------+
mysql> select id,uuid,name from cell_mappings;
+----+--------------------------------------+-------+
| id | uuid | name |
+----+--------------------------------------+-------+
| 1 | 00000000-0000-0000-0000-000000000000 | cell0 |
| 2 | 60d185ff-8de0-4b9a-8832-494d71c3f895 | cell1 |
+----+--------------------------------------+-------+
* all rows in nova_cell0's instances table remain building and scheduling
mysql> select id, uuid, vm_state, task_state from instances;
+----+--------------------------------------+----------+------------+
| id | uuid | vm_state | task_state |
+----+--------------------------------------+----------+------------+
| 1 | 8ced7975-b234-463e-86aa-bd30632812f7 | building | scheduling |
| 2 | 3785265d-b05e-47f6-9ff5-5f2300cb9a1f | building | scheduling |
| 3 | 901d03fb-a079-4707-a50c-8a9ea12f3347 | building | scheduling |
| 4 | b3ddeeb4-d5a5-4fd6-b847-b5bb49d4f984 | building | scheduling |
| 5 | e5e9cfe9-49ec-40a4-b763-d8bda68d5e56 | building | scheduling |
+----+--------------------------------------+----------+------------+
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1693438/+subscriptions
References