← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1693438] Re: error instances remain in "Build" status and can't delete it

 

Reviewed:  https://review.openstack.org/468401
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=5fac17ae960d36eeb7a642725a37f82e8ca95ec1
Submitter: Jenkins
Branch:    master

commit 5fac17ae960d36eeb7a642725a37f82e8ca95ec1
Author: Matt Riedemann <mriedem.os@xxxxxxxxx>
Date:   Fri May 26 08:27:07 2017 -0400

    Use targeted context when burying instances in cell0
    
    After Iccdf6b80f5fc8adcc8a89ce6ece3f37b6cbcaee2 we need to
    use the yielded context which is targeted to the cell when
    we do DB operations, which in this case is creating the
    instance in the cell0 database and then updating it's status.
    
    There is another place in here where this was missed, which is
    when we're trying to delete a build request which was already
    deleted.
    
    Closes-Bug: #1693438
    
    Change-Id: I142f97d691fa55e9824714c9c224f998ad72337e


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1693438

Title:
  error instances remain in "Build" status and can't delete it

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Description
  ===========

  When create server API fails because of no valid host, the instance's
  status remains in BUILD status and "scheduling" task state.
  Additionally, users can't delete the instance by delete server API.

  
  Steps to reproduce
  ==================

  1. create devstack with default configs
  2. boot instances until nova scheduler says "no valid host was found"
  3. check the error instance's status, then its status remains in "BUILD" and its task state remains in "scheduling".
  4. check nova-conductor's log, then it has a following error
  5. I can't delete the failed instance by delete instance API

  Expected result
  ===============

  the instance goes ERROR status and none task state because of "no
  valid host was found". And I can delete the instance by delete
  instance API.

  Environment
  ===========

  - git log
  Merge: bedcf29 3838d5e
  Author: Jenkins <jenkins@xxxxxxxxxxxxxxxxxxxx>
  Date:   Thu May 25 01:15:41 2017 +0000

      Merge "Handle uuid in HostAPI._find_service"

  
  - hypervisor: KVM

  
  Logs & Configs
  ==============

  - nova-conductor's log:

  nova-conductor[28120]: NoValidHost: No valid host was found. There are not enough hosts available.
  nova-conductor[28120]: 
  nova-conductor[28120]: WARNING nova.scheduler.utils [req-7969ec7f-795d-420d-b847-b5c3c6bc8489 admin admin] [instance: e5e9cfe9-49ec-40a4-b763-d8bda68d5e56] Setting instance to ERROR state.
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server [req-7969ec7f-795d-420d-b847-b5c3c6bc8489 admin admin] Exception during message handling
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server Traceback (most recent call last):
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 157, in _process_incoming
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server     res = self.dispatcher.dispatch(message)
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 213, in dispatch
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server     return self._do_dispatch(endpoint, method, ctxt, args)
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _do_dispatch
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server     result = func(ctxt, **new_args)
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server     build_requests=build_requests)
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server   File "/devstack/devstack_data/nova/nova/conductor/manager.py", line 893, in _bury_in_cell0
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server     exc, legacy_spec)
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server   File "/devstack/devstack_data/nova/nova/conductor/manager.py", line 355, in _set_vm_state_and_notify
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server     ex, request_spec)
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server   File "/devstack/devstack_data/nova/nova/scheduler/utils.py", line 104, in set_vm_state_and_notify
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server     instance.save()
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server   File "/usr/local/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server     return fn(self, *args, **kwargs)
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server   File "/devstack/devstack_data/nova/nova/objects/instance.py", line 781, in save
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server     columns_to_join=_expected_cols(expected_attrs))
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server   File "/devstack/devstack_data/nova/nova/db/api.py", line 860, in instance_update_and_get_original
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server     expected=expected)
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server   File "/devstack/devstack_data/nova/nova/db/sqlalchemy/api.py", line 180, in wrapper
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server     return f(*args, **kwargs)
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server   File "/usr/local/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server     ectxt.value = e.inner_exc
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server     self.force_reraise()
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server     six.reraise(self.type_, self.value, self.tb)
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server   File "/usr/local/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server     return f(*args, **kwargs)
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server   File "/devstack/devstack_data/nova/nova/db/sqlalchemy/api.py", line 251, in wrapped
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server     return f(context, *args, **kwargs)
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server   File "/devstack/devstack_data/nova/nova/db/sqlalchemy/api.py", line 2673, in instance_update_and_get_original
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server     columns_to_join=columns_to_join)
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server   File "/devstack/devstack_data/nova/nova/db/sqlalchemy/api.py", line 1929, in _instance_get_by_uuid
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server     raise exception.InstanceNotFound(instance_id=uuid)
  nova-conductor[28120]: ERROR oslo_messaging.rpc.server InstanceNotFound: Instance e5e9cfe9-49ec-40a4-b763-d8bda68d5e56 could not be found.

  - inspections for the bug
      * nova_api db has the failed instance info in its db, but it's cell_id is NULL

  mysql> select instance_uuid, cell_id from instance_mappings;
  +--------------------------------------+---------+
  | instance_uuid                        | cell_id |
  +--------------------------------------+---------+
  | 86be67fc-86db-4153-b142-2cdd77874e9a |       2 |
  | db4f56bc-5069-4f0a-9fdc-f23e7fdb613d |       2 |
  | 072e6db4-9ccd-4da3-935e-15e16f300cc7 |       2 |
  | f4d577a7-4c3f-4c03-a875-6d3c1d3441e4 |    NULL |
  | 9720785e-210e-4607-a16c-58250d092efc |    NULL |
  | 8ced7975-b234-463e-86aa-bd30632812f7 |    NULL |
  | 5b675dff-d588-491b-8d13-9f0d95aec131 |       2 |
  | 3785265d-b05e-47f6-9ff5-5f2300cb9a1f |    NULL |
  | 50dafdb8-dbd4-4d09-a5b4-cb604c1c28e9 |       2 |
  | 901d03fb-a079-4707-a50c-8a9ea12f3347 |    NULL |
  | b3ddeeb4-d5a5-4fd6-b847-b5bb49d4f984 |    NULL |
  | 78439d4c-65cd-432b-9d43-835d583529b7 |       2 |
  | e5e9cfe9-49ec-40a4-b763-d8bda68d5e56 |    NULL |
  +--------------------------------------+---------+

  mysql> select id,uuid,name from cell_mappings;
  +----+--------------------------------------+-------+
  | id | uuid                                 | name  |
  +----+--------------------------------------+-------+
  |  1 | 00000000-0000-0000-0000-000000000000 | cell0 |
  |  2 | 60d185ff-8de0-4b9a-8832-494d71c3f895 | cell1 |
  +----+--------------------------------------+-------+

  
      * all rows in nova_cell0's instances table remain building and scheduling

  mysql> select id, uuid, vm_state, task_state from instances;
  +----+--------------------------------------+----------+------------+
  | id | uuid                                 | vm_state | task_state |
  +----+--------------------------------------+----------+------------+
  |  1 | 8ced7975-b234-463e-86aa-bd30632812f7 | building | scheduling |
  |  2 | 3785265d-b05e-47f6-9ff5-5f2300cb9a1f | building | scheduling |
  |  3 | 901d03fb-a079-4707-a50c-8a9ea12f3347 | building | scheduling |
  |  4 | b3ddeeb4-d5a5-4fd6-b847-b5bb49d4f984 | building | scheduling |
  |  5 | e5e9cfe9-49ec-40a4-b763-d8bda68d5e56 | building | scheduling |
  +----+--------------------------------------+----------+------------+

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1693438/+subscriptions


References