← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1784093] Re: Build requests can be orphaned without instance mappings

 

Reviewed:  https://review.opendev.org/586742
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=85f8d033d27b31a6398529e0a25da74eae523b08
Submitter: Zuul
Branch:    master

commit 85f8d033d27b31a6398529e0a25da74eae523b08
Author: Mohammed Naser <mnaser@xxxxxxxxxxxx>
Date:   Fri Jul 27 21:09:22 2018 -0400

    Create request spec, build request and mappings in one transaction
    
    The transaction context is currently not shared when creating the
    RequestSpec, BuildRequest and InstanceMapping.  Because of this,
    it is possible that the database ends in an inconsistent state
    due to the fact that one of these was created and the system
    crashed afterwards.
    
    This patch adds a function which handles the creation of all those
    resources in a single transaction.
    
    Co-Authored-By: melanie witt <melwittt@xxxxxxxxx>
    
    Closes-Bug: #1784093
    
    Change-Id: If897a0d721180152ebdceb7a0c23e8f283ce6d10


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1784093

Title:
  Build requests can be orphaned without instance mappings

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) queens series:
  Confirmed
Status in OpenStack Compute (nova) rocky series:
  Confirmed
Status in OpenStack Compute (nova) stein series:
  Confirmed

Bug description:
  Mohammed reported this in the nova channel today [1] and the RDO cloud
  people have run into the same issue too. The deployment got into a
  situation where instances would show up in a 'nova list' in
  BUILD/scheduling state but were unable to be deleted. (They show up in
  'nova list' because 'nova list' lists build requests and all instances
  in all cells).

  Inspection of the database showed that the "instance" had a build
  request but *no* instance mapping and *no* instance record in any
  cell. And the instance could not be deleted even though it appeared in
  the 'nova list' because the delete API first does a compute API().get
  in order to get the instance object to pass down to the compute
  API().delete method. The compute API().get fails with InstanceNotFound
  because the _get_instance method raises InstanceNotFound if there is
  no instance mapping for the instance.

  Mohammed was able to share this trace [2] which shows the
  instance_mapping.create() failing due to database errors, right after
  the build_request.create() succeeded:

  
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/nova/compute/api.py", line 937, in _provision_instances
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     inst_mapping.create()
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 226, in wrapper
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     return fn(self, *args, **kwargs)
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/nova/objects/instance_mapping.py", line 92, in create
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     db_mapping = self._create_in_db(self._context, changes)
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 986, in wrapper
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     return fn(*args, **kwargs)
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/usr/lib64/python2.7/contextlib.py", line 24, in __exit__
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     self.gen.next()
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 1036, in _transaction_scope
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     yield resource
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/usr/lib64/python2.7/contextlib.py", line 24, in __exit__
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     self.gen.next()
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 646, in _session
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     self.session.rollback()
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 907, in rollback
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     self.transaction.rollback()
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 532, in rollback
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     util.reraise(*rollback_err)
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 497, in rollback
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     t[1].rollback()
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1632, in rollback
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     self._do_rollback()
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1670, in _do_rollback
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     self.connection._rollback_impl()
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 706, in _rollback_impl
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     self._handle_dbapi_exception(e, None, None, None, None)
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1406, in _handle_dbapi_exception
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     self._autorollback()
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 76, in __exit__
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     compat.reraise(type_, value, traceback)
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1406, in _handle_dbapi_exception
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     self._autorollback()
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 827, in _autorollback
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     self._root._rollback_impl()
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 706, in _rollback_impl
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     self._handle_dbapi_exception(e, None, None, None, None)
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1334, in _handle_dbapi_exception
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     exc_info
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     reraise(type(exception), exception, tb=exc_tb, cause=cause)
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 704, in _rollback_impl
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     self.engine.dialect.do_rollback(self.connection)
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/dialects/mysql/base.py", line 1773, in do_rollback
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     dbapi_connection.rollback()
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/pymysql/connections.py", line 786, in rollback
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     self._read_ok_packet()
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/pymysql/connections.py", line 760, in _read_ok_packet
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     pkt = self._read_packet()
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/pymysql/connections.py", line 1018, in _read_packet
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     packet.check_error()
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/pymysql/connections.py", line 384, in check_error
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     err.raise_mysql_exception(self._data)
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi   File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/pymysql/err.py", line 107, in raise_mysql_exception
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi     raise errorclass(errno, errval)
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi InternalError: (pymysql.err.InternalError) (1047, u'WSREP has not yet prepared node for application use') (Background on this error at: http://sqlalche.me/e/2j85)
  2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi 

  
  and the API request returned with a 500:

  "HTTP exception thrown: Unexpected API Error. Please report this at
  http://bugs.launchpad.net/nova/ and attach the Nova API log if
  possible."

  Mohammed is going to try a fix to do the build request and instance
  mapping creates in a single database transaction, so that the build
  request cannot be orphaned.

  Another way to handle it would be to leave the creates as-is and make
  the API handle deletion of orphaned build requests, but doing that
  would allow another avenue for instances in ERROR state, whereas doing
  the build request and instance mapping creates in a single transaction
  would avoid that.

  [1] http://eavesdrop.openstack.org/irclogs/%23openstack-nova/latest.log.html#t2018-07-28T00:27:59
  [2] http://paste.openstack.org/show/726772

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1784093/+subscriptions


References