yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #78214
[Bug 1784093] Re: Build requests can be orphaned without instance mappings
Reviewed: https://review.opendev.org/586742
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=85f8d033d27b31a6398529e0a25da74eae523b08
Submitter: Zuul
Branch: master
commit 85f8d033d27b31a6398529e0a25da74eae523b08
Author: Mohammed Naser <mnaser@xxxxxxxxxxxx>
Date: Fri Jul 27 21:09:22 2018 -0400
Create request spec, build request and mappings in one transaction
The transaction context is currently not shared when creating the
RequestSpec, BuildRequest and InstanceMapping. Because of this,
it is possible that the database ends in an inconsistent state
due to the fact that one of these was created and the system
crashed afterwards.
This patch adds a function which handles the creation of all those
resources in a single transaction.
Co-Authored-By: melanie witt <melwittt@xxxxxxxxx>
Closes-Bug: #1784093
Change-Id: If897a0d721180152ebdceb7a0c23e8f283ce6d10
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1784093
Title:
Build requests can be orphaned without instance mappings
Status in OpenStack Compute (nova):
Fix Released
Status in OpenStack Compute (nova) queens series:
Confirmed
Status in OpenStack Compute (nova) rocky series:
Confirmed
Status in OpenStack Compute (nova) stein series:
Confirmed
Bug description:
Mohammed reported this in the nova channel today [1] and the RDO cloud
people have run into the same issue too. The deployment got into a
situation where instances would show up in a 'nova list' in
BUILD/scheduling state but were unable to be deleted. (They show up in
'nova list' because 'nova list' lists build requests and all instances
in all cells).
Inspection of the database showed that the "instance" had a build
request but *no* instance mapping and *no* instance record in any
cell. And the instance could not be deleted even though it appeared in
the 'nova list' because the delete API first does a compute API().get
in order to get the instance object to pass down to the compute
API().delete method. The compute API().get fails with InstanceNotFound
because the _get_instance method raises InstanceNotFound if there is
no instance mapping for the instance.
Mohammed was able to share this trace [2] which shows the
instance_mapping.create() failing due to database errors, right after
the build_request.create() succeeded:
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/nova/compute/api.py", line 937, in _provision_instances
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi inst_mapping.create()
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 226, in wrapper
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi return fn(self, *args, **kwargs)
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/nova/objects/instance_mapping.py", line 92, in create
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi db_mapping = self._create_in_db(self._context, changes)
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 986, in wrapper
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi return fn(*args, **kwargs)
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/usr/lib64/python2.7/contextlib.py", line 24, in __exit__
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi self.gen.next()
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 1036, in _transaction_scope
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi yield resource
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/usr/lib64/python2.7/contextlib.py", line 24, in __exit__
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi self.gen.next()
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 646, in _session
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi self.session.rollback()
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 907, in rollback
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi self.transaction.rollback()
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 532, in rollback
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi util.reraise(*rollback_err)
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 497, in rollback
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi t[1].rollback()
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1632, in rollback
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi self._do_rollback()
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1670, in _do_rollback
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi self.connection._rollback_impl()
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 706, in _rollback_impl
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi self._handle_dbapi_exception(e, None, None, None, None)
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1406, in _handle_dbapi_exception
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi self._autorollback()
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 76, in __exit__
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi compat.reraise(type_, value, traceback)
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1406, in _handle_dbapi_exception
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi self._autorollback()
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 827, in _autorollback
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi self._root._rollback_impl()
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 706, in _rollback_impl
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi self._handle_dbapi_exception(e, None, None, None, None)
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1334, in _handle_dbapi_exception
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi exc_info
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi reraise(type(exception), exception, tb=exc_tb, cause=cause)
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 704, in _rollback_impl
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi self.engine.dialect.do_rollback(self.connection)
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/sqlalchemy/dialects/mysql/base.py", line 1773, in do_rollback
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi dbapi_connection.rollback()
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/pymysql/connections.py", line 786, in rollback
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi self._read_ok_packet()
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/pymysql/connections.py", line 760, in _read_ok_packet
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi pkt = self._read_packet()
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/pymysql/connections.py", line 1018, in _read_packet
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi packet.check_error()
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/pymysql/connections.py", line 384, in check_error
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi err.raise_mysql_exception(self._data)
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi File "/openstack/venvs/nova-17.0.3/lib/python2.7/site-packages/pymysql/err.py", line 107, in raise_mysql_exception
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi raise errorclass(errno, errval)
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi InternalError: (pymysql.err.InternalError) (1047, u'WSREP has not yet prepared node for application use') (Background on this error at: http://sqlalche.me/e/2j85)
2018-07-25 04:20:12.946 7926 ERROR nova.api.openstack.wsgi
and the API request returned with a 500:
"HTTP exception thrown: Unexpected API Error. Please report this at
http://bugs.launchpad.net/nova/ and attach the Nova API log if
possible."
Mohammed is going to try a fix to do the build request and instance
mapping creates in a single database transaction, so that the build
request cannot be orphaned.
Another way to handle it would be to leave the creates as-is and make
the API handle deletion of orphaned build requests, but doing that
would allow another avenue for instances in ERROR state, whereas doing
the build request and instance mapping creates in a single transaction
would avoid that.
[1] http://eavesdrop.openstack.org/irclogs/%23openstack-nova/latest.log.html#t2018-07-28T00:27:59
[2] http://paste.openstack.org/show/726772
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1784093/+subscriptions
References