yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #78901
[Bug 1832860] Re: Failed instances stuck in BUILD state after Rocky upgrade
Some things to note:
I'm pretty confident that the DB sync had been run using the rocky nova-
api container prior to the upgrade.
The 'missing' trusted_certs column did exist in the instance_extra table
in the nova DB prior to performing the workaround DB sync.
No restart of services was necessary.
** Also affects: kolla-ansible/rocky
Importance: Undecided
Status: New
** Also affects: nova
Importance: Undecided
Status: New
** Changed in: kolla-ansible/rocky
Importance: Undecided => High
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1832860
Title:
Failed instances stuck in BUILD state after Rocky upgrade
Status in kolla-ansible:
New
Status in kolla-ansible rocky series:
New
Status in OpenStack Compute (nova):
New
Bug description:
Steps to reproduce
==================
Starting with a cloud running the Queens release, upgrade to Rocky.
Create a flavor that cannot fit on any compute node, e.g.
openstack flavor create --ram 100000000 --disk 2147483647 --vcpus
10000 huge
Then create an instance using that flavor:
openstack server create huge --flavor huge --image cirros --network
demo-net
Expected
========
The instance fails to boot and ends up in the ERROR state.
Actual
======
The instance fails to boot and gets stuck in the BUILD state.
From nova-conductor.log:
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 163, in _process_incoming
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 265, in dispatch
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 1244, in schedule_and_build_instances
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server tags=tags)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 1193, in _bury_in_cell0
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server instance.create()
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 226, in wrapper
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server return fn(self, *args, **kwargs)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/objects/instance.py", line 600, in create
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server db_inst = db.instance_create(self._context, updates)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/db/api.py", line 748, in instance_create
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server return IMPL.instance_create(context, values)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 170, in wrapper
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server return f(*args, **kwargs)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 154, in wrapper
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server ectxt.value = e.inner_exc
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server self.force_reraise()
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 142, in wrapper
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server return f(*args, **kwargs)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 227, in wrapped
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server return f(context, *args, **kwargs)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 1774, in instance_create
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server ec2_instance_create(context, instance_ref['uuid'])
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 170, in wrapper
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server return f(*args, **kwargs)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 227, in wrapped
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server return f(context, *args, **kwargs)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 5286, in ec2_instance_create
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server ec2_instance_ref.save(context.session)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/models.py", line 50, in save
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server session.flush()
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2254, in flush
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server self._flush(objects)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2380, in _flush
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server transaction.rollback(_capture_exception=True)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 66, in __exit__
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server compat.reraise(exc_type, exc_value, exc_tb)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2344, in _flush
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server flush_context.execute()
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 391, in execute
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server rec.execute(self)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 556, in execute
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server uow
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 181, in save_obj
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server mapper, table, insert)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 866, in _emit_insert_statements
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server execute(statement, params)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 948, in execute
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server return meth(self, multiparams, params)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/sql/elements.py", line 269, in _execute_on_connection
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server return connection._execute_clauseelement(self, multiparams, params)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1060, in _execute_clauseelement
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server compiled_sql, distilled_params
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1200, in _execute_context
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server context)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1409, in _handle_dbapi_exception
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server util.raise_from_cause(newraise, exc_info)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server reraise(type(exception), exception, tb=exc_tb, cause=cause)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1193, in _execute_context
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server context)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 507, in do_execute
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server cursor.execute(statement, parameters)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 170, in execute
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server result = self._query(query)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 328, in _query
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server conn.query(q)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 516, in query
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server self._affected_rows = self._read_query_result(unbuffered=unbuffered)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 727, in _read_query_result
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server result.read()
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1066, in read
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server first_packet = self.connection._read_packet()
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 683, in _read_packet
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server packet.check_error()
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/pymysql/protocol.py", line 220, in check_error
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server err.raise_mysql_exception(self._data)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/pymysql/err.py", line 109, in raise_mysql_exception
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server raise errorclass(errno, errval)
2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server DBError: (pymysql.err.InternalError) (1054, u"Unknown column 'trusted_certs' in 'field list'") [SQL: u'INSERT INTO instance_extra (created_at, upd
ated_at, deleted_at, deleted, instance_uuid, device_metadata, numa_topology, pci_requests, flavor, vcpu_model, migration_context, keypairs, trusted_certs) VALUES (%(created_at)s, %(updated_at)s, %(deleted
_at)s, %(deleted)s, %(instance_uuid)s, %(device_metadata)s, %(numa_topology)s, %(pci_requests)s, %(flavor)s, %(vcpu_model)s, %(migration_context)s, %(keypairs)s, %(trusted_certs)s)'] [parameters: {'instan
ce_uuid': u'df1bd38c-67cb-4eb0-b2d2-ac08233dadae', 'keypairs': '{"nova_object.version": "1.3", "nova_object.name": "KeyPairList", "nova_object.data": {"objects": []}, "nova_object.namespace": "nova"}', 'p
ci_requests': '[]', 'vcpu_model': None, 'device_metadata': None, 'created_at': datetime.datetime(2019, 6, 12, 15, 0, 24, 430084), 'updated_at': None, 'numa_topology': None, 'trusted_certs': None, 'deleted
': 0, 'migration_context': None, 'flavor': '{"new": null, "old": null, "cur": {"nova_object.version": "1.2", "nova_object.name": "Flavor", "nova_object.data": {"disabled": false, "root_gb": 214 ... (234 c
haracters truncated) ... , "swap": 0, "rxtx_factor": 1.0, "is_public": true, "deleted_at": null, "vcpu_weight": 0, "id": 6, "name": "huge"}, "nova_object.namespace": "nova"}}', 'deleted_at': None}] (Backg
round on this error at: http://sqlalche.me/e/2j85)
Workaround
==========
On the controller, perform a nova DB sync:
docker exec -it nova_api nova-manage db sync
Despite this making no changes to the database (checked with
mysqldump), it seems to 'fix' nova. New instances created using the
'huge' flavor will go to the ERROR state.
To manage notifications about this bug go to:
https://bugs.launchpad.net/kolla-ansible/+bug/1832860/+subscriptions