← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1832860] Re: Failed instances stuck in BUILD state after Rocky upgrade

 

Some things to note:

I'm pretty confident that the DB sync had been run using the rocky nova-
api container prior to the upgrade.

The 'missing' trusted_certs column did exist in the instance_extra table
in the nova DB prior to performing the workaround DB sync.

No restart of services was necessary.

** Also affects: kolla-ansible/rocky
   Importance: Undecided
       Status: New

** Also affects: nova
   Importance: Undecided
       Status: New

** Changed in: kolla-ansible/rocky
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1832860

Title:
  Failed instances stuck in BUILD state after Rocky upgrade

Status in kolla-ansible:
  New
Status in kolla-ansible rocky series:
  New
Status in OpenStack Compute (nova):
  New

Bug description:
  Steps to reproduce
  ==================

  Starting with a cloud running the Queens release, upgrade to Rocky.

  Create a flavor that cannot fit on any compute node, e.g.

  openstack flavor create --ram 100000000 --disk 2147483647 --vcpus
  10000 huge

  Then create an instance using that flavor:

  openstack server create huge --flavor huge --image cirros --network
  demo-net

  Expected
  ========

  The instance fails to boot and ends up in the ERROR state.

  Actual
  ======

  The instance fails to boot and gets stuck in the BUILD state.

  From nova-conductor.log:

  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 163, in _process_incoming
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     res = self.dispatcher.dispatch(message)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 265, in dispatch
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     return self._do_dispatch(endpoint, method, ctxt, args)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     result = func(ctxt, **new_args)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 1244, in schedule_and_build_instances
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     tags=tags)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 1193, in _bury_in_cell0
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     instance.create()
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 226, in wrapper
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     return fn(self, *args, **kwargs)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/objects/instance.py", line 600, in create
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     db_inst = db.instance_create(self._context, updates)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/db/api.py", line 748, in instance_create
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     return IMPL.instance_create(context, values)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 170, in wrapper
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     return f(*args, **kwargs)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 154, in wrapper
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     ectxt.value = e.inner_exc
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     self.force_reraise()
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     six.reraise(self.type_, self.value, self.tb)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 142, in wrapper
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     return f(*args, **kwargs)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 227, in wrapped
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     return f(context, *args, **kwargs)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 1774, in instance_create
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     ec2_instance_create(context, instance_ref['uuid'])
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 170, in wrapper
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     return f(*args, **kwargs)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 227, in wrapped
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     return f(context, *args, **kwargs)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 5286, in ec2_instance_create
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     ec2_instance_ref.save(context.session)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/models.py", line 50, in save
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     session.flush()
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2254, in flush
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     self._flush(objects)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2380, in _flush
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     transaction.rollback(_capture_exception=True)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 66, in __exit__
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     compat.reraise(exc_type, exc_value, exc_tb)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2344, in _flush
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     flush_context.execute()
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 391, in execute
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     rec.execute(self)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 556, in execute
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     uow
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 181, in save_obj
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     mapper, table, insert)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 866, in _emit_insert_statements
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     execute(statement, params)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 948, in execute
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     return meth(self, multiparams, params)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib64/python2.7/site-packages/sqlalchemy/sql/elements.py", line 269, in _execute_on_connection
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     return connection._execute_clauseelement(self, multiparams, params)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1060, in _execute_clauseelement
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     compiled_sql, distilled_params
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1200, in _execute_context
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     context)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1409, in _handle_dbapi_exception
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     util.raise_from_cause(newraise, exc_info)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     reraise(type(exception), exception, tb=exc_tb, cause=cause)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1193, in _execute_context
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     context)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 507, in do_execute
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     cursor.execute(statement, parameters)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 170, in execute
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     result = self._query(query)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 328, in _query
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     conn.query(q)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 516, in query
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     self._affected_rows = self._read_query_result(unbuffered=unbuffered)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 727, in _read_query_result
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     result.read()
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1066, in read
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     first_packet = self.connection._read_packet()
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 683, in _read_packet
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     packet.check_error()
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/pymysql/protocol.py", line 220, in check_error
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     err.raise_mysql_exception(self._data)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/pymysql/err.py", line 109, in raise_mysql_exception
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server     raise errorclass(errno, errval)
  2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server DBError: (pymysql.err.InternalError) (1054, u"Unknown column 'trusted_certs' in 'field list'") [SQL: u'INSERT INTO instance_extra (created_at, upd
  ated_at, deleted_at, deleted, instance_uuid, device_metadata, numa_topology, pci_requests, flavor, vcpu_model, migration_context, keypairs, trusted_certs) VALUES (%(created_at)s, %(updated_at)s, %(deleted
  _at)s, %(deleted)s, %(instance_uuid)s, %(device_metadata)s, %(numa_topology)s, %(pci_requests)s, %(flavor)s, %(vcpu_model)s, %(migration_context)s, %(keypairs)s, %(trusted_certs)s)'] [parameters: {'instan
  ce_uuid': u'df1bd38c-67cb-4eb0-b2d2-ac08233dadae', 'keypairs': '{"nova_object.version": "1.3", "nova_object.name": "KeyPairList", "nova_object.data": {"objects": []}, "nova_object.namespace": "nova"}', 'p
  ci_requests': '[]', 'vcpu_model': None, 'device_metadata': None, 'created_at': datetime.datetime(2019, 6, 12, 15, 0, 24, 430084), 'updated_at': None, 'numa_topology': None, 'trusted_certs': None, 'deleted
  ': 0, 'migration_context': None, 'flavor': '{"new": null, "old": null, "cur": {"nova_object.version": "1.2", "nova_object.name": "Flavor", "nova_object.data": {"disabled": false, "root_gb": 214 ... (234 c
  haracters truncated) ... , "swap": 0, "rxtx_factor": 1.0, "is_public": true, "deleted_at": null, "vcpu_weight": 0, "id": 6, "name": "huge"}, "nova_object.namespace": "nova"}}', 'deleted_at': None}] (Backg
  round on this error at: http://sqlalche.me/e/2j85)

  Workaround
  ==========

  On the controller, perform a nova DB sync:

  docker exec -it nova_api nova-manage db sync

  Despite this making no changes to the database (checked with
  mysqldump), it seems to 'fix' nova. New instances created using the
  'huge' flavor will go to the ERROR state.

To manage notifications about this bug go to:
https://bugs.launchpad.net/kolla-ansible/+bug/1832860/+subscriptions