← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1781300] [NEW] resize reschedule results in CantStartEngineError during up-call to InstanceMappings table

 

Public bug reported:

Seen here:

http://logs.openstack.org/27/581727/1/check/tempest-full-
py3/15d7fdc/controller/logs/screen-n-cpu.txt#_Jul_11_13_32_54_822996

Jul 11 13:32:54.822996 ubuntu-xenial-rax-ord-0000660028 nova-compute[22966]: ERROR nova.compute.manager [None req-2b322ff2-8b41-4066-921d-f801f9defdaf tempest-DeleteServersTestJSON-1048472163 tempest-DeleteServersTestJSON-1048472163] [instance: 968d92c5-c972-4368-a2ce-fe8aac8c656c] Error trying to reschedule: oslo_messaging.rpc.client.RemoteError: Remote error: CantStartEngineError No sql_connection parameter is established
Jul 11 13:32:54.823342 ubuntu-xenial-rax-ord-0000660028 nova-compute[22966]: ['Traceback (most recent call last):\n', '  File "/usr/local/lib/python3.5/dist-packages/oslo_messaging/rpc/server.py", line 163, in _process_incoming\n    res = self.dispatcher.dispatch(message)\n', '  File "/usr/local/lib/python3.5/dist-packages/oslo_messaging/rpc/dispatcher.py", line 265, in dispatch\n    return self._do_dispatch(endpoint, method, ctxt, args)\n', '  File "/usr/local/lib/python3.5/dist-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch\n    result = func(ctxt, **new_args)\n', '  File "/usr/local/lib/python3.5/dist-packages/oslo_messaging/rpc/server.py", line 226, in inner\n    return func(*args, **kwargs)\n', '  File "/opt/stack/nova/nova/conductor/manager.py", line 71, in wrapper\n    context, instance.uuid)\n', '  File "/usr/local/lib/python3.5/dist-packages/oslo_versionedobjects/base.py", line 184, in wrapper\n    result = fn(cls, context, *args, **kwargs)\n', '  File "/opt/stack/nova/nova/objects/instance_mapping.py", line 72, in get_by_instance_uuid\n    db_mapping = cls._get_by_instance_uuid_from_db(context, instance_uuid)\n', '  File "/usr/local/lib/python3.5/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 992, in wrapper\n    with self._transaction_scope(context):\n', '  File "/usr/lib/python3.5/contextlib.py", line 59, in __enter__\n    return next(self.gen)\n', '  File "/usr/local/lib/python3.5/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 1042, in _transaction_scope\n    context=context) as resource:\n', '  File "/usr/lib/python3.5/contextlib.py", line 59, in __enter__\n    return next(self.gen)\n', '  File "/usr/local/lib/python3.5/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 645, in _session\n    bind=self.connection, mode=self.mode)\n', '  File "/usr/local/lib/python3.5/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 409, in _create_session\n    self._start()\n', '  File "/usr/local/lib/python3.5/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 496, in _start\n    engine_args, maker_args)\n', '  File "/usr/local/lib/py
Jul 11 13:32:54.824136 ubuntu-xenial-rax-ord-0000660028 nova-compute[22966]: thon3.5/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 518, in _setup_for_connection\n    "No sql_connection parameter is established")\n', 'oslo_db.exception.CantStartEngineError: No sql_connection parameter is established\n'].

This is because in a default superconductor mode in devstack, the n-cpu
and n-cond-cell1 services aren't configured for the nova API DB and
can't hit the instance mappings table in the API DB, but when nova-
compute casts to the cell conductor's migrate_server method, it's
decorated with the @targets_cell decorator which attempts to find the
instance mapping for the instance to get the cell and blows up.

In this reschedule scenario, the instance (and context) are actually
already targeted to a cell so we should be able to just short-circuit
the targets_cell decorator.

** Affects: nova
     Importance: Medium
         Status: Triaged


** Tags: cells reschedule resize upcall

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1781300

Title:
  resize reschedule results in CantStartEngineError during up-call to
  InstanceMappings table

Status in OpenStack Compute (nova):
  Triaged

Bug description:
  Seen here:

  http://logs.openstack.org/27/581727/1/check/tempest-full-
  py3/15d7fdc/controller/logs/screen-n-cpu.txt#_Jul_11_13_32_54_822996

  Jul 11 13:32:54.822996 ubuntu-xenial-rax-ord-0000660028 nova-compute[22966]: ERROR nova.compute.manager [None req-2b322ff2-8b41-4066-921d-f801f9defdaf tempest-DeleteServersTestJSON-1048472163 tempest-DeleteServersTestJSON-1048472163] [instance: 968d92c5-c972-4368-a2ce-fe8aac8c656c] Error trying to reschedule: oslo_messaging.rpc.client.RemoteError: Remote error: CantStartEngineError No sql_connection parameter is established
  Jul 11 13:32:54.823342 ubuntu-xenial-rax-ord-0000660028 nova-compute[22966]: ['Traceback (most recent call last):\n', '  File "/usr/local/lib/python3.5/dist-packages/oslo_messaging/rpc/server.py", line 163, in _process_incoming\n    res = self.dispatcher.dispatch(message)\n', '  File "/usr/local/lib/python3.5/dist-packages/oslo_messaging/rpc/dispatcher.py", line 265, in dispatch\n    return self._do_dispatch(endpoint, method, ctxt, args)\n', '  File "/usr/local/lib/python3.5/dist-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch\n    result = func(ctxt, **new_args)\n', '  File "/usr/local/lib/python3.5/dist-packages/oslo_messaging/rpc/server.py", line 226, in inner\n    return func(*args, **kwargs)\n', '  File "/opt/stack/nova/nova/conductor/manager.py", line 71, in wrapper\n    context, instance.uuid)\n', '  File "/usr/local/lib/python3.5/dist-packages/oslo_versionedobjects/base.py", line 184, in wrapper\n    result = fn(cls, context, *args, **kwargs)\n', '  File "/opt/stack/nova/nova/objects/instance_mapping.py", line 72, in get_by_instance_uuid\n    db_mapping = cls._get_by_instance_uuid_from_db(context, instance_uuid)\n', '  File "/usr/local/lib/python3.5/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 992, in wrapper\n    with self._transaction_scope(context):\n', '  File "/usr/lib/python3.5/contextlib.py", line 59, in __enter__\n    return next(self.gen)\n', '  File "/usr/local/lib/python3.5/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 1042, in _transaction_scope\n    context=context) as resource:\n', '  File "/usr/lib/python3.5/contextlib.py", line 59, in __enter__\n    return next(self.gen)\n', '  File "/usr/local/lib/python3.5/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 645, in _session\n    bind=self.connection, mode=self.mode)\n', '  File "/usr/local/lib/python3.5/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 409, in _create_session\n    self._start()\n', '  File "/usr/local/lib/python3.5/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 496, in _start\n    engine_args, maker_args)\n', '  File "/usr/local/lib/py
  Jul 11 13:32:54.824136 ubuntu-xenial-rax-ord-0000660028 nova-compute[22966]: thon3.5/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 518, in _setup_for_connection\n    "No sql_connection parameter is established")\n', 'oslo_db.exception.CantStartEngineError: No sql_connection parameter is established\n'].

  This is because in a default superconductor mode in devstack, the
  n-cpu and n-cond-cell1 services aren't configured for the nova API DB
  and can't hit the instance mappings table in the API DB, but when
  nova-compute casts to the cell conductor's migrate_server method, it's
  decorated with the @targets_cell decorator which attempts to find the
  instance mapping for the instance to get the cell and blows up.

  In this reschedule scenario, the instance (and context) are actually
  already targeted to a cell so we should be able to just short-circuit
  the targets_cell decorator.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1781300/+subscriptions


Follow ups