← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1680661] [NEW] HostMappingNotFound during update_instance_info

 

Public bug reported:

Noticed this in the nova-scheduler logs during a failed test run today:

http://logs.openstack.org/16/453916/4/check/gate-tempest-dsvm-py35
-ubuntu-
xenial/09850e6/logs/screen-n-sch.txt.gz?level=TRACE#_2017-04-06_23_11_45_834

2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server [req-36e4db2a-38e1-4810-9309-8893598e195a - -] Exception during message handling
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server   File "/usr/local/lib/python3.5/dist-packages/oslo_messaging/rpc/server.py", line 157, in _process_incoming
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server     res = self.dispatcher.dispatch(message)
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server   File "/usr/local/lib/python3.5/dist-packages/oslo_messaging/rpc/dispatcher.py", line 213, in dispatch
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server     return self._do_dispatch(endpoint, method, ctxt, args)
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server   File "/usr/local/lib/python3.5/dist-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _do_dispatch
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server     result = func(ctxt, **new_args)
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server   File "/opt/stack/new/nova/nova/scheduler/manager.py", line 125, in update_instance_info
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server     instance_info)
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server   File "/usr/local/lib/python3.5/dist-packages/oslo_concurrency/lockutils.py", line 271, in inner
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server     return f(*args, **kwargs)
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server   File "/opt/stack/new/nova/nova/scheduler/host_manager.py", line 774, in update_instance_info
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server     self._recreate_instance_info(context, host_name)
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server   File "/opt/stack/new/nova/nova/scheduler/host_manager.py", line 745, in _recreate_instance_info
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server     inst_dict = self._get_instances_by_host(context, host_name)
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server   File "/opt/stack/new/nova/nova/scheduler/host_manager.py", line 717, in _get_instances_by_host
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server     hm = objects.HostMapping.get_by_host(context, host_name)
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server   File "/usr/local/lib/python3.5/dist-packages/oslo_versionedobjects/base.py", line 184, in wrapper
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server     result = fn(cls, context, *args, **kwargs)
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server   File "/opt/stack/new/nova/nova/objects/host_mapping.py", line 100, in get_by_host
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server     db_mapping = cls._get_by_host_from_db(context, host)
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server   File "/usr/local/lib/python3.5/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 963, in wrapper
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server     return fn(*args, **kwargs)
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server   File "/opt/stack/new/nova/nova/objects/host_mapping.py", line 95, in _get_by_host_from_db
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server     raise exception.HostMappingNotFound(name=host)
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server nova.exception.HostMappingNotFound: Host 'ubuntu-xenial-internap-mtl01-8316357' is not mapped to any cell
2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server 

This is most likely a side effect of this change:
https://review.openstack.org/#/c/439891/

This could be a race on startup where the compute node is sending it's
instance information to the scheduler before the compute host is mapped
to a cell, in which case it wouldn't have any instances to send so the
scheduler will try to look them up from the database and blow up because
the host isn't mapped yet.

I see from the log:

Total number of compute nodes: 0 _async_init_instance_info
/opt/stack/new/nova/nova/scheduler/host_manager.py:439

Adding 0 instances for hosts 10-20 _async_init_instance_info
/opt/stack/new/nova/nova/scheduler/host_manager.py:459

It's definitely happening on startup.

** Affects: nova
     Importance: Medium
         Status: Triaged


** Tags: cells scheduler

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1680661

Title:
  HostMappingNotFound during update_instance_info

Status in OpenStack Compute (nova):
  Triaged

Bug description:
  Noticed this in the nova-scheduler logs during a failed test run
  today:

  http://logs.openstack.org/16/453916/4/check/gate-tempest-dsvm-py35
  -ubuntu-
  xenial/09850e6/logs/screen-n-sch.txt.gz?level=TRACE#_2017-04-06_23_11_45_834

  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server [req-36e4db2a-38e1-4810-9309-8893598e195a - -] Exception during message handling
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server   File "/usr/local/lib/python3.5/dist-packages/oslo_messaging/rpc/server.py", line 157, in _process_incoming
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server     res = self.dispatcher.dispatch(message)
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server   File "/usr/local/lib/python3.5/dist-packages/oslo_messaging/rpc/dispatcher.py", line 213, in dispatch
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server     return self._do_dispatch(endpoint, method, ctxt, args)
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server   File "/usr/local/lib/python3.5/dist-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _do_dispatch
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server     result = func(ctxt, **new_args)
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server   File "/opt/stack/new/nova/nova/scheduler/manager.py", line 125, in update_instance_info
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server     instance_info)
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server   File "/usr/local/lib/python3.5/dist-packages/oslo_concurrency/lockutils.py", line 271, in inner
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server     return f(*args, **kwargs)
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server   File "/opt/stack/new/nova/nova/scheduler/host_manager.py", line 774, in update_instance_info
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server     self._recreate_instance_info(context, host_name)
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server   File "/opt/stack/new/nova/nova/scheduler/host_manager.py", line 745, in _recreate_instance_info
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server     inst_dict = self._get_instances_by_host(context, host_name)
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server   File "/opt/stack/new/nova/nova/scheduler/host_manager.py", line 717, in _get_instances_by_host
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server     hm = objects.HostMapping.get_by_host(context, host_name)
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server   File "/usr/local/lib/python3.5/dist-packages/oslo_versionedobjects/base.py", line 184, in wrapper
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server     result = fn(cls, context, *args, **kwargs)
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server   File "/opt/stack/new/nova/nova/objects/host_mapping.py", line 100, in get_by_host
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server     db_mapping = cls._get_by_host_from_db(context, host)
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server   File "/usr/local/lib/python3.5/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 963, in wrapper
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server     return fn(*args, **kwargs)
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server   File "/opt/stack/new/nova/nova/objects/host_mapping.py", line 95, in _get_by_host_from_db
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server     raise exception.HostMappingNotFound(name=host)
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server nova.exception.HostMappingNotFound: Host 'ubuntu-xenial-internap-mtl01-8316357' is not mapped to any cell
  2017-04-06 23:11:45.834 24111 ERROR oslo_messaging.rpc.server 

  This is most likely a side effect of this change:
  https://review.openstack.org/#/c/439891/

  This could be a race on startup where the compute node is sending it's
  instance information to the scheduler before the compute host is
  mapped to a cell, in which case it wouldn't have any instances to send
  so the scheduler will try to look them up from the database and blow
  up because the host isn't mapped yet.

  I see from the log:

  Total number of compute nodes: 0 _async_init_instance_info
  /opt/stack/new/nova/nova/scheduler/host_manager.py:439

  Adding 0 instances for hosts 10-20 _async_init_instance_info
  /opt/stack/new/nova/nova/scheduler/host_manager.py:459

  It's definitely happening on startup.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1680661/+subscriptions


Follow ups