← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1809061] [NEW] KeyError when booting multi-stagger-instances

 

Public bug reported:

Description
===========
Bulk boot multi instances in a short time, if the amount of resources required is not the same,
and the number of resources owned by the compute node is also different, there maybe a KeyError
in nova-scheduler.log .

Steps to reproduce
==================
For example, I have four compute nodes:
host1-3, with 24 cpus and 120G ram
host4, with 12 cpus and 40G ram

And i will boot 12 instances at the same time in different cmd,
one of them  need 16 cpus and 48G ram, others need 1 cpus and 1G ram.

Then the fault appeared, some of instances ERROR.


Expected result
===============
all instance boot success.

Actual result
=============
some instances ERROR.

Environment
===========
OpenStack version: 
Queens

Hypervisor:
Libvirt + KVM

Storage:
LVM

Networking:
Neutron with OpenVSwitch

Logs & Configs
==============
In nova-scheduler.log

2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server [req-7051b4f3-bfdc-4ca0-9436-8fc4448867c8 c3dba5032e49416896c7050ef6c3cad4 de45f83097b64290923d871f7350fd6e - detion during message handling: KeyError: (u'host4', u'host4')
2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 160, in _process_incoming
2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 213, in dispatch
2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _do_dispatch
2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 232, in inner
2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server return func(*args, **kwargs)
2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/scheduler/manager.py", line 179, in select_destinations
2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server alloc_reqs_by_rp_uuid, provider_summaries)
2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 88, in select_destinations
2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server alloc_reqs_by_rp_uuid, provider_summaries)
2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 167, in _schedule
2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server placement_return_available_hosts = list(hosts)
2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/scheduler/host_manager.py", line 794, in <genexpr>
2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server return (self.host_state_map[host] for host in seen_nodes)
2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server KeyError: (u'host4', u'host4')

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1809061

Title:
  KeyError when booting multi-stagger-instances

Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  ===========
  Bulk boot multi instances in a short time, if the amount of resources required is not the same,
  and the number of resources owned by the compute node is also different, there maybe a KeyError
  in nova-scheduler.log .

  Steps to reproduce
  ==================
  For example, I have four compute nodes:
  host1-3, with 24 cpus and 120G ram
  host4, with 12 cpus and 40G ram

  And i will boot 12 instances at the same time in different cmd,
  one of them  need 16 cpus and 48G ram, others need 1 cpus and 1G ram.

  Then the fault appeared, some of instances ERROR.

  
  Expected result
  ===============
  all instance boot success.

  Actual result
  =============
  some instances ERROR.

  Environment
  ===========
  OpenStack version: 
  Queens

  Hypervisor:
  Libvirt + KVM

  Storage:
  LVM

  Networking:
  Neutron with OpenVSwitch

  Logs & Configs
  ==============
  In nova-scheduler.log

  2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server [req-7051b4f3-bfdc-4ca0-9436-8fc4448867c8 c3dba5032e49416896c7050ef6c3cad4 de45f83097b64290923d871f7350fd6e - detion during message handling: KeyError: (u'host4', u'host4')
  2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
  2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 160, in _process_incoming
  2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
  2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 213, in dispatch
  2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
  2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _do_dispatch
  2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
  2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 232, in inner
  2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server return func(*args, **kwargs)
  2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/scheduler/manager.py", line 179, in select_destinations
  2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server alloc_reqs_by_rp_uuid, provider_summaries)
  2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 88, in select_destinations
  2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server alloc_reqs_by_rp_uuid, provider_summaries)
  2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 167, in _schedule
  2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server placement_return_available_hosts = list(hosts)
  2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/scheduler/host_manager.py", line 794, in <genexpr>
  2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server return (self.host_state_map[host] for host in seen_nodes)
  2018-12-10 15:05:15.029 26837 ERROR oslo_messaging.rpc.server KeyError: (u'host4', u'host4')

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1809061/+subscriptions


Follow ups