yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #72625
[Bug 1767139] [NEW] TypeError in _get_inventory_and_update_provider_generation
Public bug reported:
Description
===========
Bringing up a new cluster as part of our CI after switch from 16.1.0 to
16.1.1 on Centos, I'm seeing this error on some computes:
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager Traceback (most recent call last):
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6752, in update_available_resource_for_node
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager rt.update_available_resource(context, nodename)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 704, in update_available_resource
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager self._update_available_resource(context, resources)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager return f(*args, **kwargs)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 728, in _update_available_resource
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager self._init_compute_node(context, resources)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 585, in _init_compute_node
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager self._update(context, cn)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 886, in _update
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager inv_data,
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 64, in set_inventory_for_provider
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager inv_data,
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 37, in __run_method
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager return getattr(self.instance, __name)(*args, **kwargs)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 789, in set_inventory_for_provider
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager self._update_inventory(rp_uuid, inv_data)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 56, in wrapper
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager return f(self, *a, **k)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 675, in _update_inventory
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager if self._update_inventory_attempt(rp_uuid, inv_data):
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 562, in _update_inventory_attempt
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager curr = self._get_inventory_and_update_provider_generation(rp_uuid)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 546, in _get_inventory_and_update_provider_generation
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager if server_gen != my_rp['generation']:
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager TypeError: 'NoneType' object has no attribute '__getitem__'
The error seems persistent for a single run of nova-compute.
Steps to reproduce
==================
Nodes were started by our CI infrastructure. We start 3 computes and a
single control node. In 50% of cases, one of the computes comes up in
this bad state.
Expected result
===============
Working cluster.
Actual result
=============
At least one of 3 nodes fails to join the cluster, it's not picked up by
discover_hosts and I see the above stack trace repeated in the nova-
compute logs.
Environment
===========
1. Exact version of OpenStack you are running. See the following
list for all releases: http://docs.openstack.org/releases/
$ rpm -qa | grep nova
python-nova-16.1.1-1.el7.noarch
openstack-nova-common-16.1.1-1.el7.noarch
python2-novaclient-9.1.1-1.el7.noarch
openstack-nova-api-16.1.1-1.el7.noarch
openstack-nova-compute-16.1.1-1.el7.noarch
2. Which hypervisor did you use?
(For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...)
What's the version of that?
$ rpm -qa | grep kvm
libvirt-daemon-kvm-3.2.0-14.el7_4.9.x86_64
qemu-kvm-common-ev-2.9.0-16.el7_4.14.1.x86_64
qemu-kvm-ev-2.9.0-16.el7_4.14.1.x86_64
2. Which storage type did you use?
(For example: Ceph, LVM, GPFS, ...)
What's the version of that?
Not sure
3. Which networking type did you use?
(For example: nova-network, Neutron with OpenVSwitch, ...)
Neutron with Calico (I work on Calico, this is our CI system)
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1767139
Title:
TypeError in _get_inventory_and_update_provider_generation
Status in OpenStack Compute (nova):
New
Bug description:
Description
===========
Bringing up a new cluster as part of our CI after switch from 16.1.0
to 16.1.1 on Centos, I'm seeing this error on some computes:
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager Traceback (most recent call last):
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6752, in update_available_resource_for_node
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager rt.update_available_resource(context, nodename)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 704, in update_available_resource
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager self._update_available_resource(context, resources)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager return f(*args, **kwargs)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 728, in _update_available_resource
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager self._init_compute_node(context, resources)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 585, in _init_compute_node
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager self._update(context, cn)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 886, in _update
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager inv_data,
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 64, in set_inventory_for_provider
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager inv_data,
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 37, in __run_method
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager return getattr(self.instance, __name)(*args, **kwargs)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 789, in set_inventory_for_provider
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager self._update_inventory(rp_uuid, inv_data)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 56, in wrapper
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager return f(self, *a, **k)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 675, in _update_inventory
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager if self._update_inventory_attempt(rp_uuid, inv_data):
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 562, in _update_inventory_attempt
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager curr = self._get_inventory_and_update_provider_generation(rp_uuid)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 546, in _get_inventory_and_update_provider_generation
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager if server_gen != my_rp['generation']:
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager TypeError: 'NoneType' object has no attribute '__getitem__'
The error seems persistent for a single run of nova-compute.
Steps to reproduce
==================
Nodes were started by our CI infrastructure. We start 3 computes and
a single control node. In 50% of cases, one of the computes comes up
in this bad state.
Expected result
===============
Working cluster.
Actual result
=============
At least one of 3 nodes fails to join the cluster, it's not picked up
by discover_hosts and I see the above stack trace repeated in the
nova-compute logs.
Environment
===========
1. Exact version of OpenStack you are running. See the following
list for all releases: http://docs.openstack.org/releases/
$ rpm -qa | grep nova
python-nova-16.1.1-1.el7.noarch
openstack-nova-common-16.1.1-1.el7.noarch
python2-novaclient-9.1.1-1.el7.noarch
openstack-nova-api-16.1.1-1.el7.noarch
openstack-nova-compute-16.1.1-1.el7.noarch
2. Which hypervisor did you use?
(For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...)
What's the version of that?
$ rpm -qa | grep kvm
libvirt-daemon-kvm-3.2.0-14.el7_4.9.x86_64
qemu-kvm-common-ev-2.9.0-16.el7_4.14.1.x86_64
qemu-kvm-ev-2.9.0-16.el7_4.14.1.x86_64
2. Which storage type did you use?
(For example: Ceph, LVM, GPFS, ...)
What's the version of that?
Not sure
3. Which networking type did you use?
(For example: nova-network, Neutron with OpenVSwitch, ...)
Neutron with Calico (I work on Calico, this is our CI system)
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1767139/+subscriptions
Follow ups