yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #79584
[Bug 1839674] Re: ResourceTracker.compute_nodes won't try to create a ComputeNode a second time if the first create() fails
** Also affects: nova/ocata
Importance: Undecided
Status: New
** Also affects: nova/rocky
Importance: Undecided
Status: New
** Also affects: nova/stein
Importance: Undecided
Status: New
** Also affects: nova/pike
Importance: Undecided
Status: New
** Also affects: nova/queens
Importance: Undecided
Status: New
** Changed in: nova/ocata
Status: New => Triaged
** Changed in: nova/pike
Status: New => Triaged
** Changed in: nova/queens
Status: New => Triaged
** Changed in: nova/stein
Status: New => Triaged
** Changed in: nova/pike
Importance: Undecided => Medium
** Changed in: nova/rocky
Importance: Undecided => Medium
** Changed in: nova/queens
Importance: Undecided => Medium
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1839674
Title:
ResourceTracker.compute_nodes won't try to create a ComputeNode a
second time if the first create() fails
Status in OpenStack Compute (nova):
Triaged
Status in OpenStack Compute (nova) ocata series:
Triaged
Status in OpenStack Compute (nova) pike series:
Triaged
Status in OpenStack Compute (nova) queens series:
Triaged
Status in OpenStack Compute (nova) rocky series:
New
Status in OpenStack Compute (nova) stein series:
Triaged
Bug description:
I found this while writing a functional recreate test for bug 1839560.
As of this change in Ocata:
https://github.com/openstack/nova/commit/1c967593fbb0ab8b9dc8b0b509e388591d32f537
The ResourceTracker.compute_nodes dict will store the ComputeNode
object *before* trying to create it:
https://github.com/openstack/nova/blob/6b7d0caad86fe32ffc49a8672de1eb7258f3b919/nova/compute/resource_tracker.py#L570-L571
The problem is if ComputeNode.create() fails for whatever reason, the
next run through update_available_resource won't try to create the
ComputeNode again because of this:
https://github.com/openstack/nova/blob/6b7d0caad86fe32ffc49a8672de1eb7258f3b919/nova/compute/resource_tracker.py#L546
And eventually you get errors like this:
b'2019-08-09 17:02:59,356 ERROR [nova.compute.manager] Error updating resources for node node2.'
b'Traceback (most recent call last):'
b' File "/home/osboxes/git/nova/nova/compute/manager.py", line 8250, in _update_available_resource_for_node'
b' startup=startup)'
b' File "/home/osboxes/git/nova/nova/compute/resource_tracker.py", line 715, in update_available_resource'
b' self._update_available_resource(context, resources, startup=startup)'
b' File "/home/osboxes/git/nova/.tox/functional-py36/lib/python3.6/site-packages/oslo_concurrency/lockutils.py", line 328, in inner'
b' return f(*args, **kwargs)'
b' File "/home/osboxes/git/nova/nova/compute/resource_tracker.py", line 796, in _update_available_resource'
b' self._update(context, cn, startup=startup)'
b' File "/home/osboxes/git/nova/nova/compute/resource_tracker.py", line 1052, in _update'
b' self.old_resources[nodename] = old_compute'
b' File "/home/osboxes/git/nova/.tox/functional-py36/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__'
b' self.force_reraise()'
b' File "/home/osboxes/git/nova/.tox/functional-py36/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise'
b' six.reraise(self.type_, self.value, self.tb)'
b' File "/home/osboxes/git/nova/.tox/functional-py36/lib/python3.6/site-packages/six.py", line 693, in reraise'
b' raise value'
b' File "/home/osboxes/git/nova/nova/compute/resource_tracker.py", line 1046, in _update'
b' compute_node.save()'
b' File "/home/osboxes/git/nova/.tox/functional-py36/lib/python3.6/site-packages/oslo_versionedobjects/base.py", line 226, in wrapper'
b' return fn(self, *args, **kwargs)'
b' File "/home/osboxes/git/nova/nova/objects/compute_node.py", line 352, in save'
b' db_compute = db.compute_node_update(self._context, self.id, updates)'
b' File "/home/osboxes/git/nova/.tox/functional-py36/lib/python3.6/site-packages/oslo_versionedobjects/base.py", line 67, in getter'
b' self.obj_load_attr(name)'
b' File "/home/osboxes/git/nova/.tox/functional-py36/lib/python3.6/site-packages/oslo_versionedobjects/base.py", line 603, in obj_load_attr'
b' _("Cannot load \'%s\' in the base class") % attrname)'
b"NotImplementedError: Cannot load 'id' in the base class"
We should only map the ComputeNode when we've successfully created it.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1839674/+subscriptions
References