yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #81450
[Bug 1857139] Re: TypeError: object of type 'object' has no len() from resources_from_request_spec when cells are down
Reviewed: https://review.opendev.org/700186
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=0d9622f581e830e7b7bc9763aaa09ba02e99b8bb
Submitter: Zuul
Branch: master
commit 0d9622f581e830e7b7bc9763aaa09ba02e99b8bb
Author: Matt Riedemann <mriedem.os@xxxxxxxxx>
Date: Fri Dec 20 10:03:23 2019 -0500
Handle cell failures in get_compute_nodes_by_host_or_node
get_compute_nodes_by_host_or_node uses the scatter_gather_cells
function but was not handling the case that a failure result
was returned, which could be the called function raising some
exception or the cell timing out. This causes issues when the
caller of get_compute_nodes_by_host_or_node expects to get a
ComputeNodeList back and can do something like len(nodes) on it
which fails when the result is not iterable.
To be clear, if a cell is down there are going to be problems
which likely result in a NoValidHost error during scheduling, but
this avoids an ugly TypeError traceback in the scheduler logs.
Change-Id: Ia54b5adf0a125ae1f9b86887a07dd1d79821dd54
Closes-Bug: #1857139
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1857139
Title:
TypeError: object of type 'object' has no len() from
resources_from_request_spec when cells are down
Status in OpenStack Compute (nova):
Fix Released
Status in OpenStack Compute (nova) train series:
Confirmed
Bug description:
Seen here:
https://zuul.opendev.org/t/openstack/build/c187e207bc1c48a0a7fa49ef9798b696/log/logs/screen-n-sch.txt.gz#2529
cell1 is down so the call to scatter_gather_cells in
get_compute_nodes_by_host_or_node yields a result but it's not a
ComputeNodeList, it's the did_not_respond_sentinel object:
https://github.com/openstack/nova/blob/02019d2660dfce3facdd64ecdb2bd60ba4a91c6d/nova/scheduler/host_manager.py#L705
https://github.com/openstack/nova/blob/02019d2660dfce3facdd64ecdb2bd60ba4a91c6d/nova/context.py#L454
which results in an error here:
https://github.com/openstack/nova/blob/02019d2660dfce3facdd64ecdb2bd60ba4a91c6d/nova/scheduler/utils.py#L612
The HostManager.get_compute_nodes_by_host_or_node method should filter
out fail/timeout results from the scatter_gather_cells results. We'll
get a NoValidHost either way but this is better than the traceback
with the TypeError in it.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1857139/+subscriptions
References