yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #96406
[Bug 2122036] Re: /os-hypervisors/detail takes too long to complete for 2.88 microversion
based on https://docs.openstack.org/project-team-
guide/bugs.html#importance
i am triaging this as medium
there are several workaround such as pagination or increasing the
request time out or using a microversion < 2.88
to either mitigate teh perfoamce impact hwen uptime is not required or
to elimianate it if the data in teh older microversion is sufficent for
the request
given this is also an admin only api the scope of the impact is limited
even fi it is acute for those impacted
taken toghter with the fact that thsi has been latent for so long
without a report this feels like a medium priority fix in general but i
think we shoudl backprot this to stable branches.
** Changed in: nova
Importance: Undecided => Medium
** Changed in: nova
Assignee: (unassigned) => sean mooney (sean-k-mooney)
** Also affects: nova/2025.1
Importance: Undecided
Status: New
** Also affects: nova/2025.2
Importance: Medium
Assignee: sean mooney (sean-k-mooney)
Status: Fix Released
** Also affects: nova/2024.1
Importance: Undecided
Status: New
** Also affects: nova/2024.2
Importance: Undecided
Status: New
** Changed in: nova/2025.1
Importance: Undecided => Medium
** Changed in: nova/2024.2
Importance: Undecided => Medium
** Changed in: nova/2024.1
Importance: Undecided => Medium
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2122036
Title:
/os-hypervisors/detail takes too long to complete for 2.88
microversion
Status in OpenStack Compute (nova):
Fix Released
Status in OpenStack Compute (nova) 2024.1 series:
New
Status in OpenStack Compute (nova) 2024.2 series:
New
Status in OpenStack Compute (nova) 2025.1 series:
New
Status in OpenStack Compute (nova) 2025.2 series:
Fix Released
Bug description:
To Reproduce Steps to reproduce the behavior:
In Antelope environment with huge number of compute nodes run "openstack hypervisor list" command. It could take more that 40 seconds to complete and provide an output.
Expected behavior
Command is completed quickly by default, extra delays are expected when operator explicitly asks for extra data.
Bug impact
May block command from completion with default timeouts (it will fail before because HAProxy will return 504). Also, we shouldn't likely activate time-consuming options by default.
Known workaround
Specify earlier API version (2.68 for example)
---
There is another independent case that can cause slowness. The uptime
RPC only called on computes that are considered up, but if the compute
is down, but such fact is not yet detected by the conductor due to the
missing hearthbeat then the the RPC is sent but never answered causing
unnecessary delay in the API response.
---
The slowness is due to 2.88 hypervisor/details includes the compute
uptime and nova gathers that by RPC calling down to each computes
sequentially.
Older microversion should be use as a workaround where uptime is not
part of that API
As a future mitigation we should implement a periodic task in nova-
compute that periodically reports the uptime to the compute_nodes.stas
json blob into the cell DB in a new service version. And change the
API to query RPC down to the compute if the service version is old. If
the service version is new enough then the API can use the data
directly from the DB.
If we don't introduce a service version but instead use the existence
of the field in the json blob as a condition then we can probably make
the feature backportable.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2122036/+subscriptions
References