← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2122036] Re: /os-hypervisors/detail takes too long to complete for 2.88 microversion

 

based on https://docs.openstack.org/project-team-
guide/bugs.html#importance

i am triaging this as medium

there are several workaround such as pagination or increasing the
request time out or using a microversion < 2.88

to either mitigate teh perfoamce impact hwen uptime is not required or
to elimianate it if the data in teh older microversion is sufficent for
the request

given this is also an admin only api the scope of the impact is limited
even fi it is acute for those impacted

taken toghter with the fact that thsi has been latent for so long
without a report this feels like a medium priority fix in general but i
think we shoudl backprot this to stable branches.

** Changed in: nova
   Importance: Undecided => Medium

** Changed in: nova
     Assignee: (unassigned) => sean mooney (sean-k-mooney)

** Also affects: nova/2025.1
   Importance: Undecided
       Status: New

** Also affects: nova/2025.2
   Importance: Medium
     Assignee: sean mooney (sean-k-mooney)
       Status: Fix Released

** Also affects: nova/2024.1
   Importance: Undecided
       Status: New

** Also affects: nova/2024.2
   Importance: Undecided
       Status: New

** Changed in: nova/2025.1
   Importance: Undecided => Medium

** Changed in: nova/2024.2
   Importance: Undecided => Medium

** Changed in: nova/2024.1
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2122036

Title:
  /os-hypervisors/detail takes too long to complete for 2.88
  microversion

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) 2024.1 series:
  New
Status in OpenStack Compute (nova) 2024.2 series:
  New
Status in OpenStack Compute (nova) 2025.1 series:
  New
Status in OpenStack Compute (nova) 2025.2 series:
  Fix Released

Bug description:
  To Reproduce Steps to reproduce the behavior:
  In Antelope environment with huge number of compute nodes run "openstack hypervisor list" command. It could take more that 40 seconds to complete and provide an output.

  Expected behavior
  Command is completed quickly by default, extra delays are expected when operator explicitly asks for extra data.

  Bug impact
  May block command from completion with default timeouts (it will fail before because HAProxy will return 504). Also, we shouldn't likely activate time-consuming options by default.

  Known workaround
  Specify earlier API version (2.68 for example)

  ---

  There is another independent case that can cause slowness. The uptime
  RPC only called on computes that are considered up, but if the compute
  is down, but such fact is not yet detected by the conductor due to the
  missing hearthbeat then the the RPC is sent but never answered causing
  unnecessary delay in the API response.

  ---

  The slowness is due to 2.88 hypervisor/details includes the compute
  uptime and nova gathers that by RPC calling down to each computes
  sequentially.

  Older microversion should be use as a workaround where uptime is not
  part of that API

  As a future mitigation we should implement a periodic task in nova-
  compute that periodically reports the uptime to the compute_nodes.stas
  json blob into the cell DB in a new service version. And change the
  API to query RPC down to the compute if the service version is old. If
  the service version is new enough then the API can use the data
  directly from the DB.

  If we don't introduce a service version but instead use the existence
  of the field in the json blob as a condition then we can probably make
  the feature backportable.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2122036/+subscriptions



References