← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2045168] [NEW] instances page fails to load if it takes more than 26 seconds

 

Public bug reported:

Focal-ussuri customer env with lots of resources.

when trying to load the project>instance page, if the total amount of
time loading data takes more than 26 seconds, the page enters a reload
loop until the browser times out in 5 minutes.

The 26 seconds number was obtained in the following way:

1) 5 minute browser timeout was observed when trying to load the page
2) logs were inspected and noticed that some queries were taking very long, like glance ~12 secs, neutron ~8 seconds, etc. Queries to nova take at most 3 seconds.
3) in a separate env with zero resources where it would load instantly, I added a time.sleep in the api/glance.py file when invoking glance for images (glance is invoked multiple times when loading the instances page). Sleeping 14 seconds times out on 5 minutes, but sleeping 13 seconds does not timeout and loads quickly. When it times out with 14 seconds, I tailed the logs and noticed that the same group of requests were being repeated for a while, always starting with the flavors request. With the 13 seconds sleep the requests would not repeat.
4) Removed the sleep from the api/glance.py file and added a sleep of 26 secs in the project/instances/views.py file get_data method right after

image_dict, flavor_dict, volume_dict =
futurist_utils.call_functions_parallel(self._get_images,
self._get_flavors, self._get_volumes)

With 26 seconds sleep it does not timeout nor repeat the requests, the
page loads fine. But with 27 seconds sleep it times out on 5 minutes and
keeps repeating the requests on the logs.

My conclusion is that the get_data method does not tolerate taking
longer than 26 seconds to finish loading the page, and "reloads" itself,
entering a loop that never finishes if the page cannot be loaded in less
than 26 seconds.

Ideally this internal timeout that causes a reload loop should be
configurable and more tolerant by default.

** Affects: horizon
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Dashboard (Horizon).
https://bugs.launchpad.net/bugs/2045168

Title:
  instances page fails to load if it takes more than 26 seconds

Status in OpenStack Dashboard (Horizon):
  New

Bug description:
  Focal-ussuri customer env with lots of resources.

  when trying to load the project>instance page, if the total amount of
  time loading data takes more than 26 seconds, the page enters a reload
  loop until the browser times out in 5 minutes.

  The 26 seconds number was obtained in the following way:

  1) 5 minute browser timeout was observed when trying to load the page
  2) logs were inspected and noticed that some queries were taking very long, like glance ~12 secs, neutron ~8 seconds, etc. Queries to nova take at most 3 seconds.
  3) in a separate env with zero resources where it would load instantly, I added a time.sleep in the api/glance.py file when invoking glance for images (glance is invoked multiple times when loading the instances page). Sleeping 14 seconds times out on 5 minutes, but sleeping 13 seconds does not timeout and loads quickly. When it times out with 14 seconds, I tailed the logs and noticed that the same group of requests were being repeated for a while, always starting with the flavors request. With the 13 seconds sleep the requests would not repeat.
  4) Removed the sleep from the api/glance.py file and added a sleep of 26 secs in the project/instances/views.py file get_data method right after

  image_dict, flavor_dict, volume_dict =
  futurist_utils.call_functions_parallel(self._get_images,
  self._get_flavors, self._get_volumes)

  With 26 seconds sleep it does not timeout nor repeat the requests, the
  page loads fine. But with 27 seconds sleep it times out on 5 minutes
  and keeps repeating the requests on the logs.

  My conclusion is that the get_data method does not tolerate taking
  longer than 26 seconds to finish loading the page, and "reloads"
  itself, entering a loop that never finishes if the page cannot be
  loaded in less than 26 seconds.

  Ideally this internal timeout that causes a reload loop should be
  configurable and more tolerant by default.

To manage notifications about this bug go to:
https://bugs.launchpad.net/horizon/+bug/2045168/+subscriptions



Follow ups