← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1807219] Re: SchedulerReporClient init slows down nova-api startup

 

Reviewed:  https://review.openstack.org/623246
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=66e44c64297e77395195d77104017fb6fcea58d2
Submitter: Zuul
Branch:    master

commit 66e44c64297e77395195d77104017fb6fcea58d2
Author: Matt Riedemann <mriedem.os@xxxxxxxxx>
Date:   Thu Dec 6 11:17:15 2018 -0500

    Only construct SchedulerReportClient on first access from API
    
    With commit 5d1a50018510b2b281ad33895ae2d9555f5d5b05 each
    API/AggregateAPI class instance constructs a SchedulerReportClient
    which holds an in-memory lock during its initialization.
    
    With at least 34 API extensions constructing at least
    one of those two API classes, the accumulated affect of the
    SchedulerReportClient construction can slow down nova-api startup
    times, especially when running with multiple API workers, like
    in our tempest-full CI job (there are 2 workers, so 68 inits).
    
    This change simply defers constructing the SchedulerReportClient
    until it is used, which is only in a few spots in the API code,
    which should help with nova-api start times.
    
    The AggregateAPI also has to construct the SchedulerQueryClient
    separately because SchedulerClient creates both the query and
    report clients.
    
    Long-term we could consider making it a singleton in nova.compute.api
    if that is safe (the aggregate code might be relying on some caching
    aspects in the SchedulerReportClient).
    
    Change-Id: Idf6e548d725db0181629a451f46b6a3a5850d186
    Closes-Bug: #1807219


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1807219

Title:
  SchedulerReporClient init slows down nova-api startup

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  This is split out from bug 1807044 where nova-api sometimes takes more
  than 60 seconds to start on slow CI nodes which causes devstack to
  timeout and fail.

  Specifically, every nova.compute.api.API constructs a
  SchedulerReportClient, which grabs an in-memory lock per API worker
  during init:

  Dec 05 20:14:27.694593 ubuntu-xenial-ovh-bhs1-0000959981
  devstack@n-api.service[23459]: DEBUG oslo_concurrency.lockutils [None
  req-dfdfad07-2ff4-43ed-9f67-2acd59687e0c None None] Lock
  "placement_client" released by
  "nova.scheduler.client.report._create_client" :: held 0.006s
  {{(pid=23462) inner /usr/local/lib/python2.7/dist-
  packages/oslo_concurrency/lockutils.py:339}}

  We could probably be smarter about either making that a singleton in
  the API or only init on first access since most of the API extensions
  aren't going to even use that SchedulerReportClient.

  There are at least 30 extensions in nova-api that create an API class,
  and there are 2 workers in devstack jobs, and each API class
  constructs that report client which has a lock during init, so it will
  have a snowball effect on startup.

  Furthermore, with this change:

  https://review.openstack.org/#/c/615641/

  The NOTE in the API is no longer true:

  https://github.com/openstack/nova/blob/c9dca64fa64005e5bea327f06a7a3f4821ab72b1/nova/compute/api.py#L256

  So the API likely just needs to add it's own lazy-load behavior for
  that client.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1807219/+subscriptions


References