← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1786055] Re: performance degradation in placement with large number of resource providers

 

I'm going to close this bug to help keep our backlog down since the two
patches to fix the major factor in the performance degradation have
landed.

"<cdent> melwitt: hmmm. There is more than can be done, but not likely
that more will be done _now_, so I would guess closed is probably a
reasonable state. The major factor has been addressed. Fixing the rest
will involve considerable refactoring"

** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1786055

Title:
  performance degradation in placement with large number of resource
  providers

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Using today's master, there is a big performance degradation in GET
  /allocation_candidates when there is a large number of resource
  providers (in my tests 1000, each with the same inventory as described
  in [1]). 17s when querying all three resource classes with
  http://127.0.0.1:8081/allocation_candidates?resources=VCPU:1,MEMORY_MB:256,DISK_GB:10

  Using a limit does not make any difference, the cost is in generating
  the original data.

  I did some advanced LOG.debug based benchmarking to determine three
  places where things are a problem, and maybe even fixed the worst one.
  See the diff below. The two main culprits are
  ResourceProvider.get_by_uuid calls looping over the full set. These
  can be replaced by either using data we already have from early
  queries, or by changing so we are making single queries.

  In the diff I've already changed one of them (the second chunk) to use
  the data that _build_provider_summaries is already getting.
  (functional tests still pass with this change)

  The third chunk is because we have a big loop, but I suspect there is
  some duplication that can be avoided. I have no investigated that
  closely (yet).

  -=-=-
  diff --git a/nova/api/openstack/placement/objects/resource_provider.py b/nova/api/openstack/placement/objects/resource_provider.py
  index 851f9719e4..e6c894b8fe 100644
  --- a/nova/api/openstack/placement/objects/resource_provider.py
  +++ b/nova/api/openstack/placement/objects/resource_provider.py
  @@ -3233,6 +3233,8 @@ def _build_provider_summaries(context, usages, prov_traits):
           if not summary:
               summary = ProviderSummary(
                   context,
  +                # This is _expensive_ when there are a large number of rps.
  +                # Building the objects differently may be better.
                   resource_provider=ResourceProvider.get_by_uuid(context,
                                                                  uuid=rp_uuid),
                   resources=[],
  @@ -3519,8 +3521,7 @@ def _alloc_candidates_multiple_providers(ctx, requested_resources,
           rp_uuid = rp_summary.resource_provider.uuid
           tree_dict[root_id][rc_id].append(
               AllocationRequestResource(
  -                ctx, resource_provider=ResourceProvider.get_by_uuid(ctx,
  -                                                                    rp_uuid),
  +                ctx, resource_provider=rp_summary.resource_provider,
                   resource_class=_RC_CACHE.string_from_id(rc_id),
                   amount=requested_resources[rc_id]))
   
  @@ -3535,6 +3536,8 @@ def _alloc_candidates_multiple_providers(ctx, requested_resources,
       alloc_prov_ids = []
   
       # Let's look into each tree
  +    # With many resource providers this takes a long time, but each trip
  +    # through the loop is not too bad.
       for root_id, alloc_dict in tree_dict.items():
           # Get request_groups, which is a list of lists of
           # AllocationRequestResource(ARR) per requested resource class(rc).
  -=-=-


  
  [1] https://github.com/cdent/placeload/blob/master/placeload/__init__.py#L23

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1786055/+subscriptions


References