← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1786519] Re: debugging why NoValidHost with placement challenging

 

Reviewed:  https://review.openstack.org/590041
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=b5ab9f5acec172d16e46876f60ca338434483905
Submitter: Zuul
Branch:    master

commit b5ab9f5acec172d16e46876f60ca338434483905
Author: Jay Pipes <jaypipes@xxxxxxxxx>
Date:   Wed Aug 8 17:11:25 2018 -0400

    [placement] split gigantor SQL query, add logging
    
    This patch modifies the code paths for the non-granular request group
    allocation candidates processing. It removes the giant multi-join SQL
    query and replaces it with multiple calls to
    _get_providers_with_resource(), logging the number of matched providers
    for each resource class requested and filter (on required traits,
    forbidden traits and aggregate memebership).
    
    Here are some examples of the debug output:
    
    - A request for three resources with no aggregate or trait filters:
    
     found 7 providers with available 5 VCPU
     found 9 providers with available 1024 MEMORY_MB
     found 5 providers after filtering by previous result
     found 8 providers with available 1500 DISK_GB
     found 2 providers after filtering by previous result
    
    - The same request, but with a required trait that nobody has, shorts
      out quickly:
    
     found 0 providers after applying required traits filter (['HW_CPU_X86_AVX2'])
    
    - A request for one resource with aggregates and forbidden (but no
      required) traits:
    
     found 2 providers after applying aggregates filter ([['3ed8fb2f-4793-46ee-a55b-fdf42cb392ca']])
     found 1 providers after applying forbidden traits filter ([u'CUSTOM_TWO', u'CUSTOM_THREE'])
     found 3 providers with available 4 VCPU
     found 1 providers after applying initial aggregate and trait filters
    
    Co-authored-by: Eric Fried <efried@xxxxxxxxxx>
    Closes-Bug: #1786519
    Change-Id: If9ddb8a6d2f03392f3cc11136c4a0b026212b95b


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1786519

Title:
  debugging why NoValidHost with placement challenging

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  With the advent of placement, the FilterScheduler no longer provides
  granular information about which class of resource (disk, VCPU, RAM)
  is not available in sufficient quantities to allow a host to be found.

  This is because placement is now making those choices and does not
  (yet) break down the results of its queries into easy to understand
  chunks. If it returns zero results all you know is "we didn't have
  enough resources". Nothing about which resources.

  This can be fixed by changing the way in queries are made so that
  there are a series of queries. After each one a report of how many
  results are left can be made.

  While this relatively straightforward to do for the (currently-)common
  simple non-nested and non-sharing providers situation it will be more
  difficult for the non-simple cases. Therefore, it makes sense to have
  different code paths for simple and non-simple allocation candidate
  queries. This will also result in performance gains for the common
  case.

  See this email thread for additional discussion and reports of
  problems in the wild: http://lists.openstack.org/pipermail/openstack-
  dev/2018-August/132735.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1786519/+subscriptions


References