← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2011127] Re: Nova scheduler stacks allocations in heterogeneous environments

 

** Changed in: nova
       Status: In Progress => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2011127

Title:
  Nova scheduler stacks allocations in heterogeneous environments

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  Our OpenStack clouds consist of different hypervisor hardware
  configurations, all of which are members of the same cell.

  What we have observed is that many of the weighers in Nova will
  encourage "stacking" of allocations instead of "spreading". That is to
  say, the weighers will preferentially keep assigning greater weights
  to the hypervisors with more resources until said hypervisors are
  objectively over-provisioned compared to the hypervisors with less
  resources.

  Suppose for example that some of these hypervisors have 1/4th the
  amount of RAM and physical CPU cores compared to others. What we
  observe is that, assuming all hypervisors start empty, the hypervisors
  with 1/4th the amount of RAM will not have a *single* instance
  assigned to them even when others can have 1/2 or more of their
  resources allocated.

  We dug into why, and landed upon this commit from 2013 which normalized the weights:
  https://github.com/openstack/nova/commit/e5ba8494374a1b049eae257fe05b10c5804049ae

  The normalization on the surface seems correct:
  "weight = w1_multiplier * norm(w1) + w2_multiplier * norm(w2) + ..."

  However, the computed values for w1 by the CPUWeigher and RAMWeigher,
  etc. are objectively *not* correct anymore. The commits mentions that
  all weighers should fall under two cases:

     Case 1: Use of a percentage instead of absolute values (for example, % of free RAM).
     Case 2: Use of absolute values.

  However, if we look at current implementation, there are hidden
  implications with case 2 with some weighers. In the context of
  RAMWeigher, for example, this is due to the fact that the
  normalization occurs with respect to the hypervisor which has the most
  free RAM at the point in time of scheduling -- this is not % free RAM
  per hypervisor, so:

  Suppose we take a fictitious example of two hypervisors, one ("HypA")
  with 2 units of RAM and one ("HypB") with 10 units of RAM. And we
  assume VMs of 0.25 units of RAM are allocated:

  Upon the first first allocation, we compute these weights:
  HypA: 2 units of free RAM, normalized weight = 0.2 (2/10)
  HypB: 10 units of free RAM, normalized weight = 1.0 (10/10)

  And the second:
  HypA: 2 units of free RAM, normalized weight = 0.20512820512820512 (2/9.75)
  HypB: 9.75 units of free RAM, normalized weight = 1.0 (9.75/9.75)

  And the third:
  HypA: 2 units of free RAM, normalized weight = 0.21052631578947367 (2/9.5)
  HypB: 9.5 units of free RAM, normalized weight = 1.0 (9.5/9.5)

  etc...

  Thus the RAMWeigher continues stacking instances on HypB until HypB
  has 2 units of free RAM remaining, at which point it has 32 instances
  of 0.25 units of RAM. After this point, it begins spreading across
  both hypervisors in lockstep fashion. But up until this points, it
  stacks.

  This same problem occurs with the CPUWeigher, but it's even more
  pernicious in that case because the CPUWeigher is acting on vCPUs wrt
  operator-supplied CPU allocation ratios.

  For example: lets suppose an operator configures Nova with
  cpu_allocation_ratio = 3.0. In this case, a hypervisor with 2x as many
  cores as another will have its cores over-provisioned (that is, more
  than 1 vCPU/1 pCPU core allocated) before the other hypervisor gets a
  single instance!

  This is because the value returned to the normalization function is
  free vCPUs over total vCPUs (# physical CPU cores *
  cpu_allocation_ratio). In this way, stacking occurs on the hypervisor
  with twice the CPU cores up until its physical CPU cores are over-
  provisioned @ 1.5vCPUs per physical CPU core.

  The documentation does not define "even spreading", as it is referred
  to... but this certainly does not seem correct.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2011127/+subscriptions



References