← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1736224] [NEW] ram_allocation_ratio > 1 causes RAMFIlter to incorrectly decide on ability so spawn instance

 

Public bug reported:

The problem is inside this function -
https://github.com/openstack/nova/blob/master/nova/scheduler/filters/ram_filter.py#L33

Probably related to https://bugs.launchpad.net/nova/+bug/1635367

The problem is that RAMFilter calculations do not take into account VM
RAM subscription. This causes scheduler to try spawning VMs on hosts
which a fully oversubscribed while still have some physical free RAM -
this is possible due to KSM for example.

Consider this scenario:

ram_allocation_ratio = 1.5
Some compute host has 10GB physical RAM and 15 1GB VMs already spawned on it.
At the same time, there is still 2GB free physical RAM on the host, as seen in "free -m" and in nova hypervisor-show.

A new VM is scheduled and RAMFilter is executed:

requested_ram = spec_obj.memory_mb = 1GB
free_ram_mb = host_state.free_ram_mb = 2GB # this is actual free RAM on a host, which does not properly reflect VM subscription
total_usable_ram_mb = host_state.total_usable_ram_mb = 10GB # host has 10GB RAM total

Then the main check which is performed is:

memory_mb_limit = total_usable_ram_mb * ram_allocation_ratio = 15GB
used_ram_mb = total_usable_ram_mb - free_ram_mb = 10 - 2 = 8GB
usable_ram = memory_mb_limit - used_ram_mb = 15 - 8 = 7GB # incorrect assumption that host has 7GB usable RAM left

Unless I have some incorrect understanding, the logic here is broken.
At first I tried to make up a quick fix, but then realized the VM subscription RAM value (sum of RAM of all VMs scheduled on a host) is not present in this code so proper calculation cannot be done. It may be available inside host_state object, I have not checked yet.

** Affects: nova
     Importance: Undecided
         Status: New


** Tags: sche

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1736224

Title:
  ram_allocation_ratio > 1 causes RAMFIlter to incorrectly decide on
  ability so spawn instance

Status in OpenStack Compute (nova):
  New

Bug description:
  The problem is inside this function -
  https://github.com/openstack/nova/blob/master/nova/scheduler/filters/ram_filter.py#L33

  Probably related to https://bugs.launchpad.net/nova/+bug/1635367

  The problem is that RAMFilter calculations do not take into account VM
  RAM subscription. This causes scheduler to try spawning VMs on hosts
  which a fully oversubscribed while still have some physical free RAM -
  this is possible due to KSM for example.

  Consider this scenario:

  ram_allocation_ratio = 1.5
  Some compute host has 10GB physical RAM and 15 1GB VMs already spawned on it.
  At the same time, there is still 2GB free physical RAM on the host, as seen in "free -m" and in nova hypervisor-show.

  A new VM is scheduled and RAMFilter is executed:

  requested_ram = spec_obj.memory_mb = 1GB
  free_ram_mb = host_state.free_ram_mb = 2GB # this is actual free RAM on a host, which does not properly reflect VM subscription
  total_usable_ram_mb = host_state.total_usable_ram_mb = 10GB # host has 10GB RAM total

  Then the main check which is performed is:

  memory_mb_limit = total_usable_ram_mb * ram_allocation_ratio = 15GB
  used_ram_mb = total_usable_ram_mb - free_ram_mb = 10 - 2 = 8GB
  usable_ram = memory_mb_limit - used_ram_mb = 15 - 8 = 7GB # incorrect assumption that host has 7GB usable RAM left

  Unless I have some incorrect understanding, the logic here is broken.
  At first I tried to make up a quick fix, but then realized the VM subscription RAM value (sum of RAM of all VMs scheduled on a host) is not present in this code so proper calculation cannot be done. It may be available inside host_state object, I have not checked yet.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1736224/+subscriptions


Follow ups