← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1899139] [NEW] Live migrations don't properly handle disk overcommitment

 

Public bug reported:

When live migrating libvirt instances, the destination host doesn't
properly check the available disk space when using image files and doing
overcommit, leading to migration failures.

Trace: http://paste.openstack.org/raw/798895/

It seems to be using resource tracker information that is not aware of
disk overcommitment, so we end up with negative values. The
"local_gb_used" value reflects the total allocated space, not the
actually used disk space.

https://github.com/openstack/nova/blob/20.4.0/nova/compute/resource_tracker.py#L1254

The same incorrect values will be reported by "openstack hypervisor show":
http://paste.openstack.org/raw/798898/

Additionally, the "disk_over_commit" boolean flag is incorrectly
checked. The driver checks if the field exists as part of the
"dest_check_data" dict but doesn't actually check its value.

https://github.com/openstack/nova/blob/20.4.0/nova/virt/libvirt/driver.py#L8224

The "disk_over_commit" parameter is deprecated. Recent Nova API versions
do not use it, which bypasses the disk allocation check on the libvirt
driver side. This might be used as a workaround (e.g. using nova client
instead of the openstack client or horizon), but this is not ideal.

** Affects: nova
     Importance: Undecided
         Status: New


** Tags: libvirt live-migration resource-tracker

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1899139

Title:
  Live migrations don't properly handle disk overcommitment

Status in OpenStack Compute (nova):
  New

Bug description:
  When live migrating libvirt instances, the destination host doesn't
  properly check the available disk space when using image files and
  doing overcommit, leading to migration failures.

  Trace: http://paste.openstack.org/raw/798895/

  It seems to be using resource tracker information that is not aware of
  disk overcommitment, so we end up with negative values. The
  "local_gb_used" value reflects the total allocated space, not the
  actually used disk space.

  https://github.com/openstack/nova/blob/20.4.0/nova/compute/resource_tracker.py#L1254

  The same incorrect values will be reported by "openstack hypervisor show":
  http://paste.openstack.org/raw/798898/

  Additionally, the "disk_over_commit" boolean flag is incorrectly
  checked. The driver checks if the field exists as part of the
  "dest_check_data" dict but doesn't actually check its value.

  https://github.com/openstack/nova/blob/20.4.0/nova/virt/libvirt/driver.py#L8224

  The "disk_over_commit" parameter is deprecated. Recent Nova API
  versions do not use it, which bypasses the disk allocation check on
  the libvirt driver side. This might be used as a workaround (e.g.
  using nova client instead of the openstack client or horizon), but
  this is not ideal.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1899139/+subscriptions


Follow ups