← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1644248] Re: Nova incorrectly tracks live migration progress

 

** Also affects: nova/ocata
   Importance: Undecided
       Status: New

** Tags added: ocata-rc-potential

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1644248

Title:
  Nova incorrectly tracks live migration progress

Status in OpenStack Compute (nova):
  In Progress
Status in OpenStack Compute (nova) ocata series:
  New

Bug description:
  Nova while monitoring live migration progress bases on what libvirt
  reports under data_remaining property

  https://github.com/openstack/nova/blob/54482fde22742bc852414c58552fe64ea59d61d5/nova/virt/libvirt/driver.py#L6189-L6193

  However, data_remaining does not reflect any valuable information that
  nova can use to track live migration progress. It's just an
  information how many data needs to be transferred in current iteration
  to finish current iteration and check whether VM can be switched to
  destination, nothing more.

  As an example let's assume we have VM with 4 GBs of memory. In the
  very fist iteration libvirt will report that there is still 4GB of
  data to be transferred. During the first iteration this number will go
  down to 0 bytes (or almost 0) and this will end the first iteration.
  Let's say that during the first iteration VM has dirtied 3 GBs of
  memory. At the beginning of subsequent iteration QEMU will calculate
  number of dirty pages * page size and libvirt will report 3 GBs of
  data to be transferred in the second iteration. However, during second
  iteration data_remaining will again go down to zero at the end of
  second iteration.

  Given that nova makes snapshot of all those information once every 0.5
  second and that data remaining reported by libvirt reflects only data
  remaining in particular iteration, we can't say whether LM is
  progressing or not. Therefore live migration progress timeout does not
  make sense as nova can take a snapshot from libvirt in the first
  iteration that will say that there is only 150 MB to be transferred to
  destination and very likely in every subsequent iteration nova will
  not take a snapshot with less amount of data to be transferred and
  will think that LM is not progressing.

  This affects all releases starting from Liberty.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1644248/+subscriptions


References