yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #61395
[Bug 1644248] Re: Nova incorrectly tracks live migration progress
** Also affects: nova/ocata
Importance: Undecided
Status: New
** Tags added: ocata-rc-potential
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1644248
Title:
Nova incorrectly tracks live migration progress
Status in OpenStack Compute (nova):
In Progress
Status in OpenStack Compute (nova) ocata series:
New
Bug description:
Nova while monitoring live migration progress bases on what libvirt
reports under data_remaining property
https://github.com/openstack/nova/blob/54482fde22742bc852414c58552fe64ea59d61d5/nova/virt/libvirt/driver.py#L6189-L6193
However, data_remaining does not reflect any valuable information that
nova can use to track live migration progress. It's just an
information how many data needs to be transferred in current iteration
to finish current iteration and check whether VM can be switched to
destination, nothing more.
As an example let's assume we have VM with 4 GBs of memory. In the
very fist iteration libvirt will report that there is still 4GB of
data to be transferred. During the first iteration this number will go
down to 0 bytes (or almost 0) and this will end the first iteration.
Let's say that during the first iteration VM has dirtied 3 GBs of
memory. At the beginning of subsequent iteration QEMU will calculate
number of dirty pages * page size and libvirt will report 3 GBs of
data to be transferred in the second iteration. However, during second
iteration data_remaining will again go down to zero at the end of
second iteration.
Given that nova makes snapshot of all those information once every 0.5
second and that data remaining reported by libvirt reflects only data
remaining in particular iteration, we can't say whether LM is
progressing or not. Therefore live migration progress timeout does not
make sense as nova can take a snapshot from libvirt in the first
iteration that will say that there is only 150 MB to be transferred to
destination and very likely in every subsequent iteration nova will
not take a snapshot with less amount of data to be transferred and
will think that LM is not progressing.
This affects all releases starting from Liberty.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1644248/+subscriptions
References