yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #69598
[Bug 1732428] Re: Unshelving a VM breaks instance metadata when using qcow2 backed images
Looking at the review comments on
https://review.openstack.org/#/c/72407/, it looks like this was
intentional:
Nikola Dipanov
Feb 10, 2014
Patch Set 2: I would prefer that you didn't merge this
Looking at the code - I am not sure we actually want to do this.
The instance should keep it's old image in the db once it has been
unshelved, but it needs the new image because it will download it in the
compute manager when it calls driver.spawn.
At the very least, we should put the actual image back to the instance
once the unshelve is done, (so end of manager call).
Going forward, we might want to change what gets passed to spawn so that
it can decide what to download. Keep in mind that right now we have the
image as a block device in the db (even though we don't use it)
--
I have no idea why we should say the instance is backed by the original
image ref rather than the snapshot image ref from the shelved offloaded
instance, that's totally confusing and wrong IMO.
** Changed in: nova
Assignee: (unassigned) => Matt Riedemann (mriedem)
** Changed in: nova
Status: New => Triaged
** Changed in: nova
Importance: Undecided => Medium
** Also affects: nova/ocata
Importance: Undecided
Status: New
** Also affects: nova/pike
Importance: Undecided
Status: New
** Changed in: nova/ocata
Status: New => Confirmed
** Changed in: nova/pike
Importance: Undecided => Medium
** Changed in: nova/pike
Status: New => Confirmed
** Changed in: nova/ocata
Importance: Undecided => Medium
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1732428
Title:
Unshelving a VM breaks instance metadata when using qcow2 backed
images
Status in OpenStack Compute (nova):
Triaged
Status in OpenStack Compute (nova) ocata series:
Confirmed
Status in OpenStack Compute (nova) pike series:
Confirmed
Bug description:
If you unshelve instances on compute nodes that use qcow2 backed
instances, the instance image_ref will point to the original image the
VM was lauched from. The base file for
/var/lib/nova/instances/uuid/disk will be the snapshot which was used
for shelving. This causes errors with e.g. resizes and migrations.
Steps to reproduce/what happens:
Have at least 2 compute nodes configured with the standard qcow2 backed images.
1) Launch an instance.
2) Shelve the instance. In the background this should in practice create a flattened snapshot of the VM.
3) Unshelve the instance. The instance will boot on one of the compute
nodes. The /var/lib/nova/instances/uuid/disk should now have the
snapshot as its base file. The instance metadata still claims that the
image_ref is the original image which the VM was launched from, not
the snapshot.
4) Resize/migrate the instance. /var/lib/nova/instances/uuid/disk
should be copied to the other compute node. If you resize to an image
with the same size disk, go to 5), if you resize to flavor with a
larger disk, it probably causes an error here when it tries to grow
the disk.
5a) If the instance was running: When nova tries to start the VM, it
will copy the original base image to the new compute node, not the
snapshot base image. The instance can't boot, since it doesn't find
its actual base file, and it goes to an ERROR state.
5b) If the instance was shutdown: You can confirm the resize, but the
VM won't start. The snapshot base file may be removed from the source
machine causing dataloss.
What should have happened:
Either the instance image_ref should be updated to the snapshot image, or the snapshot image should be rebased to the original image, or is should force a raw only image after unshelve, or something else you smart people come up with.
Environment:
RDO Neutron with KVM
rpm -qa |grep nova
openstack-nova-common-14.0.6-1.el7.noarch
python2-novaclient-6.0.1-1.el7.noarch
python-nova-14.0.6-1.el7.noarch
openstack-nova-compute-14.0.6-1.el7.noarch
Also a big thank you to Toni Peltonen and Anton Aksola from nebula.fi
for discovering and debugging this issue.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1732428/+subscriptions
References