← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1732428] Re: Unshelving a VM breaks instance metadata when using qcow2 backed images

 

Reviewed:  https://review.opendev.org/696084
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=8953a689467f8c3e996086392251de67953a45ba
Submitter: Zuul
Branch:    master

commit 8953a689467f8c3e996086392251de67953a45ba
Author: Alexandre Arents <alexandre.arents@xxxxxxxxxxxx>
Date:   Tue Nov 26 10:26:32 2019 +0000

    Rebase qcow2 images when unshelving an instance
    
    During unshelve, instance is spawn with image created by shelve
    and is deleted just after, instance.image_ref still point
    to the original instance build image.
    
    In qcow2 environment, this is an issue because instance backing file
    don't match anymore instance.image_ref and during live-migration/resize,
    target host will fetch image corresponding to instance.image_ref
    involving instance corruption.
    
    This change fetches original image and rebase instance disk on it.
    This avoid image_ref mismatch and bring back storage benefit to keep common
    image in cache.
    
    If original image is no more available in glance, backing file is merged into
    disk(flatten), ensuring instance integrity during next live-migration/resize
    operation.
    
    Change-Id: I1a33fadf0b7439cf06c06cba2bc06df6cef0945b
    Closes-Bug: #1732428


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1732428

Title:
  Unshelving a VM breaks instance metadata when using qcow2 backed
  images

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) ocata series:
  Confirmed
Status in OpenStack Compute (nova) pike series:
  Confirmed

Bug description:
  If you unshelve instances on compute nodes that use qcow2 backed
  instances, the instance image_ref will point to the original image the
  VM was lauched from. The base file for
  /var/lib/nova/instances/uuid/disk will be the snapshot which was used
  for shelving. This causes errors with e.g. resizes and migrations.

  Steps to reproduce/what happens:
  Have at least 2 compute nodes configured with the standard qcow2 backed images.

  1) Launch an instance.
  2) Shelve the instance. In the background this should in practice create a flattened snapshot of the VM.

  3) Unshelve the instance. The instance will boot on one of the compute
  nodes. The /var/lib/nova/instances/uuid/disk should now have the
  snapshot as its base file. The instance metadata still claims that the
  image_ref is the original image which the VM was launched from, not
  the snapshot.

  4) Resize/migrate the instance. /var/lib/nova/instances/uuid/disk
  should be copied to the other compute node. If you resize to an image
  with the same size disk, go to 5), if you resize to flavor with a
  larger disk, it probably causes an error here when it tries to grow
  the disk.

  5a) If the instance was running: When nova tries to start the VM, it
  will copy the original base image to the new compute node, not the
  snapshot base image. The instance can't boot, since it doesn't find
  its actual base file, and it goes to an ERROR state.

  5b) If the instance was shutdown: You can confirm the resize, but the
  VM won't start. The snapshot base file may be removed from the source
  machine causing dataloss.

  What should have happened:
  Either the instance image_ref should be updated to the snapshot image, or the snapshot image should be rebased to the original image, or is should force a raw only image after unshelve, or something else you smart people come up with.

  Environment:
  RDO Neutron with KVM

  rpm -qa |grep nova
  openstack-nova-common-14.0.6-1.el7.noarch
  python2-novaclient-6.0.1-1.el7.noarch
  python-nova-14.0.6-1.el7.noarch
  openstack-nova-compute-14.0.6-1.el7.noarch

  Also a big thank you to Toni Peltonen and Anton Aksola from nebula.fi
  for discovering and debugging this issue.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1732428/+subscriptions


References