← Back to team overview

openstack team mailing list archive

Libvirt Snapshots

 

Hello Everyone,

I've been trying to come up with a solution for libvirt snapshots to fix the issue with snapshotting when a volume is attached:

https://bugs.launchpad.net/nova/+bug/946830

The main issue here is that calling snapshot in libvirt makes an internal snapshot of the entire vm, which a) doesn't work for attached volumes b) wastes a bunch of space while snapshotting memory and ephemeral disks which aren't used.

There are two potential approaches to solving the issue, and I've prototyped them below. I need feedback on which approach is better.

OPTION A --> snapshot using qemu-img  

This method shuts down the vm and uses qemu-img to create the snapshot in the disk image

Pros:
works with older versions of libvirt

Cons:
shutting off the vm during snapshotting is overkill and annoying

Caveats:
  if it is safe to create disk file snapshots while libvirt has a file handle open, i can use suspend/resume which is better than managedSave.
 If it is safe to delete snapshots while the disk is being written to, i can resume sooner, minimizing pause time
 if it is additionally safe to create snapshots while the disk is being written to, we can avoid pausing the vm altogether! (sounds dangerous though)

https://github.com/vishvananda/nova/blob/fix-libvirt-snapshot-old/nova/virt/libvirt/connection.py#L619

OPTION B --> libvirt 9.5 snapshots

This method uses the newer snapshot xml in libvirt 9.5 to snapshot only the root disk.

Pros:
plays nicely with libvirt, so the vm is only paused for the minimum amount of time
Cons:
requires libvirt 9.5, which doesn't exist in oneiric

Caveats:
 This code is untested and a couple tests don't pass yet because I haven't made an oneiric vm. I want to make sure this is the right approach before I go through the hassle of updating.

https://github.com/vishvananda/nova/blob/fix-libvirt-snapshot/nova/virt/libvirt/connection.py#L619

So I could use some specific feedback from kvm/libvirt folks on the following questions:

a) is it safe to use qemu-img to create/delete a snapshot in a disk file that libvirt is writing to.
if not:
b) is it safe to use qemu-img to delete a snapshot in a disk file that libvirt is writing to but not actively using.
if not:
c) is it safe to use qemu-img to create/delete a snapshot in a disk file that libvirt has an open file handle to.

And I could use input from the community on which of the approaches above to use:

Do we standardize on libvirt 9.5+? or do we use the compatible version that causes a bigger outage during the snapshot?

Ideal for me would be that at least b) above is true and we can get by with the compatible version.

Vish

Follow ups