← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1656242] [NEW] nova live snapshot of rbd instance fails on xen hypervisor

 

Public bug reported:

Description:
We use a Mitaka environment with one control and three compute nodes (all running on openSUSE Leap 42.1), the compute nodes are xen hypervisors, our storage backend is ceph (for nova, glance and cinder).

When we try to snapshot a running instance, it's always a cold snapshot,
nova-compute reports:

2017-01-12 12:55:51.919 [instance: 14b75237-7619-481f-9636-792b64d1be17] Beginning cold snapshot process
2017-01-12 12:59:27.085 [instance: 14b75237-7619-481f-9636-792b64d1be17] Snapshot image upload complete

On rbd level the live snapshot process works as expected, without any
downtime of the instance, we use it for our backup strategy for example.

With some additional log statements in /usr/lib/python2.7/site-
packages/nova/virt/libvirt/driver.py I found that nova always passes
hard coded hypervisor-driver "qemu" into the function
_host.has_min_version(), it always returns "false" so that
"live_snapshot" is disabled. Replacing host.HV_DRIVER_QEMU with
host.HV_DRIVER_XEN results in a working live snapshot:

---cut here---
compute1:~ # diff -u /usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py /usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py.mod
--- /usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py        2017-01-13 09:33:23.257525708 +0100
+++ /usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py.mod    2017-01-13 09:33:46.349105366 +0100
@@ -1649,9 +1649,14 @@
         #               redundant because LVM supports only cold snapshots.
         #               It is necessary in case this situation changes in the
         #               future.
+        if CONF.libvirt.virt_type == 'xen':
+            hv_driver = host.HV_DRIVER_XEN
+        else:
+            hv_driver = host.HV_DRIVER_QEMU
+
         if (self._host.has_min_version(MIN_LIBVIRT_LIVESNAPSHOT_VERSION,
                                        MIN_QEMU_LIVESNAPSHOT_VERSION,
-                                       host.HV_DRIVER_QEMU)
+                                       hv_driver)
              and source_type not in ('lvm')
              and not CONF.ephemeral_storage_encryption.enabled
              and not CONF.workarounds.disable_libvirt_livesnapshot):
---cut here---

nova-compute reports:

2017-01-12 17:20:22.760 [instance: 14b75237-7619-481f-9636-792b64d1be17] instance snapshotting
2017-01-12 17:20:24.049 [instance: 14b75237-7619-481f-9636-792b64d1be17] Beginning live snapshot process
2017-01-12 17:24:38.997 [instance: 14b75237-7619-481f-9636-792b64d1be17] Snapshot image upload complete

The versions we use:

compute1:~ # nova --version
3.3.0

compute1:~ # ceph --version
ceph version 0.94.7-84-g8e6f430 (8e6f430683e4d8293e31fd4eb6cb09be96960cfa)

compute1:~ # libvirtd --version
libvirtd (libvirt) 2.5.0

compute1:~ # qemu-img --version
qemu-img version 2.7.0((SUSE Linux)), Copyright (c) 2003-2016 Fabrice Bellard and the QEMU Project developers

compute1:~ # rpm -qa | grep xen
xen-4.7.0_12-461.1.x86_64

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1656242

Title:
  nova live snapshot of rbd instance fails on xen hypervisor

Status in OpenStack Compute (nova):
  New

Bug description:
  Description:
  We use a Mitaka environment with one control and three compute nodes (all running on openSUSE Leap 42.1), the compute nodes are xen hypervisors, our storage backend is ceph (for nova, glance and cinder).

  When we try to snapshot a running instance, it's always a cold
  snapshot, nova-compute reports:

  2017-01-12 12:55:51.919 [instance: 14b75237-7619-481f-9636-792b64d1be17] Beginning cold snapshot process
  2017-01-12 12:59:27.085 [instance: 14b75237-7619-481f-9636-792b64d1be17] Snapshot image upload complete

  On rbd level the live snapshot process works as expected, without any
  downtime of the instance, we use it for our backup strategy for
  example.

  With some additional log statements in /usr/lib/python2.7/site-
  packages/nova/virt/libvirt/driver.py I found that nova always passes
  hard coded hypervisor-driver "qemu" into the function
  _host.has_min_version(), it always returns "false" so that
  "live_snapshot" is disabled. Replacing host.HV_DRIVER_QEMU with
  host.HV_DRIVER_XEN results in a working live snapshot:

  ---cut here---
  compute1:~ # diff -u /usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py /usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py.mod
  --- /usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py        2017-01-13 09:33:23.257525708 +0100
  +++ /usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py.mod    2017-01-13 09:33:46.349105366 +0100
  @@ -1649,9 +1649,14 @@
           #               redundant because LVM supports only cold snapshots.
           #               It is necessary in case this situation changes in the
           #               future.
  +        if CONF.libvirt.virt_type == 'xen':
  +            hv_driver = host.HV_DRIVER_XEN
  +        else:
  +            hv_driver = host.HV_DRIVER_QEMU
  +
           if (self._host.has_min_version(MIN_LIBVIRT_LIVESNAPSHOT_VERSION,
                                          MIN_QEMU_LIVESNAPSHOT_VERSION,
  -                                       host.HV_DRIVER_QEMU)
  +                                       hv_driver)
                and source_type not in ('lvm')
                and not CONF.ephemeral_storage_encryption.enabled
                and not CONF.workarounds.disable_libvirt_livesnapshot):
  ---cut here---

  nova-compute reports:

  2017-01-12 17:20:22.760 [instance: 14b75237-7619-481f-9636-792b64d1be17] instance snapshotting
  2017-01-12 17:20:24.049 [instance: 14b75237-7619-481f-9636-792b64d1be17] Beginning live snapshot process
  2017-01-12 17:24:38.997 [instance: 14b75237-7619-481f-9636-792b64d1be17] Snapshot image upload complete

  The versions we use:

  compute1:~ # nova --version
  3.3.0

  compute1:~ # ceph --version
  ceph version 0.94.7-84-g8e6f430 (8e6f430683e4d8293e31fd4eb6cb09be96960cfa)

  compute1:~ # libvirtd --version
  libvirtd (libvirt) 2.5.0

  compute1:~ # qemu-img --version
  qemu-img version 2.7.0((SUSE Linux)), Copyright (c) 2003-2016 Fabrice Bellard and the QEMU Project developers

  compute1:~ # rpm -qa | grep xen
  xen-4.7.0_12-461.1.x86_64

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1656242/+subscriptions


Follow ups