← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1841160] [NEW] With libvirt/images_type = rbd, ephemeral instances silently ignore hw_qemu_guest_agent=yes

 

Public bug reported:

Description
===========

If nova-compute is configured with libvirt/images_type = rbd, then
instances booted off images with hw_qemu_guest_agent=yes do not invoke
the qemu-ga guest-fsfreeze-freeze and guest-fsfreeze-thaw commands, and
thus do not guarantee consistent snapshots. They also appear to silently
ignore os_require_quiesce=yes.

Steps to reproduce
===========
The steps to verify whether or not the FIFREEZE and FITHAW ioctls are received by a guest are described in:

http://lists.openstack.org/pipermail/openstack-discuss/2019-August/008648.html
https://xahteiwi.eu/resources/hints-and-kinks/ftrace-qemu-ga/

Expected result
===============

When you perform these described actions on an instance running on a
compute node that does *not* set libvirt/images_type = rbd, then the
FIFREEZE and FITHAW events are received as expected when the snapshot is
created. This occurs irrespective of whether the instance is using boot-
from-image, or boot-from-volume.

Actual result
=============

When you perform these described actions on an instance running on a
compute node that *does* set libvirt/images_type = rbd, *and* the
instance is set to boot from an image, then no qemu-ga events are
received during snapshots at all.

The reason appears to be this direct_snapshot() call:

https://opendev.org/openstack/nova/src/commit/7bf75976016aae5d458eca9f6ddac92bfe75dc59/nova/virt/libvirt/driver.py#L2058

This is defined in
https://opendev.org/openstack/nova/src/commit/7bf75976016aae5d458eca9f6ddac92bfe75dc59/nova/virt/libvirt/imagebackend.py#L1055
and it uses RBD functionality only. Importantly, it never interacts with
qemu-ga, so it appears to not worry at all about freezing the filesystem.

This problem was apparently introduced in
https://opendev.org/openstack/nova/commit/824c3706a3ea691781f4fcc4453881517a9e1c55.

However, the qemu-guest-agent calls *are* received correctly if the
instance is configured to boot from volume.

Environment
===========

1. OpenStack release: Rocky (but this issue is present in current master).
2. Hypervisor: libvirt/KVM
3. Storage type: Ceph RBD
4. Networking: Neutron/ML2/OVS

Additional information
======================

A detailed discussion of the issue is available at:
https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/3YQCRO4JP56EDJN5KX5DWW5N2CSBHRHZ/?sort=date

** Affects: nova
     Importance: Undecided
         Status: New

** Summary changed:

- With libvirt/images_type = rbd, instances ignore hw_qemu_guest_agent=yes 
+ With libvirt/images_type = rbd, ephemeral instances silently ignore hw_qemu_guest_agent=yes

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1841160

Title:
  With libvirt/images_type = rbd, ephemeral instances silently ignore
  hw_qemu_guest_agent=yes

Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  ===========

  If nova-compute is configured with libvirt/images_type = rbd, then
  instances booted off images with hw_qemu_guest_agent=yes do not invoke
  the qemu-ga guest-fsfreeze-freeze and guest-fsfreeze-thaw commands,
  and thus do not guarantee consistent snapshots. They also appear to
  silently ignore os_require_quiesce=yes.

  Steps to reproduce
  ===========
  The steps to verify whether or not the FIFREEZE and FITHAW ioctls are received by a guest are described in:

  http://lists.openstack.org/pipermail/openstack-discuss/2019-August/008648.html
  https://xahteiwi.eu/resources/hints-and-kinks/ftrace-qemu-ga/

  Expected result
  ===============

  When you perform these described actions on an instance running on a
  compute node that does *not* set libvirt/images_type = rbd, then the
  FIFREEZE and FITHAW events are received as expected when the snapshot
  is created. This occurs irrespective of whether the instance is using
  boot-from-image, or boot-from-volume.

  Actual result
  =============

  When you perform these described actions on an instance running on a
  compute node that *does* set libvirt/images_type = rbd, *and* the
  instance is set to boot from an image, then no qemu-ga events are
  received during snapshots at all.

  The reason appears to be this direct_snapshot() call:

  https://opendev.org/openstack/nova/src/commit/7bf75976016aae5d458eca9f6ddac92bfe75dc59/nova/virt/libvirt/driver.py#L2058

  This is defined in
  https://opendev.org/openstack/nova/src/commit/7bf75976016aae5d458eca9f6ddac92bfe75dc59/nova/virt/libvirt/imagebackend.py#L1055
  and it uses RBD functionality only. Importantly, it never interacts with
  qemu-ga, so it appears to not worry at all about freezing the filesystem.

  This problem was apparently introduced in
  https://opendev.org/openstack/nova/commit/824c3706a3ea691781f4fcc4453881517a9e1c55.

  However, the qemu-guest-agent calls *are* received correctly if the
  instance is configured to boot from volume.

  Environment
  ===========

  1. OpenStack release: Rocky (but this issue is present in current master).
  2. Hypervisor: libvirt/KVM
  3. Storage type: Ceph RBD
  4. Networking: Neutron/ML2/OVS

  Additional information
  ======================

  A detailed discussion of the issue is available at:
  https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/3YQCRO4JP56EDJN5KX5DWW5N2CSBHRHZ/?sort=date

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1841160/+subscriptions