yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #79716
[Bug 1841160] [NEW] With libvirt/images_type = rbd, ephemeral instances silently ignore hw_qemu_guest_agent=yes
Public bug reported:
Description
===========
If nova-compute is configured with libvirt/images_type = rbd, then
instances booted off images with hw_qemu_guest_agent=yes do not invoke
the qemu-ga guest-fsfreeze-freeze and guest-fsfreeze-thaw commands, and
thus do not guarantee consistent snapshots. They also appear to silently
ignore os_require_quiesce=yes.
Steps to reproduce
===========
The steps to verify whether or not the FIFREEZE and FITHAW ioctls are received by a guest are described in:
http://lists.openstack.org/pipermail/openstack-discuss/2019-August/008648.html
https://xahteiwi.eu/resources/hints-and-kinks/ftrace-qemu-ga/
Expected result
===============
When you perform these described actions on an instance running on a
compute node that does *not* set libvirt/images_type = rbd, then the
FIFREEZE and FITHAW events are received as expected when the snapshot is
created. This occurs irrespective of whether the instance is using boot-
from-image, or boot-from-volume.
Actual result
=============
When you perform these described actions on an instance running on a
compute node that *does* set libvirt/images_type = rbd, *and* the
instance is set to boot from an image, then no qemu-ga events are
received during snapshots at all.
The reason appears to be this direct_snapshot() call:
https://opendev.org/openstack/nova/src/commit/7bf75976016aae5d458eca9f6ddac92bfe75dc59/nova/virt/libvirt/driver.py#L2058
This is defined in
https://opendev.org/openstack/nova/src/commit/7bf75976016aae5d458eca9f6ddac92bfe75dc59/nova/virt/libvirt/imagebackend.py#L1055
and it uses RBD functionality only. Importantly, it never interacts with
qemu-ga, so it appears to not worry at all about freezing the filesystem.
This problem was apparently introduced in
https://opendev.org/openstack/nova/commit/824c3706a3ea691781f4fcc4453881517a9e1c55.
However, the qemu-guest-agent calls *are* received correctly if the
instance is configured to boot from volume.
Environment
===========
1. OpenStack release: Rocky (but this issue is present in current master).
2. Hypervisor: libvirt/KVM
3. Storage type: Ceph RBD
4. Networking: Neutron/ML2/OVS
Additional information
======================
A detailed discussion of the issue is available at:
https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/3YQCRO4JP56EDJN5KX5DWW5N2CSBHRHZ/?sort=date
** Affects: nova
Importance: Undecided
Status: New
** Summary changed:
- With libvirt/images_type = rbd, instances ignore hw_qemu_guest_agent=yes
+ With libvirt/images_type = rbd, ephemeral instances silently ignore hw_qemu_guest_agent=yes
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1841160
Title:
With libvirt/images_type = rbd, ephemeral instances silently ignore
hw_qemu_guest_agent=yes
Status in OpenStack Compute (nova):
New
Bug description:
Description
===========
If nova-compute is configured with libvirt/images_type = rbd, then
instances booted off images with hw_qemu_guest_agent=yes do not invoke
the qemu-ga guest-fsfreeze-freeze and guest-fsfreeze-thaw commands,
and thus do not guarantee consistent snapshots. They also appear to
silently ignore os_require_quiesce=yes.
Steps to reproduce
===========
The steps to verify whether or not the FIFREEZE and FITHAW ioctls are received by a guest are described in:
http://lists.openstack.org/pipermail/openstack-discuss/2019-August/008648.html
https://xahteiwi.eu/resources/hints-and-kinks/ftrace-qemu-ga/
Expected result
===============
When you perform these described actions on an instance running on a
compute node that does *not* set libvirt/images_type = rbd, then the
FIFREEZE and FITHAW events are received as expected when the snapshot
is created. This occurs irrespective of whether the instance is using
boot-from-image, or boot-from-volume.
Actual result
=============
When you perform these described actions on an instance running on a
compute node that *does* set libvirt/images_type = rbd, *and* the
instance is set to boot from an image, then no qemu-ga events are
received during snapshots at all.
The reason appears to be this direct_snapshot() call:
https://opendev.org/openstack/nova/src/commit/7bf75976016aae5d458eca9f6ddac92bfe75dc59/nova/virt/libvirt/driver.py#L2058
This is defined in
https://opendev.org/openstack/nova/src/commit/7bf75976016aae5d458eca9f6ddac92bfe75dc59/nova/virt/libvirt/imagebackend.py#L1055
and it uses RBD functionality only. Importantly, it never interacts with
qemu-ga, so it appears to not worry at all about freezing the filesystem.
This problem was apparently introduced in
https://opendev.org/openstack/nova/commit/824c3706a3ea691781f4fcc4453881517a9e1c55.
However, the qemu-guest-agent calls *are* received correctly if the
instance is configured to boot from volume.
Environment
===========
1. OpenStack release: Rocky (but this issue is present in current master).
2. Hypervisor: libvirt/KVM
3. Storage type: Ceph RBD
4. Networking: Neutron/ML2/OVS
Additional information
======================
A detailed discussion of the issue is available at:
https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/3YQCRO4JP56EDJN5KX5DWW5N2CSBHRHZ/?sort=date
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1841160/+subscriptions