yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #86333
[Bug 1930734] Re: Volumes and vNICs are being hot plugged into SEV based instances without iommu='on' causing failures to attach and later detach within the guest OS
Reviewed: https://review.opendev.org/c/openstack/nova/+/794639
Committed: https://opendev.org/openstack/nova/commit/4d8bf15fec15dc3416023e577e0f2c277c216506
Submitter: "Zuul (22348)"
Branch: master
commit 4d8bf15fec15dc3416023e577e0f2c277c216506
Author: Lee Yarwood <lyarwood@xxxxxxxxxx>
Date: Thu Jun 3 16:37:45 2021 +0100
libvirt: Set driver_iommu when attaching virtio devices to SEV instance
As called out in the original spec [1] virtio devices attached to a SEV
enabled instance must have the iommu attribute enabled. This was done
within the original implementation of the spec for all virtio devices
defined when initially spawning the instance but does not include volume
and interfaces that are later hot plugged.
This change corrects this for both volumes and nics and in doing so
slightly refactors the original designer code to make it usable in both
cases.
[1] https://specs.openstack.org/openstack/nova-specs/specs/train/implemented/amd-sev-libvirt-support.html#proposed-change
Closes-Bug: #1930734
Change-Id: I11131a3f90b8af85e7151b519fb26d225629c391
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1930734
Title:
Volumes and vNICs are being hot plugged into SEV based instances
without iommu='on' causing failures to attach and later detach within
the guest OS
Status in OpenStack Compute (nova):
Fix Released
Bug description:
Description
===========
After successfully attaching a disk to a SEV enabled instance the request to detach the disk never completes with the following trace eventually logged regarding the initial attach:
[ 7.773877] pcieport 0000:00:02.5: Slot(0-5): Attention button pressed
[ 7.774743] pcieport 0000:00:02.5: Slot(0-5) Powering on due to button press
[ 7.775714] pcieport 0000:00:02.5: Slot(0-5): Card present
[ 7.776403] pcieport 0000:00:02.5: Slot(0-5): Link Up
[ 7.903183] pci 0000:06:00.0: [1af4:1042] type 00 class 0x010000
[ 7.904095] pci 0000:06:00.0: reg 0x14: [mem 0x00000000-0x00000fff]
[ 7.905024] pci 0000:06:00.0: reg 0x20: [mem 0x00000000-0x00003fff 64bit pref]
[ 7.906977] pcieport 0000:00:02.5: bridge window [io 0x1000-0x0fff] to [bus 06] add_size 1000
[ 7.908069] pcieport 0000:00:02.5: BAR 13: no space for [io size 0x1000]
[ 7.908917] pcieport 0000:00:02.5: BAR 13: failed to assign [io size 0x1000]
[ 7.909832] pcieport 0000:00:02.5: BAR 13: no space for [io size 0x1000]
[ 7.910667] pcieport 0000:00:02.5: BAR 13: failed to assign [io size 0x1000]
[ 7.911586] pci 0000:06:00.0: BAR 4: assigned [mem 0x800600000-0x800603fff 64bit pref]
[ 7.912616] pci 0000:06:00.0: BAR 1: assigned [mem 0x80400000-0x80400fff]
[ 7.913472] pcieport 0000:00:02.5: PCI bridge to [bus 06]
[ 7.915762] pcieport 0000:00:02.5: bridge window [mem 0x80400000-0x805fffff]
[ 7.917525] pcieport 0000:00:02.5: bridge window [mem 0x800600000-0x8007fffff 64bit pref]
[ 7.920252] virtio-pci 0000:06:00.0: enabling device (0000 -> 0002)
[ 7.924487] virtio_blk virtio4: [vdb] 2097152 512-byte logical blocks (1.07 GB/1.00 GiB)
[ 7.926616] vdb: detected capacity change from 0 to 1073741824
[ .. ]
[ 246.751028] INFO: task irq/29-pciehp:173 blocked for more than 120 seconds.
[ 246.752801] Not tainted 4.18.0-305.el8.x86_64 #1
[ 246.753902] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 246.755457] irq/29-pciehp D 0 173 2 0x80004000
[ 246.756616] Call Trace:
[ 246.757328] __schedule+0x2c4/0x700
[ 246.758185] schedule+0x38/0xa0
[ 246.758966] io_schedule+0x12/0x40
[ 246.759801] do_read_cache_page+0x513/0x770
[ 246.760761] ? blkdev_writepages+0x10/0x10
[ 246.761692] ? file_fdatawait_range+0x20/0x20
[ 246.762659] read_part_sector+0x38/0xda
[ 246.763554] read_lba+0x10f/0x220
[ 246.764367] efi_partition+0x1e4/0x6de
[ 246.765245] ? snprintf+0x49/0x60
[ 246.766046] ? is_gpt_valid.part.5+0x430/0x430
[ 246.766991] blk_add_partitions+0x164/0x3f0
[ 246.767915] ? blk_drop_partitions+0x91/0xc0
[ 246.768863] bdev_disk_changed+0x65/0xd0
[ 246.769748] __blkdev_get+0x3c4/0x510
[ 246.770595] blkdev_get+0xaf/0x180
[ 246.771394] __device_add_disk+0x3de/0x4b0
[ 246.772302] virtblk_probe+0x4ba/0x8a0 [virtio_blk]
[ 246.773313] virtio_dev_probe+0x158/0x1f0
[ 246.774208] really_probe+0x255/0x4a0
[ 246.775046] ? __driver_attach_async_helper+0x90/0x90
[ 246.776091] driver_probe_device+0x49/0xc0
[ 246.776965] bus_for_each_drv+0x79/0xc0
[ 246.777813] __device_attach+0xdc/0x160
[ 246.778669] bus_probe_device+0x9d/0xb0
[ 246.779523] device_add+0x418/0x780
[ 246.780321] register_virtio_device+0x9e/0xe0
[ 246.781254] virtio_pci_probe+0xb3/0x140
[ 246.782124] local_pci_probe+0x41/0x90
[ 246.782937] pci_device_probe+0x105/0x1c0
[ 246.783807] really_probe+0x255/0x4a0
[ 246.784623] ? __driver_attach_async_helper+0x90/0x90
[ 246.785647] driver_probe_device+0x49/0xc0
[ 246.786526] bus_for_each_drv+0x79/0xc0
[ 246.787364] __device_attach+0xdc/0x160
[ 246.788205] pci_bus_add_device+0x4a/0x90
[ 246.789063] pci_bus_add_devices+0x2c/0x70
[ 246.789916] pciehp_configure_device+0x91/0x130
[ 246.790855] pciehp_handle_presence_or_link_change+0x334/0x460
[ 246.791985] pciehp_ist+0x1a2/0x1b0
[ 246.792768] ? irq_finalize_oneshot.part.47+0xf0/0xf0
[ 246.793768] irq_thread_fn+0x1f/0x50
[ 246.794550] irq_thread+0xe7/0x170
[ 246.795299] ? irq_forced_thread_fn+0x70/0x70
[ 246.796190] ? irq_thread_check_affinity+0xe0/0xe0
[ 246.797147] kthread+0x116/0x130
[ 246.797841] ? kthread_flush_work_fn+0x10/0x10
[ 246.798735] ret_from_fork+0x22/0x40
[ 246.799523] INFO: task sfdisk:1129 blocked for more than 120 seconds.
[ 246.800717] Not tainted 4.18.0-305.el8.x86_64 #1
[ 246.801733] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 246.803155] sfdisk D 0 1129 1107 0x00004080
[ 246.804225] Call Trace:
[ 246.804827] __schedule+0x2c4/0x700
[ 246.805590] ? submit_bio+0x3c/0x160
[ 246.806373] schedule+0x38/0xa0
[ 246.807089] schedule_preempt_disabled+0xa/0x10
[ 246.807990] __mutex_lock.isra.6+0x2d0/0x4a0
[ 246.808876] ? wake_up_q+0x80/0x80
[ 246.809636] ? fdatawait_one_bdev+0x20/0x20
[ 246.810508] iterate_bdevs+0x98/0x142
[ 246.811304] ksys_sync+0x6e/0xb0
[ 246.812041] __ia32_sys_sync+0xa/0x10
[ 246.812820] do_syscall_64+0x5b/0x1a0
[ 246.813613] entry_SYSCALL_64_after_hwframe+0x65/0xca
[ 246.814652] RIP: 0033:0x7fa9c04924fb
[ 246.815431] Code: Unable to access opcode bytes at RIP 0x7fa9c04924d1.
[ 246.816655] RSP: 002b:00007fff47661478 EFLAGS: 00000246 ORIG_RAX: 00000000000000a2
[ 246.818047] RAX: ffffffffffffffda RBX: 000055d79fc512f0 RCX: 00007fa9c04924fb
[ 246.824526] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000055d79fc512f0
[ 246.825714] RBP: 0000000000000000 R08: 000055d79fc51012 R09: 0000000000000006
[ 246.826941] R10: 000000000000000a R11: 0000000000000246 R12: 00007fa9c075e6e0
[ 246.828169] R13: 000055d79fc58c80 R14: 0000000000000001 R15: 00007fff47661590
This is caused by the device XML supplied to libvirt missing the
driver iommu attribute:
<disk type="block" device="disk">
<driver name="qemu" type="raw" cache="none" io="native"/>
<source dev="/dev/sdc"/>
<target bus="virtio" dev="vdb"/>
<serial>b11ce83a-723a-49a2-a5cc-025cb8985b0d</serial>
</disk>
As called out in the original SEV spec this is required:
https://specs.openstack.org/openstack/nova-
specs/specs/train/implemented/amd-sev-libvirt-support
> The iommu attribute is on for all virtio devices.
> Despite the name, this does not require the guest
> or host to have an IOMMU device, but merely enables
> the virtio flag which indicates that virtualized DMA
> should be used. This ties into the SEV code to handle
> memory encryption/decryption, and prevents IO buffers
> being shared between host and guest.
>
> The DMA will go through bounce buffers, so some
> overhead is expected compared to non-SEV guests.
>
> (Note: virtio-net device queues are not encrypted.)
Steps to reproduce
==================
1. Hot plug a PCIe device into a SEV enabled instance.
Expected result
===============
Hot plug succeeds and the device is visible within the instance.
Actual result
=============
Hot plug appears to succeed but the device is never present within the instance and a trace is later logged.
Environment
===========
1. Exact version of OpenStack you are running. See the following
list for all releases: http://docs.openstack.org/releases/
master
2. Which hypervisor did you use?
(For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...)
What's the version of that?
libvirt + KVM
2. Which storage type did you use?
(For example: Ceph, LVM, GPFS, ...)
What's the version of that?
N/A
3. Which networking type did you use?
(For example: nova-network, Neutron with OpenVSwitch, ...)
N/A
Logs & Configs
==============
[OSP 16.2] Volumes and vNICs are being hot plugged into SEV based instances without iommu='on' causing failures to attach and later detach within the guest OS
https://bugzilla.redhat.com/show_bug.cgi?id=1967293
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1930734/+subscriptions
References