yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #86211
[Bug 1930734] [NEW] Volumes and vNICs are being hot plugged into SEV based instances without iommu='on' causing failures to attach and later detach within the guest OS
Public bug reported:
Description
===========
After successfully attaching a disk to a SEV enabled instance the request to detach the disk never completes with the following trace eventually logged regarding the initial attach:
[ 7.773877] pcieport 0000:00:02.5: Slot(0-5): Attention button pressed
[ 7.774743] pcieport 0000:00:02.5: Slot(0-5) Powering on due to button press
[ 7.775714] pcieport 0000:00:02.5: Slot(0-5): Card present
[ 7.776403] pcieport 0000:00:02.5: Slot(0-5): Link Up
[ 7.903183] pci 0000:06:00.0: [1af4:1042] type 00 class 0x010000
[ 7.904095] pci 0000:06:00.0: reg 0x14: [mem 0x00000000-0x00000fff]
[ 7.905024] pci 0000:06:00.0: reg 0x20: [mem 0x00000000-0x00003fff 64bit pref]
[ 7.906977] pcieport 0000:00:02.5: bridge window [io 0x1000-0x0fff] to [bus 06] add_size 1000
[ 7.908069] pcieport 0000:00:02.5: BAR 13: no space for [io size 0x1000]
[ 7.908917] pcieport 0000:00:02.5: BAR 13: failed to assign [io size 0x1000]
[ 7.909832] pcieport 0000:00:02.5: BAR 13: no space for [io size 0x1000]
[ 7.910667] pcieport 0000:00:02.5: BAR 13: failed to assign [io size 0x1000]
[ 7.911586] pci 0000:06:00.0: BAR 4: assigned [mem 0x800600000-0x800603fff 64bit pref]
[ 7.912616] pci 0000:06:00.0: BAR 1: assigned [mem 0x80400000-0x80400fff]
[ 7.913472] pcieport 0000:00:02.5: PCI bridge to [bus 06]
[ 7.915762] pcieport 0000:00:02.5: bridge window [mem 0x80400000-0x805fffff]
[ 7.917525] pcieport 0000:00:02.5: bridge window [mem 0x800600000-0x8007fffff 64bit pref]
[ 7.920252] virtio-pci 0000:06:00.0: enabling device (0000 -> 0002)
[ 7.924487] virtio_blk virtio4: [vdb] 2097152 512-byte logical blocks (1.07 GB/1.00 GiB)
[ 7.926616] vdb: detected capacity change from 0 to 1073741824
[ .. ]
[ 246.751028] INFO: task irq/29-pciehp:173 blocked for more than 120 seconds.
[ 246.752801] Not tainted 4.18.0-305.el8.x86_64 #1
[ 246.753902] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 246.755457] irq/29-pciehp D 0 173 2 0x80004000
[ 246.756616] Call Trace:
[ 246.757328] __schedule+0x2c4/0x700
[ 246.758185] schedule+0x38/0xa0
[ 246.758966] io_schedule+0x12/0x40
[ 246.759801] do_read_cache_page+0x513/0x770
[ 246.760761] ? blkdev_writepages+0x10/0x10
[ 246.761692] ? file_fdatawait_range+0x20/0x20
[ 246.762659] read_part_sector+0x38/0xda
[ 246.763554] read_lba+0x10f/0x220
[ 246.764367] efi_partition+0x1e4/0x6de
[ 246.765245] ? snprintf+0x49/0x60
[ 246.766046] ? is_gpt_valid.part.5+0x430/0x430
[ 246.766991] blk_add_partitions+0x164/0x3f0
[ 246.767915] ? blk_drop_partitions+0x91/0xc0
[ 246.768863] bdev_disk_changed+0x65/0xd0
[ 246.769748] __blkdev_get+0x3c4/0x510
[ 246.770595] blkdev_get+0xaf/0x180
[ 246.771394] __device_add_disk+0x3de/0x4b0
[ 246.772302] virtblk_probe+0x4ba/0x8a0 [virtio_blk]
[ 246.773313] virtio_dev_probe+0x158/0x1f0
[ 246.774208] really_probe+0x255/0x4a0
[ 246.775046] ? __driver_attach_async_helper+0x90/0x90
[ 246.776091] driver_probe_device+0x49/0xc0
[ 246.776965] bus_for_each_drv+0x79/0xc0
[ 246.777813] __device_attach+0xdc/0x160
[ 246.778669] bus_probe_device+0x9d/0xb0
[ 246.779523] device_add+0x418/0x780
[ 246.780321] register_virtio_device+0x9e/0xe0
[ 246.781254] virtio_pci_probe+0xb3/0x140
[ 246.782124] local_pci_probe+0x41/0x90
[ 246.782937] pci_device_probe+0x105/0x1c0
[ 246.783807] really_probe+0x255/0x4a0
[ 246.784623] ? __driver_attach_async_helper+0x90/0x90
[ 246.785647] driver_probe_device+0x49/0xc0
[ 246.786526] bus_for_each_drv+0x79/0xc0
[ 246.787364] __device_attach+0xdc/0x160
[ 246.788205] pci_bus_add_device+0x4a/0x90
[ 246.789063] pci_bus_add_devices+0x2c/0x70
[ 246.789916] pciehp_configure_device+0x91/0x130
[ 246.790855] pciehp_handle_presence_or_link_change+0x334/0x460
[ 246.791985] pciehp_ist+0x1a2/0x1b0
[ 246.792768] ? irq_finalize_oneshot.part.47+0xf0/0xf0
[ 246.793768] irq_thread_fn+0x1f/0x50
[ 246.794550] irq_thread+0xe7/0x170
[ 246.795299] ? irq_forced_thread_fn+0x70/0x70
[ 246.796190] ? irq_thread_check_affinity+0xe0/0xe0
[ 246.797147] kthread+0x116/0x130
[ 246.797841] ? kthread_flush_work_fn+0x10/0x10
[ 246.798735] ret_from_fork+0x22/0x40
[ 246.799523] INFO: task sfdisk:1129 blocked for more than 120 seconds.
[ 246.800717] Not tainted 4.18.0-305.el8.x86_64 #1
[ 246.801733] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 246.803155] sfdisk D 0 1129 1107 0x00004080
[ 246.804225] Call Trace:
[ 246.804827] __schedule+0x2c4/0x700
[ 246.805590] ? submit_bio+0x3c/0x160
[ 246.806373] schedule+0x38/0xa0
[ 246.807089] schedule_preempt_disabled+0xa/0x10
[ 246.807990] __mutex_lock.isra.6+0x2d0/0x4a0
[ 246.808876] ? wake_up_q+0x80/0x80
[ 246.809636] ? fdatawait_one_bdev+0x20/0x20
[ 246.810508] iterate_bdevs+0x98/0x142
[ 246.811304] ksys_sync+0x6e/0xb0
[ 246.812041] __ia32_sys_sync+0xa/0x10
[ 246.812820] do_syscall_64+0x5b/0x1a0
[ 246.813613] entry_SYSCALL_64_after_hwframe+0x65/0xca
[ 246.814652] RIP: 0033:0x7fa9c04924fb
[ 246.815431] Code: Unable to access opcode bytes at RIP 0x7fa9c04924d1.
[ 246.816655] RSP: 002b:00007fff47661478 EFLAGS: 00000246 ORIG_RAX: 00000000000000a2
[ 246.818047] RAX: ffffffffffffffda RBX: 000055d79fc512f0 RCX: 00007fa9c04924fb
[ 246.824526] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000055d79fc512f0
[ 246.825714] RBP: 0000000000000000 R08: 000055d79fc51012 R09: 0000000000000006
[ 246.826941] R10: 000000000000000a R11: 0000000000000246 R12: 00007fa9c075e6e0
[ 246.828169] R13: 000055d79fc58c80 R14: 0000000000000001 R15: 00007fff47661590
This is caused by the device XML supplied to libvirt missing the driver
iommu attribute:
<disk type="block" device="disk">
<driver name="qemu" type="raw" cache="none" io="native"/>
<source dev="/dev/sdc"/>
<target bus="virtio" dev="vdb"/>
<serial>b11ce83a-723a-49a2-a5cc-025cb8985b0d</serial>
</disk>
As called out in the original SEV spec this is required:
https://specs.openstack.org/openstack/nova-specs/specs/train/implemented
/amd-sev-libvirt-support
> The iommu attribute is on for all virtio devices.
> Despite the name, this does not require the guest
> or host to have an IOMMU device, but merely enables
> the virtio flag which indicates that virtualized DMA
> should be used. This ties into the SEV code to handle
> memory encryption/decryption, and prevents IO buffers
> being shared between host and guest.
>
> The DMA will go through bounce buffers, so some
> overhead is expected compared to non-SEV guests.
>
> (Note: virtio-net device queues are not encrypted.)
Steps to reproduce
==================
1. Hot plug a PCIe device into a SEV enabled instance.
Expected result
===============
Hot plug succeeds and the device is visible within the instance.
Actual result
=============
Hot plug appears to succeed but the device is never present within the instance and a trace is later logged.
Environment
===========
1. Exact version of OpenStack you are running. See the following
list for all releases: http://docs.openstack.org/releases/
master
2. Which hypervisor did you use?
(For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...)
What's the version of that?
libvirt + KVM
2. Which storage type did you use?
(For example: Ceph, LVM, GPFS, ...)
What's the version of that?
N/A
3. Which networking type did you use?
(For example: nova-network, Neutron with OpenVSwitch, ...)
N/A
Logs & Configs
==============
[OSP 16.2] Volumes and vNICs are being hot plugged into SEV based instances without iommu='on' causing failures to attach and later detach within the guest OS
https://bugzilla.redhat.com/show_bug.cgi?id=1967293
** Affects: nova
Importance: Undecided
Assignee: Lee Yarwood (lyarwood)
Status: New
** Tags: libvirt
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1930734
Title:
Volumes and vNICs are being hot plugged into SEV based instances
without iommu='on' causing failures to attach and later detach within
the guest OS
Status in OpenStack Compute (nova):
New
Bug description:
Description
===========
After successfully attaching a disk to a SEV enabled instance the request to detach the disk never completes with the following trace eventually logged regarding the initial attach:
[ 7.773877] pcieport 0000:00:02.5: Slot(0-5): Attention button pressed
[ 7.774743] pcieport 0000:00:02.5: Slot(0-5) Powering on due to button press
[ 7.775714] pcieport 0000:00:02.5: Slot(0-5): Card present
[ 7.776403] pcieport 0000:00:02.5: Slot(0-5): Link Up
[ 7.903183] pci 0000:06:00.0: [1af4:1042] type 00 class 0x010000
[ 7.904095] pci 0000:06:00.0: reg 0x14: [mem 0x00000000-0x00000fff]
[ 7.905024] pci 0000:06:00.0: reg 0x20: [mem 0x00000000-0x00003fff 64bit pref]
[ 7.906977] pcieport 0000:00:02.5: bridge window [io 0x1000-0x0fff] to [bus 06] add_size 1000
[ 7.908069] pcieport 0000:00:02.5: BAR 13: no space for [io size 0x1000]
[ 7.908917] pcieport 0000:00:02.5: BAR 13: failed to assign [io size 0x1000]
[ 7.909832] pcieport 0000:00:02.5: BAR 13: no space for [io size 0x1000]
[ 7.910667] pcieport 0000:00:02.5: BAR 13: failed to assign [io size 0x1000]
[ 7.911586] pci 0000:06:00.0: BAR 4: assigned [mem 0x800600000-0x800603fff 64bit pref]
[ 7.912616] pci 0000:06:00.0: BAR 1: assigned [mem 0x80400000-0x80400fff]
[ 7.913472] pcieport 0000:00:02.5: PCI bridge to [bus 06]
[ 7.915762] pcieport 0000:00:02.5: bridge window [mem 0x80400000-0x805fffff]
[ 7.917525] pcieport 0000:00:02.5: bridge window [mem 0x800600000-0x8007fffff 64bit pref]
[ 7.920252] virtio-pci 0000:06:00.0: enabling device (0000 -> 0002)
[ 7.924487] virtio_blk virtio4: [vdb] 2097152 512-byte logical blocks (1.07 GB/1.00 GiB)
[ 7.926616] vdb: detected capacity change from 0 to 1073741824
[ .. ]
[ 246.751028] INFO: task irq/29-pciehp:173 blocked for more than 120 seconds.
[ 246.752801] Not tainted 4.18.0-305.el8.x86_64 #1
[ 246.753902] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 246.755457] irq/29-pciehp D 0 173 2 0x80004000
[ 246.756616] Call Trace:
[ 246.757328] __schedule+0x2c4/0x700
[ 246.758185] schedule+0x38/0xa0
[ 246.758966] io_schedule+0x12/0x40
[ 246.759801] do_read_cache_page+0x513/0x770
[ 246.760761] ? blkdev_writepages+0x10/0x10
[ 246.761692] ? file_fdatawait_range+0x20/0x20
[ 246.762659] read_part_sector+0x38/0xda
[ 246.763554] read_lba+0x10f/0x220
[ 246.764367] efi_partition+0x1e4/0x6de
[ 246.765245] ? snprintf+0x49/0x60
[ 246.766046] ? is_gpt_valid.part.5+0x430/0x430
[ 246.766991] blk_add_partitions+0x164/0x3f0
[ 246.767915] ? blk_drop_partitions+0x91/0xc0
[ 246.768863] bdev_disk_changed+0x65/0xd0
[ 246.769748] __blkdev_get+0x3c4/0x510
[ 246.770595] blkdev_get+0xaf/0x180
[ 246.771394] __device_add_disk+0x3de/0x4b0
[ 246.772302] virtblk_probe+0x4ba/0x8a0 [virtio_blk]
[ 246.773313] virtio_dev_probe+0x158/0x1f0
[ 246.774208] really_probe+0x255/0x4a0
[ 246.775046] ? __driver_attach_async_helper+0x90/0x90
[ 246.776091] driver_probe_device+0x49/0xc0
[ 246.776965] bus_for_each_drv+0x79/0xc0
[ 246.777813] __device_attach+0xdc/0x160
[ 246.778669] bus_probe_device+0x9d/0xb0
[ 246.779523] device_add+0x418/0x780
[ 246.780321] register_virtio_device+0x9e/0xe0
[ 246.781254] virtio_pci_probe+0xb3/0x140
[ 246.782124] local_pci_probe+0x41/0x90
[ 246.782937] pci_device_probe+0x105/0x1c0
[ 246.783807] really_probe+0x255/0x4a0
[ 246.784623] ? __driver_attach_async_helper+0x90/0x90
[ 246.785647] driver_probe_device+0x49/0xc0
[ 246.786526] bus_for_each_drv+0x79/0xc0
[ 246.787364] __device_attach+0xdc/0x160
[ 246.788205] pci_bus_add_device+0x4a/0x90
[ 246.789063] pci_bus_add_devices+0x2c/0x70
[ 246.789916] pciehp_configure_device+0x91/0x130
[ 246.790855] pciehp_handle_presence_or_link_change+0x334/0x460
[ 246.791985] pciehp_ist+0x1a2/0x1b0
[ 246.792768] ? irq_finalize_oneshot.part.47+0xf0/0xf0
[ 246.793768] irq_thread_fn+0x1f/0x50
[ 246.794550] irq_thread+0xe7/0x170
[ 246.795299] ? irq_forced_thread_fn+0x70/0x70
[ 246.796190] ? irq_thread_check_affinity+0xe0/0xe0
[ 246.797147] kthread+0x116/0x130
[ 246.797841] ? kthread_flush_work_fn+0x10/0x10
[ 246.798735] ret_from_fork+0x22/0x40
[ 246.799523] INFO: task sfdisk:1129 blocked for more than 120 seconds.
[ 246.800717] Not tainted 4.18.0-305.el8.x86_64 #1
[ 246.801733] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 246.803155] sfdisk D 0 1129 1107 0x00004080
[ 246.804225] Call Trace:
[ 246.804827] __schedule+0x2c4/0x700
[ 246.805590] ? submit_bio+0x3c/0x160
[ 246.806373] schedule+0x38/0xa0
[ 246.807089] schedule_preempt_disabled+0xa/0x10
[ 246.807990] __mutex_lock.isra.6+0x2d0/0x4a0
[ 246.808876] ? wake_up_q+0x80/0x80
[ 246.809636] ? fdatawait_one_bdev+0x20/0x20
[ 246.810508] iterate_bdevs+0x98/0x142
[ 246.811304] ksys_sync+0x6e/0xb0
[ 246.812041] __ia32_sys_sync+0xa/0x10
[ 246.812820] do_syscall_64+0x5b/0x1a0
[ 246.813613] entry_SYSCALL_64_after_hwframe+0x65/0xca
[ 246.814652] RIP: 0033:0x7fa9c04924fb
[ 246.815431] Code: Unable to access opcode bytes at RIP 0x7fa9c04924d1.
[ 246.816655] RSP: 002b:00007fff47661478 EFLAGS: 00000246 ORIG_RAX: 00000000000000a2
[ 246.818047] RAX: ffffffffffffffda RBX: 000055d79fc512f0 RCX: 00007fa9c04924fb
[ 246.824526] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000055d79fc512f0
[ 246.825714] RBP: 0000000000000000 R08: 000055d79fc51012 R09: 0000000000000006
[ 246.826941] R10: 000000000000000a R11: 0000000000000246 R12: 00007fa9c075e6e0
[ 246.828169] R13: 000055d79fc58c80 R14: 0000000000000001 R15: 00007fff47661590
This is caused by the device XML supplied to libvirt missing the
driver iommu attribute:
<disk type="block" device="disk">
<driver name="qemu" type="raw" cache="none" io="native"/>
<source dev="/dev/sdc"/>
<target bus="virtio" dev="vdb"/>
<serial>b11ce83a-723a-49a2-a5cc-025cb8985b0d</serial>
</disk>
As called out in the original SEV spec this is required:
https://specs.openstack.org/openstack/nova-
specs/specs/train/implemented/amd-sev-libvirt-support
> The iommu attribute is on for all virtio devices.
> Despite the name, this does not require the guest
> or host to have an IOMMU device, but merely enables
> the virtio flag which indicates that virtualized DMA
> should be used. This ties into the SEV code to handle
> memory encryption/decryption, and prevents IO buffers
> being shared between host and guest.
>
> The DMA will go through bounce buffers, so some
> overhead is expected compared to non-SEV guests.
>
> (Note: virtio-net device queues are not encrypted.)
Steps to reproduce
==================
1. Hot plug a PCIe device into a SEV enabled instance.
Expected result
===============
Hot plug succeeds and the device is visible within the instance.
Actual result
=============
Hot plug appears to succeed but the device is never present within the instance and a trace is later logged.
Environment
===========
1. Exact version of OpenStack you are running. See the following
list for all releases: http://docs.openstack.org/releases/
master
2. Which hypervisor did you use?
(For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...)
What's the version of that?
libvirt + KVM
2. Which storage type did you use?
(For example: Ceph, LVM, GPFS, ...)
What's the version of that?
N/A
3. Which networking type did you use?
(For example: nova-network, Neutron with OpenVSwitch, ...)
N/A
Logs & Configs
==============
[OSP 16.2] Volumes and vNICs are being hot plugged into SEV based instances without iommu='on' causing failures to attach and later detach within the guest OS
https://bugzilla.redhat.com/show_bug.cgi?id=1967293
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1930734/+subscriptions
Follow ups