← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1930734] [NEW] Volumes and vNICs are being hot plugged into SEV based instances without iommu='on' causing failures to attach and later detach within the guest OS

 

Public bug reported:

Description
===========
After successfully attaching a disk to a SEV enabled instance the request to detach the disk never completes with the following trace eventually logged regarding the initial attach:

[    7.773877] pcieport 0000:00:02.5: Slot(0-5): Attention button pressed
[    7.774743] pcieport 0000:00:02.5: Slot(0-5) Powering on due to button press
[    7.775714] pcieport 0000:00:02.5: Slot(0-5): Card present
[    7.776403] pcieport 0000:00:02.5: Slot(0-5): Link Up
[    7.903183] pci 0000:06:00.0: [1af4:1042] type 00 class 0x010000
[    7.904095] pci 0000:06:00.0: reg 0x14: [mem 0x00000000-0x00000fff]
[    7.905024] pci 0000:06:00.0: reg 0x20: [mem 0x00000000-0x00003fff 64bit pref]
[    7.906977] pcieport 0000:00:02.5: bridge window [io  0x1000-0x0fff] to [bus 06] add_size 1000
[    7.908069] pcieport 0000:00:02.5: BAR 13: no space for [io  size 0x1000]
[    7.908917] pcieport 0000:00:02.5: BAR 13: failed to assign [io  size 0x1000]
[    7.909832] pcieport 0000:00:02.5: BAR 13: no space for [io  size 0x1000]
[    7.910667] pcieport 0000:00:02.5: BAR 13: failed to assign [io  size 0x1000]
[    7.911586] pci 0000:06:00.0: BAR 4: assigned [mem 0x800600000-0x800603fff 64bit pref]
[    7.912616] pci 0000:06:00.0: BAR 1: assigned [mem 0x80400000-0x80400fff]
[    7.913472] pcieport 0000:00:02.5: PCI bridge to [bus 06]
[    7.915762] pcieport 0000:00:02.5:   bridge window [mem 0x80400000-0x805fffff]
[    7.917525] pcieport 0000:00:02.5:   bridge window [mem 0x800600000-0x8007fffff 64bit pref]
[    7.920252] virtio-pci 0000:06:00.0: enabling device (0000 -> 0002)
[    7.924487] virtio_blk virtio4: [vdb] 2097152 512-byte logical blocks (1.07 GB/1.00 GiB)
[    7.926616] vdb: detected capacity change from 0 to 1073741824
[ .. ]
[  246.751028] INFO: task irq/29-pciehp:173 blocked for more than 120 seconds.
[  246.752801]       Not tainted 4.18.0-305.el8.x86_64 #1
[  246.753902] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  246.755457] irq/29-pciehp   D    0   173      2 0x80004000
[  246.756616] Call Trace:
[  246.757328]  __schedule+0x2c4/0x700
[  246.758185]  schedule+0x38/0xa0
[  246.758966]  io_schedule+0x12/0x40
[  246.759801]  do_read_cache_page+0x513/0x770
[  246.760761]  ? blkdev_writepages+0x10/0x10
[  246.761692]  ? file_fdatawait_range+0x20/0x20
[  246.762659]  read_part_sector+0x38/0xda
[  246.763554]  read_lba+0x10f/0x220
[  246.764367]  efi_partition+0x1e4/0x6de
[  246.765245]  ? snprintf+0x49/0x60
[  246.766046]  ? is_gpt_valid.part.5+0x430/0x430
[  246.766991]  blk_add_partitions+0x164/0x3f0
[  246.767915]  ? blk_drop_partitions+0x91/0xc0
[  246.768863]  bdev_disk_changed+0x65/0xd0
[  246.769748]  __blkdev_get+0x3c4/0x510
[  246.770595]  blkdev_get+0xaf/0x180
[  246.771394]  __device_add_disk+0x3de/0x4b0
[  246.772302]  virtblk_probe+0x4ba/0x8a0 [virtio_blk]
[  246.773313]  virtio_dev_probe+0x158/0x1f0
[  246.774208]  really_probe+0x255/0x4a0
[  246.775046]  ? __driver_attach_async_helper+0x90/0x90
[  246.776091]  driver_probe_device+0x49/0xc0
[  246.776965]  bus_for_each_drv+0x79/0xc0
[  246.777813]  __device_attach+0xdc/0x160
[  246.778669]  bus_probe_device+0x9d/0xb0
[  246.779523]  device_add+0x418/0x780
[  246.780321]  register_virtio_device+0x9e/0xe0
[  246.781254]  virtio_pci_probe+0xb3/0x140
[  246.782124]  local_pci_probe+0x41/0x90
[  246.782937]  pci_device_probe+0x105/0x1c0
[  246.783807]  really_probe+0x255/0x4a0
[  246.784623]  ? __driver_attach_async_helper+0x90/0x90
[  246.785647]  driver_probe_device+0x49/0xc0
[  246.786526]  bus_for_each_drv+0x79/0xc0
[  246.787364]  __device_attach+0xdc/0x160
[  246.788205]  pci_bus_add_device+0x4a/0x90
[  246.789063]  pci_bus_add_devices+0x2c/0x70
[  246.789916]  pciehp_configure_device+0x91/0x130
[  246.790855]  pciehp_handle_presence_or_link_change+0x334/0x460
[  246.791985]  pciehp_ist+0x1a2/0x1b0
[  246.792768]  ? irq_finalize_oneshot.part.47+0xf0/0xf0
[  246.793768]  irq_thread_fn+0x1f/0x50
[  246.794550]  irq_thread+0xe7/0x170
[  246.795299]  ? irq_forced_thread_fn+0x70/0x70
[  246.796190]  ? irq_thread_check_affinity+0xe0/0xe0
[  246.797147]  kthread+0x116/0x130
[  246.797841]  ? kthread_flush_work_fn+0x10/0x10
[  246.798735]  ret_from_fork+0x22/0x40
[  246.799523] INFO: task sfdisk:1129 blocked for more than 120 seconds.
[  246.800717]       Not tainted 4.18.0-305.el8.x86_64 #1
[  246.801733] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  246.803155] sfdisk          D    0  1129   1107 0x00004080
[  246.804225] Call Trace:
[  246.804827]  __schedule+0x2c4/0x700
[  246.805590]  ? submit_bio+0x3c/0x160
[  246.806373]  schedule+0x38/0xa0
[  246.807089]  schedule_preempt_disabled+0xa/0x10
[  246.807990]  __mutex_lock.isra.6+0x2d0/0x4a0
[  246.808876]  ? wake_up_q+0x80/0x80
[  246.809636]  ? fdatawait_one_bdev+0x20/0x20
[  246.810508]  iterate_bdevs+0x98/0x142
[  246.811304]  ksys_sync+0x6e/0xb0
[  246.812041]  __ia32_sys_sync+0xa/0x10
[  246.812820]  do_syscall_64+0x5b/0x1a0
[  246.813613]  entry_SYSCALL_64_after_hwframe+0x65/0xca
[  246.814652] RIP: 0033:0x7fa9c04924fb
[  246.815431] Code: Unable to access opcode bytes at RIP 0x7fa9c04924d1.
[  246.816655] RSP: 002b:00007fff47661478 EFLAGS: 00000246 ORIG_RAX: 00000000000000a2
[  246.818047] RAX: ffffffffffffffda RBX: 000055d79fc512f0 RCX: 00007fa9c04924fb
[  246.824526] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000055d79fc512f0
[  246.825714] RBP: 0000000000000000 R08: 000055d79fc51012 R09: 0000000000000006
[  246.826941] R10: 000000000000000a R11: 0000000000000246 R12: 00007fa9c075e6e0
[  246.828169] R13: 000055d79fc58c80 R14: 0000000000000001 R15: 00007fff47661590

This is caused by the device XML supplied to libvirt missing the driver
iommu attribute:

<disk type="block" device="disk">
  <driver name="qemu" type="raw" cache="none" io="native"/>
  <source dev="/dev/sdc"/>
  <target bus="virtio" dev="vdb"/>
  <serial>b11ce83a-723a-49a2-a5cc-025cb8985b0d</serial>
</disk>

As called out in the original SEV spec this is required:

https://specs.openstack.org/openstack/nova-specs/specs/train/implemented
/amd-sev-libvirt-support

> The iommu attribute is on for all virtio devices. 
> Despite the name, this does not require the guest 
> or host to have an IOMMU device, but merely enables 
> the virtio flag which indicates that virtualized DMA
> should be used. This ties into the SEV code to handle
> memory encryption/decryption, and prevents IO buffers
> being shared between host and guest.
>
> The DMA will go through bounce buffers, so some 
> overhead is expected compared to non-SEV guests.
>
> (Note: virtio-net device queues are not encrypted.)

Steps to reproduce
==================
1. Hot plug a PCIe device into a SEV enabled instance.

Expected result
===============
Hot plug succeeds and the device is visible within the instance.

Actual result
=============
Hot plug appears to succeed but the device is never present within the instance and a trace is later logged.

Environment
===========
1. Exact version of OpenStack you are running. See the following
  list for all releases: http://docs.openstack.org/releases/

   master

2. Which hypervisor did you use?
   (For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...)
   What's the version of that?

   libvirt + KVM

2. Which storage type did you use?
   (For example: Ceph, LVM, GPFS, ...)
   What's the version of that?

   N/A

3. Which networking type did you use?
   (For example: nova-network, Neutron with OpenVSwitch, ...)

   N/A

Logs & Configs
==============

[OSP 16.2] Volumes and vNICs are being hot plugged into SEV based instances without iommu='on' causing failures to attach and later detach within the guest OS
https://bugzilla.redhat.com/show_bug.cgi?id=1967293

** Affects: nova
     Importance: Undecided
     Assignee: Lee Yarwood (lyarwood)
         Status: New


** Tags: libvirt

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1930734

Title:
  Volumes and vNICs are being hot plugged into SEV based instances
  without iommu='on' causing failures to attach and later detach within
  the guest OS

Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  ===========
  After successfully attaching a disk to a SEV enabled instance the request to detach the disk never completes with the following trace eventually logged regarding the initial attach:

  [    7.773877] pcieport 0000:00:02.5: Slot(0-5): Attention button pressed
  [    7.774743] pcieport 0000:00:02.5: Slot(0-5) Powering on due to button press
  [    7.775714] pcieport 0000:00:02.5: Slot(0-5): Card present
  [    7.776403] pcieport 0000:00:02.5: Slot(0-5): Link Up
  [    7.903183] pci 0000:06:00.0: [1af4:1042] type 00 class 0x010000
  [    7.904095] pci 0000:06:00.0: reg 0x14: [mem 0x00000000-0x00000fff]
  [    7.905024] pci 0000:06:00.0: reg 0x20: [mem 0x00000000-0x00003fff 64bit pref]
  [    7.906977] pcieport 0000:00:02.5: bridge window [io  0x1000-0x0fff] to [bus 06] add_size 1000
  [    7.908069] pcieport 0000:00:02.5: BAR 13: no space for [io  size 0x1000]
  [    7.908917] pcieport 0000:00:02.5: BAR 13: failed to assign [io  size 0x1000]
  [    7.909832] pcieport 0000:00:02.5: BAR 13: no space for [io  size 0x1000]
  [    7.910667] pcieport 0000:00:02.5: BAR 13: failed to assign [io  size 0x1000]
  [    7.911586] pci 0000:06:00.0: BAR 4: assigned [mem 0x800600000-0x800603fff 64bit pref]
  [    7.912616] pci 0000:06:00.0: BAR 1: assigned [mem 0x80400000-0x80400fff]
  [    7.913472] pcieport 0000:00:02.5: PCI bridge to [bus 06]
  [    7.915762] pcieport 0000:00:02.5:   bridge window [mem 0x80400000-0x805fffff]
  [    7.917525] pcieport 0000:00:02.5:   bridge window [mem 0x800600000-0x8007fffff 64bit pref]
  [    7.920252] virtio-pci 0000:06:00.0: enabling device (0000 -> 0002)
  [    7.924487] virtio_blk virtio4: [vdb] 2097152 512-byte logical blocks (1.07 GB/1.00 GiB)
  [    7.926616] vdb: detected capacity change from 0 to 1073741824
  [ .. ]
  [  246.751028] INFO: task irq/29-pciehp:173 blocked for more than 120 seconds.
  [  246.752801]       Not tainted 4.18.0-305.el8.x86_64 #1
  [  246.753902] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  [  246.755457] irq/29-pciehp   D    0   173      2 0x80004000
  [  246.756616] Call Trace:
  [  246.757328]  __schedule+0x2c4/0x700
  [  246.758185]  schedule+0x38/0xa0
  [  246.758966]  io_schedule+0x12/0x40
  [  246.759801]  do_read_cache_page+0x513/0x770
  [  246.760761]  ? blkdev_writepages+0x10/0x10
  [  246.761692]  ? file_fdatawait_range+0x20/0x20
  [  246.762659]  read_part_sector+0x38/0xda
  [  246.763554]  read_lba+0x10f/0x220
  [  246.764367]  efi_partition+0x1e4/0x6de
  [  246.765245]  ? snprintf+0x49/0x60
  [  246.766046]  ? is_gpt_valid.part.5+0x430/0x430
  [  246.766991]  blk_add_partitions+0x164/0x3f0
  [  246.767915]  ? blk_drop_partitions+0x91/0xc0
  [  246.768863]  bdev_disk_changed+0x65/0xd0
  [  246.769748]  __blkdev_get+0x3c4/0x510
  [  246.770595]  blkdev_get+0xaf/0x180
  [  246.771394]  __device_add_disk+0x3de/0x4b0
  [  246.772302]  virtblk_probe+0x4ba/0x8a0 [virtio_blk]
  [  246.773313]  virtio_dev_probe+0x158/0x1f0
  [  246.774208]  really_probe+0x255/0x4a0
  [  246.775046]  ? __driver_attach_async_helper+0x90/0x90
  [  246.776091]  driver_probe_device+0x49/0xc0
  [  246.776965]  bus_for_each_drv+0x79/0xc0
  [  246.777813]  __device_attach+0xdc/0x160
  [  246.778669]  bus_probe_device+0x9d/0xb0
  [  246.779523]  device_add+0x418/0x780
  [  246.780321]  register_virtio_device+0x9e/0xe0
  [  246.781254]  virtio_pci_probe+0xb3/0x140
  [  246.782124]  local_pci_probe+0x41/0x90
  [  246.782937]  pci_device_probe+0x105/0x1c0
  [  246.783807]  really_probe+0x255/0x4a0
  [  246.784623]  ? __driver_attach_async_helper+0x90/0x90
  [  246.785647]  driver_probe_device+0x49/0xc0
  [  246.786526]  bus_for_each_drv+0x79/0xc0
  [  246.787364]  __device_attach+0xdc/0x160
  [  246.788205]  pci_bus_add_device+0x4a/0x90
  [  246.789063]  pci_bus_add_devices+0x2c/0x70
  [  246.789916]  pciehp_configure_device+0x91/0x130
  [  246.790855]  pciehp_handle_presence_or_link_change+0x334/0x460
  [  246.791985]  pciehp_ist+0x1a2/0x1b0
  [  246.792768]  ? irq_finalize_oneshot.part.47+0xf0/0xf0
  [  246.793768]  irq_thread_fn+0x1f/0x50
  [  246.794550]  irq_thread+0xe7/0x170
  [  246.795299]  ? irq_forced_thread_fn+0x70/0x70
  [  246.796190]  ? irq_thread_check_affinity+0xe0/0xe0
  [  246.797147]  kthread+0x116/0x130
  [  246.797841]  ? kthread_flush_work_fn+0x10/0x10
  [  246.798735]  ret_from_fork+0x22/0x40
  [  246.799523] INFO: task sfdisk:1129 blocked for more than 120 seconds.
  [  246.800717]       Not tainted 4.18.0-305.el8.x86_64 #1
  [  246.801733] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  [  246.803155] sfdisk          D    0  1129   1107 0x00004080
  [  246.804225] Call Trace:
  [  246.804827]  __schedule+0x2c4/0x700
  [  246.805590]  ? submit_bio+0x3c/0x160
  [  246.806373]  schedule+0x38/0xa0
  [  246.807089]  schedule_preempt_disabled+0xa/0x10
  [  246.807990]  __mutex_lock.isra.6+0x2d0/0x4a0
  [  246.808876]  ? wake_up_q+0x80/0x80
  [  246.809636]  ? fdatawait_one_bdev+0x20/0x20
  [  246.810508]  iterate_bdevs+0x98/0x142
  [  246.811304]  ksys_sync+0x6e/0xb0
  [  246.812041]  __ia32_sys_sync+0xa/0x10
  [  246.812820]  do_syscall_64+0x5b/0x1a0
  [  246.813613]  entry_SYSCALL_64_after_hwframe+0x65/0xca
  [  246.814652] RIP: 0033:0x7fa9c04924fb
  [  246.815431] Code: Unable to access opcode bytes at RIP 0x7fa9c04924d1.
  [  246.816655] RSP: 002b:00007fff47661478 EFLAGS: 00000246 ORIG_RAX: 00000000000000a2
  [  246.818047] RAX: ffffffffffffffda RBX: 000055d79fc512f0 RCX: 00007fa9c04924fb
  [  246.824526] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000055d79fc512f0
  [  246.825714] RBP: 0000000000000000 R08: 000055d79fc51012 R09: 0000000000000006
  [  246.826941] R10: 000000000000000a R11: 0000000000000246 R12: 00007fa9c075e6e0
  [  246.828169] R13: 000055d79fc58c80 R14: 0000000000000001 R15: 00007fff47661590

  This is caused by the device XML supplied to libvirt missing the
  driver iommu attribute:

  <disk type="block" device="disk">
    <driver name="qemu" type="raw" cache="none" io="native"/>
    <source dev="/dev/sdc"/>
    <target bus="virtio" dev="vdb"/>
    <serial>b11ce83a-723a-49a2-a5cc-025cb8985b0d</serial>
  </disk>

  As called out in the original SEV spec this is required:

  https://specs.openstack.org/openstack/nova-
  specs/specs/train/implemented/amd-sev-libvirt-support

  > The iommu attribute is on for all virtio devices. 
  > Despite the name, this does not require the guest 
  > or host to have an IOMMU device, but merely enables 
  > the virtio flag which indicates that virtualized DMA
  > should be used. This ties into the SEV code to handle
  > memory encryption/decryption, and prevents IO buffers
  > being shared between host and guest.
  >
  > The DMA will go through bounce buffers, so some 
  > overhead is expected compared to non-SEV guests.
  >
  > (Note: virtio-net device queues are not encrypted.)

  Steps to reproduce
  ==================
  1. Hot plug a PCIe device into a SEV enabled instance.

  Expected result
  ===============
  Hot plug succeeds and the device is visible within the instance.

  Actual result
  =============
  Hot plug appears to succeed but the device is never present within the instance and a trace is later logged.

  Environment
  ===========
  1. Exact version of OpenStack you are running. See the following
    list for all releases: http://docs.openstack.org/releases/

     master

  2. Which hypervisor did you use?
     (For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...)
     What's the version of that?

     libvirt + KVM

  2. Which storage type did you use?
     (For example: Ceph, LVM, GPFS, ...)
     What's the version of that?

     N/A

  3. Which networking type did you use?
     (For example: nova-network, Neutron with OpenVSwitch, ...)

     N/A

  Logs & Configs
  ==============

  [OSP 16.2] Volumes and vNICs are being hot plugged into SEV based instances without iommu='on' causing failures to attach and later detach within the guest OS
  https://bugzilla.redhat.com/show_bug.cgi?id=1967293

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1930734/+subscriptions


Follow ups