← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1930734] Re: Volumes and vNICs are being hot plugged into SEV based instances without iommu='on' causing failures to attach and later detach within the guest OS

 

Reviewed:  https://review.opendev.org/c/openstack/nova/+/794639
Committed: https://opendev.org/openstack/nova/commit/4d8bf15fec15dc3416023e577e0f2c277c216506
Submitter: "Zuul (22348)"
Branch:    master

commit 4d8bf15fec15dc3416023e577e0f2c277c216506
Author: Lee Yarwood <lyarwood@xxxxxxxxxx>
Date:   Thu Jun 3 16:37:45 2021 +0100

    libvirt: Set driver_iommu when attaching virtio devices to SEV instance
    
    As called out in the original spec [1] virtio devices attached to a SEV
    enabled instance must have the iommu attribute enabled. This was done
    within the original implementation of the spec for all virtio devices
    defined when initially spawning the instance but does not include volume
    and interfaces that are later hot plugged.
    
    This change corrects this for both volumes and nics and in doing so
    slightly refactors the original designer code to make it usable in both
    cases.
    
    [1] https://specs.openstack.org/openstack/nova-specs/specs/train/implemented/amd-sev-libvirt-support.html#proposed-change
    
    Closes-Bug: #1930734
    Change-Id: I11131a3f90b8af85e7151b519fb26d225629c391


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1930734

Title:
  Volumes and vNICs are being hot plugged into SEV based instances
  without iommu='on' causing failures to attach and later detach within
  the guest OS

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Description
  ===========
  After successfully attaching a disk to a SEV enabled instance the request to detach the disk never completes with the following trace eventually logged regarding the initial attach:

  [    7.773877] pcieport 0000:00:02.5: Slot(0-5): Attention button pressed
  [    7.774743] pcieport 0000:00:02.5: Slot(0-5) Powering on due to button press
  [    7.775714] pcieport 0000:00:02.5: Slot(0-5): Card present
  [    7.776403] pcieport 0000:00:02.5: Slot(0-5): Link Up
  [    7.903183] pci 0000:06:00.0: [1af4:1042] type 00 class 0x010000
  [    7.904095] pci 0000:06:00.0: reg 0x14: [mem 0x00000000-0x00000fff]
  [    7.905024] pci 0000:06:00.0: reg 0x20: [mem 0x00000000-0x00003fff 64bit pref]
  [    7.906977] pcieport 0000:00:02.5: bridge window [io  0x1000-0x0fff] to [bus 06] add_size 1000
  [    7.908069] pcieport 0000:00:02.5: BAR 13: no space for [io  size 0x1000]
  [    7.908917] pcieport 0000:00:02.5: BAR 13: failed to assign [io  size 0x1000]
  [    7.909832] pcieport 0000:00:02.5: BAR 13: no space for [io  size 0x1000]
  [    7.910667] pcieport 0000:00:02.5: BAR 13: failed to assign [io  size 0x1000]
  [    7.911586] pci 0000:06:00.0: BAR 4: assigned [mem 0x800600000-0x800603fff 64bit pref]
  [    7.912616] pci 0000:06:00.0: BAR 1: assigned [mem 0x80400000-0x80400fff]
  [    7.913472] pcieport 0000:00:02.5: PCI bridge to [bus 06]
  [    7.915762] pcieport 0000:00:02.5:   bridge window [mem 0x80400000-0x805fffff]
  [    7.917525] pcieport 0000:00:02.5:   bridge window [mem 0x800600000-0x8007fffff 64bit pref]
  [    7.920252] virtio-pci 0000:06:00.0: enabling device (0000 -> 0002)
  [    7.924487] virtio_blk virtio4: [vdb] 2097152 512-byte logical blocks (1.07 GB/1.00 GiB)
  [    7.926616] vdb: detected capacity change from 0 to 1073741824
  [ .. ]
  [  246.751028] INFO: task irq/29-pciehp:173 blocked for more than 120 seconds.
  [  246.752801]       Not tainted 4.18.0-305.el8.x86_64 #1
  [  246.753902] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  [  246.755457] irq/29-pciehp   D    0   173      2 0x80004000
  [  246.756616] Call Trace:
  [  246.757328]  __schedule+0x2c4/0x700
  [  246.758185]  schedule+0x38/0xa0
  [  246.758966]  io_schedule+0x12/0x40
  [  246.759801]  do_read_cache_page+0x513/0x770
  [  246.760761]  ? blkdev_writepages+0x10/0x10
  [  246.761692]  ? file_fdatawait_range+0x20/0x20
  [  246.762659]  read_part_sector+0x38/0xda
  [  246.763554]  read_lba+0x10f/0x220
  [  246.764367]  efi_partition+0x1e4/0x6de
  [  246.765245]  ? snprintf+0x49/0x60
  [  246.766046]  ? is_gpt_valid.part.5+0x430/0x430
  [  246.766991]  blk_add_partitions+0x164/0x3f0
  [  246.767915]  ? blk_drop_partitions+0x91/0xc0
  [  246.768863]  bdev_disk_changed+0x65/0xd0
  [  246.769748]  __blkdev_get+0x3c4/0x510
  [  246.770595]  blkdev_get+0xaf/0x180
  [  246.771394]  __device_add_disk+0x3de/0x4b0
  [  246.772302]  virtblk_probe+0x4ba/0x8a0 [virtio_blk]
  [  246.773313]  virtio_dev_probe+0x158/0x1f0
  [  246.774208]  really_probe+0x255/0x4a0
  [  246.775046]  ? __driver_attach_async_helper+0x90/0x90
  [  246.776091]  driver_probe_device+0x49/0xc0
  [  246.776965]  bus_for_each_drv+0x79/0xc0
  [  246.777813]  __device_attach+0xdc/0x160
  [  246.778669]  bus_probe_device+0x9d/0xb0
  [  246.779523]  device_add+0x418/0x780
  [  246.780321]  register_virtio_device+0x9e/0xe0
  [  246.781254]  virtio_pci_probe+0xb3/0x140
  [  246.782124]  local_pci_probe+0x41/0x90
  [  246.782937]  pci_device_probe+0x105/0x1c0
  [  246.783807]  really_probe+0x255/0x4a0
  [  246.784623]  ? __driver_attach_async_helper+0x90/0x90
  [  246.785647]  driver_probe_device+0x49/0xc0
  [  246.786526]  bus_for_each_drv+0x79/0xc0
  [  246.787364]  __device_attach+0xdc/0x160
  [  246.788205]  pci_bus_add_device+0x4a/0x90
  [  246.789063]  pci_bus_add_devices+0x2c/0x70
  [  246.789916]  pciehp_configure_device+0x91/0x130
  [  246.790855]  pciehp_handle_presence_or_link_change+0x334/0x460
  [  246.791985]  pciehp_ist+0x1a2/0x1b0
  [  246.792768]  ? irq_finalize_oneshot.part.47+0xf0/0xf0
  [  246.793768]  irq_thread_fn+0x1f/0x50
  [  246.794550]  irq_thread+0xe7/0x170
  [  246.795299]  ? irq_forced_thread_fn+0x70/0x70
  [  246.796190]  ? irq_thread_check_affinity+0xe0/0xe0
  [  246.797147]  kthread+0x116/0x130
  [  246.797841]  ? kthread_flush_work_fn+0x10/0x10
  [  246.798735]  ret_from_fork+0x22/0x40
  [  246.799523] INFO: task sfdisk:1129 blocked for more than 120 seconds.
  [  246.800717]       Not tainted 4.18.0-305.el8.x86_64 #1
  [  246.801733] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  [  246.803155] sfdisk          D    0  1129   1107 0x00004080
  [  246.804225] Call Trace:
  [  246.804827]  __schedule+0x2c4/0x700
  [  246.805590]  ? submit_bio+0x3c/0x160
  [  246.806373]  schedule+0x38/0xa0
  [  246.807089]  schedule_preempt_disabled+0xa/0x10
  [  246.807990]  __mutex_lock.isra.6+0x2d0/0x4a0
  [  246.808876]  ? wake_up_q+0x80/0x80
  [  246.809636]  ? fdatawait_one_bdev+0x20/0x20
  [  246.810508]  iterate_bdevs+0x98/0x142
  [  246.811304]  ksys_sync+0x6e/0xb0
  [  246.812041]  __ia32_sys_sync+0xa/0x10
  [  246.812820]  do_syscall_64+0x5b/0x1a0
  [  246.813613]  entry_SYSCALL_64_after_hwframe+0x65/0xca
  [  246.814652] RIP: 0033:0x7fa9c04924fb
  [  246.815431] Code: Unable to access opcode bytes at RIP 0x7fa9c04924d1.
  [  246.816655] RSP: 002b:00007fff47661478 EFLAGS: 00000246 ORIG_RAX: 00000000000000a2
  [  246.818047] RAX: ffffffffffffffda RBX: 000055d79fc512f0 RCX: 00007fa9c04924fb
  [  246.824526] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000055d79fc512f0
  [  246.825714] RBP: 0000000000000000 R08: 000055d79fc51012 R09: 0000000000000006
  [  246.826941] R10: 000000000000000a R11: 0000000000000246 R12: 00007fa9c075e6e0
  [  246.828169] R13: 000055d79fc58c80 R14: 0000000000000001 R15: 00007fff47661590

  This is caused by the device XML supplied to libvirt missing the
  driver iommu attribute:

  <disk type="block" device="disk">
    <driver name="qemu" type="raw" cache="none" io="native"/>
    <source dev="/dev/sdc"/>
    <target bus="virtio" dev="vdb"/>
    <serial>b11ce83a-723a-49a2-a5cc-025cb8985b0d</serial>
  </disk>

  As called out in the original SEV spec this is required:

  https://specs.openstack.org/openstack/nova-
  specs/specs/train/implemented/amd-sev-libvirt-support

  > The iommu attribute is on for all virtio devices. 
  > Despite the name, this does not require the guest 
  > or host to have an IOMMU device, but merely enables 
  > the virtio flag which indicates that virtualized DMA
  > should be used. This ties into the SEV code to handle
  > memory encryption/decryption, and prevents IO buffers
  > being shared between host and guest.
  >
  > The DMA will go through bounce buffers, so some 
  > overhead is expected compared to non-SEV guests.
  >
  > (Note: virtio-net device queues are not encrypted.)

  Steps to reproduce
  ==================
  1. Hot plug a PCIe device into a SEV enabled instance.

  Expected result
  ===============
  Hot plug succeeds and the device is visible within the instance.

  Actual result
  =============
  Hot plug appears to succeed but the device is never present within the instance and a trace is later logged.

  Environment
  ===========
  1. Exact version of OpenStack you are running. See the following
    list for all releases: http://docs.openstack.org/releases/

     master

  2. Which hypervisor did you use?
     (For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...)
     What's the version of that?

     libvirt + KVM

  2. Which storage type did you use?
     (For example: Ceph, LVM, GPFS, ...)
     What's the version of that?

     N/A

  3. Which networking type did you use?
     (For example: nova-network, Neutron with OpenVSwitch, ...)

     N/A

  Logs & Configs
  ==============

  [OSP 16.2] Volumes and vNICs are being hot plugged into SEV based instances without iommu='on' causing failures to attach and later detach within the guest OS
  https://bugzilla.redhat.com/show_bug.cgi?id=1967293

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1930734/+subscriptions


References