← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2129788] [NEW] Native AIO mode for Cinder volumes is not always appropriate

 

Public bug reported:

Recently we have had customers reporting issues [1][2] using Nova with
Cinder volumes (and seem to be sparse volumes) when the guest disk
<driver> XML element sets attribute io=native. Such customers
experienced reduced disk I/O performance or guest hanging with their NFS
or Fibre Channel Cinder volumes and narrowed down the cause to the
io=native attribute set in the guest XML by Nova.

The hard-coding io=native in guest XML in Nova for Cinder volume
backends iSCSI, Fibre Channel, and NFS was added about 10 years ago to
improve disk performance [3]. It seems that determination may no longer
be accurate or at least it is not universally the case.

QEMU has logic inside it that it uses to select the best AIO mode for
the device at hand and from the aforementioned experiences, some
deployers need to be able to let QEMU set the best AIO mode and not have
Nova hard-code it.

It's possible that the entire assumption needs to be revisited at a
fundamental level given the amount of time that has passed since the
hard-coding was added. QEMU may have had advancements since then and may
even have access to more modern AIO modes such as io_uring as well.

For the immediate term, we can add a [workarounds] config option to
enable deployers to defer AIO mode selection to QEMU if they are having
problems with io=native.

For the long term, we will need to discuss the topic with the Cinder
team to learn if there is something we need to change more unilaterally
in Nova.

[1] https://issues.redhat.com/browse/OSPRH-20325
[2] https://issues.redhat.com/browse/OSPRH-20737
[3] https://specs.openstack.org/openstack/nova-specs/specs/mitaka/implemented/libvirt-aio-mode.html

** Affects: nova
     Importance: Undecided
     Assignee: melanie witt (melwitt)
         Status: In Progress


** Tags: nfs volumes

** Description changed:

  Recently we have had customers reporting issues [1][2] using Nova with
- Cinder volumes (and seem to be sparse volumes) when the disk <driver>
- element sets attribute io=native. Such customers experienced reduced
- disk I/O performance or guest hanging with their NFS or Fibre Channel
- Cinder volumes and narrowed down the cause to the io=native attribute
- set in the guest XML by Nova.
+ Cinder volumes (and seem to be sparse volumes) when the guest disk
+ <driver> XML element sets attribute io=native. Such customers
+ experienced reduced disk I/O performance or guest hanging with their NFS
+ or Fibre Channel Cinder volumes and narrowed down the cause to the
+ io=native attribute set in the guest XML by Nova.
  
  The hard-coding io=native in guest XML in Nova for Cinder volume
  backends iSCSI, Fibre Channel, and NFS was added about 10 years ago to
  improve disk performance [3]. It seems that determination may no longer
  be accurate or at least it is not universally the case.
  
  QEMU has logic inside it that it uses to select the best AIO mode for
  the device at hand and from the aforementioned experiences, some
  deployers need to be able to let QEMU set the best AIO mode and not have
  Nova hard-code it.
  
  It's possible that the entire assumption needs to be revisited at a
  fundamental level given the amount of time that has passed since the
  hard-coding was added. QEMU may have had advancements since then and may
  even have access to more modern AIO modes such as io_uring as well.
  
  For the immediate term, we can add a [workarounds] config option to
  enable deployers to defer AIO mode selection to QEMU if they are having
  problems with io=native.
  
  For the long term, we will need to discuss the topic with the Cinder
  team to learn if there is something we need to change more unilaterally
  in Nova.
  
- 
  [1] https://issues.redhat.com/browse/OSPRH-20325
  [2] https://issues.redhat.com/browse/OSPRH-20737
  [3] https://specs.openstack.org/openstack/nova-specs/specs/mitaka/implemented/libvirt-aio-mode.html

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2129788

Title:
  Native AIO mode for Cinder volumes is not always appropriate

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  Recently we have had customers reporting issues [1][2] using Nova with
  Cinder volumes (and seem to be sparse volumes) when the guest disk
  <driver> XML element sets attribute io=native. Such customers
  experienced reduced disk I/O performance or guest hanging with their
  NFS or Fibre Channel Cinder volumes and narrowed down the cause to the
  io=native attribute set in the guest XML by Nova.

  The hard-coding io=native in guest XML in Nova for Cinder volume
  backends iSCSI, Fibre Channel, and NFS was added about 10 years ago to
  improve disk performance [3]. It seems that determination may no
  longer be accurate or at least it is not universally the case.

  QEMU has logic inside it that it uses to select the best AIO mode for
  the device at hand and from the aforementioned experiences, some
  deployers need to be able to let QEMU set the best AIO mode and not
  have Nova hard-code it.

  It's possible that the entire assumption needs to be revisited at a
  fundamental level given the amount of time that has passed since the
  hard-coding was added. QEMU may have had advancements since then and
  may even have access to more modern AIO modes such as io_uring as
  well.

  For the immediate term, we can add a [workarounds] config option to
  enable deployers to defer AIO mode selection to QEMU if they are
  having problems with io=native.

  For the long term, we will need to discuss the topic with the Cinder
  team to learn if there is something we need to change more
  unilaterally in Nova.

  [1] https://issues.redhat.com/browse/OSPRH-20325
  [2] https://issues.redhat.com/browse/OSPRH-20737
  [3] https://specs.openstack.org/openstack/nova-specs/specs/mitaka/implemented/libvirt-aio-mode.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2129788/+subscriptions