← Back to team overview

kernel-packages team mailing list archive

[Bug 1319003] Re: Storage performance regression when Xen backend lacks persistent-grants support

 

** Attachment added: "Saucy x86_64 guests without the backports"
   https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1319003/+attachment/4123555/+files/saucy64.png

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1319003

Title:
  Storage performance regression when Xen backend lacks persistent-
  grants support

Status in “linux” package in Ubuntu:
  In Progress
Status in “linux” source package in Saucy:
  In Progress

Bug description:
  Description of problem:
  When used as a Xen guest, Ubuntu 13.10 may be slower than older releases in terms of storage performance. This is due to the persistent-grants feature introduced in xen-blkfront on the Linux Kernel 3.8 series. From 3.8 to 3.12 (inclusive), xen-blkfront will add an extra set of memcpy() operations regardless of persistent-grants support in the backend (i.e. xen-blkback, qemu, tapdisk). Many Xen dom0s do not have backend persistent-grants support (such as Citrix XenServer and any Linux distro with Kernel prior to 3.8). This has been identified and fixed in the 3.13 kernel series [1], but was not backported to previous LTS kernels due to the nature of the bug (performance only).

  While persistent grants reduce the stress on the Xen grant table and
  allow for much better aggregate throughput (at the cost of an extra
  set of memcpy() operations), adding the copy overhead when the feature
  is unsupported on the backend combines the worst of both worlds.
  This is particularly noticeable when intensive storage workloads are
  active from many guests.

  The graphs attached show storage throughput numbers for Linux guests
  using kernel 3.12.9 (Graph 1) and 3.13.7 (Graph 2) running on a Citrix
  XenServer development build. The server had 4 storage repositories
  (SRs) with 1 Micron P320 SSD per SR (i.e. 10 VMs per SR means 40 VMs
  in total). When using 3.12.9 kernel, the regression is clearly visible
  for more than 2 VMs per SR and block sizes larger than 64 KiB. The
  workload consisted of sequential reads on pre-allocated raw LVM
  logical volumes.

  [1] Commits by Roger Pau Monné:
      bfe11d6de1c416cea4f3f0f35f864162063ce3fa
      fbe363c476afe8ec992d3baf682670a4bd1b6ce6

  Version-Release number of selected component (if applicable):
  xen-blkfront of Linux kernel 3.11

  How reproducible:
  This is always reproducible when a Ubuntu 13.10 guest is running on Xen and the storage backend (i.e. xen-blkback, qemu, tapdisk) does not have support for persistent grants.

  Steps to Reproduce:
  1. Install a Xen dom0 running a kernel prior to 3.8 (without persistent-grants support).
  2. Install a set of Ubuntu 13.10 guests (which uses kernel 3.11).
  3. Measure aggregate storage throughput from all guests.

  NOTE: The storage infrastructure (e.g. local SSDs, network-attached
  storage) should not be a bottleneck in itself. If tested on a single
  SATA disk, for example, the issue will probably be unnoticeable as the
  infrastructure will be limiting response time and throughput.

  Actual results:
  Aggregate storage throughput will be lower than with a xen-blkfront versions prior to 3.8 or newer than 3.12.

  Expected results:
  Aggregate storage throughput should be at least as good or better than previous (or newer) versions of Ubuntu in cases where the backend doesn't support persistent grants.

  Additional info:
  Given that this is fixed on newer kernels, we urge that a backport of the relevant patches to the 3.11 stable branch is requested. According to the rules in: https://www.kernel.org/doc/Documentation/stable_kernel_rules.txt, the patches would be accepted on the grounds of:

  - Serious issues as reported by a user of a distribution kernel may also
     be considered if they fix a notable performance or interactivity issue.
     As these fixes are not as obvious and have a higher risk of a subtle
     regression they should only be submitted by a distribution kernel
     maintainer and include an addendum linking to a bugzilla entry if it
     exists and additional information on the user-visible impact.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1319003/+subscriptions