← Back to team overview

kernel-packages team mailing list archive

[Bug 1319003] [NEW] Storage performance regression when Xen backend lacks persistent-grants support

 

You have been subscribed to a public bug:

Description of problem:
When used as a Xen guest, Ubuntu 13.10 may be slower than older releases in terms of storage performance. This is due to the persistent-grants feature introduced in xen-blkfront on the Linux Kernel 3.8 series. From 3.8 to 3.12 (inclusive), xen-blkfront will add an extra set of memcpy() operations regardless of persistent-grants support in the backend (i.e. xen-blkback, qemu, tapdisk). Many Xen dom0s do not have backend persistent-grants support (such as Citrix XenServer and any Linux distro with Kernel prior to 3.8). This has been identified and fixed in the 3.13 kernel series [1], but was not backported to previous LTS kernels due to the nature of the bug (performance only).

While persistent grants reduce the stress on the Xen grant table and
allow for much better aggregate throughput (at the cost of an extra set
of memcpy() operations), adding the copy overhead when the feature is
unsupported on the backend combines the worst of both worlds.   This is
particularly noticeable when intensive storage workloads are active from
many guests.

The graphs attached show storage throughput numbers for Linux guests
using kernel 3.12.9 (Graph 1) and 3.13.7 (Graph 2) running on a Citrix
XenServer development build. The server had 4 storage repositories (SRs)
with 1 Micron P320 SSD per SR (i.e. 10 VMs per SR means 40 VMs in
total). When using 3.12.9 kernel, the regression is clearly visible for
more than 2 VMs per SR and block sizes larger than 64 KiB. The workload
consisted of sequential reads on pre-allocated raw LVM logical volumes.

[1] Commits by Roger Pau Monné:
    bfe11d6de1c416cea4f3f0f35f864162063ce3fa
    fbe363c476afe8ec992d3baf682670a4bd1b6ce6

Version-Release number of selected component (if applicable):
xen-blkfront of Linux kernel 3.11

How reproducible:
This is always reproducible when a Ubuntu 13.10 guest is running on Xen and the storage backend (i.e. xen-blkback, qemu, tapdisk) does not have support for persistent grants.

Steps to Reproduce:
1. Install a Xen dom0 running a kernel prior to 3.8 (without persistent-grants support).
2. Install a set of Ubuntu 13.10 guests (which uses kernel 3.11).
3. Measure aggregate storage throughput from all guests.

NOTE: The storage infrastructure (e.g. local SSDs, network-attached
storage) should not be a bottleneck in itself. If tested on a single
SATA disk, for example, the issue will probably be unnoticeable as the
infrastructure will be limiting response time and throughput.

Actual results:
Aggregate storage throughput will be lower than with a xen-blkfront versions prior to 3.8 or newer than 3.12.

Expected results:
Aggregate storage throughput should be at least as good or better than previous (or newer) versions of Ubuntu in cases where the backend doesn't support persistent grants.

Additional info:
Given that this is fixed on newer kernels, we urge that a backport of the relevant patches to the 3.11 stable branch is requested. According to the rules in: https://www.kernel.org/doc/Documentation/stable_kernel_rules.txt, the patches would be accepted on the grounds of:

- Serious issues as reported by a user of a distribution kernel may also
   be considered if they fix a notable performance or interactivity issue.
   As these fixes are not as obvious and have a higher risk of a subtle
   regression they should only be submitted by a distribution kernel
   maintainer and include an addendum linking to a bugzilla entry if it
   exists and additional information on the user-visible impact.

** Affects: linux-meta (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: bot-comment
-- 
Storage performance regression when Xen backend lacks persistent-grants support
https://bugs.launchpad.net/bugs/1319003
You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-meta in Ubuntu.