← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1097905] Re: Poor VM disk performance on host using LVM Mirroring

 

I don't think this is something that we can deal with in openstack. This
is likely a kernel/lvm issue.

** Changed in: nova
       Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1097905

Title:
  Poor VM disk performance on host using LVM Mirroring

Status in OpenStack Compute (Nova):
  Invalid

Bug description:
  When running an OpenStack VM on a host machine that uses LVM Mirroring
  for the filesystem that is hosting /var/lib/nova, performance can be
  1/10th of native hard drive speeds due to some latency issue with LVM
  Mirroring.

  To reproduce the problem:

  1. Install OpenStack controller/compute node on a single machine. Ensure that the root filesystem, which hosts /var/lib/nova/, is backed by an LVM Mirror. The setup I had was 4 drives, 1 master and 3 mirrors.
  2. Run dd if=/dev/zero of=/tmp/test.dat bs=1G count=1 oflag=direct on the host and ensure near-native disk write speeds. My test showed ~124MB/s.
  3. Start an OpenStack VM and run dd if=/dev/zero of=/tmp/test.dat bs=1G count=1 oflag=direct in the VM and you should get terrible disk write speeds. My test showed ~13MB/s.

  To solve the problem:

  1. On the host machine, do lvconvert -m0 for the root filesystem. Ensure near-native disk write speeds by running the dd command above.
  2. On the VM, run the dd command above. Disk speeds should be at least 50% or more of the host's native disk write speeds.

  This is most likely a libvirt or LVM2 issue, but it only surfaced when
  using OpenStack and LVM2 Mirroring together.

  Other important configuration details:

  Host VM:  Ubuntu 12.04.1 LTS (GNU/Linux 3.2.0-35-generic x86_64)

   # dpkg -l "*nova*" | grep nova
  ii  nova-ajax-console-proxy          2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - AJAX console proxy - transitional package
  ii  nova-api                         2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - API frontend
  ii  nova-cert                        2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - certificate management
  ii  nova-common                      2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - common files
  ii  nova-compute                     2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - compute node
  ii  nova-compute-kvm                 2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - compute node (KVM)
  ii  nova-consoleauth                 2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - Console Authenticator
  ii  nova-doc                         2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - documentation
  ii  nova-network                     2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - Network manager
  ii  nova-scheduler                   2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - virtual machine scheduler
  ii  nova-volume                      2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - storage
  ii  python-nova                      2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute Python libraries
  ii  python-novaclient                2012.1-0ubuntu1                            client library for OpenStack Compute API

  # dpkg -l "*lvm*"
  ii  lvm2                             2.02.66-4ubuntu7.1                         The Linux Logical Volume Manager

  # dpkg -l "*virt*" | grep libvirt
  ii  libvirt-bin                      0.9.8-2ubuntu17.4                          programs for the libvirt library
  ii  libvirt0                         0.9.8-2ubuntu17.4                          library for interfacing with different virtualization systems
  ii  python-libvirt                   0.9.8-2ubuntu17.4                          libvirt Python bindings

  Here's the question I asked on the #openstack IRC channel, and nobody
  seemed to know the answer to it:

  """
  I'm having virtio disk read/write slowness issues, and I'm trying to debug if libvirt is setup correctly.
  We are using libvirt via OpenStack. We have two OpenStack setups that are showing the same poor read performance.
  We're running Ubuntu 12.04 for the hosts and the VMs, OpenStack Essex. From what I can tell, the VMs are using virtio. Host is using ext4, VMs are using ext4. Raw disk speed for the host machine is 120MB/s, but VMs top out at 10-20MB/s.
  The command I'm using to benchmark is dd if=/dev/zero of=/tmp/test.dat bs=1G count=1 oflag=direct
  I have double-checked the libvirt.xml files to ensure that they have the appropriate entries for <driver type='qcow2' cache='none'/> and <target dev='vda' bus='virtio'/>
  The VM kernel log says "Booting paravirtualized kernel on KVM", and has the following VirtIO drivers via lspci: 00:03.0 Ethernet controller: Red Hat, Inc Virtio network device, 00:04.0 SCSI storage controller: Red Hat, Inc Virtio block device, 00:05.0 RAM memory: Red Hat, Inc Virtio memory balloon
  I have also booted just a plain 'ol cirros image via 'kvm -m 1024 -drive file=cirros.img,if=virtio,index=0 -boot c -net nic -net user -nographic -vnc :0' and gotten dismal read/write speeds  1-2MB/s
  So, this leads me to believe that libvirt may be setup incorrectly, but I don't know how where to start looking for issues... anyone on here have any pointers?
  """

  Further testing showed the host system could do 120MB/s of throughput,
  but the disk latencies were quite high (via bonnie++):

  (LVM Mirrored)
  Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
  Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
  Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
  production-1 31G  1004  88 120741  23 56172  18  3479  62 141812  20  72.7   2
  Latency             22025us   10328ms   14211ms     157ms     233ms    1104ms                      <------------ !!!HIGH LATENCIES!!!
  Version  1.96       ------Sequential Create------ --------Random Create--------
  production-1     -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
                files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                   16  5570   5 +++++ +++ 11761   9 20993  16 +++++ +++ 18943  13
  Latency               528us    1127us     232ms     522us      59us     735us
  1.96,1.96,production-1,1,1357612517,31G,,1004,88,120741,23,56172,18,3479,62,141812,20,72.7,2,16,,,,,5570,5,+++++,+++,11761,9,20993,16,+++++,+++,18943,13,22025us,10328ms,14211ms,157ms,233ms,1104ms,528us,1127us,232ms,522us,59us,735us

  So, I moved /var/lib/nova to a ramdisk and that helped performance
  tremendously (540MB/s throughput from inside the VMs). I then mounted
  a simple disk with ext3 on /var/lib/nova and that showed good
  throughput as well (128MB/s on the host, 65MB/s on the VM). I then
  tested drive + LVM + ext3 (same good performance). That left LVM
  mirroring on the main OpenStack VM host as the only culprit. I removed
  LVM mirroring via lvcreate -m0 and disk throughput for all of the VMs
  jumped from 13MB/s up to 65MB/s - 85MB/s. Note that I only had one VM
  running at a time to ensure that this wasn't disk contention between
  multiple VMs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1097905/+subscriptions