yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #00949
[Bug 1097905] Re: Poor VM disk performance on host using LVM Mirroring
I don't think this is something that we can deal with in openstack. This
is likely a kernel/lvm issue.
** Changed in: nova
Status: New => Invalid
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1097905
Title:
Poor VM disk performance on host using LVM Mirroring
Status in OpenStack Compute (Nova):
Invalid
Bug description:
When running an OpenStack VM on a host machine that uses LVM Mirroring
for the filesystem that is hosting /var/lib/nova, performance can be
1/10th of native hard drive speeds due to some latency issue with LVM
Mirroring.
To reproduce the problem:
1. Install OpenStack controller/compute node on a single machine. Ensure that the root filesystem, which hosts /var/lib/nova/, is backed by an LVM Mirror. The setup I had was 4 drives, 1 master and 3 mirrors.
2. Run dd if=/dev/zero of=/tmp/test.dat bs=1G count=1 oflag=direct on the host and ensure near-native disk write speeds. My test showed ~124MB/s.
3. Start an OpenStack VM and run dd if=/dev/zero of=/tmp/test.dat bs=1G count=1 oflag=direct in the VM and you should get terrible disk write speeds. My test showed ~13MB/s.
To solve the problem:
1. On the host machine, do lvconvert -m0 for the root filesystem. Ensure near-native disk write speeds by running the dd command above.
2. On the VM, run the dd command above. Disk speeds should be at least 50% or more of the host's native disk write speeds.
This is most likely a libvirt or LVM2 issue, but it only surfaced when
using OpenStack and LVM2 Mirroring together.
Other important configuration details:
Host VM: Ubuntu 12.04.1 LTS (GNU/Linux 3.2.0-35-generic x86_64)
# dpkg -l "*nova*" | grep nova
ii nova-ajax-console-proxy 2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - AJAX console proxy - transitional package
ii nova-api 2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - API frontend
ii nova-cert 2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - certificate management
ii nova-common 2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - common files
ii nova-compute 2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - compute node
ii nova-compute-kvm 2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - compute node (KVM)
ii nova-consoleauth 2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - Console Authenticator
ii nova-doc 2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - documentation
ii nova-network 2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - Network manager
ii nova-scheduler 2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - virtual machine scheduler
ii nova-volume 2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - storage
ii python-nova 2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute Python libraries
ii python-novaclient 2012.1-0ubuntu1 client library for OpenStack Compute API
# dpkg -l "*lvm*"
ii lvm2 2.02.66-4ubuntu7.1 The Linux Logical Volume Manager
# dpkg -l "*virt*" | grep libvirt
ii libvirt-bin 0.9.8-2ubuntu17.4 programs for the libvirt library
ii libvirt0 0.9.8-2ubuntu17.4 library for interfacing with different virtualization systems
ii python-libvirt 0.9.8-2ubuntu17.4 libvirt Python bindings
Here's the question I asked on the #openstack IRC channel, and nobody
seemed to know the answer to it:
"""
I'm having virtio disk read/write slowness issues, and I'm trying to debug if libvirt is setup correctly.
We are using libvirt via OpenStack. We have two OpenStack setups that are showing the same poor read performance.
We're running Ubuntu 12.04 for the hosts and the VMs, OpenStack Essex. From what I can tell, the VMs are using virtio. Host is using ext4, VMs are using ext4. Raw disk speed for the host machine is 120MB/s, but VMs top out at 10-20MB/s.
The command I'm using to benchmark is dd if=/dev/zero of=/tmp/test.dat bs=1G count=1 oflag=direct
I have double-checked the libvirt.xml files to ensure that they have the appropriate entries for <driver type='qcow2' cache='none'/> and <target dev='vda' bus='virtio'/>
The VM kernel log says "Booting paravirtualized kernel on KVM", and has the following VirtIO drivers via lspci: 00:03.0 Ethernet controller: Red Hat, Inc Virtio network device, 00:04.0 SCSI storage controller: Red Hat, Inc Virtio block device, 00:05.0 RAM memory: Red Hat, Inc Virtio memory balloon
I have also booted just a plain 'ol cirros image via 'kvm -m 1024 -drive file=cirros.img,if=virtio,index=0 -boot c -net nic -net user -nographic -vnc :0' and gotten dismal read/write speeds 1-2MB/s
So, this leads me to believe that libvirt may be setup incorrectly, but I don't know how where to start looking for issues... anyone on here have any pointers?
"""
Further testing showed the host system could do 120MB/s of throughput,
but the disk latencies were quite high (via bonnie++):
(LVM Mirrored)
Version 1.96 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
production-1 31G 1004 88 120741 23 56172 18 3479 62 141812 20 72.7 2
Latency 22025us 10328ms 14211ms 157ms 233ms 1104ms <------------ !!!HIGH LATENCIES!!!
Version 1.96 ------Sequential Create------ --------Random Create--------
production-1 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 5570 5 +++++ +++ 11761 9 20993 16 +++++ +++ 18943 13
Latency 528us 1127us 232ms 522us 59us 735us
1.96,1.96,production-1,1,1357612517,31G,,1004,88,120741,23,56172,18,3479,62,141812,20,72.7,2,16,,,,,5570,5,+++++,+++,11761,9,20993,16,+++++,+++,18943,13,22025us,10328ms,14211ms,157ms,233ms,1104ms,528us,1127us,232ms,522us,59us,735us
So, I moved /var/lib/nova to a ramdisk and that helped performance
tremendously (540MB/s throughput from inside the VMs). I then mounted
a simple disk with ext3 on /var/lib/nova and that showed good
throughput as well (128MB/s on the host, 65MB/s on the VM). I then
tested drive + LVM + ext3 (same good performance). That left LVM
mirroring on the main OpenStack VM host as the only culprit. I removed
LVM mirroring via lvcreate -m0 and disk throughput for all of the VMs
jumped from 13MB/s up to 65MB/s - 85MB/s. Note that I only had one VM
running at a time to ensure that this wasn't disk contention between
multiple VMs.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1097905/+subscriptions