← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1303690] [NEW] nova live-migration slow when using volumes

 

Public bug reported:

I have block live migration configured in my environment (no shared storage) and it is very fast for instances which don't use volumes. An instance with 2.5G disk image takes ~40 seconds to migrate to different host.
When I migrate instances which do use ceph backed volumes they take much longer and it depends on the volume size. For example migration of an instance with 1G volume takes around 1 minute, 10G ~8 minutes and with 50G I had to wait nearly 50 minutes for the process to complete. It completes without errors every time, it is just very slow.

I was looking at the network traffic during migration and it looks a bit
strange. Lets say I am migrating an instance with 50B volume from
compute node A to compute node B and ceph is running on hosts X,Y and Z.

I initiate live migration and as expected there is lots of traffic going from host A to B, this lasts less than 1 minute (disk image transfer). Then traffic from A to B goes down to ~200Mbit/s and stays at this level until migration is completed.
After initial traffic burst between host A and B host B starts sending data to the ceph nodes X,Y and Z. I can see between 40 to 80Mbit/s of going from host B to each of the ceph nodes. This continues for ~50 minutes, then migration completes and networks traffic idles.

Every time I tried migration eventually completed fine but for instances
with lets say 200G volume it could take nearly 4 hours to complete.

I am using havana on precise.

Compute nodes:
ii  nova-common                      1:2013.2.2-0ubuntu1~cloud0
ii  nova-compute                     1:2013.2.2-0ubuntu1~cloud0
ii  nova-compute-kvm                 1:2013.2.2-0ubuntu1~cloud0

Ceph:
ii  ceph                             0.67.4-0ubuntu2.2~cloud0
ii  ceph-common                      0.67.4-0ubuntu2.2~cloud0
ii  libcephfs1                       0.67.4-0ubuntu2.2~cloud0

** Affects: nova
     Importance: Undecided
         Status: New


** Tags: canonical-is

** Tags added: canonical-is

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1303690

Title:
  nova live-migration slow when using volumes

Status in OpenStack Compute (Nova):
  New

Bug description:
  I have block live migration configured in my environment (no shared storage) and it is very fast for instances which don't use volumes. An instance with 2.5G disk image takes ~40 seconds to migrate to different host.
  When I migrate instances which do use ceph backed volumes they take much longer and it depends on the volume size. For example migration of an instance with 1G volume takes around 1 minute, 10G ~8 minutes and with 50G I had to wait nearly 50 minutes for the process to complete. It completes without errors every time, it is just very slow.

  I was looking at the network traffic during migration and it looks a
  bit strange. Lets say I am migrating an instance with 50B volume from
  compute node A to compute node B and ceph is running on hosts X,Y and
  Z.

  I initiate live migration and as expected there is lots of traffic going from host A to B, this lasts less than 1 minute (disk image transfer). Then traffic from A to B goes down to ~200Mbit/s and stays at this level until migration is completed.
  After initial traffic burst between host A and B host B starts sending data to the ceph nodes X,Y and Z. I can see between 40 to 80Mbit/s of going from host B to each of the ceph nodes. This continues for ~50 minutes, then migration completes and networks traffic idles.

  Every time I tried migration eventually completed fine but for
  instances with lets say 200G volume it could take nearly 4 hours to
  complete.

  I am using havana on precise.

  Compute nodes:
  ii  nova-common                      1:2013.2.2-0ubuntu1~cloud0
  ii  nova-compute                     1:2013.2.2-0ubuntu1~cloud0
  ii  nova-compute-kvm                 1:2013.2.2-0ubuntu1~cloud0

  Ceph:
  ii  ceph                             0.67.4-0ubuntu2.2~cloud0
  ii  ceph-common                      0.67.4-0ubuntu2.2~cloud0
  ii  libcephfs1                       0.67.4-0ubuntu2.2~cloud0

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1303690/+subscriptions


Follow ups

References