← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1706083] Re: Post-migration, Cinder volumes lose disk cache value, resulting in I/O latency

 

** Also affects: nova/newton
   Importance: Undecided
       Status: New

** Also affects: nova/ocata
   Importance: Undecided
       Status: New

** Changed in: nova
   Importance: Undecided => Medium

** Changed in: nova/newton
   Importance: Undecided => Medium

** Changed in: nova/ocata
   Importance: Undecided => Medium

** Changed in: nova/ocata
       Status: New => Confirmed

** Changed in: nova/newton
       Status: New => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1706083

Title:
  Post-migration, Cinder volumes lose disk cache value, resulting in I/O
  latency

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) newton series:
  Confirmed
Status in OpenStack Compute (nova) ocata series:
  Confirmed

Bug description:
  Description
  ===========

  [This was initially reported by a Red Hat OSP customer.]

  The I/O latency of a Cinder volume after live migration of an instance
  to which it's attached increases significantly. This stays increased
  till the VM is stopped and started again. [VM is booted with Cinder
  volume.

  This is not the case when using a disk from a Nova store backend [
  without Cinder volume] -- or at least the difference isn't so
  significantly high after a live migration.

  The storage backend is Ceph 2.0.

  
  How reproducible: Consistently

  
  Steps to Reproduce
  ==================

  (0) Both the Nova instances and Cinder volumes are located on Ceph

  (1) Create a Nova instance with a Cinder volume attached to it

  (2) Live migrate it to a target Compute node

  (3) Run `ioping` (`ioping -c 10 .`) on the Cinder volume.
      Alternatively, run other I/O benchmarks like using `fio` with
      'direct=1' (which uses non-bufferred I/O) as a good sanity check to
      get a second opinion regarding latency.

  
  Actual result
  =============

  Before live migration: `ioping` output on the Cinder volume attached to a Nova
  instance:

      [guest]$ sudo ioping -c 10 .
      4 KiB <<< . (xfs /dev/sda1): request=1 time=98.0 us (warmup)
      4 KiB <<< . (xfs /dev/sda1): request=2 time=135.6 us
      4 KiB <<< . (xfs /dev/sda1): request=3 time=155.5 us
      4 KiB <<< . (xfs /dev/sda1): request=4 time=161.7 us
      4 KiB <<< . (xfs /dev/sda1): request=5 time=148.4 us
      4 KiB <<< . (xfs /dev/sda1): request=6 time=354.3 us
      4 KiB <<< . (xfs /dev/sda1): request=7 time=138.0 us (fast)
      4 KiB <<< . (xfs /dev/sda1): request=8 time=150.7 us
      4 KiB <<< . (xfs /dev/sda1): request=9 time=149.6 us
      4 KiB <<< . (xfs /dev/sda1): request=10 time=138.6 us (fast)
      
      --- . (xfs /dev/sda1) ioping statistics ---
      9 requests completed in 1.53 ms, 36 KiB read, 5.87 k iops, 22.9 MiB/s
      generated 10 requests in 9.00 s, 40 KiB, 1 iops, 4.44 KiB/s
      min/avg/max/mdev = 135.6 us / 170.3 us / 354.3 us / 65.6 us

  
  After live migration, `ioping` output on the Cinder 

      [guest]$ sudo ioping -c 10 .
      4 KiB <<< . (xfs /dev/sda1): request=1 time=1.03 ms (warmup)
      4 KiB <<< . (xfs /dev/sda1): request=2 time=948.6 us
      4 KiB <<< . (xfs /dev/sda1): request=3 time=955.7 us
      4 KiB <<< . (xfs /dev/sda1): request=4 time=920.5 us
      4 KiB <<< . (xfs /dev/sda1): request=5 time=1.03 ms
      4 KiB <<< . (xfs /dev/sda1): request=6 time=838.2 us
      4 KiB <<< . (xfs /dev/sda1): request=7 time=1.13 ms (slow)
      4 KiB <<< . (xfs /dev/sda1): request=8 time=868.6 us
      4 KiB <<< . (xfs /dev/sda1): request=9 time=985.2 us
      4 KiB <<< . (xfs /dev/sda1): request=10 time=936.6 us
      
      --- . (xfs /dev/sda1) ioping statistics ---
      9 requests completed in 8.61 ms, 36 KiB read, 1.04 k iops, 4.08 MiB/s
      generated 10 requests in 9.00 s, 40 KiB, 1 iops, 4.44 KiB/s
      min/avg/max/mdev = 838.2 us / 956.9 us / 1.13 ms / 81.0 us

  This goes back to an average of 200us again after shutting down and
  starting up the instance. 

  
  Expected result
  ===============

  No I/O latency experienced on Cinder volumes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1706083/+subscriptions


References