← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1356552] Re: Live migration: "Disk of instance is too large" when using a volume stored on NFS

 

** Changed in: nova/juno
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1356552

Title:
  Live migration: "Disk of instance is too large" when using a volume
  stored on NFS

Status in OpenStack Compute (Nova):
  Fix Released
Status in OpenStack Compute (nova) juno series:
  Fix Released

Bug description:
  When live-migrating an instance that has a Cinder volume (stored on
  NFS) attached, the operation fails if the volume size is bigger than
  the space left on the destination node. This should not happen, since
  this volume does not have to be migrated. Here is how to reproduce the
  bug on a cluster with one control node and two compute nodes, using
  the NFS backend of Cinder.

  
  $ nova boot --flavor m1.tiny --image 173241e-babb-45c7-a35f-b9b62e8ced78 test_vm
  ...

  $ nova volume-create --display-name test_volume 100
  ...
  | id                  | 6b9e1d03-3f53-4454-add9-a8c32d82c7e6 |
  ...

  
  $ nova volume-attach test_vm  6b9e1d03-3f53-4454-add9-a8c32d82c7e6 auto
  ...

  $ nova show test_vm | grep OS-EXT-SRV-ATTR:host
  | OS-EXT-SRV-ATTR:host                 | t1-cpunode0                                                |

  $ nova service-list | grep nova-compute
  | nova-compute     | t1-cpunode0 | nova     | enabled | up    | 2014-08-13T19:14:40.000000 | -               |
  | nova-compute     | t1-cpunode1 | nova     | enabled | up    | 2014-08-13T19:14:41.000000 | -               |

  Now, let's say I want to live-migrate test_vm to t1-cpunode1:

  $ nova live-migration --block-migrate test_vm t1-cpunode1
  ERROR: Migration pre-check error: Unable to migrate a0d9c991-7931-4710-8684-282b1df4cca6: Disk of instance is too large(available on destination host:46170898432 < need:108447924224) (HTTP 400) (Request-ID: req-b4f00867-df51-44be-8f97-577be385d536)

  
  In nova/virt/libvirt/driver.py, _assert_dest_node_has_enough_disk() calls get_instance_disk_info(), which in turn, calls _get_instance_disk_info(). In this method, we see that volume devices are not taken into account when computing the amount of space needed to migrate an instance:

  ...
              if disk_type != 'file':
                  LOG.debug('skipping %s since it looks like volume', path)
                  continue

              if target in volume_devices:
                  LOG.debug('skipping disk %(path)s (%(target)s) as it is a '
                            'volume', {'path': path, 'target': target})
                  continue
  ...

  But for some reason, we never get into these conditions.

  If we ssh the compute where the instance currently lies, we can get
  more information about it:

  $ virsh dumpxml 11
  ...
      <disk type='file' device='disk'>
        <driver name='qemu' type='raw' cache='none'/>
        <source file='/var/lib/nova/mnt/84751739e625d0ea9609a65dd9c0a6f1/volume-6b9e1d03-3f53-4454-add9-a8c32d82c7e6'/>
        <target dev='vdb' bus='virtio'/>
        <serial>6b9e1d03-3f53-4454-add9-a8c32d82c7e6</serial>
        <alias name='virtio-disk1'/>
        <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
      </disk>
  ...

  The disk type is "file", which might explain why this volume is not
  skipped in the code snippet shown above. When we use the default
  Cinder backend, we get something such as:

      <disk type='block' device='disk'>
        <driver name='qemu' type='raw' cache='none'/>
        <source dev='/dev/disk/by-path/ip-192.168.200.250:3260-iscsi-iqn.2010-10.org.openstack:volume-47ecc6a6-8af9-4011-a53f-14a71d14f50b-lun-1'/>
        <target dev='vdb' bus='virtio'/>
        <serial>47ecc6a6-8af9-4011-a53f-14a71d14f50b</serial>
        <alias name='virtio-disk1'/>
        <address type='pci' domain='0x0000' bus='0x00' slot='0x07'
  function='0x0'/>
      </disk>

  
  I think that the code in LibvirtNFSVolumeDriver.connect_volume() might be wrong: conf.source_type should be set to something else than "file" (and some other changes might be needed), but I must admit I'm not a libvirt expert.

  Any thoughts ?

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1356552/+subscriptions


References