← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1770640] Re: live block migration of instance with vfat config drive fails

 

Reviewed:  https://review.openstack.org/567899
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=016986f4706e881fed16c85c8790af4d26a7c351
Submitter: Zuul
Branch:    master

commit 016986f4706e881fed16c85c8790af4d26a7c351
Author: Lee Yarwood <lyarwood@xxxxxxxxxx>
Date:   Fri May 11 15:04:48 2018 +0100

    libvirt: Report the virtual size of RAW disks
    
    If642e51a4e186833349a8e30b04224a3687f5594 started to correctly report
    the actual size of preallocated file based disks but missed that this
    value was later used as the virtual disk size for RAW disks.
    
    This is an issue as nova.virt.libvirt.utils.create_image creates these
    disks as sparse files with a reported actual size much smaller than the
    virtual size. During block based LM this then results in disks
    on the destination being created with a much smaller virtual size than
    the disk should have leading to errors during the transfer.
    
    Closes-Bug: #1770640
    Change-Id: I464bc2b88123a012cd12213beac4b572c3c20a56


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1770640

Title:
  live block migration of instance with vfat config drive fails

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) ocata series:
  In Progress
Status in OpenStack Compute (nova) pike series:
  In Progress
Status in OpenStack Compute (nova) queens series:
  In Progress

Bug description:
  It seems like the following commit broke block live migrations of
  instances with vfat config drive --
  https://review.openstack.org/561704. Reproduced the problem on pike
  with that patch backport, and I think master is affected as well.

  The exception happening is:

  May 10 17:01:48 cmp02 nova-compute[4443]: Traceback (most recent call last):
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/eventlet/hubs/poll.py", line 115, in wait
  May 10 17:01:48 cmp02 nova-compute[4443]:     listener.cb(fileno)
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main
  May 10 17:01:48 cmp02 nova-compute[4443]:     result = function(*args, **kwargs)
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/nova/utils.py", line 1004, in context_wrapper
  May 10 17:01:48 cmp02 nova-compute[4443]:     return func(*args, **kwargs)
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 6568, in _live_migration_operation
  May 10 17:01:48 cmp02 nova-compute[4443]:     LOG.error("Live Migration failure: %s", e, instance=instance)
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
  May 10 17:01:48 cmp02 nova-compute[4443]:     self.force_reraise()
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
  May 10 17:01:48 cmp02 nova-compute[4443]:     six.reraise(self.type_, self.value, self.tb)
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 6562, in _live_migration_operation
  May 10 17:01:48 cmp02 nova-compute[4443]:     bandwidth=CONF.libvirt.live_migration_bandwidth)
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/guest.py", line 670, in migrate
  May 10 17:01:48 cmp02 nova-compute[4443]:     destination, params=params, flags=flags)
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, in doit
  May 10 17:01:48 cmp02 nova-compute[4443]:     result = proxy_call(self._autowrap, f, *args, **kwargs)
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, in proxy_call
  May 10 17:01:48 cmp02 nova-compute[4443]:     rv = execute(f, *args, **kwargs)
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, in execute
  May 10 17:01:48 cmp02 nova-compute[4443]:     six.reraise(c, e, tb)
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker
  May 10 17:01:48 cmp02 nova-compute[4443]:     rv = meth(*args, **kwargs)
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/libvirt.py", line 1702, in migrateToURI3
  May 10 17:01:48 cmp02 nova-compute[4443]:     if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self)
  May 10 17:01:48 cmp02 nova-compute[4443]: libvirtError: operation failed: migration of disk vdb failed: Input/output error
  May 10 17:01:48 cmp02 nova-compute[4443]: Removing descriptor: 29

  (vdb is a vfat config drive)

  after turning on the info logging in qemu seen the following in the
  syslog:

  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.204+0000: 10323: info : qemuMonitorSend:1076 : QEMU_MONITOR_SEND_MSG: mon=0x7f6fe801b920 msg={"execute":"drive-mirror","arguments":{"device":"drive-virtio-disk0","target":"nbd:10.60.0.105:49153:exportname=drive-virtio-disk0","speed":9223372036853727232,"sync":"top","mode":"existing","format":"raw"},"id":"libvirt-208"}
  May 10 17:01:48 cmp02 libvirtd[10318]:  fd=-1
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.204+0000: 10318: info : qemuMonitorIOWrite:552 : QEMU_MONITOR_IO_WRITE: mon=0x7f6fe801b920 buf={"execute":"drive-mirror","arguments":{"device":"drive-virtio-disk0","target":"nbd:10.60.0.105:49153:exportname=drive-virtio-disk0","speed":9223372036853727232,"sync":"top","mode":"existing","format":"raw"},"id":"libvirt-208"}
  May 10 17:01:48 cmp02 libvirtd[10318]:  len=228 ret=228 errno=0
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.296+0000: 10323: info : qemuMonitorSend:1076 : QEMU_MONITOR_SEND_MSG: mon=0x7f6fe801b920 msg={"execute":"drive-mirror","arguments":{"device":"drive-virtio-disk1","target":"nbd:10.60.0.105:49153:exportname=drive-virtio-disk1","speed":9223372036853727232,"sync":"top","mode":"existing","format":"raw"},"id":"libvirt-209"}
  May 10 17:01:48 cmp02 libvirtd[10318]:  fd=-1
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.296+0000: 10318: info : qemuMonitorIOWrite:552 : QEMU_MONITOR_IO_WRITE: mon=0x7f6fe801b920 buf={"execute":"drive-mirror","arguments":{"device":"drive-virtio-disk1","target":"nbd:10.60.0.105:49153:exportname=drive-virtio-disk1","speed":9223372036853727232,"sync":"top","mode":"existing","format":"raw"},"id":"libvirt-209"}
  May 10 17:01:48 cmp02 libvirtd[10318]:  len=228 ret=228 errno=0
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.388+0000: 10318: info : qemuMonitorJSONIOProcessLine:208 : QEMU_MONITOR_RECV_EVENT: mon=0x7f6fe801b920 event={"timestamp": {"seconds": 1525971708, "microseconds": 387891}, "event": "BLOCK_JOB_ERROR", "data": {"device": "drive-virtio-disk1", "operation": "write", "action": "report"}}
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.388+0000: 10318: info : qemuMonitorJSONIOProcessLine:208 : QEMU_MONITOR_RECV_EVENT: mon=0x7f6fe801b920 event={"timestamp": {"seconds": 1525971708, "microseconds": 388730}, "event": "BLOCK_JOB_READY", "data": {"device": "drive-virtio-disk0", "len": 1507328, "offset": 1507328, "speed": 9223372036853727232, "type": "mirror"}}
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.392+0000: 10318: info : qemuMonitorJSONIOProcessLine:208 : QEMU_MONITOR_RECV_EVENT: mon=0x7f6fe801b920 event={"timestamp": {"seconds": 1525971708, "microseconds": 392754}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "drive-virtio-disk1", "len": 67108864, "offset": 196608, "speed": 9223372036853727232, "type": "mirror", "error": "Input/output error"}}
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.393+0000: 10318: info : qemuMonitorJSONIOProcessLine:213 : QEMU_MONITOR_RECV_REPLY: mon=0x7f6fe801b920 reply={"return": [{"io-status": "ok", "device": "drive-virtio-disk0", "busy": false, "len": 1507328, "offset": 1507328, "paused": false, "speed": 9223372036853727232, "ready": true, "type": "mirror"}], "id": "libvirt-210"}
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.395+0000: 10323: error : qemuMigrationDriveMirrorReady:624 : operation failed: migration of disk vdb failed: Input/output error
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.395+0000: 10323: info : qemuMonitorSend:1076 : QEMU_MONITOR_SEND_MSG: mon=0x7f6fe801b920 msg={"execute":"block-job-cancel","arguments":{"device":"drive-virtio-disk0"},"id":"libvirt-211"}
  May 10 17:01:48 cmp02 libvirtd[10318]:  fd=-1
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.395+0000: 10318: info : qemuMonitorIOWrite:552 : QEMU_MONITOR_IO_WRITE: mon=0x7f6fe801b920 buf={"execute":"block-job-cancel","arguments":{"device":"drive-virtio-disk0"},"id":"libvirt-211"}
  May 10 17:01:48 cmp02 libvirtd[10318]:  len=95 ret=95 errno=0
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.398+0000: 10323: info : qemuMonitorSend:1076 : QEMU_MONITOR_SEND_MSG: mon=0x7f6fe801b920 msg={"execute":"block-job-cancel","arguments":{"device":"drive-virtio-disk1"},"id":"libvirt-212"}
  May 10 17:01:48 cmp02 libvirtd[10318]:  fd=-1
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.398+0000: 10318: info : qemuMonitorIOWrite:552 : QEMU_MONITOR_IO_WRITE: mon=0x7f6fe801b920 buf={"execute":"block-job-cancel","arguments":{"device":"drive-virtio-disk1"},"id":"libvirt-212"}
  May 10 17:01:48 cmp02 libvirtd[10318]:  len=95 ret=95 errno=0
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.402+0000: 10318: info : qemuMonitorJSONIOProcessLine:208 : QEMU_MONITOR_RECV_EVENT: mon=0x7f6fe801b920 event={"timestamp": {"seconds": 1525971708, "microseconds": 401820}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "drive-virtio-disk0", "len": 1507328, "offset": 1507328, "speed": 9223372036853727232, "type": "mirror"}}
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.403+0000: 10318: info : qemuMonitorJSONIOProcessLine:213 : QEMU_MONITOR_RECV_REPLY: mon=0x7f6fe801b920 reply={"id": "libvirt-212", "error": {"class": "DeviceNotActive", "desc": "Block job 'drive-virtio-disk1' not found"}}
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.414+0000: 10318: info : qemuMonitorJSONIOProcessLine:213 : QEMU_MONITOR_RECV_REPLY: mon=0x7f6fe801b920 reply={"return": [{"device": "drive-virtio-disk0", "parent": {"stats": {"flush_total_time_ns": 0, "wr_highest_offset": 1835008, "wr_total_time_ns": 0, "failed_wr_operations": 0, "failed_rd_operations": 0, "wr_merged": 0, "wr_bytes": 0, "timed_stats": [], "failed_flush_operations": 0, "account_invalid": false, "rd_total_time_ns": 0, "flush_operations": 0, "wr_operations": 0, "rd_merged": 0, "rd_bytes": 0, "invalid_flush_operations": 0, "account_failed": false, "rd_operations": 0, "invalid_wr_operations": 0, "invalid_rd_operations": 0}, "node-name": "#block002"}, "stats": {"flush_total_time_ns": 9890804, "wr_highest_offset": 32911872, "wr_total_time_ns": 274684321, "failed_wr_operations": 0, "failed_rd_operations": 0, "wr_merged": 5, "wr_bytes": 268288, "timed_stats": [], "failed_flush_operations": 0, "account_invalid": true, "rd_total_time_ns": 182145276, "flush_operations": 22, "wr_operations": 87, "rd_merged": 7, "rd_bytes": 20611072, "invalid_flush_operations": 0, "account_failed": true, "idle_time_ns": 783247565442, "rd_operations": 912, "invalid_wr_operations": 0, "invalid_rd_operations": 0}, "backing": {"parent": {"stats": {"flush_total_time_ns": 0, "wr_highest_offset": 0, "wr_total_time_ns": 0, "failed_wr_operations": 0, "failed_rd_operations": 0, "wr_merged": 0, "wr_bytes": 0, "timed_stats": [], "failed_flush_operations": 0, "account_invalid": false, "rd_total_time_ns": 0, "flush_operations": 0, "wr_operations": 0, "rd_merged": 0, "rd_bytes": 0, "invalid_flush_operations": 0, "account_failed": false, "rd_operations": 0, "invalid_wr_operations": 0, "invalid_rd_operations": 0}, "node-name": "#block264"}, "stats": {"flush_total_time_ns": 0, "wr_highest_offset": 0, "wr_total_time_ns": 0, "failed_wr_operations": 0, "failed_rd_operations": 0, "wr_merged": 0, "wr_bytes": 0, "timed_stats": [], "failed_flush_operations": 0, "account_invalid": false, "rd_total_time_ns": 0, "flush_operations": 0,
  May 10 17:01:48 cmp02 libvirtd[10318]:  "wr_operations": 0, "rd_merged": 0, "rd_bytes": 0, "invalid_flush_operations": 0, "account_failed": false, "rd_operations": 0, "invalid_wr_operations": 0, "invalid_rd_operations": 0}, "node-name": "#block392"}, "node-name": "#block108"}, {"device": "drive-virtio-disk1", "parent": {"stats": {"flush_total_time_ns": 0, "wr_highest_offset": 0, "wr_total_time_ns": 0, "failed_wr_operations": 0, "failed_rd_operations": 0, "wr_merged": 0, "wr_bytes": 0, "timed_stats": [], "failed_flush_operations": 0, "account_invalid": false, "rd_total_time_ns": 0, "flush_operations": 0, "wr_operations": 0, "rd_merged": 0, "rd_bytes": 0, "invalid_flush_operations": 0, "account_failed": false, "rd_operations": 0, "invalid_wr_operations": 0, "invalid_rd_operations": 0}, "node-name": "#block458"}, "stats": {"flush_total_time_ns": 0, "wr_highest_offset": 0, "wr_total_time_ns": 0, "failed_wr_operations": 0, "failed_rd_operations": 0, "wr_merged": 0, "wr_bytes": 0, "timed_stats": [], "failed_flush_operations": 0, "account_invalid": true, "rd_total_time_ns": 50271508, "flush_operations": 0, "wr_operations": 0, "rd_merged": 8, "rd_bytes": 1889792, "invalid_flush_operations": 0, "account_failed": true, "idle_time_ns": 2412792465348, "rd_operations": 359, "invalid_wr_operations": 0, "invalid_rd_operations": 0}, "node-name": "#block543"}], "id": "libvirt-214"}
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.823+0000: 10323: error : virNetClientProgramDispatchError:177 : migration successfully aborted
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.845+0000: 10318: error : virNetSocketReadWire:1811 : End of file while reading data: Input/output error

  It seems like qemu's drive mirror expects the full disk size to be
  present on dest compute, not only what is allocated.

  Revert of the patch solved the issue.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1770640/+subscriptions


References