← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1770640] Re: live block migration of instance with vfat config drive fails

 

** Also affects: cloud-archive
   Importance: Undecided
       Status: New

** Also affects: cloud-archive/queens
   Importance: Undecided
       Status: New

** Also affects: cloud-archive/rocky
   Importance: Undecided
       Status: New

** Also affects: cloud-archive/ocata
   Importance: Undecided
       Status: New

** Also affects: cloud-archive/pike
   Importance: Undecided
       Status: New

** Changed in: cloud-archive/ocata
       Status: New => Triaged

** Changed in: cloud-archive/pike
       Status: New => Triaged

** Changed in: cloud-archive/queens
       Status: New => Triaged

** Changed in: cloud-archive/rocky
       Status: New => Triaged

** Changed in: cloud-archive/pike
   Importance: Undecided => High

** Changed in: cloud-archive/rocky
   Importance: Undecided => High

** Changed in: cloud-archive/queens
   Importance: Undecided => High

** Changed in: cloud-archive/ocata
   Importance: Undecided => High

** Also affects: nova (Ubuntu)
   Importance: Undecided
       Status: New

** Also affects: nova (Ubuntu Cosmic)
   Importance: Undecided
       Status: New

** Also affects: nova (Ubuntu Bionic)
   Importance: Undecided
       Status: New

** Changed in: nova (Ubuntu Bionic)
   Importance: Undecided => High

** Changed in: nova (Ubuntu Bionic)
       Status: New => Triaged

** Changed in: nova (Ubuntu Cosmic)
   Importance: Undecided => High

** Changed in: nova (Ubuntu Cosmic)
       Status: New => Triaged

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1770640

Title:
  live block migration of instance with vfat config drive fails

Status in Ubuntu Cloud Archive:
  Triaged
Status in Ubuntu Cloud Archive ocata series:
  Triaged
Status in Ubuntu Cloud Archive pike series:
  Triaged
Status in Ubuntu Cloud Archive queens series:
  Triaged
Status in Ubuntu Cloud Archive rocky series:
  Triaged
Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) ocata series:
  Fix Committed
Status in OpenStack Compute (nova) pike series:
  Fix Committed
Status in OpenStack Compute (nova) queens series:
  Fix Committed
Status in nova package in Ubuntu:
  Triaged
Status in nova source package in Bionic:
  Triaged
Status in nova source package in Cosmic:
  Triaged

Bug description:
  It seems like the following commit broke block live migrations of
  instances with vfat config drive --
  https://review.openstack.org/561704. Reproduced the problem on pike
  with that patch backport, and I think master is affected as well.

  The exception happening is:

  May 10 17:01:48 cmp02 nova-compute[4443]: Traceback (most recent call last):
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/eventlet/hubs/poll.py", line 115, in wait
  May 10 17:01:48 cmp02 nova-compute[4443]:     listener.cb(fileno)
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main
  May 10 17:01:48 cmp02 nova-compute[4443]:     result = function(*args, **kwargs)
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/nova/utils.py", line 1004, in context_wrapper
  May 10 17:01:48 cmp02 nova-compute[4443]:     return func(*args, **kwargs)
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 6568, in _live_migration_operation
  May 10 17:01:48 cmp02 nova-compute[4443]:     LOG.error("Live Migration failure: %s", e, instance=instance)
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
  May 10 17:01:48 cmp02 nova-compute[4443]:     self.force_reraise()
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
  May 10 17:01:48 cmp02 nova-compute[4443]:     six.reraise(self.type_, self.value, self.tb)
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 6562, in _live_migration_operation
  May 10 17:01:48 cmp02 nova-compute[4443]:     bandwidth=CONF.libvirt.live_migration_bandwidth)
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/guest.py", line 670, in migrate
  May 10 17:01:48 cmp02 nova-compute[4443]:     destination, params=params, flags=flags)
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, in doit
  May 10 17:01:48 cmp02 nova-compute[4443]:     result = proxy_call(self._autowrap, f, *args, **kwargs)
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, in proxy_call
  May 10 17:01:48 cmp02 nova-compute[4443]:     rv = execute(f, *args, **kwargs)
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, in execute
  May 10 17:01:48 cmp02 nova-compute[4443]:     six.reraise(c, e, tb)
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker
  May 10 17:01:48 cmp02 nova-compute[4443]:     rv = meth(*args, **kwargs)
  May 10 17:01:48 cmp02 nova-compute[4443]:   File "/usr/lib/python2.7/dist-packages/libvirt.py", line 1702, in migrateToURI3
  May 10 17:01:48 cmp02 nova-compute[4443]:     if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self)
  May 10 17:01:48 cmp02 nova-compute[4443]: libvirtError: operation failed: migration of disk vdb failed: Input/output error
  May 10 17:01:48 cmp02 nova-compute[4443]: Removing descriptor: 29

  (vdb is a vfat config drive)

  after turning on the info logging in qemu seen the following in the
  syslog:

  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.204+0000: 10323: info : qemuMonitorSend:1076 : QEMU_MONITOR_SEND_MSG: mon=0x7f6fe801b920 msg={"execute":"drive-mirror","arguments":{"device":"drive-virtio-disk0","target":"nbd:10.60.0.105:49153:exportname=drive-virtio-disk0","speed":9223372036853727232,"sync":"top","mode":"existing","format":"raw"},"id":"libvirt-208"}
  May 10 17:01:48 cmp02 libvirtd[10318]:  fd=-1
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.204+0000: 10318: info : qemuMonitorIOWrite:552 : QEMU_MONITOR_IO_WRITE: mon=0x7f6fe801b920 buf={"execute":"drive-mirror","arguments":{"device":"drive-virtio-disk0","target":"nbd:10.60.0.105:49153:exportname=drive-virtio-disk0","speed":9223372036853727232,"sync":"top","mode":"existing","format":"raw"},"id":"libvirt-208"}
  May 10 17:01:48 cmp02 libvirtd[10318]:  len=228 ret=228 errno=0
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.296+0000: 10323: info : qemuMonitorSend:1076 : QEMU_MONITOR_SEND_MSG: mon=0x7f6fe801b920 msg={"execute":"drive-mirror","arguments":{"device":"drive-virtio-disk1","target":"nbd:10.60.0.105:49153:exportname=drive-virtio-disk1","speed":9223372036853727232,"sync":"top","mode":"existing","format":"raw"},"id":"libvirt-209"}
  May 10 17:01:48 cmp02 libvirtd[10318]:  fd=-1
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.296+0000: 10318: info : qemuMonitorIOWrite:552 : QEMU_MONITOR_IO_WRITE: mon=0x7f6fe801b920 buf={"execute":"drive-mirror","arguments":{"device":"drive-virtio-disk1","target":"nbd:10.60.0.105:49153:exportname=drive-virtio-disk1","speed":9223372036853727232,"sync":"top","mode":"existing","format":"raw"},"id":"libvirt-209"}
  May 10 17:01:48 cmp02 libvirtd[10318]:  len=228 ret=228 errno=0
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.388+0000: 10318: info : qemuMonitorJSONIOProcessLine:208 : QEMU_MONITOR_RECV_EVENT: mon=0x7f6fe801b920 event={"timestamp": {"seconds": 1525971708, "microseconds": 387891}, "event": "BLOCK_JOB_ERROR", "data": {"device": "drive-virtio-disk1", "operation": "write", "action": "report"}}
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.388+0000: 10318: info : qemuMonitorJSONIOProcessLine:208 : QEMU_MONITOR_RECV_EVENT: mon=0x7f6fe801b920 event={"timestamp": {"seconds": 1525971708, "microseconds": 388730}, "event": "BLOCK_JOB_READY", "data": {"device": "drive-virtio-disk0", "len": 1507328, "offset": 1507328, "speed": 9223372036853727232, "type": "mirror"}}
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.392+0000: 10318: info : qemuMonitorJSONIOProcessLine:208 : QEMU_MONITOR_RECV_EVENT: mon=0x7f6fe801b920 event={"timestamp": {"seconds": 1525971708, "microseconds": 392754}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "drive-virtio-disk1", "len": 67108864, "offset": 196608, "speed": 9223372036853727232, "type": "mirror", "error": "Input/output error"}}
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.393+0000: 10318: info : qemuMonitorJSONIOProcessLine:213 : QEMU_MONITOR_RECV_REPLY: mon=0x7f6fe801b920 reply={"return": [{"io-status": "ok", "device": "drive-virtio-disk0", "busy": false, "len": 1507328, "offset": 1507328, "paused": false, "speed": 9223372036853727232, "ready": true, "type": "mirror"}], "id": "libvirt-210"}
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.395+0000: 10323: error : qemuMigrationDriveMirrorReady:624 : operation failed: migration of disk vdb failed: Input/output error
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.395+0000: 10323: info : qemuMonitorSend:1076 : QEMU_MONITOR_SEND_MSG: mon=0x7f6fe801b920 msg={"execute":"block-job-cancel","arguments":{"device":"drive-virtio-disk0"},"id":"libvirt-211"}
  May 10 17:01:48 cmp02 libvirtd[10318]:  fd=-1
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.395+0000: 10318: info : qemuMonitorIOWrite:552 : QEMU_MONITOR_IO_WRITE: mon=0x7f6fe801b920 buf={"execute":"block-job-cancel","arguments":{"device":"drive-virtio-disk0"},"id":"libvirt-211"}
  May 10 17:01:48 cmp02 libvirtd[10318]:  len=95 ret=95 errno=0
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.398+0000: 10323: info : qemuMonitorSend:1076 : QEMU_MONITOR_SEND_MSG: mon=0x7f6fe801b920 msg={"execute":"block-job-cancel","arguments":{"device":"drive-virtio-disk1"},"id":"libvirt-212"}
  May 10 17:01:48 cmp02 libvirtd[10318]:  fd=-1
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.398+0000: 10318: info : qemuMonitorIOWrite:552 : QEMU_MONITOR_IO_WRITE: mon=0x7f6fe801b920 buf={"execute":"block-job-cancel","arguments":{"device":"drive-virtio-disk1"},"id":"libvirt-212"}
  May 10 17:01:48 cmp02 libvirtd[10318]:  len=95 ret=95 errno=0
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.402+0000: 10318: info : qemuMonitorJSONIOProcessLine:208 : QEMU_MONITOR_RECV_EVENT: mon=0x7f6fe801b920 event={"timestamp": {"seconds": 1525971708, "microseconds": 401820}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "drive-virtio-disk0", "len": 1507328, "offset": 1507328, "speed": 9223372036853727232, "type": "mirror"}}
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.403+0000: 10318: info : qemuMonitorJSONIOProcessLine:213 : QEMU_MONITOR_RECV_REPLY: mon=0x7f6fe801b920 reply={"id": "libvirt-212", "error": {"class": "DeviceNotActive", "desc": "Block job 'drive-virtio-disk1' not found"}}
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.414+0000: 10318: info : qemuMonitorJSONIOProcessLine:213 : QEMU_MONITOR_RECV_REPLY: mon=0x7f6fe801b920 reply={"return": [{"device": "drive-virtio-disk0", "parent": {"stats": {"flush_total_time_ns": 0, "wr_highest_offset": 1835008, "wr_total_time_ns": 0, "failed_wr_operations": 0, "failed_rd_operations": 0, "wr_merged": 0, "wr_bytes": 0, "timed_stats": [], "failed_flush_operations": 0, "account_invalid": false, "rd_total_time_ns": 0, "flush_operations": 0, "wr_operations": 0, "rd_merged": 0, "rd_bytes": 0, "invalid_flush_operations": 0, "account_failed": false, "rd_operations": 0, "invalid_wr_operations": 0, "invalid_rd_operations": 0}, "node-name": "#block002"}, "stats": {"flush_total_time_ns": 9890804, "wr_highest_offset": 32911872, "wr_total_time_ns": 274684321, "failed_wr_operations": 0, "failed_rd_operations": 0, "wr_merged": 5, "wr_bytes": 268288, "timed_stats": [], "failed_flush_operations": 0, "account_invalid": true, "rd_total_time_ns": 182145276, "flush_operations": 22, "wr_operations": 87, "rd_merged": 7, "rd_bytes": 20611072, "invalid_flush_operations": 0, "account_failed": true, "idle_time_ns": 783247565442, "rd_operations": 912, "invalid_wr_operations": 0, "invalid_rd_operations": 0}, "backing": {"parent": {"stats": {"flush_total_time_ns": 0, "wr_highest_offset": 0, "wr_total_time_ns": 0, "failed_wr_operations": 0, "failed_rd_operations": 0, "wr_merged": 0, "wr_bytes": 0, "timed_stats": [], "failed_flush_operations": 0, "account_invalid": false, "rd_total_time_ns": 0, "flush_operations": 0, "wr_operations": 0, "rd_merged": 0, "rd_bytes": 0, "invalid_flush_operations": 0, "account_failed": false, "rd_operations": 0, "invalid_wr_operations": 0, "invalid_rd_operations": 0}, "node-name": "#block264"}, "stats": {"flush_total_time_ns": 0, "wr_highest_offset": 0, "wr_total_time_ns": 0, "failed_wr_operations": 0, "failed_rd_operations": 0, "wr_merged": 0, "wr_bytes": 0, "timed_stats": [], "failed_flush_operations": 0, "account_invalid": false, "rd_total_time_ns": 0, "flush_operations": 0,
  May 10 17:01:48 cmp02 libvirtd[10318]:  "wr_operations": 0, "rd_merged": 0, "rd_bytes": 0, "invalid_flush_operations": 0, "account_failed": false, "rd_operations": 0, "invalid_wr_operations": 0, "invalid_rd_operations": 0}, "node-name": "#block392"}, "node-name": "#block108"}, {"device": "drive-virtio-disk1", "parent": {"stats": {"flush_total_time_ns": 0, "wr_highest_offset": 0, "wr_total_time_ns": 0, "failed_wr_operations": 0, "failed_rd_operations": 0, "wr_merged": 0, "wr_bytes": 0, "timed_stats": [], "failed_flush_operations": 0, "account_invalid": false, "rd_total_time_ns": 0, "flush_operations": 0, "wr_operations": 0, "rd_merged": 0, "rd_bytes": 0, "invalid_flush_operations": 0, "account_failed": false, "rd_operations": 0, "invalid_wr_operations": 0, "invalid_rd_operations": 0}, "node-name": "#block458"}, "stats": {"flush_total_time_ns": 0, "wr_highest_offset": 0, "wr_total_time_ns": 0, "failed_wr_operations": 0, "failed_rd_operations": 0, "wr_merged": 0, "wr_bytes": 0, "timed_stats": [], "failed_flush_operations": 0, "account_invalid": true, "rd_total_time_ns": 50271508, "flush_operations": 0, "wr_operations": 0, "rd_merged": 8, "rd_bytes": 1889792, "invalid_flush_operations": 0, "account_failed": true, "idle_time_ns": 2412792465348, "rd_operations": 359, "invalid_wr_operations": 0, "invalid_rd_operations": 0}, "node-name": "#block543"}], "id": "libvirt-214"}
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.823+0000: 10323: error : virNetClientProgramDispatchError:177 : migration successfully aborted
  May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.845+0000: 10318: error : virNetSocketReadWire:1811 : End of file while reading data: Input/output error

  It seems like qemu's drive mirror expects the full disk size to be
  present on dest compute, not only what is allocated.

  Revert of the patch solved the issue.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1770640/+subscriptions


References