← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1488435] [NEW] Resources on destination node doesn't cleanup if live-migration fails

 

Public bug reported:

I've deployed multinode devstack environment and tried to live-migrate volume-backed instance,
live-migration fails during copying of glance image to destination. When it happens nova leaves destination host in inconsistent state, and attempt to run live-migration again fails.
Here is small investigation of a problem:
nova creates instance directory on target compute node:
https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L6152
in second run it fail here, because dest node wasn't properly cleanedup:
https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L6148

Also it's really strange, that when pre_live_migration on dest node fails, flow returns to source which decide is that really required to clean_up destination or not. It will be more clearer to wrap pre_live_migration with try/catch and rollback it before response to source node:
https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L4962-L4984

** Affects: nova
     Importance: Undecided
     Assignee: Timofey Durakov (tdurakov)
         Status: New


** Tags: live-migration

** Changed in: nova
     Assignee: (unassigned) => Timofey Durakov (tdurakov)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1488435

Title:
  Resources on destination node doesn't cleanup if live-migration fails

Status in OpenStack Compute (nova):
  New

Bug description:
  I've deployed multinode devstack environment and tried to live-migrate volume-backed instance,
  live-migration fails during copying of glance image to destination. When it happens nova leaves destination host in inconsistent state, and attempt to run live-migration again fails.
  Here is small investigation of a problem:
  nova creates instance directory on target compute node:
  https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L6152
  in second run it fail here, because dest node wasn't properly cleanedup:
  https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L6148

  Also it's really strange, that when pre_live_migration on dest node fails, flow returns to source which decide is that really required to clean_up destination or not. It will be more clearer to wrap pre_live_migration with try/catch and rollback it before response to source node:
  https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L4962-L4984

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1488435/+subscriptions


Follow ups