← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1628606] [NEW] live migration does not clean up at target node if a failure occurs during post migration

 

Public bug reported:


If a live migration fails during the post processing on the source (i.e.
failure to disconnect volumes) it can lead to the instance being
shutdown on the source node and left in a migrating task state. Also the
copy of the instance on the target node will be left running although
not usable because neutron networking has not yet been switch to target
and nova stills records the instance as being on the source node.

This situation can be resolved as follows:

on target
virsh destroy <instance domain id>
if the compute nodes are NOT using shared storage
sudo rm -rf <instance uuid directory>

Then use nova client as admin to restart the instance on the source node:
nova reset-state --active <instance uuid>
nova reboot --hard <instance uuid>

I will investigate how to address this issue

** Affects: nova
     Importance: Undecided
     Assignee: Paul Carlton (paul-carlton2)
         Status: New

** Changed in: nova
     Assignee: (unassigned) => Paul Carlton (paul-carlton2)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1628606

Title:
  live migration does not clean up at target node if a failure occurs
  during post migration

Status in OpenStack Compute (nova):
  New

Bug description:

  If a live migration fails during the post processing on the source
  (i.e. failure to disconnect volumes) it can lead to the instance being
  shutdown on the source node and left in a migrating task state. Also
  the copy of the instance on the target node will be left running
  although not usable because neutron networking has not yet been switch
  to target and nova stills records the instance as being on the source
  node.

  This situation can be resolved as follows:

  on target
  virsh destroy <instance domain id>
  if the compute nodes are NOT using shared storage
  sudo rm -rf <instance uuid directory>

  Then use nova client as admin to restart the instance on the source node:
  nova reset-state --active <instance uuid>
  nova reboot --hard <instance uuid>

  I will investigate how to address this issue

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1628606/+subscriptions


Follow ups