yahoo-eng-team team mailing list archive

Thread
Date
[Bug 1171526] Re: nova live-migration failed due to exception.MigrationError

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Alan Pevec <1171526@xxxxxxxxxxxxxxxxxx>
Date: Thu, 08 Aug 2013 19:55:00 -0000
Reply-to: Bug 1171526 <1171526@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx
** Changed in: nova/grizzly
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1171526

Title:
  nova live-migration failed due to exception.MigrationError

Status in OpenStack Compute (Nova):
  Fix Released
Status in OpenStack Compute (nova) grizzly series:
  Fix Released

Bug description:
  In grizzly release, when an instance live-migration failed due to
  target hot doesn't  have enough resource such as memory,  the nova
  scheduler driver will throw  MigrationError exception to caller, the
  instance status will be changed from 'Active' to 'MIGRATING',  this
  status for the instance will keep 'MIGRATING' status for a  long time,
  and record trace this in table 'instance_faults' .  So as an admin,
  he don't know what happened for the 'MIGRATING' instance, only if he
  checks the log file. At this time, the instance run happily on source
  host with status 'MIGRATING'

  I think for for above case should with following behavior:

  1. The instance should keep 'Active' status
  2.  A record is created in table of instance_faults with the error message and details

  see following code fragment in nova/scheduler/driver.py

  
      def _assert_compute_node_has_enough_memory(self, context,
                                                instance_ref, dest):
          """Checks if destination host has enough memory for live migration.

  
          :param context: security context
          :param instance_ref: nova.db.sqlalchemy.models.Instance object
          :param dest: destination host

          """
          # Getting total available memory of host
          avail = self._get_compute_info(context, dest)['free_ram_mb']

          mem_inst = instance_ref['memory_mb']
          if not mem_inst or avail <= mem_inst:
              instance_uuid = instance_ref['uuid']
              reason = _("Unable to migrate %(instance_uuid)s to %(dest)s: "
                         "Lack of memory(host:%(avail)s <= "
                         "instance:%(mem_inst)s)")
              raise exception.MigrationError(reason=reason % locals())  >>> >> throw exception

  see following code fragment in nova/scheduler/manager.py

      def live_migration(self, context, instance, dest,
                         block_migration, disk_over_commit):
          try:
              return self.driver.schedule_live_migration(
                  context, instance, dest,
                  block_migration, disk_over_commit)
          except (exception.ComputeServiceUnavailable,         >>>>> doesn't catch 'MigrationError' exception here 
                  exception.InvalidHypervisorType,
                  exception.UnableToMigrateToSelf,
                  exception.DestinationHypervisorTooOld,
                  exception.InvalidLocalStorage,
                  exception.InvalidSharedStorage) as ex:
              request_spec = {'instance_properties': {
                  'uuid': instance['uuid'], },
              }
              with excutils.save_and_reraise_exception():
                  self._set_vm_state_and_notify('live_migration',
                              dict(vm_state=instance['vm_state'],
                                   task_state=None,
                                   expected_task_state=task_states.MIGRATING,),
                                                context, ex, request_spec)
          except Exception as ex: 
              with excutils.save_and_reraise_exception():
                  self._set_vm_state_and_notify('live_migration',
                                               {'vm_state': vm_states.ERROR},
                                               context, ex, {})

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1171526/+subscriptions