← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1199885] Re: Handle RPC errors during scheduling a live migration task

 

*** This bug is a duplicate of bug 1171526 ***
    https://bugs.launchpad.net/bugs/1171526

Interesting, an attempt was made to backport the fix to grizzly but was
abandoned:

https://review.openstack.org/#/c/27956/

Looks like just due to inactivity.

Note that there is also another patch out in havana review for the same
bug (adds on to the original patch) which looks like it could be for the
issue you're seeing here:

https://review.openstack.org/#/c/34485/

Actually, yeah, it's the same descrpition:

"These get bubbled up from the conductor as RemoteErrors, and are
easily triggered due to user error.  Not handling them results in
instances getting permanently stuck in a "Migrating" state."

Looks like this is a dupe.

** This bug has been marked a duplicate of bug 1171526
   nova live-migration failed due to exception.MigrationError

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1199885

Title:
  Handle RPC errors during scheduling a live migration task

Status in OpenStack Compute (Nova):
  New

Bug description:
  This was observed on OpenStack Grizzly, but I think it also applies to
  Havana. When starting a live-migration with --block-migrate without a
  defined destination, Nova (Nova Conductor in Havana) tries to find a
  viable compute node. It does so by running checks locally and on the
  compute node via an RPC call. In some cases the RPC call can report an
  exception:

  2013-07-10 13:28:36.896 ERROR nova.openstack.common.rpc.amqp [req-26e82430-3f0b-4501-b352-6224ed9229a6 40198af7c70c4d58b1a6100e66e65dba 57366b9c0f704db79b2f27f99b52a30c] Exception during message handling
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp Traceback (most recent call last):
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib64/python2.6/site-packages/nova/openstack/common/rpc/amqp.py", line 430, in _process_data
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp     rval = self.proxy.dispatch(ctxt, version, method, **args)
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib64/python2.6/site-packages/nova/openstack/common/rpc/dispatcher.py", line 133, in dispatch
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp     return getattr(proxyobj, method)(ctxt, **kwargs)
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib64/python2.6/site-packages/nova/scheduler/manager.py", line 117, in live_migration
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp     context, ex, {})
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp     self.gen.next()
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib64/python2.6/site-packages/nova/scheduler/manager.py", line 96, in live_migration
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp     block_migration, disk_over_commit)
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib64/python2.6/site-packages/nova/scheduler/driver.py", line 203, in schedule_live_migration
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp     disk_over_commit)
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib64/python2.6/site-packages/nova/compute/rpcapi.py", line 240, in check_can_live_migrate_destination
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp     ctxt, destination, None))
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib64/python2.6/site-packages/nova/openstack/common/rpc/proxy.py", line 80, in call
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp     return rpc.call(context, self._get_topic(topic), msg, timeout)
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib64/python2.6/site-packages/nova/openstack/common/rpc/__init__.py", line 140, in call
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp     return _get_impl().call(CONF, context, topic, msg, timeout)
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib64/python2.6/site-packages/nova/openstack/common/rpc/impl_kombu.py", line 798, in call
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp     rpc_amqp.get_connection_pool(conf, Connection))
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib64/python2.6/site-packages/nova/openstack/common/rpc/amqp.py", line 612, in call
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp     rv = list(rv)
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib64/python2.6/site-packages/nova/openstack/common/rpc/amqp.py", line 561, in __iter__
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp     raise result
  2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp RemoteError: Remote error: MigrationError_Remote Migration error: Unable to migrate a89eac08-c3dd-408e-a804-abbbe49fbac1: Disk of instance is too large(available on destination host:3221225472 < need:8589934592)

  
  This exception is not handled as the code only checks for exception.Invalid. The exception is however nova.openstack.common.rpc.RemoteError

  When that happens, the VM is stuck in the state "MIGRATING"

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1199885/+subscriptions