yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #03910
[Bug 1199885] Re: Handle RPC errors during scheduling a live migration task
*** This bug is a duplicate of bug 1171526 ***
https://bugs.launchpad.net/bugs/1171526
Interesting, an attempt was made to backport the fix to grizzly but was
abandoned:
https://review.openstack.org/#/c/27956/
Looks like just due to inactivity.
Note that there is also another patch out in havana review for the same
bug (adds on to the original patch) which looks like it could be for the
issue you're seeing here:
https://review.openstack.org/#/c/34485/
Actually, yeah, it's the same descrpition:
"These get bubbled up from the conductor as RemoteErrors, and are
easily triggered due to user error. Not handling them results in
instances getting permanently stuck in a "Migrating" state."
Looks like this is a dupe.
** This bug has been marked a duplicate of bug 1171526
nova live-migration failed due to exception.MigrationError
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1199885
Title:
Handle RPC errors during scheduling a live migration task
Status in OpenStack Compute (Nova):
New
Bug description:
This was observed on OpenStack Grizzly, but I think it also applies to
Havana. When starting a live-migration with --block-migrate without a
defined destination, Nova (Nova Conductor in Havana) tries to find a
viable compute node. It does so by running checks locally and on the
compute node via an RPC call. In some cases the RPC call can report an
exception:
2013-07-10 13:28:36.896 ERROR nova.openstack.common.rpc.amqp [req-26e82430-3f0b-4501-b352-6224ed9229a6 40198af7c70c4d58b1a6100e66e65dba 57366b9c0f704db79b2f27f99b52a30c] Exception during message handling
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp Traceback (most recent call last):
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp File "/usr/lib64/python2.6/site-packages/nova/openstack/common/rpc/amqp.py", line 430, in _process_data
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp rval = self.proxy.dispatch(ctxt, version, method, **args)
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp File "/usr/lib64/python2.6/site-packages/nova/openstack/common/rpc/dispatcher.py", line 133, in dispatch
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp return getattr(proxyobj, method)(ctxt, **kwargs)
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp File "/usr/lib64/python2.6/site-packages/nova/scheduler/manager.py", line 117, in live_migration
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp context, ex, {})
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp self.gen.next()
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp File "/usr/lib64/python2.6/site-packages/nova/scheduler/manager.py", line 96, in live_migration
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp block_migration, disk_over_commit)
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp File "/usr/lib64/python2.6/site-packages/nova/scheduler/driver.py", line 203, in schedule_live_migration
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp disk_over_commit)
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp File "/usr/lib64/python2.6/site-packages/nova/compute/rpcapi.py", line 240, in check_can_live_migrate_destination
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp ctxt, destination, None))
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp File "/usr/lib64/python2.6/site-packages/nova/openstack/common/rpc/proxy.py", line 80, in call
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp return rpc.call(context, self._get_topic(topic), msg, timeout)
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp File "/usr/lib64/python2.6/site-packages/nova/openstack/common/rpc/__init__.py", line 140, in call
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp return _get_impl().call(CONF, context, topic, msg, timeout)
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp File "/usr/lib64/python2.6/site-packages/nova/openstack/common/rpc/impl_kombu.py", line 798, in call
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp rpc_amqp.get_connection_pool(conf, Connection))
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp File "/usr/lib64/python2.6/site-packages/nova/openstack/common/rpc/amqp.py", line 612, in call
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp rv = list(rv)
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp File "/usr/lib64/python2.6/site-packages/nova/openstack/common/rpc/amqp.py", line 561, in __iter__
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp raise result
2013-07-10 13:28:36.896 10708 TRACE nova.openstack.common.rpc.amqp RemoteError: Remote error: MigrationError_Remote Migration error: Unable to migrate a89eac08-c3dd-408e-a804-abbbe49fbac1: Disk of instance is too large(available on destination host:3221225472 < need:8589934592)
This exception is not handled as the code only checks for exception.Invalid. The exception is however nova.openstack.common.rpc.RemoteError
When that happens, the VM is stuck in the state "MIGRATING"
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1199885/+subscriptions