yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #32260
[Bug 1444300] Re: nova-compute service doesn't restart if resize operation fails
** Changed in: nova/kilo
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1444300
Title:
nova-compute service doesn't restart if resize operation fails
Status in OpenStack Compute (Nova):
Fix Committed
Status in OpenStack Compute (nova) kilo series:
Fix Released
Bug description:
If instance is resizing and user tries to delete the instance, in that
case instance gets deleted successfully. After instance deletion, greenthread which was resizing the instance raises InstanceNotFound error, which caught in errors_out_migration and raises "KeyError: 'migration' ".
Now if user tries to restart the n-cpu service, it fails with
InstanceNotFound error.
Steps to reproduce:
1. Create instance
2. Resize instance
3. Delete instance while resize is in progress (scp/rsync process is running)
4. Instance is deleted successfully and instance files are cleaned from source compute node
5. When scp/rsync process completes it throws error "InstanceNotFound" and later the migration status remains in 'migrating' status. After catching InstanceNotFound error in _errors_out_migration decorator, it throws "KeyError: 'migration'" from errors_out_migration decorator, where migration is expected to be a kwargs, but it is passed as args.
It throws below error:
2015-04-14 23:29:12.466 ERROR nova.compute.manager [req-2b4e3718-a1fa-4603-bd9e-6c9481f75e16 demo demo] [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] Setting instance vm_state to ERROR
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] Traceback (most recent call last):
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] File "/opt/stack/nova/nova/compute/manager.py", line 6358, in _error_out_instance_on_exception
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] yield
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] File "/opt/stack/nova/nova/compute/manager.py", line 3984, in resize_instance
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] timeout, retry_interval)
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6318, in migrate_disk_and_power_off
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] shared_storage)
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 85, in __exit__
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] six.reraise(self.type_, self.value, self.tb)
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6313, in migrate_disk_and_power_off
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] libvirt_utils.copy_image(from_path, img_path, host=dest)
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] File "/opt/stack/nova/nova/virt/libvirt/utils.py", line 327, in copy_image
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] execute('scp', src, dest)
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] File "/opt/stack/nova/nova/virt/libvirt/utils.py", line 55, in execute
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] return utils.execute(*args, **kwargs)
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] File "/opt/stack/nova/nova/utils.py", line 206, in execute
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] return processutils.execute(*cmd, **kwargs)
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/processutils.py", line 238, in execute
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] cmd=sanitized_cmd)
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] ProcessExecutionError: Unexpected error while running command.
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] Command: scp /opt/stack/data/nova/instances/1f24201e-4eac-4dc4-9532-8fb863949a09_resize/disk.config 10.69.4.172:/opt/stack/data/nova/instances/1f24201e-4eac-4dc4-9532-8fb863949a09/disk.config
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] Exit code: 1
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] Stdout: u''
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] Stderr: u'/opt/stack/data/nova/instances/1f24201e-4eac-4dc4-9532-8fb863949a09_resize/disk.config: No such file or directory\n'
2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09]
2015-04-14 23:29:12.632 DEBUG nova.compute.manager [req-2b4e3718-a1fa-4603-b
d9e-6c9481f75e16 demo demo] [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] Instance has been destroyed from under us while trying to set it to ERROR from (pid=28734) _set_instance_error_state /opt/stack/nova/nova/compute/manager.py:741
2015-04-14 23:29:12.684 ERROR root [req-2b4e3718-a1fa-4603-bd9e-6c9481f75e16 demo demo] Original exception being dropped: ['Traceback (most recent call last):\n', ' File "/opt/stack/nova/nova/compute/manager.py", line 269, in decorated_function\n return function(self, context, *args, **kwargs)\n', ' File "/opt/stack/nova/nova/compute/manager.py", line 352, in decorated_function\n kwargs[\'instance\'], e, sys.exc_info())\n', ' File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 85, in __exit__\n six.reraise(self.type_, self.value, self.tb)\n', ' File "/opt/stack/nova/nova/compute/manager.py", line 340, in decorated_function\n return function(self, context, *args, **kwargs)\n', ' File "/opt/stack/nova/nova/compute/manager.py", line 3984, in resize_instance\n timeout, retry_interval)\n', ' File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6318, in migrate_disk_and_power_off\n shared_storage)\n', ' File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 85, in __exit__\n six.reraise(self.type_, self.value, self.tb)\n', ' File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6313, in migrate_disk_and_power_off\n libvirt_utils.copy_image(from_path, img_path, host=dest)\n', ' File "/opt/stack/nova/nova/virt/libvirt/utils.py", line 327, in copy_image\n execute(\'scp\', src, dest)\n', ' File "/opt/stack/nova/nova/virt/libvirt/utils.py", line 55, in execute\n return utils.execute(*args, **kwargs)\n', ' File "/opt/stack/nova/nova/utils.py", line 206, in execute\n return processutils.execute(*cmd, **kwargs)\n', ' File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/processutils.py", line 238, in execute\n cmd=sanitized_cmd)\n', "ProcessExecutionError: Unexpected error while running command.\nCommand: scp /opt/stack/data/nova/instances/1f24201e-4eac-4dc4-9532-8fb863949a09_resize/disk.config 10.69.4.172:/opt/stack/data/nova/instances/1f24201e-4eac-4dc4-9532-8fb863949a09/disk.config\nExit code: 1\nStdout: u''\nStderr: u'/opt/stack/data/nova/instances/1f24201e-4eac-4dc4-9532-8fb863949a09_resize/disk.config: No such file or directory\\n'\n"]
2015-04-14 23:29:12.772 ERROR oslo_messaging.rpc.dispatcher [req-2b4e3718-a1fa-4603-bd9e-6c9481f75e16 demo demo] Exception during message handling: 'migration'
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher Traceback (most recent call last):
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher executor_callback))
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher executor_callback)
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 130, in _do_dispatch
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher result = func(ctxt, **new_args)
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/compute/manager.py", line 6732, in resize_instance
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher clean_shutdown=clean_shutdown)
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/exception.py", line 88, in wrapped
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher payload)
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 85, in __exit__
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/exception.py", line 71, in wrapped
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher return f(self, context, *args, **kw)
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/compute/manager.py", line 324, in decorated_function
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher LOG.warning(msg, e, instance_uuid=instance_uuid)
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 85, in __exit__
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/compute/manager.py", line 295, in decorated_function
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/compute/manager.py", line 374, in decorated_function
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/compute/manager.py", line 272, in decorated_function
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher migration = kwargs['migration']
2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher KeyError: 'migration'
6. Stop and start nova-compute service
When you start nova-compute service, it fails to start with below
error.
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup Traceback (most recent call last):
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/openstack/common/threadgroup.py", line 145, in wait
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup x.wait()
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/openstack/common/threadgroup.py", line 47, in wait
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup return self.thread.wait()
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 175, in wait
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup return self._exit_event.wait()
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 121, in wait
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup return hubs.get_hub().switch()
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 294, in switch
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup return self.greenlet.switch()
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup result = function(*args, **kwargs)
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/openstack/common/service.py", line 497, in run_service
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup service.start()
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/service.py", line 183, in start
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup self.manager.pre_start_hook()
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/compute/manager.py", line 1287, in pre_start_hook
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup self.update_available_resource(nova.context.get_admin_context())
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/compute/manager.py", line 6236, in update_available_resource
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup rt.update_available_resource(context)
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/compute/resource_tracker.py", line 402, in update_available_resource
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup self._update_available_resource(context, resources)
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 445, in inner
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup return f(*args, **kwargs)
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/compute/resource_tracker.py", line 445, in _update_available_resource
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup self._update_usage_from_migrations(context, resources, migrations)
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/compute/resource_tracker.py", line 708, in _update_usage_from_migrations
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup instance = migration.instance
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/objects/migration.py", line 80, in instance
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup return objects.Instance.get_by_uuid(self._context, self.instance_uuid)
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/objects/base.py", line 161, in wrapper
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup args, kwargs)
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/conductor/rpcapi.py", line 325, in object_class_action
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup objver=objver, args=args, kwargs=kwargs)
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 156, in call
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup retry=self.retry)
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 90, in _send
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup timeout=timeout, retry=retry)
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 350, in send
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup retry=retry)
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 341, in _send
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup raise result
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup InstanceNotFound_Remote: Instance 1f24201e-4eac-4dc4-9532-8fb863949a09 could not be found.
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup Traceback (most recent call last):
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/conductor/manager.py", line 423, in _object_dispatch
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup return getattr(target, method)(*args, **kwargs)
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/objects/base.py", line 163, in wrapper
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup result = fn(cls, context, *args, **kwargs)
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/objects/instance.py", line 564, in get_by_uuid
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup use_slave=use_slave)
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/db/api.py", line 651, in instance_get_by_uuid
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup columns_to_join, use_slave=use_slave)
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/db/sqlalchemy/api.py", line 233, in wrapper
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup return f(*args, **kwargs)
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/db/sqlalchemy/api.py", line 1744, in instance_get_by_uuid
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup columns_to_join=columns_to_join, use_slave=use_slave)
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/db/sqlalchemy/api.py", line 1756, in _instance_get_by_uuid
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup raise exception.InstanceNotFound(instance_id=uuid)
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup InstanceNotFound: Instance 1f24201e-4eac-4dc4-9532-8fb863949a09 could not be found.
2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1444300/+subscriptions
References