← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1444300] Re: nova-compute service doesn't restart if resize operation fails

 

** Changed in: nova
       Status: Fix Committed => Fix Released

** Changed in: nova
    Milestone: None => liberty-1

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1444300

Title:
  nova-compute service doesn't restart if resize operation fails

Status in OpenStack Compute (Nova):
  Fix Released
Status in OpenStack Compute (nova) kilo series:
  Fix Released

Bug description:
  If instance is resizing and user tries to delete the instance, in that
  case instance gets deleted successfully. After instance deletion, greenthread which was resizing the instance raises InstanceNotFound error, which caught in errors_out_migration and raises "KeyError: 'migration' ".

  Now if user tries to restart the n-cpu service, it fails with
  InstanceNotFound error.

  Steps to reproduce:
  1. Create instance
  2. Resize instance
  3. Delete instance while resize is in progress (scp/rsync process is running)
  4. Instance is deleted successfully and instance files are cleaned from source compute node
  5. When scp/rsync process completes it throws error "InstanceNotFound" and later the migration status remains in 'migrating' status. After catching InstanceNotFound error in _errors_out_migration decorator, it throws "KeyError: 'migration'" from errors_out_migration decorator, where migration is expected to be a kwargs, but it is passed as args.
  It throws below error:

  2015-04-14 23:29:12.466 ERROR nova.compute.manager [req-2b4e3718-a1fa-4603-bd9e-6c9481f75e16 demo demo] [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] Setting instance vm_state to ERROR
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] Traceback (most recent call last):
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09]   File "/opt/stack/nova/nova/compute/manager.py", line 6358, in _error_out_instance_on_exception
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09]     yield
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09]   File "/opt/stack/nova/nova/compute/manager.py", line 3984, in resize_instance
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09]     timeout, retry_interval)
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09]   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6318, in migrate_disk_and_power_off
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09]     shared_storage)
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09]   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 85, in __exit__
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09]     six.reraise(self.type_, self.value, self.tb)
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09]   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6313, in migrate_disk_and_power_off
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09]     libvirt_utils.copy_image(from_path, img_path, host=dest)
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09]   File "/opt/stack/nova/nova/virt/libvirt/utils.py", line 327, in copy_image
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09]     execute('scp', src, dest)
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09]   File "/opt/stack/nova/nova/virt/libvirt/utils.py", line 55, in execute
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09]     return utils.execute(*args, **kwargs)
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09]   File "/opt/stack/nova/nova/utils.py", line 206, in execute
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09]     return processutils.execute(*cmd, **kwargs)
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09]   File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/processutils.py", line 238, in execute
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09]     cmd=sanitized_cmd)
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] ProcessExecutionError: Unexpected error while running command.
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] Command: scp /opt/stack/data/nova/instances/1f24201e-4eac-4dc4-9532-8fb863949a09_resize/disk.config 10.69.4.172:/opt/stack/data/nova/instances/1f24201e-4eac-4dc4-9532-8fb863949a09/disk.config
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] Exit code: 1
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] Stdout: u''
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] Stderr: u'/opt/stack/data/nova/instances/1f24201e-4eac-4dc4-9532-8fb863949a09_resize/disk.config: No such file or directory\n'
  2015-04-14 23:29:12.466 TRACE nova.compute.manager [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09]
  2015-04-14 23:29:12.632 DEBUG nova.compute.manager [req-2b4e3718-a1fa-4603-b
  d9e-6c9481f75e16 demo demo] [instance: 1f24201e-4eac-4dc4-9532-8fb863949a09] Instance has been destroyed from under us while trying to set it to ERROR from (pid=28734) _set_instance_error_state /opt/stack/nova/nova/compute/manager.py:741
  2015-04-14 23:29:12.684 ERROR root [req-2b4e3718-a1fa-4603-bd9e-6c9481f75e16 demo demo] Original exception being dropped: ['Traceback (most recent call last):\n', '  File "/opt/stack/nova/nova/compute/manager.py", line 269, in decorated_function\n    return function(self, context, *args, **kwargs)\n', '  File "/opt/stack/nova/nova/compute/manager.py", line 352, in decorated_function\n    kwargs[\'instance\'], e, sys.exc_info())\n', '  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 85, in __exit__\n    six.reraise(self.type_, self.value, self.tb)\n', '  File "/opt/stack/nova/nova/compute/manager.py", line 340, in decorated_function\n    return function(self, context, *args, **kwargs)\n', '  File "/opt/stack/nova/nova/compute/manager.py", line 3984, in resize_instance\n    timeout, retry_interval)\n', '  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6318, in migrate_disk_and_power_off\n    shared_storage)\n', '  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 85, in __exit__\n    six.reraise(self.type_, self.value, self.tb)\n', '  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6313, in migrate_disk_and_power_off\n    libvirt_utils.copy_image(from_path, img_path, host=dest)\n', '  File "/opt/stack/nova/nova/virt/libvirt/utils.py", line 327, in copy_image\n    execute(\'scp\', src, dest)\n', '  File "/opt/stack/nova/nova/virt/libvirt/utils.py", line 55, in execute\n    return utils.execute(*args, **kwargs)\n', '  File "/opt/stack/nova/nova/utils.py", line 206, in execute\n    return processutils.execute(*cmd, **kwargs)\n', '  File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/processutils.py", line 238, in execute\n    cmd=sanitized_cmd)\n', "ProcessExecutionError: Unexpected error while running command.\nCommand: scp /opt/stack/data/nova/instances/1f24201e-4eac-4dc4-9532-8fb863949a09_resize/disk.config 10.69.4.172:/opt/stack/data/nova/instances/1f24201e-4eac-4dc4-9532-8fb863949a09/disk.config\nExit code: 1\nStdout: u''\nStderr: u'/opt/stack/data/nova/instances/1f24201e-4eac-4dc4-9532-8fb863949a09_resize/disk.config: No such file or directory\\n'\n"]
  2015-04-14 23:29:12.772 ERROR oslo_messaging.rpc.dispatcher [req-2b4e3718-a1fa-4603-bd9e-6c9481f75e16 demo demo] Exception during message handling: 'migration'
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher Traceback (most recent call last):
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher     executor_callback))
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher     executor_callback)
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 130, in _do_dispatch
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher     result = func(ctxt, **new_args)
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher   File "/opt/stack/nova/nova/compute/manager.py", line 6732, in resize_instance
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher     clean_shutdown=clean_shutdown)
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher   File "/opt/stack/nova/nova/exception.py", line 88, in wrapped
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher     payload)
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 85, in __exit__
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher     six.reraise(self.type_, self.value, self.tb)
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher   File "/opt/stack/nova/nova/exception.py", line 71, in wrapped
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher     return f(self, context, *args, **kw)
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher   File "/opt/stack/nova/nova/compute/manager.py", line 324, in decorated_function
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher     LOG.warning(msg, e, instance_uuid=instance_uuid)
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 85, in __exit__
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher     six.reraise(self.type_, self.value, self.tb)
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher   File "/opt/stack/nova/nova/compute/manager.py", line 295, in decorated_function
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher     return function(self, context, *args, **kwargs)
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher   File "/opt/stack/nova/nova/compute/manager.py", line 374, in decorated_function
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher     return function(self, context, *args, **kwargs)
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher   File "/opt/stack/nova/nova/compute/manager.py", line 272, in decorated_function
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher     migration = kwargs['migration']
  2015-04-14 23:29:12.772 TRACE oslo_messaging.rpc.dispatcher KeyError: 'migration'

  6. Stop and start nova-compute service

  When you start nova-compute service, it fails to start with below
  error.

  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup Traceback (most recent call last):
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/opt/stack/nova/nova/openstack/common/threadgroup.py", line 145, in wait
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     x.wait()
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/opt/stack/nova/nova/openstack/common/threadgroup.py", line 47, in wait
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     return self.thread.wait()
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 175, in wait
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     return self._exit_event.wait()
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 121, in wait
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     return hubs.get_hub().switch()
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 294, in switch
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     return self.greenlet.switch()
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     result = function(*args, **kwargs)
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/opt/stack/nova/nova/openstack/common/service.py", line 497, in run_service
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     service.start()
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/opt/stack/nova/nova/service.py", line 183, in start
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     self.manager.pre_start_hook()
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/opt/stack/nova/nova/compute/manager.py", line 1287, in pre_start_hook
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     self.update_available_resource(nova.context.get_admin_context())
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/opt/stack/nova/nova/compute/manager.py", line 6236, in update_available_resource
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     rt.update_available_resource(context)
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 402, in update_available_resource
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     self._update_available_resource(context, resources)
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 445, in inner
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     return f(*args, **kwargs)
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 445, in _update_available_resource
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     self._update_usage_from_migrations(context, resources, migrations)
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 708, in _update_usage_from_migrations
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     instance = migration.instance
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/opt/stack/nova/nova/objects/migration.py", line 80, in instance
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     return objects.Instance.get_by_uuid(self._context, self.instance_uuid)
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/opt/stack/nova/nova/objects/base.py", line 161, in wrapper
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     args, kwargs)
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/opt/stack/nova/nova/conductor/rpcapi.py", line 325, in object_class_action
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     objver=objver, args=args, kwargs=kwargs)
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 156, in call
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     retry=self.retry)
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 90, in _send
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     timeout=timeout, retry=retry)
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 350, in send
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     retry=retry)
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 341, in _send
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     raise result
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup InstanceNotFound_Remote: Instance 1f24201e-4eac-4dc4-9532-8fb863949a09 could not be found.
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup Traceback (most recent call last):
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/opt/stack/nova/nova/conductor/manager.py", line 423, in _object_dispatch
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     return getattr(target, method)(*args, **kwargs)
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/opt/stack/nova/nova/objects/base.py", line 163, in wrapper
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     result = fn(cls, context, *args, **kwargs)
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/opt/stack/nova/nova/objects/instance.py", line 564, in get_by_uuid
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     use_slave=use_slave)
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/opt/stack/nova/nova/db/api.py", line 651, in instance_get_by_uuid
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     columns_to_join, use_slave=use_slave)
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/opt/stack/nova/nova/db/sqlalchemy/api.py", line 233, in wrapper
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     return f(*args, **kwargs)
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/opt/stack/nova/nova/db/sqlalchemy/api.py", line 1744, in instance_get_by_uuid
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     columns_to_join=columns_to_join, use_slave=use_slave)
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup   File "/opt/stack/nova/nova/db/sqlalchemy/api.py", line 1756, in _instance_get_by_uuid
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup     raise exception.InstanceNotFound(instance_id=uuid)
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup InstanceNotFound: Instance 1f24201e-4eac-4dc4-9532-8fb863949a09 could not be found.
  2015-04-14 23:31:31.256 TRACE nova.openstack.common.threadgroup

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1444300/+subscriptions


References