← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1897716] [NEW] Down state host rejoins cluster failed

 

Public bug reported:

Description
===========
When a down state host rejoin to the cluster, nova-compute service failed to start. Here is the log:


ERROR oslo_service.service Traceback (most recent call last):
ERROR oslo_service.service   File "/usr/local/lib/python3.6/site-packages/oslo_service/service.py", line 807, in run_service
ERROR oslo_service.service     service.start()
ERROR oslo_service.service   File "/home/stack/nova/nova/service.py", line 159, in start
ERROR oslo_service.service     self.manager.init_host()
ERROR oslo_service.service   File "/home/stack/nova/nova/compute/manager.py", line 1439, in init_host
ERROR oslo_service.service     context, nodes_by_uuid)
ERROR oslo_service.service   File "/home/stack/nova/nova/compute/manager.py", line 746, in _destroy_evacuated_instances
ERROR oslo_service.service     bdi, destroy_disks)
ERROR oslo_service.service   File "/home/stack/nova/nova/virt/libvirt/driver.py", line 1342, in destroy
ERROR oslo_service.service     destroy_disks)
ERROR oslo_service.service   File "/home/stack/nova/nova/virt/libvirt/driver.py", line 1414, in cleanup
ERROR oslo_service.service     cleanup_instance_disks=cleanup_instance_disks)
ERROR oslo_service.service   File "/home/stack/nova/nova/virt/libvirt/driver.py", line 1493, in _cleanup
ERROR oslo_service.service     instance.save()
ERROR oslo_service.service   File "/usr/local/lib/python3.6/site-packages/oslo_versionedobjects/base.py", line 210, in wrapper
ERROR oslo_service.service     ctxt, self, fn.__name__, args, kwargs)
ERROR oslo_service.service   File "/home/stack/nova/nova/conductor/rpcapi.py", line 248, in object_action
ERROR oslo_service.service     objmethod=objmethod, args=args, kwargs=kwargs)
ERROR oslo_service.service   File "/usr/local/lib/python3.6/site-packages/oslo_messaging/rpc/client.py", line 179, in call
ERROR oslo_service.service     transport_options=self.transport_options)
ERROR oslo_service.service   File "/usr/local/lib/python3.6/site-packages/oslo_messaging/transport.py", line 128, in _send
ERROR oslo_service.service     transport_options=transport_options)
ERROR oslo_service.service   File "/usr/local/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 654, in send
ERROR oslo_service.service     transport_options=transport_options)
ERROR oslo_service.service   File "/usr/local/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 644, in _send
ERROR oslo_service.service     raise result
ERROR oslo_service.service nova.exception_Remote.InstanceNotFound_Remote: Instance dd7d0109-34c5-4800-b8a1-0e28b208f75e could not be found.
ERROR oslo_service.service Traceback (most recent call last):
ERROR oslo_service.service 
ERROR oslo_service.service   File "/home/stack/nova/nova/conductor/manager.py", line 139, in _object_dispatch
ERROR oslo_service.service     return getattr(target, method)(*args, **kwargs)
ERROR oslo_service.service 
ERROR oslo_service.service   File "/usr/local/lib/python3.6/site-packages/oslo_versionedobjects/base.py", line 226, in wrapper
ERROR oslo_service.service     return fn(self, *args, **kwargs)
ERROR oslo_service.service 
ERROR oslo_service.service   File "/home/stack/nova/nova/objects/instance.py", line 838, in save
ERROR oslo_service.service     columns_to_join=_expected_cols(expected_attrs))
ERROR oslo_service.service 
ERROR oslo_service.service   File "/home/stack/nova/nova/db/api.py", line 685, in instance_update_and_get_original
ERROR oslo_service.service     expected=expected)
ERROR oslo_service.service 
ERROR oslo_service.service   File "/home/stack/nova/nova/db/sqlalchemy/api.py", line 179, in wrapper
ERROR oslo_service.service     return f(*args, **kwargs)
ERROR oslo_service.service 
ERROR oslo_service.service   File "/usr/local/lib/python3.6/site-packages/oslo_db/api.py", line 154, in wrapper
ERROR oslo_service.service     ectxt.value = e.inner_exc
ERROR oslo_service.service 
ERROR oslo_service.service   File "/usr/local/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
ERROR oslo_service.service     self.force_reraise()
ERROR oslo_service.service 
ERROR oslo_service.service   File "/usr/local/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
ERROR oslo_service.service     six.reraise(self.type_, self.value, self.tb)
ERROR oslo_service.service 
ERROR oslo_service.service   File "/usr/local/lib/python3.6/site-packages/six.py", line 703, in reraise
ERROR oslo_service.service     raise value
ERROR oslo_service.service 
ERROR oslo_service.service   File "/usr/local/lib/python3.6/site-packages/oslo_db/api.py", line 142, in wrapper
ERROR oslo_service.service     return f(*args, **kwargs)
ERROR oslo_service.service 
ERROR oslo_service.service   File "/home/stack/nova/nova/db/sqlalchemy/api.py", line 222, in wrapped
ERROR oslo_service.service     return f(context, *args, **kwargs)
ERROR oslo_service.service 
ERROR oslo_service.service   File "/home/stack/nova/nova/db/sqlalchemy/api.py", line 2106, in instance_update_and_get_original
ERROR oslo_service.service     columns_to_join=columns_to_join)
ERROR oslo_service.service 
ERROR oslo_service.service   File "/home/stack/nova/nova/db/sqlalchemy/api.py", line 1234, in _instance_get_by_uuid
ERROR oslo_service.service     raise exception.InstanceNotFound(instance_id=uuid)
ERROR oslo_service.service 
ERROR oslo_service.service nova.exception.InstanceNotFound: Instance dd7d0109-34c5-4800-b8a1-0e28b208f75e could not be found.
ERROR oslo_service.service 
ERROR oslo_service.service 
DEBUG oslo_concurrency.lockutils [None req-cdd7334f-ef15-45cd-b732-d86e59fe40f6 None None] Acquired lock "singleton_lock" {{(pid=122256) lock /usr/local/lib/python3.6/site-packages/oslo_concurrency/lockutils.py:266}}
DEBUG oslo_concurrency.lockutils [None req-cdd7334f-ef15-45cd-b732-d86e59fe40f6 None None] Releasing lock "singleton_lock" {{(pid=122256) lock /usr/local/lib/python3.6/site-packages/oslo_concurrency/lockutils.py:282}} 

Steps to reproduce
==================
* Force down a compute node or there is a down host.
* nova evacuate {instance_id}  # the instance should launch at the down host before.
* nova delete {instance_id}
* nova force-down-service {service_id} --unset
* Restart nova-compute service

Expected result
===============
Host rejoins the cluster.

Actual result
=============
compute start failed

Environment
===========
Trigger this bug in Q release. And still exists in my devstack(master) environment.

** Affects: nova
     Importance: Undecided
     Assignee: wangzhh (wangzhh)
         Status: New

** Changed in: nova
     Assignee: (unassigned) => wangzhh (wangzhh)

** Summary changed:

- Down state host rejoin cluster failed
+ Down state host rejoins cluster failed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1897716

Title:
  Down state host rejoins cluster failed

Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  ===========
  When a down state host rejoin to the cluster, nova-compute service failed to start. Here is the log:

  
  ERROR oslo_service.service Traceback (most recent call last):
  ERROR oslo_service.service   File "/usr/local/lib/python3.6/site-packages/oslo_service/service.py", line 807, in run_service
  ERROR oslo_service.service     service.start()
  ERROR oslo_service.service   File "/home/stack/nova/nova/service.py", line 159, in start
  ERROR oslo_service.service     self.manager.init_host()
  ERROR oslo_service.service   File "/home/stack/nova/nova/compute/manager.py", line 1439, in init_host
  ERROR oslo_service.service     context, nodes_by_uuid)
  ERROR oslo_service.service   File "/home/stack/nova/nova/compute/manager.py", line 746, in _destroy_evacuated_instances
  ERROR oslo_service.service     bdi, destroy_disks)
  ERROR oslo_service.service   File "/home/stack/nova/nova/virt/libvirt/driver.py", line 1342, in destroy
  ERROR oslo_service.service     destroy_disks)
  ERROR oslo_service.service   File "/home/stack/nova/nova/virt/libvirt/driver.py", line 1414, in cleanup
  ERROR oslo_service.service     cleanup_instance_disks=cleanup_instance_disks)
  ERROR oslo_service.service   File "/home/stack/nova/nova/virt/libvirt/driver.py", line 1493, in _cleanup
  ERROR oslo_service.service     instance.save()
  ERROR oslo_service.service   File "/usr/local/lib/python3.6/site-packages/oslo_versionedobjects/base.py", line 210, in wrapper
  ERROR oslo_service.service     ctxt, self, fn.__name__, args, kwargs)
  ERROR oslo_service.service   File "/home/stack/nova/nova/conductor/rpcapi.py", line 248, in object_action
  ERROR oslo_service.service     objmethod=objmethod, args=args, kwargs=kwargs)
  ERROR oslo_service.service   File "/usr/local/lib/python3.6/site-packages/oslo_messaging/rpc/client.py", line 179, in call
  ERROR oslo_service.service     transport_options=self.transport_options)
  ERROR oslo_service.service   File "/usr/local/lib/python3.6/site-packages/oslo_messaging/transport.py", line 128, in _send
  ERROR oslo_service.service     transport_options=transport_options)
  ERROR oslo_service.service   File "/usr/local/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 654, in send
  ERROR oslo_service.service     transport_options=transport_options)
  ERROR oslo_service.service   File "/usr/local/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 644, in _send
  ERROR oslo_service.service     raise result
  ERROR oslo_service.service nova.exception_Remote.InstanceNotFound_Remote: Instance dd7d0109-34c5-4800-b8a1-0e28b208f75e could not be found.
  ERROR oslo_service.service Traceback (most recent call last):
  ERROR oslo_service.service 
  ERROR oslo_service.service   File "/home/stack/nova/nova/conductor/manager.py", line 139, in _object_dispatch
  ERROR oslo_service.service     return getattr(target, method)(*args, **kwargs)
  ERROR oslo_service.service 
  ERROR oslo_service.service   File "/usr/local/lib/python3.6/site-packages/oslo_versionedobjects/base.py", line 226, in wrapper
  ERROR oslo_service.service     return fn(self, *args, **kwargs)
  ERROR oslo_service.service 
  ERROR oslo_service.service   File "/home/stack/nova/nova/objects/instance.py", line 838, in save
  ERROR oslo_service.service     columns_to_join=_expected_cols(expected_attrs))
  ERROR oslo_service.service 
  ERROR oslo_service.service   File "/home/stack/nova/nova/db/api.py", line 685, in instance_update_and_get_original
  ERROR oslo_service.service     expected=expected)
  ERROR oslo_service.service 
  ERROR oslo_service.service   File "/home/stack/nova/nova/db/sqlalchemy/api.py", line 179, in wrapper
  ERROR oslo_service.service     return f(*args, **kwargs)
  ERROR oslo_service.service 
  ERROR oslo_service.service   File "/usr/local/lib/python3.6/site-packages/oslo_db/api.py", line 154, in wrapper
  ERROR oslo_service.service     ectxt.value = e.inner_exc
  ERROR oslo_service.service 
  ERROR oslo_service.service   File "/usr/local/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
  ERROR oslo_service.service     self.force_reraise()
  ERROR oslo_service.service 
  ERROR oslo_service.service   File "/usr/local/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
  ERROR oslo_service.service     six.reraise(self.type_, self.value, self.tb)
  ERROR oslo_service.service 
  ERROR oslo_service.service   File "/usr/local/lib/python3.6/site-packages/six.py", line 703, in reraise
  ERROR oslo_service.service     raise value
  ERROR oslo_service.service 
  ERROR oslo_service.service   File "/usr/local/lib/python3.6/site-packages/oslo_db/api.py", line 142, in wrapper
  ERROR oslo_service.service     return f(*args, **kwargs)
  ERROR oslo_service.service 
  ERROR oslo_service.service   File "/home/stack/nova/nova/db/sqlalchemy/api.py", line 222, in wrapped
  ERROR oslo_service.service     return f(context, *args, **kwargs)
  ERROR oslo_service.service 
  ERROR oslo_service.service   File "/home/stack/nova/nova/db/sqlalchemy/api.py", line 2106, in instance_update_and_get_original
  ERROR oslo_service.service     columns_to_join=columns_to_join)
  ERROR oslo_service.service 
  ERROR oslo_service.service   File "/home/stack/nova/nova/db/sqlalchemy/api.py", line 1234, in _instance_get_by_uuid
  ERROR oslo_service.service     raise exception.InstanceNotFound(instance_id=uuid)
  ERROR oslo_service.service 
  ERROR oslo_service.service nova.exception.InstanceNotFound: Instance dd7d0109-34c5-4800-b8a1-0e28b208f75e could not be found.
  ERROR oslo_service.service 
  ERROR oslo_service.service 
  DEBUG oslo_concurrency.lockutils [None req-cdd7334f-ef15-45cd-b732-d86e59fe40f6 None None] Acquired lock "singleton_lock" {{(pid=122256) lock /usr/local/lib/python3.6/site-packages/oslo_concurrency/lockutils.py:266}}
  DEBUG oslo_concurrency.lockutils [None req-cdd7334f-ef15-45cd-b732-d86e59fe40f6 None None] Releasing lock "singleton_lock" {{(pid=122256) lock /usr/local/lib/python3.6/site-packages/oslo_concurrency/lockutils.py:282}} 

  Steps to reproduce
  ==================
  * Force down a compute node or there is a down host.
  * nova evacuate {instance_id}  # the instance should launch at the down host before.
  * nova delete {instance_id}
  * nova force-down-service {service_id} --unset
  * Restart nova-compute service

  Expected result
  ===============
  Host rejoins the cluster.

  Actual result
  =============
  compute start failed

  Environment
  ===========
  Trigger this bug in Q release. And still exists in my devstack(master) environment.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1897716/+subscriptions