← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1371587] [NEW] MessagingTimeout errors in unit tests

 

Public bug reported:

These can be seen all over the unit test logs. At least some of them are
caused by tests failing to mock calls to conductor api method
build_instances(), which is spawning new threads to handle such builds.
The timeouts happen when calls to scheduler gets no reply within the
configured rpc timeout of 60 secs.

This is not actually causing any test failures but makes debugging
harder since errors show up randomly in logs.

A typical error looks like this:

Traceback (most recent call last):
  File "nova/conductor/manager.py", line 614, in build_instances
    request_spec, filter_properties)
  File "nova/scheduler/client/__init__.py", line 49, in select_destinations
    context, request_spec, filter_properties)
  File "nova/scheduler/client/__init__.py", line 35, in __run_method
    return getattr(self.instance, __name)(*args, **kwargs)
  File "nova/scheduler/client/query.py", line 34, in select_destinations
    context, request_spec, filter_properties)
  File "nova/scheduler/rpcapi.py", line 107, in select_destinations
    request_spec=request_spec, filter_properties=filter_properties)
  File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/oslo/messaging/rpc/client.py", line 152, in call
    retry=self.retry)
  File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/oslo/messaging/transport.py", line 90, in _send
    timeout=timeout, retry=retry)
  File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_fake.py", line 194, in send
    return self._send(target, ctxt, message, wait_for_reply, timeout)
  File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_fake.py", line 186, in _send
    'No reply on topic %s' % target.topic)
MessagingTimeout: No reply on topic scheduler
WARNING [nova.scheduler.driver] Setting instance to ERROR state.

Then followed by an attempt to set the instance to ERROR state, which
fails since the instance does not exist in the database.

Traceback (most recent call last):
  File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 455, in fire_timers
    timer()
  File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/eventlet/hubs/timer.py", line 58, in __call__
    cb(*args, **kw)
  File "nova/utils.py", line 949, in wrapper
    return func(*args, **kwargs)
  File "nova/conductor/manager.py", line 618, in build_instances
    instance.uuid, request_spec)
  File "nova/scheduler/driver.py", line 67, in handle_schedule_error
    'task_state': None})
  File "nova/db/api.py", line 746, in instance_update_and_get_original
    columns_to_join=columns_to_join)
  File "nova/db/sqlalchemy/api.py", line 143, in wrapper
    return f(*args, **kwargs)
  File "nova/db/sqlalchemy/api.py", line 2282, in instance_update_and_get_original
    columns_to_join=columns_to_join)
  File "nova/db/sqlalchemy/api.py", line 2320, in _instance_update
    columns_to_join=columns_to_join)
  File "nova/db/sqlalchemy/api.py", line 1713, in _instance_get_by_uuid
    raise exception.InstanceNotFound(instance_id=uuid)

** Affects: nova
     Importance: Undecided
     Assignee: Hans Lindgren (hanlind)
         Status: New


** Tags: testing

** Description changed:

  These can be seen all over the unit test logs. At least some of them are
  caused by tests failing to mock calls to conductor api method
- create_instances(), which is spawning new threads to handle such
- creations. The timeouts happen when calls to scheduler gets no reply
- within the configured rpc timeout of 60 secs.
+ build_instances(), which is spawning new threads to handle such builds.
+ The timeouts happen when calls to scheduler gets no reply within the
+ configured rpc timeout of 60 secs.
  
  This is not actually causing any test failures but makes debugging
  harder since errors show up randomly in logs.
  
  A typical error looks like this:
  
  Traceback (most recent call last):
-   File "nova/conductor/manager.py", line 614, in build_instances
-     request_spec, filter_properties)
-   File "nova/scheduler/client/__init__.py", line 49, in select_destinations
-     context, request_spec, filter_properties)
-   File "nova/scheduler/client/__init__.py", line 35, in __run_method
-     return getattr(self.instance, __name)(*args, **kwargs)
-   File "nova/scheduler/client/query.py", line 34, in select_destinations
-     context, request_spec, filter_properties)
-   File "nova/scheduler/rpcapi.py", line 107, in select_destinations
-     request_spec=request_spec, filter_properties=filter_properties)
-   File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/oslo/messaging/rpc/client.py", line 152, in call
-     retry=self.retry)
-   File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/oslo/messaging/transport.py", line 90, in _send
-     timeout=timeout, retry=retry)
-   File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_fake.py", line 194, in send
-     return self._send(target, ctxt, message, wait_for_reply, timeout)
-   File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_fake.py", line 186, in _send
-     'No reply on topic %s' % target.topic)
+   File "nova/conductor/manager.py", line 614, in build_instances
+     request_spec, filter_properties)
+   File "nova/scheduler/client/__init__.py", line 49, in select_destinations
+     context, request_spec, filter_properties)
+   File "nova/scheduler/client/__init__.py", line 35, in __run_method
+     return getattr(self.instance, __name)(*args, **kwargs)
+   File "nova/scheduler/client/query.py", line 34, in select_destinations
+     context, request_spec, filter_properties)
+   File "nova/scheduler/rpcapi.py", line 107, in select_destinations
+     request_spec=request_spec, filter_properties=filter_properties)
+   File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/oslo/messaging/rpc/client.py", line 152, in call
+     retry=self.retry)
+   File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/oslo/messaging/transport.py", line 90, in _send
+     timeout=timeout, retry=retry)
+   File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_fake.py", line 194, in send
+     return self._send(target, ctxt, message, wait_for_reply, timeout)
+   File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_fake.py", line 186, in _send
+     'No reply on topic %s' % target.topic)
  MessagingTimeout: No reply on topic scheduler
  WARNING [nova.scheduler.driver] Setting instance to ERROR state.
  
  Then followed by an attempt to set the instance to ERROR state, which
  fails since the instance does not exist in the database.
  
  Traceback (most recent call last):
-   File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 455, in fire_timers
-     timer()
-   File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/eventlet/hubs/timer.py", line 58, in __call__
-     cb(*args, **kw)
-   File "nova/utils.py", line 949, in wrapper
-     return func(*args, **kwargs)
-   File "nova/conductor/manager.py", line 618, in build_instances
-     instance.uuid, request_spec)
-   File "nova/scheduler/driver.py", line 67, in handle_schedule_error
-     'task_state': None})
-   File "nova/db/api.py", line 746, in instance_update_and_get_original
-     columns_to_join=columns_to_join)
-   File "nova/db/sqlalchemy/api.py", line 143, in wrapper
-     return f(*args, **kwargs)
-   File "nova/db/sqlalchemy/api.py", line 2282, in instance_update_and_get_original
-     columns_to_join=columns_to_join)
-   File "nova/db/sqlalchemy/api.py", line 2320, in _instance_update
-     columns_to_join=columns_to_join)
-   File "nova/db/sqlalchemy/api.py", line 1713, in _instance_get_by_uuid
-     raise exception.InstanceNotFound(instance_id=uuid)
+   File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 455, in fire_timers
+     timer()
+   File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/eventlet/hubs/timer.py", line 58, in __call__
+     cb(*args, **kw)
+   File "nova/utils.py", line 949, in wrapper
+     return func(*args, **kwargs)
+   File "nova/conductor/manager.py", line 618, in build_instances
+     instance.uuid, request_spec)
+   File "nova/scheduler/driver.py", line 67, in handle_schedule_error
+     'task_state': None})
+   File "nova/db/api.py", line 746, in instance_update_and_get_original
+     columns_to_join=columns_to_join)
+   File "nova/db/sqlalchemy/api.py", line 143, in wrapper
+     return f(*args, **kwargs)
+   File "nova/db/sqlalchemy/api.py", line 2282, in instance_update_and_get_original
+     columns_to_join=columns_to_join)
+   File "nova/db/sqlalchemy/api.py", line 2320, in _instance_update
+     columns_to_join=columns_to_join)
+   File "nova/db/sqlalchemy/api.py", line 1713, in _instance_get_by_uuid
+     raise exception.InstanceNotFound(instance_id=uuid)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1371587

Title:
  MessagingTimeout errors in unit tests

Status in OpenStack Compute (Nova):
  New

Bug description:
  These can be seen all over the unit test logs. At least some of them
  are caused by tests failing to mock calls to conductor api method
  build_instances(), which is spawning new threads to handle such
  builds. The timeouts happen when calls to scheduler gets no reply
  within the configured rpc timeout of 60 secs.

  This is not actually causing any test failures but makes debugging
  harder since errors show up randomly in logs.

  A typical error looks like this:

  Traceback (most recent call last):
    File "nova/conductor/manager.py", line 614, in build_instances
      request_spec, filter_properties)
    File "nova/scheduler/client/__init__.py", line 49, in select_destinations
      context, request_spec, filter_properties)
    File "nova/scheduler/client/__init__.py", line 35, in __run_method
      return getattr(self.instance, __name)(*args, **kwargs)
    File "nova/scheduler/client/query.py", line 34, in select_destinations
      context, request_spec, filter_properties)
    File "nova/scheduler/rpcapi.py", line 107, in select_destinations
      request_spec=request_spec, filter_properties=filter_properties)
    File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/oslo/messaging/rpc/client.py", line 152, in call
      retry=self.retry)
    File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/oslo/messaging/transport.py", line 90, in _send
      timeout=timeout, retry=retry)
    File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_fake.py", line 194, in send
      return self._send(target, ctxt, message, wait_for_reply, timeout)
    File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_fake.py", line 186, in _send
      'No reply on topic %s' % target.topic)
  MessagingTimeout: No reply on topic scheduler
  WARNING [nova.scheduler.driver] Setting instance to ERROR state.

  Then followed by an attempt to set the instance to ERROR state, which
  fails since the instance does not exist in the database.

  Traceback (most recent call last):
    File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 455, in fire_timers
      timer()
    File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/eventlet/hubs/timer.py", line 58, in __call__
      cb(*args, **kw)
    File "nova/utils.py", line 949, in wrapper
      return func(*args, **kwargs)
    File "nova/conductor/manager.py", line 618, in build_instances
      instance.uuid, request_spec)
    File "nova/scheduler/driver.py", line 67, in handle_schedule_error
      'task_state': None})
    File "nova/db/api.py", line 746, in instance_update_and_get_original
      columns_to_join=columns_to_join)
    File "nova/db/sqlalchemy/api.py", line 143, in wrapper
      return f(*args, **kwargs)
    File "nova/db/sqlalchemy/api.py", line 2282, in instance_update_and_get_original
      columns_to_join=columns_to_join)
    File "nova/db/sqlalchemy/api.py", line 2320, in _instance_update
      columns_to_join=columns_to_join)
    File "nova/db/sqlalchemy/api.py", line 1713, in _instance_get_by_uuid
      raise exception.InstanceNotFound(instance_id=uuid)

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1371587/+subscriptions


Follow ups

References