← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1547544] Re: heat: MessagingTimeout: Timed out waiting for a reply to message ID

 

I realistically expect that you have just overloaded the system so these
requests are taking too long. dstat info during the run would be useful
to figure that out.

** Also affects: oslo.messaging
   Importance: Undecided
       Status: New

** Changed in: nova
       Status: New => Incomplete

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1547544

Title:
  heat: MessagingTimeout: Timed out waiting for a reply to message ID

Status in OpenStack Compute (nova):
  Incomplete
Status in oslo.messaging:
  New

Bug description:
  Setup:

  Single controller[48 GB RAM, 16vCPU, 120GB Disk]
  3 Network Nodes
  100 ESX hypervisors distributed in 10 nova-compute nodes

  Test:

  1. Create /16 network
  2. Heat template which which will launch 100 instances on network created step 1
  3. Create 10 stack back2back so that we reach 1000 instances without waiting for previous stack to complete

  Observation:

  stack creations are failing while nova run_periodic_tasks at different
  places like _heal_instance_info_cache,  _sync_scheduler_instance_info,
  _update_available_resource etc

  Have attached sample heat template, heat logs, nova compute log from
  one of the host.

  
  Logs:

  2016-02-19 04:21:54.691 TRACE nova.compute.manager   File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 271, in inner
  2016-02-19 04:21:54.691 TRACE nova.compute.manager     return f(*args, **kwargs)
  2016-02-19 04:21:54.691 TRACE nova.compute.manager   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 553, in _update_available_resource
  2016-02-19 04:21:54.691 TRACE nova.compute.manager     context, self.host, self.nodename)
  2016-02-19 04:21:54.691 TRACE nova.compute.manager   File "/usr/local/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 174, in wrapper
  2016-02-19 04:21:54.691 TRACE nova.compute.manager     args, kwargs)
  2016-02-19 04:21:54.691 TRACE nova.compute.manager   File "/opt/stack/nova/nova/conductor/rpcapi.py", line 240, in object_class_action_versions
  2016-02-19 04:21:54.691 TRACE nova.compute.manager     args=args, kwargs=kwargs)
  2016-02-19 04:21:54.691 TRACE nova.compute.manager   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 158, in call
  2016-02-19 04:21:54.691 TRACE nova.compute.manager     retry=self.retry)
  2016-02-19 04:21:54.691 TRACE nova.compute.manager   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 90, in _send
  2016-02-19 04:21:54.691 TRACE nova.compute.manager     timeout=timeout, retry=retry)
  2016-02-19 04:21:54.691 TRACE nova.compute.manager   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 465, in send
  2016-02-19 04:21:54.691 TRACE nova.compute.manager     retry=retry)
  2016-02-19 04:21:54.691 TRACE nova.compute.manager   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 454, in _send
  2016-02-19 04:21:54.691 TRACE nova.compute.manager     result = self._waiter.wait(msg_id, timeout)
  2016-02-19 04:21:54.691 TRACE nova.compute.manager   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 337, in wait
  2016-02-19 04:21:54.691 TRACE nova.compute.manager     message = self.waiters.get(msg_id, timeout=timeout)
  2016-02-19 04:21:54.691 TRACE nova.compute.manager   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 239, in get
  2016-02-19 04:21:54.691 TRACE nova.compute.manager     'to message ID %s' % msg_id)
  2016-02-19 04:21:54.691 TRACE nova.compute.manager MessagingTimeout: Timed out waiting for a reply to message ID a87a7f358a0948efa3ab5beb0c8f45e7
  --

  
  stack@esx-compute-9:/opt/stack/nova$ git log -1
  commit d51c5670d8d26e989d92eb29658eed8113034c0f
  Merge: 4fade90 30d5d80
  Author: Jenkins <jenkins@xxxxxxxxxxxxxxxxxxxx>
  Date:   Thu Feb 18 17:56:32 2016 +0000

      Merge "reset task_state after select_destinations failed."
  stack@esx-compute-9:/opt/stack/nova$

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1547544/+subscriptions


References