yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #46521
[Bug 1547544] [NEW] heat: MessagingTimeout: Timed out waiting for a reply to message ID
Public bug reported:
Setup:
Single controller[48 GB RAM, 16vCPU, 120GB Disk]
3 Network Nodes
100 ESX hypervisors distributed in 10 nova-compute nodes
Test:
1. Create /16 network
2. Heat template which which will launch 100 instances on network created step 1
3. Create 10 stack back2back so that we reach 1000 instances without waiting for previous stack to complete
Observation:
stack creations are failing while nova run_periodic_tasks at different
places like _heal_instance_info_cache, _sync_scheduler_instance_info,
_update_available_resource etc
Have attached sample heat template, heat logs, nova compute log from one
of the host.
Logs:
2016-02-19 04:21:54.691 TRACE nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 271, in inner
2016-02-19 04:21:54.691 TRACE nova.compute.manager return f(*args, **kwargs)
2016-02-19 04:21:54.691 TRACE nova.compute.manager File "/opt/stack/nova/nova/compute/resource_tracker.py", line 553, in _update_available_resource
2016-02-19 04:21:54.691 TRACE nova.compute.manager context, self.host, self.nodename)
2016-02-19 04:21:54.691 TRACE nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 174, in wrapper
2016-02-19 04:21:54.691 TRACE nova.compute.manager args, kwargs)
2016-02-19 04:21:54.691 TRACE nova.compute.manager File "/opt/stack/nova/nova/conductor/rpcapi.py", line 240, in object_class_action_versions
2016-02-19 04:21:54.691 TRACE nova.compute.manager args=args, kwargs=kwargs)
2016-02-19 04:21:54.691 TRACE nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 158, in call
2016-02-19 04:21:54.691 TRACE nova.compute.manager retry=self.retry)
2016-02-19 04:21:54.691 TRACE nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 90, in _send
2016-02-19 04:21:54.691 TRACE nova.compute.manager timeout=timeout, retry=retry)
2016-02-19 04:21:54.691 TRACE nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 465, in send
2016-02-19 04:21:54.691 TRACE nova.compute.manager retry=retry)
2016-02-19 04:21:54.691 TRACE nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 454, in _send
2016-02-19 04:21:54.691 TRACE nova.compute.manager result = self._waiter.wait(msg_id, timeout)
2016-02-19 04:21:54.691 TRACE nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 337, in wait
2016-02-19 04:21:54.691 TRACE nova.compute.manager message = self.waiters.get(msg_id, timeout=timeout)
2016-02-19 04:21:54.691 TRACE nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 239, in get
2016-02-19 04:21:54.691 TRACE nova.compute.manager 'to message ID %s' % msg_id)
2016-02-19 04:21:54.691 TRACE nova.compute.manager MessagingTimeout: Timed out waiting for a reply to message ID a87a7f358a0948efa3ab5beb0c8f45e7
--
stack@esx-compute-9:/opt/stack/nova$ git log -1
commit d51c5670d8d26e989d92eb29658eed8113034c0f
Merge: 4fade90 30d5d80
Author: Jenkins <jenkins@xxxxxxxxxxxxxxxxxxxx>
Date: Thu Feb 18 17:56:32 2016 +0000
Merge "reset task_state after select_destinations failed."
stack@esx-compute-9:/opt/stack/nova$
** Affects: nova
Importance: Undecided
Status: New
** Tags: vmware
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1547544
Title:
heat: MessagingTimeout: Timed out waiting for a reply to message ID
Status in OpenStack Compute (nova):
New
Bug description:
Setup:
Single controller[48 GB RAM, 16vCPU, 120GB Disk]
3 Network Nodes
100 ESX hypervisors distributed in 10 nova-compute nodes
Test:
1. Create /16 network
2. Heat template which which will launch 100 instances on network created step 1
3. Create 10 stack back2back so that we reach 1000 instances without waiting for previous stack to complete
Observation:
stack creations are failing while nova run_periodic_tasks at different
places like _heal_instance_info_cache, _sync_scheduler_instance_info,
_update_available_resource etc
Have attached sample heat template, heat logs, nova compute log from
one of the host.
Logs:
2016-02-19 04:21:54.691 TRACE nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 271, in inner
2016-02-19 04:21:54.691 TRACE nova.compute.manager return f(*args, **kwargs)
2016-02-19 04:21:54.691 TRACE nova.compute.manager File "/opt/stack/nova/nova/compute/resource_tracker.py", line 553, in _update_available_resource
2016-02-19 04:21:54.691 TRACE nova.compute.manager context, self.host, self.nodename)
2016-02-19 04:21:54.691 TRACE nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 174, in wrapper
2016-02-19 04:21:54.691 TRACE nova.compute.manager args, kwargs)
2016-02-19 04:21:54.691 TRACE nova.compute.manager File "/opt/stack/nova/nova/conductor/rpcapi.py", line 240, in object_class_action_versions
2016-02-19 04:21:54.691 TRACE nova.compute.manager args=args, kwargs=kwargs)
2016-02-19 04:21:54.691 TRACE nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 158, in call
2016-02-19 04:21:54.691 TRACE nova.compute.manager retry=self.retry)
2016-02-19 04:21:54.691 TRACE nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 90, in _send
2016-02-19 04:21:54.691 TRACE nova.compute.manager timeout=timeout, retry=retry)
2016-02-19 04:21:54.691 TRACE nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 465, in send
2016-02-19 04:21:54.691 TRACE nova.compute.manager retry=retry)
2016-02-19 04:21:54.691 TRACE nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 454, in _send
2016-02-19 04:21:54.691 TRACE nova.compute.manager result = self._waiter.wait(msg_id, timeout)
2016-02-19 04:21:54.691 TRACE nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 337, in wait
2016-02-19 04:21:54.691 TRACE nova.compute.manager message = self.waiters.get(msg_id, timeout=timeout)
2016-02-19 04:21:54.691 TRACE nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 239, in get
2016-02-19 04:21:54.691 TRACE nova.compute.manager 'to message ID %s' % msg_id)
2016-02-19 04:21:54.691 TRACE nova.compute.manager MessagingTimeout: Timed out waiting for a reply to message ID a87a7f358a0948efa3ab5beb0c8f45e7
--
stack@esx-compute-9:/opt/stack/nova$ git log -1
commit d51c5670d8d26e989d92eb29658eed8113034c0f
Merge: 4fade90 30d5d80
Author: Jenkins <jenkins@xxxxxxxxxxxxxxxxxxxx>
Date: Thu Feb 18 17:56:32 2016 +0000
Merge "reset task_state after select_destinations failed."
stack@esx-compute-9:/opt/stack/nova$
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1547544/+subscriptions
Follow ups