← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1415106] [NEW] gate-tempest-dsvm-large-ops fails with RPC MessagingTimeout in _try_deallocate_network

 

Public bug reported:

http://logs.openstack.org/60/147460/7/gate/gate-tempest-dsvm-large-
ops/12ba360/logs/screen-n-cpu-1.txt.gz?level=TRACE

http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwiU2V0dGluZyBpbnN0YW5jZSB2bV9zdGF0ZSB0byBFUlJPUlwiIEFORCBtZXNzYWdlOlwiX3RyeV9kZWFsbG9jYXRlX25ldHdvcmtcIiBBTkQgbWVzc2FnZTpcIk1lc3NhZ2luZ1RpbWVvdXRcIiBBTkQgYnVpbGRfbmFtZTpcImdhdGUtdGVtcGVzdC1kc3ZtLWxhcmdlLW9wc1wiIiwiZmllbGRzIjpbXSwib2Zmc2V0IjowLCJ0aW1lZnJhbWUiOiI2MDQ4MDAiLCJncmFwaG1vZGUiOiJjb3VudCIsInRpbWUiOnsidXNlcl9pbnRlcnZhbCI6MH0sInN0YW1wIjoxNDIyMzc3NTI3MzgwLCJtb2RlIjoiIiwiYW5hbHl6ZV9maWVsZCI6IiJ9

61 hits in 7 days, check and gate, all failures.

2015-01-27 10:55:34.025 ERROR nova.compute.manager [req-34f0ef2f-34f0-4b06-9e30-fd72565a3388 TestLargeOpsScenario-405264936 TestLargeOpsScenario-221766999] [instance: 63145652-811e-4124-8ef7-cc64b594336a] Setting instance vm_state to ERROR
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a] Traceback (most recent call last):
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/opt/stack/new/nova/nova/compute/manager.py", line 2569, in do_terminate_instance
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     self._delete_instance(context, instance, bdms, quotas)
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/opt/stack/new/nova/nova/hooks.py", line 149, in inner
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     rv = f(*args, **kwargs)
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/opt/stack/new/nova/nova/compute/manager.py", line 2538, in _delete_instance
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     quotas.rollback()
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 82, in __exit__
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     six.reraise(self.type_, self.value, self.tb)
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/opt/stack/new/nova/nova/compute/manager.py", line 2515, in _delete_instance
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     self._shutdown_instance(context, instance, bdms)
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/opt/stack/new/nova/nova/compute/manager.py", line 2449, in _shutdown_instance
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     self._try_deallocate_network(context, instance, requested_networks)
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/opt/stack/new/nova/nova/compute/manager.py", line 2379, in _try_deallocate_network
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     self._set_instance_error_state(context, instance)
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 82, in __exit__
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     six.reraise(self.type_, self.value, self.tb)
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/opt/stack/new/nova/nova/compute/manager.py", line 2374, in _try_deallocate_network
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     self._deallocate_network(context, instance, requested_networks)
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/opt/stack/new/nova/nova/compute/manager.py", line 1954, in _deallocate_network
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     context, instance, requested_networks=requested_networks)
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/opt/stack/new/nova/nova/network/api.py", line 47, in wrapped
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     return func(self, context, *args, **kwargs)
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/opt/stack/new/nova/nova/network/api.py", line 292, in deallocate_for_instance
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     requested_networks=requested_networks)
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/opt/stack/new/nova/nova/network/rpcapi.py", line 209, in deallocate_for_instance
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     return cctxt.call(ctxt, 'deallocate_for_instance', **kwargs)
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/client.py", line 389, in call
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     return self.prepare().call(ctxt, method, **kwargs)
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/client.py", line 152, in call
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     retry=self.retry)
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/transport.py", line 90, in _send
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     timeout=timeout, retry=retry)
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 436, in send
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     retry=retry)
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 425, in _send
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     result = self._waiter.wait(msg_id, timeout)
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 328, in wait
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     reply, ending, trylock = self._poll_queue(msg_id, timer)
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 267, in _poll_queue
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     message = self.waiters.get(msg_id, timeout=timer.check_return(msg_id))
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 173, in get
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     'to message ID %s' % msg_id)
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a] MessagingTimeout: Timed out waiting for a reply to message ID 2717e1139f6d414dbebbc85164e05f9c
2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a] 


This job uses nova-network, it's doing an rpc call on the network deallocate so if something is holding a lock in the same area would could be timing out that way, need to check the code to see what's all using locks at the same time in these flows.

** Affects: nova
     Importance: Undecided
         Status: New


** Tags: network

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1415106

Title:
  gate-tempest-dsvm-large-ops fails with RPC MessagingTimeout in
  _try_deallocate_network

Status in OpenStack Compute (Nova):
  New

Bug description:
  http://logs.openstack.org/60/147460/7/gate/gate-tempest-dsvm-large-
  ops/12ba360/logs/screen-n-cpu-1.txt.gz?level=TRACE

  http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwiU2V0dGluZyBpbnN0YW5jZSB2bV9zdGF0ZSB0byBFUlJPUlwiIEFORCBtZXNzYWdlOlwiX3RyeV9kZWFsbG9jYXRlX25ldHdvcmtcIiBBTkQgbWVzc2FnZTpcIk1lc3NhZ2luZ1RpbWVvdXRcIiBBTkQgYnVpbGRfbmFtZTpcImdhdGUtdGVtcGVzdC1kc3ZtLWxhcmdlLW9wc1wiIiwiZmllbGRzIjpbXSwib2Zmc2V0IjowLCJ0aW1lZnJhbWUiOiI2MDQ4MDAiLCJncmFwaG1vZGUiOiJjb3VudCIsInRpbWUiOnsidXNlcl9pbnRlcnZhbCI6MH0sInN0YW1wIjoxNDIyMzc3NTI3MzgwLCJtb2RlIjoiIiwiYW5hbHl6ZV9maWVsZCI6IiJ9

  61 hits in 7 days, check and gate, all failures.

  2015-01-27 10:55:34.025 ERROR nova.compute.manager [req-34f0ef2f-34f0-4b06-9e30-fd72565a3388 TestLargeOpsScenario-405264936 TestLargeOpsScenario-221766999] [instance: 63145652-811e-4124-8ef7-cc64b594336a] Setting instance vm_state to ERROR
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a] Traceback (most recent call last):
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/opt/stack/new/nova/nova/compute/manager.py", line 2569, in do_terminate_instance
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     self._delete_instance(context, instance, bdms, quotas)
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/opt/stack/new/nova/nova/hooks.py", line 149, in inner
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     rv = f(*args, **kwargs)
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/opt/stack/new/nova/nova/compute/manager.py", line 2538, in _delete_instance
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     quotas.rollback()
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 82, in __exit__
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     six.reraise(self.type_, self.value, self.tb)
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/opt/stack/new/nova/nova/compute/manager.py", line 2515, in _delete_instance
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     self._shutdown_instance(context, instance, bdms)
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/opt/stack/new/nova/nova/compute/manager.py", line 2449, in _shutdown_instance
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     self._try_deallocate_network(context, instance, requested_networks)
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/opt/stack/new/nova/nova/compute/manager.py", line 2379, in _try_deallocate_network
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     self._set_instance_error_state(context, instance)
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 82, in __exit__
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     six.reraise(self.type_, self.value, self.tb)
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/opt/stack/new/nova/nova/compute/manager.py", line 2374, in _try_deallocate_network
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     self._deallocate_network(context, instance, requested_networks)
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/opt/stack/new/nova/nova/compute/manager.py", line 1954, in _deallocate_network
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     context, instance, requested_networks=requested_networks)
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/opt/stack/new/nova/nova/network/api.py", line 47, in wrapped
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     return func(self, context, *args, **kwargs)
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/opt/stack/new/nova/nova/network/api.py", line 292, in deallocate_for_instance
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     requested_networks=requested_networks)
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/opt/stack/new/nova/nova/network/rpcapi.py", line 209, in deallocate_for_instance
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     return cctxt.call(ctxt, 'deallocate_for_instance', **kwargs)
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/client.py", line 389, in call
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     return self.prepare().call(ctxt, method, **kwargs)
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/client.py", line 152, in call
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     retry=self.retry)
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/transport.py", line 90, in _send
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     timeout=timeout, retry=retry)
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 436, in send
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     retry=retry)
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 425, in _send
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     result = self._waiter.wait(msg_id, timeout)
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 328, in wait
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     reply, ending, trylock = self._poll_queue(msg_id, timer)
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 267, in _poll_queue
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     message = self.waiters.get(msg_id, timeout=timer.check_return(msg_id))
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]   File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 173, in get
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a]     'to message ID %s' % msg_id)
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a] MessagingTimeout: Timed out waiting for a reply to message ID 2717e1139f6d414dbebbc85164e05f9c
  2015-01-27 10:55:34.025 32626 TRACE nova.compute.manager [instance: 63145652-811e-4124-8ef7-cc64b594336a] 

  
  This job uses nova-network, it's doing an rpc call on the network deallocate so if something is holding a lock in the same area would could be timing out that way, need to check the code to see what's all using locks at the same time in these flows.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1415106/+subscriptions


Follow ups

References