yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #38166
[Bug 1491949] Re: gate-tempest-dsvm-large-ops fails to deallocate instance network due to rpc timeout
Reviewed: https://review.openstack.org/220525
Committed: https://git.openstack.org/cgit/openstack-dev/devstack/commit/?id=975243189216561f66ca91520495e0c6e2f747e2
Submitter: Jenkins
Branch: master
commit 975243189216561f66ca91520495e0c6e2f747e2
Author: Matt Riedemann <mriedem@xxxxxxxxxx>
Date: Fri Sep 4 14:15:27 2015 +0000
Revert "turn multi host true for nova network by default"
This reverts commit 2e1a91c50b73ca7f46871d3a906ade93bbcac6a7
It looks like this introduced race bug 1491949 in the
gate-tempest-dsvm-large-ops job causing rpc timeouts when
deallocating network information for an instance,
specifically around the dnsmasq callback to release the
fixed IP that the instance was using which triggers the
disassociation between the fixed IP and the instance in the
nova database.
Change-Id: I163cdeea75e92485f241647c69aea0d7456c3258
Closes-Bug: #1491949
** Changed in: devstack
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1491949
Title:
gate-tempest-dsvm-large-ops fails to deallocate instance network due
to rpc timeout
Status in devstack:
Fix Released
Status in OpenStack Compute (nova):
Invalid
Bug description:
http://logs.openstack.org/96/219696/4/check/gate-tempest-dsvm-large-
ops/158f061/logs/screen-n-cpu-1.txt.gz?level=TRACE
2015-09-03 15:11:10.090 ERROR nova.compute.manager [req-ae96c425-a199-472f-a0db-e1b48147bb4c tempest-TestLargeOpsScenario-1690771764 tempest-TestLargeOpsScenario-1474206998] [instance: 195361d7-95c3-4740-825b-1acab707969e] Failed to deallocate network for instance.
2015-09-03 15:11:11.051 ERROR nova.compute.manager [req-ae96c425-a199-472f-a0db-e1b48147bb4c tempest-TestLargeOpsScenario-1690771764 tempest-TestLargeOpsScenario-1474206998] [instance: 195361d7-95c3-4740-825b-1acab707969e] Setting instance vm_state to ERROR
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] Traceback (most recent call last):
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] File "/opt/stack/new/nova/nova/compute/manager.py", line 2396, in do_terminate_instance
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] self._delete_instance(context, instance, bdms, quotas)
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] File "/opt/stack/new/nova/nova/hooks.py", line 149, in inner
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] rv = f(*args, **kwargs)
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] File "/opt/stack/new/nova/nova/compute/manager.py", line 2375, in _delete_instance
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] quotas.rollback()
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 195, in __exit__
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] six.reraise(self.type_, self.value, self.tb)
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] File "/opt/stack/new/nova/nova/compute/manager.py", line 2338, in _delete_instance
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] self._shutdown_instance(context, instance, bdms)
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] File "/opt/stack/new/nova/nova/compute/manager.py", line 2265, in _shutdown_instance
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] self._try_deallocate_network(context, instance, requested_networks)
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] File "/opt/stack/new/nova/nova/compute/manager.py", line 2194, in _try_deallocate_network
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] self._set_instance_obj_error_state(context, instance)
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 195, in __exit__
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] six.reraise(self.type_, self.value, self.tb)
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] File "/opt/stack/new/nova/nova/compute/manager.py", line 2189, in _try_deallocate_network
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] self._deallocate_network(context, instance, requested_networks)
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] File "/opt/stack/new/nova/nova/compute/manager.py", line 1812, in _deallocate_network
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] context, instance, requested_networks=requested_networks)
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] File "/opt/stack/new/nova/nova/network/api.py", line 49, in wrapped
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] return func(self, context, *args, **kwargs)
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] File "/opt/stack/new/nova/nova/network/api.py", line 297, in deallocate_for_instance
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] requested_networks=requested_networks)
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] File "/opt/stack/new/nova/nova/network/rpcapi.py", line 175, in deallocate_for_instance
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] return cctxt.call(ctxt, 'deallocate_for_instance', **kwargs)
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 158, in call
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] retry=self.retry)
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 90, in _send
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] timeout=timeout, retry=retry)
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 431, in send
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] retry=retry)
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 420, in _send
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] result = self._waiter.wait(msg_id, timeout)
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 318, in wait
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] message = self.waiters.get(msg_id, timeout=timeout)
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 223, in get
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] 'to message ID %s' % msg_id)
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e] MessagingTimeout: Timed out waiting for a reply to message ID 20d504c45f1c45e980c0194ac492d531
2015-09-03 15:11:11.051 22635 ERROR nova.compute.manager [instance: 195361d7-95c3-4740-825b-1acab707969e]
I see this in the n-net logs which I think we only do if something
else fails:
http://logs.openstack.org/96/219696/4/check/gate-tempest-dsvm-large-
ops/158f061/logs/screen-n-net.txt.gz#_2015-09-03_15_11_10_664
2015-09-03 15:11:10.664 DEBUG nova.network.manager [req-ae96c425-a199
-472f-a0db-e1b48147bb4c tempest-TestLargeOpsScenario-1690771764
tempest-TestLargeOpsScenario-1474206998] [instance: 195361d7-95c3-4740
-825b-1acab707969e] Explicitly disassociating fixed IP %s from
instance. deallocate_fixed_ip
/opt/stack/new/nova/nova/network/manager.py:1077
https://github.com/openstack/nova/blob/master/nova/network/manager.py#L1068
# NOTE(yufang521247): This is probably a failed dhcp fixed
# ip. DHCPRELEASE packet sent to dnsmasq would not trigger
# dhcp-bridge to run. Thus it is better to disassociate
# such fixed ip here.
fixed_ip_ref = objects.FixedIP.get_by_address(
context, address)
if (instance_uuid == fixed_ip_ref.instance_uuid and
not fixed_ip_ref.leased):
LOG.debug('Explicitly disassociating fixed IP %s from '
'instance.', instance_uuid=instance_uuid)
fixed_ip_ref.disassociate()
To manage notifications about this bug go to:
https://bugs.launchpad.net/devstack/+bug/1491949/+subscriptions
References