← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1611871] [NEW] Timeouts in conductor when updating large sets of security group rules (liberty)

 

Public bug reported:


I have a project with 130+ instances in it.  When I set a 'source group' security rule in that project, the rule is never applied on the compute nodes.

nova-compute logs include timeout warnings like the one pasted below.
This timeout only happens in 'big' cases.  If I add a single port to
every instance in the project, everything's fine.  If I add a 'source
group' rule to a project with fewer instances, everything is also fine.
It's only the n^2 case for large numbers of n that I get the timeout.

Increasing my rpc_response_timeout setting from the default of 60 makes
the issue go away.  From this I conclude that conductor is not choking,
it just really takes longer than 60 seconds.

Openstack Liberty, kvm, running on Ubuntu Trusty servers with
3.13-series kernels.


Sample stack trace:


2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher [req-7af38b91-cfaa-4739-88ec-cbcf10142653 andrew tools - - -] Exception during message handling: Timed out waiting for a reply to message ID 77988fad9aa940aa929752826bde7cdc
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher Traceback (most recent call last):
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     executor_callback))
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     executor_callback)
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 129, in _do_dispatch
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     result = func(ctxt, **new_args)
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 470, in decorated_function
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     return function(self, context, *args, **kwargs)
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 89, in wrapped
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     payload)
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 195, in __exit__
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     six.reraise(self.type_, self.value, self.tb)
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 72, in wrapped
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     return f(self, context, *args, **kw)
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1387, in refresh_instance_security_rules
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     return _sync_refresh()
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 254, in inner
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     return f(*args, **kwargs)
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1382, in _sync_refresh
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     return self.driver.refresh_instance_security_rules(instance)
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 5074, in refresh_instance_security_rules
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     self.firewall_driver.refresh_instance_security_rules(instance)
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/virt/firewall.py", line 434, in refresh_instance_security_rules
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     self.do_refresh_instance_rules(instance)
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/virt/firewall.py", line 467, in do_refresh_instance_rules
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     ipv4_rules, ipv6_rules = self.instance_rules(instance, network_info)
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/virt/firewall.py", line 399, in instance_rules
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     ctxt, rule['grantee_group']))
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 1228, in get_by_security_group
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     return cls.get_by_security_group_id(context, security_group.id)
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 169, in wrapper
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     args, kwargs)
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 229, in object_class_action
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     args, kwargs)
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 237, in object_class_action_versions
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     args=args, kwargs=kwargs)
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 158, in call
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     retry=self.retry)
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 90, in _send
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     timeout=timeout, retry=retry)
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 462, in send
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     retry=retry)
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 451, in _send
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     result = self._waiter.wait(msg_id, timeout)
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 348, in wait
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     message = self.waiters.get(msg_id, timeout=timeout)
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 253, in get
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     'to message ID %s' % msg_id)
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher MessagingTimeout: Timed out waiting for a reply to message ID 77988fad9aa940aa929752826bde7cdc
2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1611871

Title:
  Timeouts in conductor when updating large sets of security group rules
  (liberty)

Status in OpenStack Compute (nova):
  New

Bug description:
  
  I have a project with 130+ instances in it.  When I set a 'source group' security rule in that project, the rule is never applied on the compute nodes.

  nova-compute logs include timeout warnings like the one pasted below.
  This timeout only happens in 'big' cases.  If I add a single port to
  every instance in the project, everything's fine.  If I add a 'source
  group' rule to a project with fewer instances, everything is also
  fine.  It's only the n^2 case for large numbers of n that I get the
  timeout.

  Increasing my rpc_response_timeout setting from the default of 60
  makes the issue go away.  From this I conclude that conductor is not
  choking, it just really takes longer than 60 seconds.

  Openstack Liberty, kvm, running on Ubuntu Trusty servers with
  3.13-series kernels.

  
  Sample stack trace:

  
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher [req-7af38b91-cfaa-4739-88ec-cbcf10142653 andrew tools - - -] Exception during message handling: Timed out waiting for a reply to message ID 77988fad9aa940aa929752826bde7cdc
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher Traceback (most recent call last):
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     executor_callback))
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     executor_callback)
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 129, in _do_dispatch
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     result = func(ctxt, **new_args)
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 470, in decorated_function
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     return function(self, context, *args, **kwargs)
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 89, in wrapped
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     payload)
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 195, in __exit__
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     six.reraise(self.type_, self.value, self.tb)
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 72, in wrapped
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     return f(self, context, *args, **kw)
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1387, in refresh_instance_security_rules
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     return _sync_refresh()
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 254, in inner
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     return f(*args, **kwargs)
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1382, in _sync_refresh
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     return self.driver.refresh_instance_security_rules(instance)
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 5074, in refresh_instance_security_rules
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     self.firewall_driver.refresh_instance_security_rules(instance)
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/virt/firewall.py", line 434, in refresh_instance_security_rules
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     self.do_refresh_instance_rules(instance)
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/virt/firewall.py", line 467, in do_refresh_instance_rules
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     ipv4_rules, ipv6_rules = self.instance_rules(instance, network_info)
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/virt/firewall.py", line 399, in instance_rules
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     ctxt, rule['grantee_group']))
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 1228, in get_by_security_group
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     return cls.get_by_security_group_id(context, security_group.id)
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 169, in wrapper
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     args, kwargs)
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 229, in object_class_action
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     args, kwargs)
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 237, in object_class_action_versions
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     args=args, kwargs=kwargs)
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 158, in call
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     retry=self.retry)
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 90, in _send
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     timeout=timeout, retry=retry)
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 462, in send
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     retry=retry)
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 451, in _send
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     result = self._waiter.wait(msg_id, timeout)
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 348, in wait
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     message = self.waiters.get(msg_id, timeout=timeout)
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 253, in get
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher     'to message ID %s' % msg_id)
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher MessagingTimeout: Timed out waiting for a reply to message ID 77988fad9aa940aa929752826bde7cdc
  2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1611871/+subscriptions