← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1528895] Re: Timeouts in update_device_list (too slow with large # of VIFs)

 

[Expired for neutron because there has been no activity for 60 days.]

** Changed in: neutron
       Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1528895

Title:
  Timeouts in update_device_list (too slow with large # of VIFs)

Status in neutron:
  Expired

Bug description:
  In our environment, we have some large compute nodes with a large
  number of VIFs.  When the update_device_list call happens on the agent
  start up:

  https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L842

  This takes a very long time as it seems to loop on each port at the
  server side, contact Nova and much more. The default rpc timeout of 60
  seconds is not enough and it ends up failing on a server with around
  120 VIFs.  When raising the timeout to 120, it seems to work with no
  problems.

  2015-12-23 15:27:27.373 38588 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-1e6cc46d-eb52-4d99-bd77-bf2e8424a1ea - - - - -] Error while processing VIF ports
  2015-12-23 15:27:27.373 38588 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent Traceback (most recent call last):
  2015-12-23 15:27:27.373 38588 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 1752, in rpc_loop
  2015-12-23 15:27:27.373 38588 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     ovs_restarted)
  2015-12-23 15:27:27.373 38588 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 1507, in process_network_ports
  2015-12-23 15:27:27.373 38588 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     self._bind_devices(need_binding_devices)
  2015-12-23 15:27:27.373 38588 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 847, in _bind_devices
  2015-12-23 15:27:27.373 38588 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     self.conf.host)
  2015-12-23 15:27:27.373 38588 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/dist-packages/neutron/agent/rpc.py", line 179, in update_device_list
  2015-12-23 15:27:27.373 38588 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     agent_id=agent_id, host=host)
  2015-12-23 15:27:27.373 38588 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 158, in call
  2015-12-23 15:27:27.373 38588 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     retry=self.retry)
  2015-12-23 15:27:27.373 38588 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 90, in _send
  2015-12-23 15:27:27.373 38588 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     timeout=timeout, retry=retry)
  2015-12-23 15:27:27.373 38588 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 431, in send
  2015-12-23 15:27:27.373 38588 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     retry=retry)
  2015-12-23 15:27:27.373 38588 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 420, in _send
  2015-12-23 15:27:27.373 38588 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     result = self._waiter.wait(msg_id, timeout)
  2015-12-23 15:27:27.373 38588 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 318, in wait
  2015-12-23 15:27:27.373 38588 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     message = self.waiters.get(msg_id, timeout=timeout)
  2015-12-23 15:27:27.373 38588 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 223, in get
  2015-12-23 15:27:27.373 38588 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     'to message ID %s' % msg_id)
  2015-12-23 15:27:27.373 38588 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent MessagingTimeout: Timed out waiting for a reply to message ID c42c1ffc801b41ca89aa4472696bbf1a

  I don't think that an RPC call should ever take that long, the
  neutron-server is not loaded or anything and adding new ones doesn't
  seem to resolve it, due to the fact a single RPC responder answers
  this.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1528895/+subscriptions


References