← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2028185] [NEW] Large number of FIPs causes slow sync_routers response

 

Public bug reported:

Description
-----------

When in DVR mode, the sync_routers RPC call (specifically the
_get_dvr_sync_data function) becomes very slow if there are a large
number of FIPs configured for a router.

This appears to be due to it fetching every FIP in the network and then
filtering out the ones that are needed within the Python code (and
sometimes via additional DB calls) rather than only fetching the
required FIPs from the database.


Preconditions
-------------
* Neutron is setup with DVR and multiple hosts
* A network is created with a significant amount of FIPs (1000s should be enough to make this issue visible)


Step by step reproduction steps
--------------------------------
* Restart the neutron_l3_agent and note the time cost logged when calling the sync_routers RPC method


Expected output
---------------
* This RPC method returns in a reasonable amount of time (10s or less)


Actual output
-------------
* This RPC method returns in 40s or more causing unnecessary load on the Neutron server


Version
-------
* OpenStack Zed

** Affects: neutron
     Importance: Undecided
     Assignee: Adam Oswick (adamoswick)
         Status: New


** Tags: l3-dvr-backlog

** Tags added: l3-dvr-backlog

** Changed in: neutron
     Assignee: (unassigned) => Adam Oswick (adamoswick)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2028185

Title:
  Large number of FIPs causes slow sync_routers response

Status in neutron:
  New

Bug description:
  Description
  -----------

  When in DVR mode, the sync_routers RPC call (specifically the
  _get_dvr_sync_data function) becomes very slow if there are a large
  number of FIPs configured for a router.

  This appears to be due to it fetching every FIP in the network and
  then filtering out the ones that are needed within the Python code
  (and sometimes via additional DB calls) rather than only fetching the
  required FIPs from the database.

  
  Preconditions
  -------------
  * Neutron is setup with DVR and multiple hosts
  * A network is created with a significant amount of FIPs (1000s should be enough to make this issue visible)

  
  Step by step reproduction steps
  --------------------------------
  * Restart the neutron_l3_agent and note the time cost logged when calling the sync_routers RPC method

  
  Expected output
  ---------------
  * This RPC method returns in a reasonable amount of time (10s or less)

  
  Actual output
  -------------
  * This RPC method returns in 40s or more causing unnecessary load on the Neutron server

  
  Version
  -------
  * OpenStack Zed

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2028185/+subscriptions