yahoo-eng-team team mailing list archive

Thread
Date

[Bug 1413314] [NEW] dvr router update does not scale for many fips

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Erik Colnick <erik.colnick@xxxxxx>
Date: Wed, 21 Jan 2015 17:22:15 -0000
Reply-to: Bug 1413314 <1413314@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx

Public bug reported:

On an installation with 7 compute nodes and distributed routing enabled,
the time it takes to process a sync_routers request on the controller
grows linearly with the number of floating ips.  With 120 floating ips
associated to vm instances distributed across the the compute nodes (all
attached to one router), the time to associate or disassociate a
floating ip to an instance has been observed to take over 40 seconds
(and this with 3 load balanced controller nodes and 10 rpc worker
threads and 10 api worker threads configured on each controller node).

Tracing the logs, it is observed that the highest percentage of time
spent as the number of floating ips associated to vms increases is in
the '_process_floating_ips' method of the l3_dvr_db.py source, which
makes multiple DB requests per floating ip.  The second longest time is
spent in the get_sync_data method itself prior to the call to
_process_floating_ips where a call is made to the get_vm_port_hostid
method for each floating ip.

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1413314

Title:
  dvr router update does not scale for many fips

Status in OpenStack Neutron (virtual network service):
  New

Bug description:
  On an installation with 7 compute nodes and distributed routing
  enabled, the time it takes to process a sync_routers request on the
  controller grows linearly with the number of floating ips.  With 120
  floating ips associated to vm instances distributed across the the
  compute nodes (all attached to one router), the time to associate or
  disassociate a floating ip to an instance has been observed to take
  over 40 seconds (and this with 3 load balanced controller nodes and 10
  rpc worker threads and 10 api worker threads configured on each
  controller node).

  Tracing the logs, it is observed that the highest percentage of time
  spent as the number of floating ips associated to vms increases is in
  the '_process_floating_ips' method of the l3_dvr_db.py source, which
  makes multiple DB requests per floating ip.  The second longest time
  is spent in the get_sync_data method itself prior to the call to
  _process_floating_ips where a call is made to the get_vm_port_hostid
  method for each floating ip.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1413314/+subscriptions

Follow ups

[Bug 1413314] Re: dvr router update does not scale for many fips
From: Thierry Carrez, 2015-04-09
[Bug 1413314] [NEW] dvr router update does not scale for many fips
From: Erik Colnick, 2015-01-21

References

[Bug 1413314] [NEW] dvr router update does not scale for many fips
From: Erik Colnick, 2015-01-21