yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #91457
[Bug 2009509] [NEW] Large number of FIPs and subnets causes slow sync_routers response
Public bug reported:
When a large number of subnets and FIPs are configured on a network, the
response time for neutron.api.rpc.handlers.l3_rpc.syncer_routers
increases significantly.
Based on profiling data, a large amount of time is spent waiting on
_get_sync_floating_ips
(https://opendev.org/openstack/neutron/src/commit/0a214b0437874fd7f5379ec94fd07ef5d3ff4bbe/neutron/db/l3_db.py#L1879).
ncalls tottime percall cumtime percall filename:lineno(function)
...TRUNCATED...
16 2 0.000 0.000 19.827 9.913 /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/db/l3_db.py:1873(_get_sync_floating_ips)
...TRUNCATED...
In the above example, the total execution time logged for sync_routers
was 26.645s.
Further investigation reveals that the call to
l3_obj.FloatingIP.get_scoped_floating_ips within this is spending a
large amount of time mapping SQL output to ORM Python objects.
Reproduction steps:
- Setup OpenStack with DVR enabled
- Create a network
- Attach a large number of subnets (the above has 27)
- Create a large number of FIPs and attach them to VMs (the above has around 1000 attached FIPs)
- Restart neutron_l3_agent on a compute node and observe slow response times for the get_routers() RPC
Version:
- OpenStack: Zed
- Kernel/distro: N/A
** Affects: neutron
Importance: Undecided
Assignee: Adam (adamoswick)
Status: In Progress
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2009509
Title:
Large number of FIPs and subnets causes slow sync_routers response
Status in neutron:
In Progress
Bug description:
When a large number of subnets and FIPs are configured on a network,
the response time for neutron.api.rpc.handlers.l3_rpc.syncer_routers
increases significantly.
Based on profiling data, a large amount of time is spent waiting on
_get_sync_floating_ips
(https://opendev.org/openstack/neutron/src/commit/0a214b0437874fd7f5379ec94fd07ef5d3ff4bbe/neutron/db/l3_db.py#L1879).
ncalls tottime percall cumtime percall filename:lineno(function)
...TRUNCATED...
16 2 0.000 0.000 19.827 9.913 /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/db/l3_db.py:1873(_get_sync_floating_ips)
...TRUNCATED...
In the above example, the total execution time logged for sync_routers
was 26.645s.
Further investigation reveals that the call to
l3_obj.FloatingIP.get_scoped_floating_ips within this is spending a
large amount of time mapping SQL output to ORM Python objects.
Reproduction steps:
- Setup OpenStack with DVR enabled
- Create a network
- Attach a large number of subnets (the above has 27)
- Create a large number of FIPs and attach them to VMs (the above has around 1000 attached FIPs)
- Restart neutron_l3_agent on a compute node and observe slow response times for the get_routers() RPC
Version:
- OpenStack: Zed
- Kernel/distro: N/A
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2009509/+subscriptions
Follow ups