yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #87961
[Bug 1954763] Re: Creating ports in bulks can be very slow due to many IPAM module conflicts
Reviewed: https://review.opendev.org/c/openstack/neutron/+/821727
Committed: https://opendev.org/openstack/neutron/commit/82aabb0aa962a3c5c5ce5ad1067952d8f3d9f992
Submitter: "Zuul (22348)"
Branch: master
commit 82aabb0aa962a3c5c5ce5ad1067952d8f3d9f992
Author: Slawek Kaplonski <skaplons@xxxxxxxxxx>
Date: Tue Dec 14 12:44:31 2021 +0100
Allocate IPs in bulk requests in separate transactions
In the ML2 plugin in create_port_bulk method, we are iterating over
list of the ports to be created and do everything for all ports in
single DB transaction (which makes totally sense as this is bulk
request).
But one of the things which was done during that huge transaction was
allocation of the IP addresses for all ports. That action is prone for
race conditions and can fail often, especially when there is no many IP
addresses available in the subnet(s) for the ports.
In case of the error while allocating IP address even for one port from
the whole bulk request, whole create_port_bulk method was retried so
allocations (and everything else) for all ports was reverted and started
from scratch. That takes a lot of time so some requests may be processed
very long time, like e.g. 2-3 minutes in my tests.
To reproduce that issue I did simple script which created network with
/24 subnet and then sent 24 requests to create 10 ports in bulk in each
request. That was in totall 240 ports created in that subnet.
I measured time of the creation of all those ports in the current master
branch (without this patch) and with the patch. Results are like below:
+-----+---------------+------------+---------------------------+
| Run | Master branch | This patch | Simulate bulk by creation |
| | [mm:ss] | [mm:ss] | of 10 ports one by one |
+-----+---------------+------------+---------------------------+
| 1 | 01:37 | 01:02 | 00:57 |
| 2 | 02:06 | 00:40 | 01:03 |
| 3 | 02:08 | 00:41 | 00:59 |
| 4 | 02:14 | 00:45 | 00:55 |
| 5 | 01:58 | 00:45 | 00:57 |
| 6 | 02:37 | 00:53 | 01:05 |
| 7 | 01:59 | 00:42 | 00:58 |
| 8 | 02:01 | 00:41 | 00:57 |
| 9 | 02:39 | 00:42 | 00:55 |
| 10 | 01:59 | 00:41 | 00:56 |
+-----+---------------+------------+---------------------------+
| AVG | 00:02:07 | 00:00:45 | 00:58 |
+-----+---------------+------------+---------------------------+
Closes-Bug: #1954763
Change-Id: I8877c658446fed155130add6f1c69f2772113c27
** Changed in: neutron
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1954763
Title:
Creating ports in bulks can be very slow due to many IPAM module
conflicts
Status in neutron:
Fix Released
Bug description:
When ports are created in bulk, ML2 plugin tries to do everything for all ports in one single DB transaction. That means, that if IP allocation for one of the IPs will fail, whole create_port_bulk method will be retried so it will retry to do everything for all ports.
That may be very inefficient in some use cases, where many ports in bulk are created in same network.
I did simple test:
- create 2 networks with /24 subnet in each network,
- run 24 API requests to create 10 ports in bulk in each request - so in total 240 IPs from subnet will be allocated. All those requests were done in parallel.
- wait how long it will take to have all ports created.
I run my reproducer script 10 times as ipam module is working in some kind of random way so results may be different in various runs.
Results of that simple test are below:
Run Execution time
1 00:01:37
2 00:02:06
3 00:02:08
4 00:02:14
5 00:01:58
6 00:02:37
7 00:01:59
8 00:02:01
9 00:02:39
10 00:01:59
AVG 00:02:07
The execution time here is in fact time of the execution of the
longest API request (all of them started together). So there was many
retries done internally in Neutron to allocate IP addresses for those
ports and client had to wait sometimes more than 150 seconds to get
reply for request.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1954763/+subscriptions
References