← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1954763] Re: Creating ports in bulks can be very slow due to many IPAM module conflicts

 

Reviewed:  https://review.opendev.org/c/openstack/neutron/+/821727
Committed: https://opendev.org/openstack/neutron/commit/82aabb0aa962a3c5c5ce5ad1067952d8f3d9f992
Submitter: "Zuul (22348)"
Branch:    master

commit 82aabb0aa962a3c5c5ce5ad1067952d8f3d9f992
Author: Slawek Kaplonski <skaplons@xxxxxxxxxx>
Date:   Tue Dec 14 12:44:31 2021 +0100

    Allocate IPs in bulk requests in separate transactions
    
    In the ML2 plugin in create_port_bulk method, we are iterating over
    list of the ports to be created and do everything for all ports in
    single DB transaction (which makes totally sense as this is bulk
    request).
    But one of the things which was done during that huge transaction was
    allocation of the IP addresses for all ports. That action is prone for
    race conditions and can fail often, especially when there is no many IP
    addresses available in the subnet(s) for the ports.
    In case of the error while allocating IP address even for one port from
    the whole bulk request, whole create_port_bulk method was retried so
    allocations (and everything else) for all ports was reverted and started
    from scratch. That takes a lot of time so some requests may be processed
    very long time, like e.g. 2-3 minutes in my tests.
    
    To reproduce that issue I did simple script which created network with
    /24 subnet and then sent 24 requests to create 10 ports in bulk in each
    request. That was in totall 240 ports created in that subnet.
    I measured time of the creation of all those ports in the current master
    branch (without this patch) and with the patch. Results are like below:
    
    +-----+---------------+------------+---------------------------+
    | Run | Master branch | This patch | Simulate bulk by creation |
    |     | [mm:ss]       | [mm:ss]    | of 10 ports one by one    |
    +-----+---------------+------------+---------------------------+
    | 1   | 01:37         | 01:02      | 00:57                     |
    | 2   | 02:06         | 00:40      | 01:03                     |
    | 3   | 02:08         | 00:41      | 00:59                     |
    | 4   | 02:14         | 00:45      | 00:55                     |
    | 5   | 01:58         | 00:45      | 00:57                     |
    | 6   | 02:37         | 00:53      | 01:05                     |
    | 7   | 01:59         | 00:42      | 00:58                     |
    | 8   | 02:01         | 00:41      | 00:57                     |
    | 9   | 02:39         | 00:42      | 00:55                     |
    | 10  | 01:59         | 00:41      | 00:56                     |
    +-----+---------------+------------+---------------------------+
    | AVG | 00:02:07      | 00:00:45   | 00:58                     |
    +-----+---------------+------------+---------------------------+
    
    Closes-Bug: #1954763
    Change-Id: I8877c658446fed155130add6f1c69f2772113c27


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1954763

Title:
  Creating ports in bulks can be very slow due to many IPAM module
  conflicts

Status in neutron:
  Fix Released

Bug description:
  When ports are created in bulk, ML2 plugin tries to do everything for all ports in one single DB transaction. That means, that if IP allocation for one of the IPs will fail, whole create_port_bulk method will be retried so it will retry to do everything for all ports.
  That may be very inefficient in some use cases, where many ports in bulk are created in same network.

  I did simple test:
  - create 2 networks with /24 subnet in each network,
  - run 24 API requests to create 10 ports in bulk in each request - so in total 240 IPs from subnet will be allocated. All those requests were done in parallel.
  - wait how long it will take to have all ports created.

  I run my reproducer script 10 times as ipam module is working in some kind of random way so results may be different in various runs.
  Results of that simple test are below:

  Run	Execution time
  1	00:01:37
  2	00:02:06
  3	00:02:08
  4	00:02:14
  5	00:01:58
  6	00:02:37
  7	00:01:59
  8	00:02:01
  9	00:02:39
  10	00:01:59
  AVG	00:02:07

  The execution time here is in fact time of the execution of the
  longest API request (all of them started together). So there was many
  retries done internally in Neutron to allocate IP addresses for those
  ports and client had to wait sometimes more than 150 seconds to get
  reply for request.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1954763/+subscriptions



References