yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #10976
[Bug 1214115] Re: ipavailabilityranges race condition when allocating from same range on multiple neutron-servers
** Changed in: neutron
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1214115
Title:
ipavailabilityranges race condition when allocating from same range on
multiple neutron-servers
Status in OpenStack Neutron (virtual network service):
Fix Released
Bug description:
Lets say that we start with an allocation_pool_id that looks like
this:
+--------------------------------------+----------------+----------------+
| allocation_pool_id | first_ip | last_ip |
+--------------------------------------+----------------+----------------+
| 0f175416-a378-463b-9a84-18528f396e6f | 192.168.1.10 | 192.168.1.254 |
+--------------------------------------+----------------+----------------+
We then allocate a few of those IPs, let's say 10-20, our pool now
looks like this:
+--------------------------------------+----------------+----------------+
| allocation_pool_id | first_ip | last_ip |
+--------------------------------------+----------------+----------------+
| 0f175416-a378-463b-9a84-18528f396e6f | 192.168.1.20 | 192.168.1.254 |
+--------------------------------------+----------------+----------------+
Now, we try and free a couple of those IPs, let's say 16, 17 and 18
now we have this in the db:
+--------------------------------------+----------------+----------------+
| allocation_pool_id | first_ip | last_ip |
+--------------------------------------+----------------+----------------+
| 0f175416-a378-463b-9a84-18528f396e6f | 192.168.1.16 | 192.168.1.18 |
| 0f175416-a378-463b-9a84-18528f396e6f | 192.168.1.20 | 192.168.1.254 |
+--------------------------------------+----------------+----------------+
The race condition I'm about to describe will probably hamper the
above operation, but that's okay. Let's just pretend for the sake of
illustration. Now let's suppose that I have 2 neutron-servers running,
one gets a request to allocate 192.168.1.16 and the other gets a
request to free 192.168.1.15. Both servers are going to generate
UPDATEs to the DB, they will look something like this:
SERVER 1: UPDATE ipavailabilityranges SET first_ip = '192.168.1.17' WHERE first_ip = '192.168.1.16'
SERVER 2: UPDATE ipavailabilityranges SET first_ip = '192.168.1.15' WHERE first_ip = '192.168.1.16'
Depending on order, how busy your neutron-servers are and how busy
your database is one of the above statements is going to fail. That's
okay, it reports the failure up through the API, the issue we see is
that retries also tend to fail since usually only one operation
affecting a single row in the table ever succeeds. If you have a very
active neutron API and lots of free and allocate requests you end up
getting into a very unusable state where active periods for the API
are full of errors and get bogged down and fail until activity stops.
This is one example of the race condition. There are obviously other
ways to trigger it if you sit down and look at the applicable piece of
code. Some kind of concurrency management is probably in order, not
sure what the best way to solve this would be however...
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1214115/+subscriptions