yahoo-eng-team team mailing list archive

Thread
Date

[Bug 1099966] Re: Race condition when rapidly deleting and creating tokens

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Morgan Fainberg <morgan.fainberg@xxxxxxxxx>
Date: Sun, 01 Jun 2014 04:23:39 -0000
Reply-to: Bug 1099966 <1099966@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx

We have included microsecond data (should be unique per token except in
some fairly narrow scenarios). Further improvements are likely to
require a lot of work for not a lot of benefit.

At this point I don't think we're seeing much of this error occurring
either in test or real deployments, so I'm marking this as "Wont Fix".
We can revisit this later if it turns out to resurface.

** Changed in: keystone
Status: Triaged => Won't Fix

--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Keystone.
https://bugs.launchpad.net/bugs/1099966

Title:
Race condition when rapidly deleting and creating tokens

Status in OpenStack Identity (Keystone):
Won't Fix

Bug description:
token backend is SQL. PKI enabled. Multi-node setup with database on
separate node from keystone server.

The symptom of this looks like this:

http://paste.openstack.org/show/29472/

Which occurs on random tests in Tempest's identity admin tests, but
consistently, when executing the tests in a PKI environment with the
database server on a separate node. This apparently does not occur
when using devstack, which has a local MySQL instance (and may have
some caching enabled?)

The race condition happens like so:

Thread 1:

POST /tokens with auth data, passing the token matching the PKI CMS
record for a user

Hits this block of code:

https://github.com/openstack/keystone/blob/master/keystone/token/controllers.py#L124
[1]

The call to token_api.create_token() fails with an IntegrityError from
SQLAlchemy. This is a planned-for event, apparently, as the code on
line 132 [2] catches Exception, with the following in-line code
comment:

# an identical token may have been created already.
# if so, return the token_data as it is also identical

now in Thread 2:

A call to DELETE /tokens (or possibly some token expiration code?)
proceeds to delete the same token for the user that just resulted in
the IntegrityError raised in thread 1.

back in Thread 1:

The call to token_api.get_token() now fails with a NotFound exception,
which causes the original exception (IntegrityError) to be re-raised
and sent back across the wire to the end-user.

Proposed Solution:

Instead of re-raising the original exception on line 139 [3], instead
drop into a simple loop with a randomized timeout that calls
create_token() again with the token ID and token data from line 125.

[1] Same block in Folsom: https://github.com/openstack/keystone/blob/stable/folsom/keystone/service.py#L437
[2] Line 445 in Folsom code.
[3] Line 452 in Folsom code.

To manage notifications about this bug go to:
https://bugs.launchpad.net/keystone/+bug/1099966/+subscriptions