yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #87920
[Bug 1954662] Re: Quota driver "DbQuotaNoLockDriver" can lock when removing the expired reservations
Reviewed: https://review.opendev.org/c/openstack/neutron/+/821592
Committed: https://opendev.org/openstack/neutron/commit/2dd3ffa271d68b4e042ff64fcc2657af6990e95f
Submitter: "Zuul (22348)"
Branch: master
commit 2dd3ffa271d68b4e042ff64fcc2657af6990e95f
Author: Rodolfo Alonso Hernandez <ralonsoh@xxxxxxxxxx>
Date: Mon Dec 13 14:29:47 2021 +0000
Remove the expired reservations in a separate DB transaction
In "DbQuotaNoLockDriver", when a new reservation is being made,
first the expired reservations are removed. That guarantees the
freshness of the existing reservations.
In systems with high concurrency of operations, the
"DbQuotaNoLockDriver.make_reservation" method will be called in
parallel. The expired reservations removal implies a deletion
on the "reservation" table that could be executed by several
workers at the same time (in the same controller or not). That
could lead to a "DBDeadlock" exception if multiple workers want
to delete the same registers.
In case an API worker receives this exception, it should continue
as the expired reservations have been deleted by other worker. It
should not retry this operation.
If the reservations are not deleted, the quota engine will filter
out those expired reservations when counting the current number of
reservations [1][2][3]. That means even if in a particular request
the expired reservations are not deleted, these won't count in the
resource quota calculation.
The default reservation expiration timeout is set to 120 seconds
(as it should have been initially set) that is the default
expiration delta for a reservation since 2015.
[1]https://github.com/openstack/neutron/blob/e99d9a9d0697a21ba7ec84465f415f60041f3767/neutron/quota/resource.py#L340
[2]https://github.com/openstack/neutron/blob/e99d9a9d0697a21ba7ec84465f415f60041f3767/neutron/db/quota/api.py#L226
[3]https://github.com/openstack/neutron/blob/e99d9a9d0697a21ba7ec84465f415f60041f3767/neutron/objects/quota.py#L100-L101
Closes-Bug: #1954662
Change-Id: I8af6565d2537db7f0df2e8e567ea046a0a6e003a
** Changed in: neutron
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1954662
Title:
Quota driver "DbQuotaNoLockDriver" can lock when removing the expired
reservations
Status in neutron:
Fix Released
Bug description:
Just in case, this is related to [1].
In [1] we found that we were deleting the reservations always for a
specific resource and project, regardless of the date. The solution
was to introduce a timeout (with reasonable value of 20 seconds) to
filter the existing reservations. Any recent reservation, created by
an ongoing request transaction, is keep in the DB.
This bug shows another problem related to situations with very high
concurrency. The deletion of the expired reservations cannot be
executed at the same time by two or more concurrent transactions. In
case this happens, only one transaction will succeed and the others
will fail, triggering the DB retry and ending in a DB lock state.
Error log: https://paste.opendev.org/show/811637/
[1]https://bugs.launchpad.net/neutron/+bug/1940311
[2]https://github.com/openstack/neutron/blob/e99d9a9d0697a21ba7ec84465f415f60041f3767/neutron/db/quota/driver_nolock.py#L53-L58
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1954662/+subscriptions
References