← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1783613] [NEW] [ocata only] quota usage not decremented during boot/delete race

 

Public bug reported:

A customer ran into a situation where, during rapid boot/delete
(parallel) requests to nova, they noticed that quota usage was not
decremented after deleting instances. So, the number of instances in use
did not match the quota usage (quota out-of-sync).

I noticed in the code that in the _delete_while_booting method, we check
for the presence of the build request and if it's found, we delete it
and we lookup the instance and commit the quota usage decrement if we
found the instance. However, we *don't* decrement the quota usage via
commit if we did not find the instance (after finding the build
request).

I think that's wrong because if we found the build request, we're in the
middle of booting, and if we don't find the instance after that, it
means conductor either a) didn't create the instance record yet or b)
deleted the instance record because it got BuildRequestNotFound because
we (compute/api) deleted the build request. In either case, conductor
isn't going to do anything to decrement the quota usage, so we need to
do it in compute/api.

I think the fix is to decrement the quota usage whether we find the
instance or not, after we successfully delete the build request.

** Affects: nova
     Importance: Undecided
         Status: Invalid

** Affects: nova/ocata
     Importance: Undecided
     Assignee: melanie witt (melwitt)
         Status: In Progress


** Tags: quotas

** Also affects: nova/ocata
   Importance: Undecided
       Status: New

** Changed in: nova/ocata
       Status: New => Triaged

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1783613

Title:
  [ocata only] quota usage not decremented during boot/delete race

Status in OpenStack Compute (nova):
  Invalid
Status in OpenStack Compute (nova) ocata series:
  In Progress

Bug description:
  A customer ran into a situation where, during rapid boot/delete
  (parallel) requests to nova, they noticed that quota usage was not
  decremented after deleting instances. So, the number of instances in
  use did not match the quota usage (quota out-of-sync).

  I noticed in the code that in the _delete_while_booting method, we
  check for the presence of the build request and if it's found, we
  delete it and we lookup the instance and commit the quota usage
  decrement if we found the instance. However, we *don't* decrement the
  quota usage via commit if we did not find the instance (after finding
  the build request).

  I think that's wrong because if we found the build request, we're in
  the middle of booting, and if we don't find the instance after that,
  it means conductor either a) didn't create the instance record yet or
  b) deleted the instance record because it got BuildRequestNotFound
  because we (compute/api) deleted the build request. In either case,
  conductor isn't going to do anything to decrement the quota usage, so
  we need to do it in compute/api.

  I think the fix is to decrement the quota usage whether we find the
  instance or not, after we successfully delete the build request.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1783613/+subscriptions


Follow ups