← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1270680] [NEW] v3 extensions api inherently racey wrt instances

 

Public bug reported:

The pci extension for the v3 API does another instance lookup back to
the database for instance objects. The issue being that when you are
doing something like a list_* operation on instances, this means that
we're making a second trip to the database that's distinct from the
first lookup in the request handling. If an instance got deleted between
the request and the extension hook running, this will generate a
database exception, which turns into an InstanceNot found, and 404s the
list operation *if any instance was deleted during the request*

We are managing to hit this quite frequently in tempest with our
test_list_servers_by_admin_with_all_tenants (even at only concurency 2)
- http://logs.openstack.org/80/67480/1/gate/gate-tempest-dsvm-
full/24f9aab/console.html#_2014-01-20_01_18_11_102

The explosion looks like this -
http://logs.openstack.org/80/67480/1/gate/gate-tempest-dsvm-
full/24f9aab/logs/screen-n-api.txt.gz?level=INFO#_2014-01-20_00_57_44_352

Logstash picks up these tracebacks really easily. This kind of explosion
doesn't always trigger a Tempest failure, because some times this might
be in cleanup code, where we protect against 404s (though it probably
means we are leaking resources a lot on a normal run).

Logstash query -
http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwiVFJBQ0Ugbm92YS5hcGkub3BlbnN0YWNrXCIgQU5EIG1lc3NhZ2U6XCJJbnN0YW5jZU5vdEZvdW5kOiBJbnN0YW5jZVwiIEFORCBmaWxlbmFtZTpcImxvZ3Mvc2NyZWVuLW4tYXBpLnR4dFwiIiwiZmllbGRzIjpbXSwib2Zmc2V0IjowLCJ0aW1lZnJhbWUiOiI2MDQ4MDAiLCJncmFwaG1vZGUiOiJjb3VudCIsInRpbWUiOnsidXNlcl9pbnRlcnZhbCI6MH0sInN0YW1wIjoxMzkwMTgzNzk1ODI1fQ==

** Affects: nova
     Importance: Critical
         Status: Confirmed

** Changed in: nova
       Status: New => Confirmed

** Changed in: nova
   Importance: Undecided => Critical

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1270680

Title:
  v3 extensions api inherently racey wrt instances

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  The pci extension for the v3 API does another instance lookup back to
  the database for instance objects. The issue being that when you are
  doing something like a list_* operation on instances, this means that
  we're making a second trip to the database that's distinct from the
  first lookup in the request handling. If an instance got deleted
  between the request and the extension hook running, this will generate
  a database exception, which turns into an InstanceNot found, and 404s
  the list operation *if any instance was deleted during the request*

  We are managing to hit this quite frequently in tempest with our
  test_list_servers_by_admin_with_all_tenants (even at only concurency
  2)  - http://logs.openstack.org/80/67480/1/gate/gate-tempest-dsvm-
  full/24f9aab/console.html#_2014-01-20_01_18_11_102

  The explosion looks like this -
  http://logs.openstack.org/80/67480/1/gate/gate-tempest-dsvm-
  full/24f9aab/logs/screen-n-api.txt.gz?level=INFO#_2014-01-20_00_57_44_352

  Logstash picks up these tracebacks really easily. This kind of
  explosion doesn't always trigger a Tempest failure, because some times
  this might be in cleanup code, where we protect against 404s (though
  it probably means we are leaking resources a lot on a normal run).

  Logstash query -
  http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwiVFJBQ0Ugbm92YS5hcGkub3BlbnN0YWNrXCIgQU5EIG1lc3NhZ2U6XCJJbnN0YW5jZU5vdEZvdW5kOiBJbnN0YW5jZVwiIEFORCBmaWxlbmFtZTpcImxvZ3Mvc2NyZWVuLW4tYXBpLnR4dFwiIiwiZmllbGRzIjpbXSwib2Zmc2V0IjowLCJ0aW1lZnJhbWUiOiI2MDQ4MDAiLCJncmFwaG1vZGUiOiJjb3VudCIsInRpbWUiOnsidXNlcl9pbnRlcnZhbCI6MH0sInN0YW1wIjoxMzkwMTgzNzk1ODI1fQ==

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1270680/+subscriptions


Follow ups

References