yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #08386
[Bug 1270680] [NEW] v3 extensions api inherently racey wrt instances
Public bug reported:
The pci extension for the v3 API does another instance lookup back to
the database for instance objects. The issue being that when you are
doing something like a list_* operation on instances, this means that
we're making a second trip to the database that's distinct from the
first lookup in the request handling. If an instance got deleted between
the request and the extension hook running, this will generate a
database exception, which turns into an InstanceNot found, and 404s the
list operation *if any instance was deleted during the request*
We are managing to hit this quite frequently in tempest with our
test_list_servers_by_admin_with_all_tenants (even at only concurency 2)
- http://logs.openstack.org/80/67480/1/gate/gate-tempest-dsvm-
full/24f9aab/console.html#_2014-01-20_01_18_11_102
The explosion looks like this -
http://logs.openstack.org/80/67480/1/gate/gate-tempest-dsvm-
full/24f9aab/logs/screen-n-api.txt.gz?level=INFO#_2014-01-20_00_57_44_352
Logstash picks up these tracebacks really easily. This kind of explosion
doesn't always trigger a Tempest failure, because some times this might
be in cleanup code, where we protect against 404s (though it probably
means we are leaking resources a lot on a normal run).
Logstash query -
http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwiVFJBQ0Ugbm92YS5hcGkub3BlbnN0YWNrXCIgQU5EIG1lc3NhZ2U6XCJJbnN0YW5jZU5vdEZvdW5kOiBJbnN0YW5jZVwiIEFORCBmaWxlbmFtZTpcImxvZ3Mvc2NyZWVuLW4tYXBpLnR4dFwiIiwiZmllbGRzIjpbXSwib2Zmc2V0IjowLCJ0aW1lZnJhbWUiOiI2MDQ4MDAiLCJncmFwaG1vZGUiOiJjb3VudCIsInRpbWUiOnsidXNlcl9pbnRlcnZhbCI6MH0sInN0YW1wIjoxMzkwMTgzNzk1ODI1fQ==
** Affects: nova
Importance: Critical
Status: Confirmed
** Changed in: nova
Status: New => Confirmed
** Changed in: nova
Importance: Undecided => Critical
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1270680
Title:
v3 extensions api inherently racey wrt instances
Status in OpenStack Compute (Nova):
Confirmed
Bug description:
The pci extension for the v3 API does another instance lookup back to
the database for instance objects. The issue being that when you are
doing something like a list_* operation on instances, this means that
we're making a second trip to the database that's distinct from the
first lookup in the request handling. If an instance got deleted
between the request and the extension hook running, this will generate
a database exception, which turns into an InstanceNot found, and 404s
the list operation *if any instance was deleted during the request*
We are managing to hit this quite frequently in tempest with our
test_list_servers_by_admin_with_all_tenants (even at only concurency
2) - http://logs.openstack.org/80/67480/1/gate/gate-tempest-dsvm-
full/24f9aab/console.html#_2014-01-20_01_18_11_102
The explosion looks like this -
http://logs.openstack.org/80/67480/1/gate/gate-tempest-dsvm-
full/24f9aab/logs/screen-n-api.txt.gz?level=INFO#_2014-01-20_00_57_44_352
Logstash picks up these tracebacks really easily. This kind of
explosion doesn't always trigger a Tempest failure, because some times
this might be in cleanup code, where we protect against 404s (though
it probably means we are leaking resources a lot on a normal run).
Logstash query -
http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwiVFJBQ0Ugbm92YS5hcGkub3BlbnN0YWNrXCIgQU5EIG1lc3NhZ2U6XCJJbnN0YW5jZU5vdEZvdW5kOiBJbnN0YW5jZVwiIEFORCBmaWxlbmFtZTpcImxvZ3Mvc2NyZWVuLW4tYXBpLnR4dFwiIiwiZmllbGRzIjpbXSwib2Zmc2V0IjowLCJ0aW1lZnJhbWUiOiI2MDQ4MDAiLCJncmFwaG1vZGUiOiJjb3VudCIsInRpbWUiOnsidXNlcl9pbnRlcnZhbCI6MH0sInN0YW1wIjoxMzkwMTgzNzk1ODI1fQ==
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1270680/+subscriptions
Follow ups
References