yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #70096
[Bug 1739658] Re: DBDeadlock trying to update instance_actions.updated_at
Reviewed: https://review.openstack.org/529672
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3e3bc32db064de5783d2aab3ebd3977d43d8eacb
Submitter: Zuul
Branch: master
commit 3e3bc32db064de5783d2aab3ebd3977d43d8eacb
Author: Matt Riedemann <mriedem.os@xxxxxxxxx>
Date: Thu Dec 21 13:41:41 2017 -0500
Add retry_on_deadlock decorator to action_event_start
This decorator already exists on action_event_finish
but we didn't think we'd need it on action_event_start,
but looking at failures in the logs over the last 10 days
it's clear we do.
Change-Id: If11ebc64a3275a76d9ee6b39f1cd00f2ef0413de
Closes-Bug: #1739658
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1739658
Title:
DBDeadlock trying to update instance_actions.updated_at
Status in OpenStack Compute (nova):
Fix Released
Bug description:
Seeing this here:
http://logs.openstack.org/58/511358/47/check/legacy-tempest-dsvm-
neutron-
full/539f670/logs/screen-n-cpu.txt.gz?level=TRACE#_Dec_21_14_57_35_530720
Dec 21 14:57:35.530720 ubuntu-xenial-ovh-bhs1-0001558681 nova-
compute[15980]: ERROR oslo_messaging.rpc.server [None req-
bf61a8c5-88f4-419c-a3c3-4f076b60ec96 tempest-
ServerActionsTestJSON-1969878288 tempest-
ServerActionsTestJSON-1969878288] Exception during message handling:
RemoteError: Remote error: DBDeadlock (pymysql.err.InternalError)
(1213, u'Deadlock found when trying to get lock; try restarting
transaction') [SQL: u'UPDATE instance_actions SET
updated_at=%(updated_at)s WHERE instance_actions.id =
%(instance_actions_id)s'] [parameters: {'instance_actions_id': 313,
'updated_at': datetime.datetime(2017, 12, 21, 14, 57, 35, 420626)}]
http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22RemoteError%3A%20Remote%20error%3A%20DBDeadlock%20(pymysql.err.InternalError)%20(1213%2C%20u'Deadlock%20found%20when%20trying%20to%20get%20lock%3B%20try%20restarting%20transaction')%20%5BSQL%3A%20u'UPDATE%20instance_actions%20SET%20updated_at%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22&from=10d
This would be a result of the @wrap_instance_event decorator in the
compute manager creating an instance action event:
Dec 21 14:57:35.534644 ubuntu-xenial-ovh-bhs1-0001558681 nova-compute[15980]: ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/compute/utils.py", line 910, in decorated_function
Dec 21 14:57:35.534785 ubuntu-xenial-ovh-bhs1-0001558681 nova-compute[15980]: ERROR oslo_messaging.rpc.server with EventReporter(context, event_name, instance_uuid):
Dec 21 14:57:35.534905 ubuntu-xenial-ovh-bhs1-0001558681 nova-compute[15980]: ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/compute/utils.py", line 881, in __enter__
Logstash shows that this started within the last 10 days, which would
have been when this change merged:
https://github.com/openstack/nova/commit/011b36aa93fed0fd2f3b06e9dd5bd27ea9ccbf6e
We have the retry decorator on action_event_finish but not
action_event_start because we didn't think we'd need it, but I guess
we do.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1739658/+subscriptions
References