← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1739658] Re: DBDeadlock trying to update instance_actions.updated_at

 

Reviewed:  https://review.openstack.org/529672
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3e3bc32db064de5783d2aab3ebd3977d43d8eacb
Submitter: Zuul
Branch:    master

commit 3e3bc32db064de5783d2aab3ebd3977d43d8eacb
Author: Matt Riedemann <mriedem.os@xxxxxxxxx>
Date:   Thu Dec 21 13:41:41 2017 -0500

    Add retry_on_deadlock decorator to action_event_start
    
    This decorator already exists on action_event_finish
    but we didn't think we'd need it on action_event_start,
    but looking at failures in the logs over the last 10 days
    it's clear we do.
    
    Change-Id: If11ebc64a3275a76d9ee6b39f1cd00f2ef0413de
    Closes-Bug: #1739658


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1739658

Title:
  DBDeadlock trying to update instance_actions.updated_at

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Seeing this here:

  http://logs.openstack.org/58/511358/47/check/legacy-tempest-dsvm-
  neutron-
  full/539f670/logs/screen-n-cpu.txt.gz?level=TRACE#_Dec_21_14_57_35_530720

  Dec 21 14:57:35.530720 ubuntu-xenial-ovh-bhs1-0001558681 nova-
  compute[15980]: ERROR oslo_messaging.rpc.server [None req-
  bf61a8c5-88f4-419c-a3c3-4f076b60ec96 tempest-
  ServerActionsTestJSON-1969878288 tempest-
  ServerActionsTestJSON-1969878288] Exception during message handling:
  RemoteError: Remote error: DBDeadlock (pymysql.err.InternalError)
  (1213, u'Deadlock found when trying to get lock; try restarting
  transaction') [SQL: u'UPDATE instance_actions SET
  updated_at=%(updated_at)s WHERE instance_actions.id =
  %(instance_actions_id)s'] [parameters: {'instance_actions_id': 313,
  'updated_at': datetime.datetime(2017, 12, 21, 14, 57, 35, 420626)}]

  http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22RemoteError%3A%20Remote%20error%3A%20DBDeadlock%20(pymysql.err.InternalError)%20(1213%2C%20u'Deadlock%20found%20when%20trying%20to%20get%20lock%3B%20try%20restarting%20transaction')%20%5BSQL%3A%20u'UPDATE%20instance_actions%20SET%20updated_at%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22&from=10d

  This would be a result of the @wrap_instance_event decorator in the
  compute manager creating an instance action event:

  Dec 21 14:57:35.534644 ubuntu-xenial-ovh-bhs1-0001558681 nova-compute[15980]: ERROR oslo_messaging.rpc.server   File "/opt/stack/new/nova/nova/compute/utils.py", line 910, in decorated_function
  Dec 21 14:57:35.534785 ubuntu-xenial-ovh-bhs1-0001558681 nova-compute[15980]: ERROR oslo_messaging.rpc.server     with EventReporter(context, event_name, instance_uuid):
  Dec 21 14:57:35.534905 ubuntu-xenial-ovh-bhs1-0001558681 nova-compute[15980]: ERROR oslo_messaging.rpc.server   File "/opt/stack/new/nova/nova/compute/utils.py", line 881, in __enter__

  Logstash shows that this started within the last 10 days, which would
  have been when this change merged:

  https://github.com/openstack/nova/commit/011b36aa93fed0fd2f3b06e9dd5bd27ea9ccbf6e

  We have the retry decorator on action_event_finish but not
  action_event_start because we didn't think we'd need it, but I guess
  we do.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1739658/+subscriptions


References