← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1739658] [NEW] DBDeadlock trying to update instance_actions.updated_at

 

Public bug reported:

Seeing this here:

http://logs.openstack.org/58/511358/47/check/legacy-tempest-dsvm-
neutron-
full/539f670/logs/screen-n-cpu.txt.gz?level=TRACE#_Dec_21_14_57_35_530720

Dec 21 14:57:35.530720 ubuntu-xenial-ovh-bhs1-0001558681 nova-
compute[15980]: ERROR oslo_messaging.rpc.server [None req-bf61a8c5-88f4
-419c-a3c3-4f076b60ec96 tempest-ServerActionsTestJSON-1969878288
tempest-ServerActionsTestJSON-1969878288] Exception during message
handling: RemoteError: Remote error: DBDeadlock
(pymysql.err.InternalError) (1213, u'Deadlock found when trying to get
lock; try restarting transaction') [SQL: u'UPDATE instance_actions SET
updated_at=%(updated_at)s WHERE instance_actions.id =
%(instance_actions_id)s'] [parameters: {'instance_actions_id': 313,
'updated_at': datetime.datetime(2017, 12, 21, 14, 57, 35, 420626)}]

http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22RemoteError%3A%20Remote%20error%3A%20DBDeadlock%20(pymysql.err.InternalError)%20(1213%2C%20u'Deadlock%20found%20when%20trying%20to%20get%20lock%3B%20try%20restarting%20transaction')%20%5BSQL%3A%20u'UPDATE%20instance_actions%20SET%20updated_at%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22&from=10d

This would be a result of the @wrap_instance_event decorator in the
compute manager creating an instance action event:

Dec 21 14:57:35.534644 ubuntu-xenial-ovh-bhs1-0001558681 nova-compute[15980]: ERROR oslo_messaging.rpc.server   File "/opt/stack/new/nova/nova/compute/utils.py", line 910, in decorated_function
Dec 21 14:57:35.534785 ubuntu-xenial-ovh-bhs1-0001558681 nova-compute[15980]: ERROR oslo_messaging.rpc.server     with EventReporter(context, event_name, instance_uuid):
Dec 21 14:57:35.534905 ubuntu-xenial-ovh-bhs1-0001558681 nova-compute[15980]: ERROR oslo_messaging.rpc.server   File "/opt/stack/new/nova/nova/compute/utils.py", line 881, in __enter__

Logstash shows that this started within the last 10 days, which would
have been when this change merged:

https://github.com/openstack/nova/commit/011b36aa93fed0fd2f3b06e9dd5bd27ea9ccbf6e

We have the retry decorator on action_event_finish but not
action_event_start because we didn't think we'd need it, but I guess we
do.

** Affects: nova
     Importance: High
     Assignee: Matt Riedemann (mriedem)
         Status: In Progress


** Tags: db

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1739658

Title:
  DBDeadlock trying to update instance_actions.updated_at

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  Seeing this here:

  http://logs.openstack.org/58/511358/47/check/legacy-tempest-dsvm-
  neutron-
  full/539f670/logs/screen-n-cpu.txt.gz?level=TRACE#_Dec_21_14_57_35_530720

  Dec 21 14:57:35.530720 ubuntu-xenial-ovh-bhs1-0001558681 nova-
  compute[15980]: ERROR oslo_messaging.rpc.server [None req-
  bf61a8c5-88f4-419c-a3c3-4f076b60ec96 tempest-
  ServerActionsTestJSON-1969878288 tempest-
  ServerActionsTestJSON-1969878288] Exception during message handling:
  RemoteError: Remote error: DBDeadlock (pymysql.err.InternalError)
  (1213, u'Deadlock found when trying to get lock; try restarting
  transaction') [SQL: u'UPDATE instance_actions SET
  updated_at=%(updated_at)s WHERE instance_actions.id =
  %(instance_actions_id)s'] [parameters: {'instance_actions_id': 313,
  'updated_at': datetime.datetime(2017, 12, 21, 14, 57, 35, 420626)}]

  http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22RemoteError%3A%20Remote%20error%3A%20DBDeadlock%20(pymysql.err.InternalError)%20(1213%2C%20u'Deadlock%20found%20when%20trying%20to%20get%20lock%3B%20try%20restarting%20transaction')%20%5BSQL%3A%20u'UPDATE%20instance_actions%20SET%20updated_at%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22&from=10d

  This would be a result of the @wrap_instance_event decorator in the
  compute manager creating an instance action event:

  Dec 21 14:57:35.534644 ubuntu-xenial-ovh-bhs1-0001558681 nova-compute[15980]: ERROR oslo_messaging.rpc.server   File "/opt/stack/new/nova/nova/compute/utils.py", line 910, in decorated_function
  Dec 21 14:57:35.534785 ubuntu-xenial-ovh-bhs1-0001558681 nova-compute[15980]: ERROR oslo_messaging.rpc.server     with EventReporter(context, event_name, instance_uuid):
  Dec 21 14:57:35.534905 ubuntu-xenial-ovh-bhs1-0001558681 nova-compute[15980]: ERROR oslo_messaging.rpc.server   File "/opt/stack/new/nova/nova/compute/utils.py", line 881, in __enter__

  Logstash shows that this started within the last 10 days, which would
  have been when this change merged:

  https://github.com/openstack/nova/commit/011b36aa93fed0fd2f3b06e9dd5bd27ea9ccbf6e

  We have the retry decorator on action_event_finish but not
  action_event_start because we didn't think we'd need it, but I guess
  we do.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1739658/+subscriptions


Follow ups