← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1282922] [NEW] nec pluign: ResourceClosedError from sqlalchemy when multiple delete-port in parallel

 

Public bug reported:

When multiple delete_port are run in parallel, nec plugin detects ResourceClosedError from sqlalchemy. The error itself can occur when a port row is deleted from ports table, but the problem is it occurs even after retrying db_plugin.delete_port by restarting a new transaction in plugin.delete_port().
At now it seems only nec plugin hits this problem, but potentially other plugins can hit this issue.

The case I observed is that delete-port from dhcp-agent
(release_dhcp_port RPC call) and delete-port from delete-network API
request are run in parallel. plugin.delete-port in nec plugin calls REST
API call to an external controller in addition to operates on neutron
database.

After my investigation and testing, db_plugin.delete_ports() calls plugin.delete_port() under a transaction.
https://github.com/openstack/neutron/blob/master/neutron/db/db_base_plugin_v2.py#L1367
This means the transaction continues over API calls to external controller and it leads to a long transaction.
When plugin.delete_ports() and plugin.delete_port() are run at the same time, even if plugin.delete_port() avoid long transaction, db operations in plugin.delete_port() is blocked and they can fail with timeout.

One example is : http://133.242.19.163:8000/neutron-ci-
logs/Neutron_Master/917/logs/devstack/q-svc-filtered.txt.gz

2014-02-18 18:38:41.771 16025 ERROR neutron.api.v2.resource [-] delete failed
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource Traceback (most recent call last):
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/opt/stack/neutron/neutron/api/v2/resource.py", line 84, in resource
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     result = method(request=request, **args)
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/opt/stack/neutron/neutron/api/v2/base.py", line 438, in delete
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     obj_deleter(request.context, id, **kwargs)
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/opt/stack/neutron/neutron/common/log.py", line 43, in wrapper
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     data['exc'] = unicode(e)
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/opt/stack/neutron/neutron/openstack/common/excutils.py", line 68, in __exit__
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     six.reraise(self.type_, self.value, self.tb)
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/opt/stack/neutron/neutron/common/log.py", line 40, in wrapper
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     ret = method(*args, **kwargs)
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/opt/stack/neutron/neutron/plugins/nec/nec_plugin.py", line 419, in delete_network
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     super(NECPluginV2, self).delete_network(context, id)
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/opt/stack/neutron/neutron/db/db_base_plugin_v2.py", line 968, in delete_network
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     for p in ports)
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/opt/stack/neutron/neutron/db/db_base_plugin_v2.py", line 967, in <genexpr>
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     only_auto_del = all(p['device_owner'] in AUTO_DELETE_PORT_OWNERS
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/loading.py", line 65, in instances
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     fetch = cursor.fetchall()
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/result.py", line 752, in fetchall
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     self.cursor, self.context)
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1027, in _handle_dbapi_exception
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     util.reraise(*exc_info)
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/result.py", line 746, in fetchall
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     l = self.process_rows(self._fetchall_impl())
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/result.py", line 715, in _fetchall_impl
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     self._non_result()
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/result.py", line 720, in _non_result
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     "This result object does not return rows. "
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource ResourceClosedError: This result object does not return rows. It has been closed automatically.
2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource 
2014-02-18 18:38:41.772 16025 DEBUG neutron.wsgi [-] Data Request Failed: internal server error while processing your request. type is <type 'unicode'> _to_xml_node /opt/stack/neutron/neutron/wsgi.py:533
2014-02-18 18:38:41.772 16025 INFO neutron.wsgi [-] 10.56.45.201 - - [18/Feb/2014 18:38:41] "DELETE /v2.0/networks/451384e4-a4f2-4558-8da0-40dc60f1a17f HTTP/1.1" 500 517 103.278109

** Affects: neutron
     Importance: High
     Assignee: Akihiro Motoki (amotoki)
         Status: New


** Tags: db nec

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1282922

Title:
  nec pluign: ResourceClosedError from sqlalchemy when multiple delete-
  port in parallel

Status in OpenStack Neutron (virtual network service):
  New

Bug description:
  When multiple delete_port are run in parallel, nec plugin detects ResourceClosedError from sqlalchemy. The error itself can occur when a port row is deleted from ports table, but the problem is it occurs even after retrying db_plugin.delete_port by restarting a new transaction in plugin.delete_port().
  At now it seems only nec plugin hits this problem, but potentially other plugins can hit this issue.

  The case I observed is that delete-port from dhcp-agent
  (release_dhcp_port RPC call) and delete-port from delete-network API
  request are run in parallel. plugin.delete-port in nec plugin calls
  REST API call to an external controller in addition to operates on
  neutron database.

  After my investigation and testing, db_plugin.delete_ports() calls plugin.delete_port() under a transaction.
  https://github.com/openstack/neutron/blob/master/neutron/db/db_base_plugin_v2.py#L1367
  This means the transaction continues over API calls to external controller and it leads to a long transaction.
  When plugin.delete_ports() and plugin.delete_port() are run at the same time, even if plugin.delete_port() avoid long transaction, db operations in plugin.delete_port() is blocked and they can fail with timeout.

  One example is : http://133.242.19.163:8000/neutron-ci-
  logs/Neutron_Master/917/logs/devstack/q-svc-filtered.txt.gz

  2014-02-18 18:38:41.771 16025 ERROR neutron.api.v2.resource [-] delete failed
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource Traceback (most recent call last):
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/opt/stack/neutron/neutron/api/v2/resource.py", line 84, in resource
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     result = method(request=request, **args)
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/opt/stack/neutron/neutron/api/v2/base.py", line 438, in delete
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     obj_deleter(request.context, id, **kwargs)
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/opt/stack/neutron/neutron/common/log.py", line 43, in wrapper
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     data['exc'] = unicode(e)
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/opt/stack/neutron/neutron/openstack/common/excutils.py", line 68, in __exit__
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     six.reraise(self.type_, self.value, self.tb)
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/opt/stack/neutron/neutron/common/log.py", line 40, in wrapper
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     ret = method(*args, **kwargs)
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/opt/stack/neutron/neutron/plugins/nec/nec_plugin.py", line 419, in delete_network
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     super(NECPluginV2, self).delete_network(context, id)
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/opt/stack/neutron/neutron/db/db_base_plugin_v2.py", line 968, in delete_network
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     for p in ports)
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/opt/stack/neutron/neutron/db/db_base_plugin_v2.py", line 967, in <genexpr>
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     only_auto_del = all(p['device_owner'] in AUTO_DELETE_PORT_OWNERS
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/loading.py", line 65, in instances
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     fetch = cursor.fetchall()
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/result.py", line 752, in fetchall
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     self.cursor, self.context)
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1027, in _handle_dbapi_exception
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     util.reraise(*exc_info)
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/result.py", line 746, in fetchall
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     l = self.process_rows(self._fetchall_impl())
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/result.py", line 715, in _fetchall_impl
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     self._non_result()
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource   File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/result.py", line 720, in _non_result
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource     "This result object does not return rows. "
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource ResourceClosedError: This result object does not return rows. It has been closed automatically.
  2014-02-18 18:38:41.771 16025 TRACE neutron.api.v2.resource 
  2014-02-18 18:38:41.772 16025 DEBUG neutron.wsgi [-] Data Request Failed: internal server error while processing your request. type is <type 'unicode'> _to_xml_node /opt/stack/neutron/neutron/wsgi.py:533
  2014-02-18 18:38:41.772 16025 INFO neutron.wsgi [-] 10.56.45.201 - - [18/Feb/2014 18:38:41] "DELETE /v2.0/networks/451384e4-a4f2-4558-8da0-40dc60f1a17f HTTP/1.1" 500 517 103.278109

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1282922/+subscriptions


Follow ups

References