← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1354226] [NEW] NSX: neutron sync backend race condition

 

Public bug reported:

If you run with multiple_api  workers there is a race condition in the
nsx sync backend as we start reading from the database before it's
connected if running with 1 worker this doesn't occur as things don't
occur concurrently.


-07 16:47:44.311 ERROR neutron.openstack.common.loopingcall [req-fa4f72be-3b10-49ef-a9a6-0204c1af1920 None None] in dynamic looping call
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall Traceback (most recent call last):
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/opt/stack/neutron/neutron/openstack/common/loopingcall.py", line 123, in _inner
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     idle = self.f(*self.args, **self.kw)
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/opt/stack/neutron/neutron/plugins/vmware/common/sync.py", line 670, in _synchronize_state
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     scan_missing=scan_missing)
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/opt/stack/neutron/neutron/plugins/vmware/common/sync.py", line 344, in _synchronize_lswitches
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     filters=filters)
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/opt/stack/neutron/neutron/db/common_db_mixin.py", line 176, in _get_collection
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     items = [dict_func(c, fields) for c in query]
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 2353, in __iter__
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     return self._execute_and_instances(context)
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 2368, in _execute_and_instances
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     result = conn.execute(querycontext.statement, self._params)
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 662, in execute
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     params)
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 761, in _execute_clauseelement
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     compiled_sql, distilled_params
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 874, in _execute_context
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     context)
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/usr/local/lib/python2.7/dist-packages/oslo/db/sqlalchemy/compat/handle_error.py", line 123, in _handle_dbapi_exception
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     six.reraise(type(newraise), newraise, sys.exc_info()[2])
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/usr/local/lib/python2.7/dist-packages/oslo/db/sqlalchemy/compat/handle_error.py", line 100, in _handle_dbapi_exception
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     per_fn = fn(ctx)
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/usr/local/lib/python2.7/dist-packages/oslo/db/sqlalchemy/exc_filters.py", line 319, in handler
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     context.is_disconnect)
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/usr/local/lib/python2.7/dist-packages/oslo/db/sqlalchemy/exc_filters.py", line 260, in _is_db_connection_error
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     raise exception.DBConnectionError(operational_error)
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall DBConnectionError: (OperationalError) (2006, 'MySQL server has gone away') 'SELECT networks.tenant_id AS networks_tenant_id, networks.id AS networks_id, networks.name AS networks_name, networks.status AS networks_status, networks.admin_state_up AS networks_admin_state_up, networks.shared AS networks_shared, ipavailabilityranges_1.allocation_pool_id AS ipavailabilityranges_1_allocation_pool_id, ipavailabilityranges_1.first_ip AS ipavailabilityranges_1_first_ip, ipavailabilityranges_1.last_ip AS ipavailabilityranges_1_last_ip, ipallocationpools_1.id AS ipallocationpools_1_id, ipallocationpools_1.subnet_id AS ipallocationpools_1_subnet_id, ipallocationpools_1.first_ip AS ipallocationpools_1_first_ip, ipallocationpools_1.last_ip AS ipallocationpools_1_last_ip, subnets_1.tenant_id AS subnets_1_tenant_id, subnets_1.id AS subnets_1_id, subnets_1.name AS subnets_1_name, subnets_1.network_id AS subnets_1_network_id, subnets_1.ip_version AS subnets_1_ip_version, subnets_1.cidr AS subnets_1_cidr, subnets_1.gateway_ip AS subnets_1_gateway_ip, subnets_1.enable_dhcp AS subnets_1_enable_dhcp, subnets_1.shared AS subnets_1_shared, subnets_1.ipv6_ra_mode AS subnets_1_ipv6_ra_mode, subnets_1.ipv6_address_mode AS subnets_1_ipv6_address_mode, networksecuritybindings_1.network_id AS networksecuritybindings_1_network_id, networksecuritybindings_1.port_security_enabled AS networksecuritybindings_1_port_security_enabled, networkqueuemappings_1.network_id AS networkqueuemappings_1_network_id, networkqueuemappings_1.queue_id AS networkqueuemappings_1_queue_id, externalnetworks_1.network_id AS externalnetworks_1_network_id \nFROM networks LEFT OUTER JOIN externalnetworks ON networks.id = externalnetworks.network_id LEFT OUTER JOIN subnets AS subnets_1 ON networks.id = subnets_1.network_id LEFT OUTER JOIN ipallocationpools AS ipallocationpools_1 ON subnets_1.id = ipallocationpools_1.subnet_id LEFT OUTER JOIN ipavailabilityranges AS ipavailabilityranges_1 ON ipallocationpools_1.id = ipavailabilityranges_1.allocation_pool_id LEFT OUTER JOIN networksecuritybindings AS networksecuritybindings_1 ON networks.id = networksecuritybindings_1.network_id LEFT OUTER JOIN networkqueuemappings AS networkqueuemappings_1 ON networks.id = networkqueuemappings_1.network_id LEFT OUTER JOIN externalnetworks AS externalnetworks_1 ON networks.id = externalnetworks_1.network_id \nWHERE externalnetworks.network_id IS NULL' ()
2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall

** Affects: neutron
     Importance: Undecided
     Assignee: Aaron Rosen (arosen)
         Status: New

** Changed in: neutron
     Assignee: (unassigned) => Aaron Rosen (arosen)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1354226

Title:
  NSX: neutron sync backend race condition

Status in OpenStack Neutron (virtual network service):
  New

Bug description:
  If you run with multiple_api  workers there is a race condition in the
  nsx sync backend as we start reading from the database before it's
  connected if running with 1 worker this doesn't occur as things don't
  occur concurrently.

  
  -07 16:47:44.311 ERROR neutron.openstack.common.loopingcall [req-fa4f72be-3b10-49ef-a9a6-0204c1af1920 None None] in dynamic looping call
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall Traceback (most recent call last):
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/opt/stack/neutron/neutron/openstack/common/loopingcall.py", line 123, in _inner
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     idle = self.f(*self.args, **self.kw)
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/opt/stack/neutron/neutron/plugins/vmware/common/sync.py", line 670, in _synchronize_state
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     scan_missing=scan_missing)
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/opt/stack/neutron/neutron/plugins/vmware/common/sync.py", line 344, in _synchronize_lswitches
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     filters=filters)
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/opt/stack/neutron/neutron/db/common_db_mixin.py", line 176, in _get_collection
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     items = [dict_func(c, fields) for c in query]
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 2353, in __iter__
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     return self._execute_and_instances(context)
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 2368, in _execute_and_instances
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     result = conn.execute(querycontext.statement, self._params)
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 662, in execute
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     params)
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 761, in _execute_clauseelement
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     compiled_sql, distilled_params
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 874, in _execute_context
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     context)
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/usr/local/lib/python2.7/dist-packages/oslo/db/sqlalchemy/compat/handle_error.py", line 123, in _handle_dbapi_exception
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     six.reraise(type(newraise), newraise, sys.exc_info()[2])
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/usr/local/lib/python2.7/dist-packages/oslo/db/sqlalchemy/compat/handle_error.py", line 100, in _handle_dbapi_exception
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     per_fn = fn(ctx)
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/usr/local/lib/python2.7/dist-packages/oslo/db/sqlalchemy/exc_filters.py", line 319, in handler
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     context.is_disconnect)
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall   File "/usr/local/lib/python2.7/dist-packages/oslo/db/sqlalchemy/exc_filters.py", line 260, in _is_db_connection_error
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall     raise exception.DBConnectionError(operational_error)
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall DBConnectionError: (OperationalError) (2006, 'MySQL server has gone away') 'SELECT networks.tenant_id AS networks_tenant_id, networks.id AS networks_id, networks.name AS networks_name, networks.status AS networks_status, networks.admin_state_up AS networks_admin_state_up, networks.shared AS networks_shared, ipavailabilityranges_1.allocation_pool_id AS ipavailabilityranges_1_allocation_pool_id, ipavailabilityranges_1.first_ip AS ipavailabilityranges_1_first_ip, ipavailabilityranges_1.last_ip AS ipavailabilityranges_1_last_ip, ipallocationpools_1.id AS ipallocationpools_1_id, ipallocationpools_1.subnet_id AS ipallocationpools_1_subnet_id, ipallocationpools_1.first_ip AS ipallocationpools_1_first_ip, ipallocationpools_1.last_ip AS ipallocationpools_1_last_ip, subnets_1.tenant_id AS subnets_1_tenant_id, subnets_1.id AS subnets_1_id, subnets_1.name AS subnets_1_name, subnets_1.network_id AS subnets_1_network_id, subnets_1.ip_version AS subnets_1_ip_version, subnets_1.cidr AS subnets_1_cidr, subnets_1.gateway_ip AS subnets_1_gateway_ip, subnets_1.enable_dhcp AS subnets_1_enable_dhcp, subnets_1.shared AS subnets_1_shared, subnets_1.ipv6_ra_mode AS subnets_1_ipv6_ra_mode, subnets_1.ipv6_address_mode AS subnets_1_ipv6_address_mode, networksecuritybindings_1.network_id AS networksecuritybindings_1_network_id, networksecuritybindings_1.port_security_enabled AS networksecuritybindings_1_port_security_enabled, networkqueuemappings_1.network_id AS networkqueuemappings_1_network_id, networkqueuemappings_1.queue_id AS networkqueuemappings_1_queue_id, externalnetworks_1.network_id AS externalnetworks_1_network_id \nFROM networks LEFT OUTER JOIN externalnetworks ON networks.id = externalnetworks.network_id LEFT OUTER JOIN subnets AS subnets_1 ON networks.id = subnets_1.network_id LEFT OUTER JOIN ipallocationpools AS ipallocationpools_1 ON subnets_1.id = ipallocationpools_1.subnet_id LEFT OUTER JOIN ipavailabilityranges AS ipavailabilityranges_1 ON ipallocationpools_1.id = ipavailabilityranges_1.allocation_pool_id LEFT OUTER JOIN networksecuritybindings AS networksecuritybindings_1 ON networks.id = networksecuritybindings_1.network_id LEFT OUTER JOIN networkqueuemappings AS networkqueuemappings_1 ON networks.id = networkqueuemappings_1.network_id LEFT OUTER JOIN externalnetworks AS externalnetworks_1 ON networks.id = externalnetworks_1.network_id \nWHERE externalnetworks.network_id IS NULL' ()
  2014-08-07 16:47:44.311 TRACE neutron.openstack.common.loopingcall

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1354226/+subscriptions


Follow ups

References