← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1426427] Re: Improve tunnel_sync server side rpc to handle race conditions

 

[Expired for neutron because there has been no activity for 60 days.]

** Changed in: neutron
       Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1426427

Title:
  Improve tunnel_sync server side rpc to handle race conditions

Status in neutron:
  Expired

Bug description:
  We  have a concern that we may have race conditions with the following
  code snippet:

              if host:
                  host_endpoint = driver.obj.get_endpoint_by_host(host)
                  ip_endpoint = driver.obj.get_endpoint_by_ip(tunnel_ip)

                  if (ip_endpoint and ip_endpoint.host is None
                      and host_endpoint is None):
                      driver.obj.delete_endpoint(ip_endpoint.ip_address)
                  elif (ip_endpoint and ip_endpoint.host != host):
                      msg = (_("Tunnel IP %(ip)s in use with host %(host)s"),
                             {'ip': ip_endpoint.ip_address,
                              'host': ip_endpoint.host})
                      raise exc.InvalidInput(error_message=msg)
                  elif (host_endpoint and host_endpoint.ip_address != tunnel_ip):
                      # Notify all other listening agents to delete stale tunnels
                      self._notifier.tunnel_delete(rpc_context,
                          host_endpoint.ip_address, tunnel_type)
                      driver.obj.delete_endpoint(host_endpoint.ip_address)

  Consider two threads (A and B), where for

  Thread A we have following use case:
  if Host is passed from an agent and it is not found in DB but the passed tunnel_ip is found, delete the endpoint from DB and add the endpoint with 
  (tunnel_ip, host), it's an upgrade case.

  whereas for Thread B we have following use case:
  if passed host and tunnel_ip are not found in the DB, it is a new endpoint.

  Both threads will do the following in the end:

              tunnel = driver.obj.add_endpoint(tunnel_ip, host)
              tunnels = driver.obj.get_endpoints()
              entry = {'tunnels': tunnels}
              # Notify all other listening agents
              self._notifier.tunnel_update(rpc_context, tunnel.ip_address,
                                           tunnel_type)
              # Return the list of tunnels IP's to the agent
              return entry

  
  Since, Thread A first deletes the endpoints and adds it, we may have chances where Thread B doesn't get that endpoint in get_endpoints call during race condition.

  One way to overcome this problem would be instead of doing
  delete_endpoint we could introduce update_endpoint method in
  type_drivers.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1426427/+subscriptions


References