← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1234026] Re: race condition: disable_dhcp_helper

 

*** This bug is a duplicate of bug 1251874 ***
    https://bugs.launchpad.net/bugs/1251874

** This bug has been marked a duplicate of bug 1251874
   reduce severity of network notfound trace when looked up by dhcp agent

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1234026

Title:
  race condition: disable_dhcp_helper

Status in OpenStack Neutron (virtual network service):
  Fix Committed

Bug description:
  while investigate https://bugs.launchpad.net/bugs/1232525 , i find
  there is a race condition in disable_dhcp_helper, here is a gate log
  example: http://logs.openstack.org/20/48720/3/gate/gate-tempest-
  devstack-vm-neutron/f10cd53/logs/screen-q-dhcp.txt.gz?level=TRACE

  there is a periodic task (eventlet spawned) in dhcp_agent:
  periodic_resync, which will sync_state and
  disable_dhcp_helper(deleted_id)

  however, if there is a agent notify, network.delete.end (or something
  related like update, refresh may also cause), then disable_dhcp_helper
  will be invoked

      def disable_dhcp_helper(self, network_id):
          """Disable DHCP for a network known to the agent."""
          network = self.cache.get_network_by_id(network_id)
          if network:
              if (self.conf.use_namespaces and
                  self.conf.enable_isolated_metadata):
                  self.disable_isolated_metadata_proxy(network)
              if self.call_driver('disable', network):
                  self.cache.remove(network)

  class NetworkCache(object):
      def remove(self, network):
          del self.cache[network.id]
          ...

  then there is a situation when two disable_dhcp_helper stack both
  think network is not none, but when one of them call self.cache.remove
  behind another one, then KeyError will be raised.

  my simplest fix is adding check network in remove(), such as

      if network.id not in self.cache:
          return

  but there still exists race, (even the time window is much smaller)

  I'm not quite sure about my opinion, any help?

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1234026/+subscriptions