← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1202722] Re: Caught Exception in dhcp agent sync_state may block or delay configuration of new networks

 

** Changed in: neutron
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1202722

Title:
   Caught Exception in dhcp agent sync_state may block or delay
  configuration of new networks

Status in OpenStack Neutron (virtual network service):
  Fix Released

Bug description:
  
  In the dhcp-agent.log, sometimes this error is seen.
  Dhcp_agent.ini is configured with no router_id defined. There is one dhcp agent managing all dhcp servers in one node.

  This is the Traceback:
  2013-07-10 16:16:15 ERROR [quantum.agent.dhcp_agent] Unable to sync network state.
  Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/quantum/agent/dhcp_agent.py", line 152, in sync_state
  self.disable_dhcp_helper(deleted_id)
  File "/usr/lib/python2.7/dist-packages/quantum/agent/dhcp_agent.py", line 197, in disable_dhcp_helper
  self.disable_isolated_metadata_proxy(network)
  File "/usr/lib/python2.7/dist-packages/quantum/agent/dhcp_agent.py", line 340, in disable_isolated_metadata_proxy
  pm.disable()
  File "/usr/lib/python2.7/dist-packages/quantum/agent/linux/external_process.py", line 67, in disable
  ip_wrapper.netns.execute(cmd)
  File "/usr/lib/python2.7/dist-packages/quantum/agent/linux/ip_lib.py", line 407, in execute
  check_exit_code=check_exit_code)
  File "/usr/lib/python2.7/dist-packages/quantum/agent/linux/utils.py", line 61, in execute
  raise RuntimeError(m)
  RuntimeError:...

  
  The dhcp_agent.py in commit 1bd456371f9909d5cb33536e84a3fdd7aac40f8c shows:

      def sync_state(self):
          """Sync the local DHCP state with Neutron."""
          LOG.info(_('Synchronizing state'))
          pool = eventlet.GreenPool(cfg.CONF.num_sync_threads)
          known_network_ids = set(self.cache.get_network_ids())

          try:
              active_networks = self.plugin_rpc.get_active_networks_info()
              active_network_ids = set(network.id for network in active_networks)
              for deleted_id in known_network_ids - active_network_ids:
                  self.disable_dhcp_helper(deleted_id)

              for network in active_networks:
                  pool.spawn_n(self.configure_dhcp_for_network, network)

          except Exception:
              self.needs_resync = True
              LOG.exception(_('Unable to sync network state.'))

  
  When an error happens in the loop of the self.disable_dhcp_helper routine, all networks in the "for network in active_networks" loop will be skipped.  So either the configuration of dhcp of network will either be delayed or will never happen.  

  This routine touches many networks in a system. So any error
  exceptions should be caught inside disable_dhcp_helper() so that
  processing of other networks are not blocked -- for any exceptions,
  bugs or not.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1202722/+subscriptions