← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1574881] Re: use_helper_for_ns_read=False breaks dhcp agent and l3 agent when /var/run/netns doesn't exist

 

Reviewed:  https://review.openstack.org/237653
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=339a1ccbb931ff906143ee6339376836f6f4563e
Submitter: Jenkins
Branch:    master

commit 339a1ccbb931ff906143ee6339376836f6f4563e
Author: Ryan Moats <rmoats@xxxxxxxxxx>
Date:   Tue Oct 20 15:51:37 2015 +0000

    Revert "Improve performance of ensure_namespace"
    
    This reverts commit 81823e86328e62850a89aef9f0b609bfc0a6dacd.
    
    Unneeded optimization: this commit only improves execution
    time on the order of milliseconds, which is less than 1% of
    the total router update execution time at the network node.
    
    This also
    
    Closes-bug: #1574881
    
    Change-Id: Icbcdf4725ba7d2e743bb6761c9799ae436bd953b


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1574881

Title:
  use_helper_for_ns_read=False breaks dhcp agent and l3 agent when
  /var/run/netns doesn't exist

Status in neutron:
  Fix Released

Bug description:
  A bug was introduced in https://review.openstack.org/#/c/227589/ .
  The idea of that patch was to improve performance by not shelling out
  to "ip netns list" to get a list of network namespaces and doing it in
  python instead (os.listdir).

  The iproute2 C code which implements "ip netns list" will first check
  if the "/var/run/netns" directory exists, before trying to enumerate
  the contents. The Python code tries to enumerate the directory
  contents (os.listdir), but doesn't check for the directory's
  existence. ip netns add would be able to create the directory.
  However, since an exception is thrown, that code path is no longer
  reached.

  The result is that the agents are non-functional when the directory is
  not present, and are unable to recover on their own.

  
  When use_helper_for_ns_read is True (the default value), then the existence of the directory is a non-issue, as "ip netns list" is run instead and sidesteps the broken behavior.

  Steps to reproduce:

  1- Start with a machine with no /var/run/netns directory (such as a newly provisioned VM)
  2- Disable use_helper_for_ns_read.
  In devstack:
  [[post-config|$NEUTRON_CONF]]
  [agent]
  use_helper_for_ns_read=False

  3- Run stack.sh
  4- At this point, q-l3 errors should already start appearing in the logs
  5- Create a new network and subnetwork
  6- There will be stacktraces in the dhcp agent logs.
  7- Observe that no router or dhcp namespaces were created

  Expected behavior:

  - No errors in the logs
  - /var/run/netns directory and mountpoint created (if not yet present)
  - network namespaces are created

  One possible fix would be to merge
  (https://review.openstack.org/#/c/237653/) and restore the old
  behavior.


  q-dhcp.log:

  Unable to plug DHCP port for network 37b27bd3-5072-4c3a-b26d-b7b67c2bc788. Releasing port.
  2016-04-22 23:53:21.492 TRACE neutron.agent.linux.dhcp Traceback (most recent call last):
  2016-04-22 23:53:21.492 TRACE neutron.agent.linux.dhcp   File "/opt/stack/neutron/neutron/agent/linux/dhcp.py", line 1234, in setup
  2016-04-22 23:53:21.492 TRACE neutron.agent.linux.dhcp     mtu=network.get('mtu'))
  2016-04-22 23:53:21.492 TRACE neutron.agent.linux.dhcp   File "/opt/stack/neutron/neutron/agent/linux/interface.py", line 250, in plug
  2016-04-22 23:53:21.492 TRACE neutron.agent.linux.dhcp     bridge, namespace, prefix, mtu)
  2016-04-22 23:53:21.492 TRACE neutron.agent.linux.dhcp   File "/opt/stack/neutron/neutron/agent/linux/interface.py", line 354, in plug_new
  2016-04-22 23:53:21.492 TRACE neutron.agent.linux.dhcp     namespace_obj = ip.ensure_namespace(namespace)
  2016-04-22 23:53:21.492 TRACE neutron.agent.linux.dhcp   File "/opt/stack/neutron/neutron/agent/linux/ip_lib.py", line 195, in ensure_namespace
  2016-04-22 23:53:21.492 TRACE neutron.agent.linux.dhcp     if not self.netns.exists(name):
  2016-04-22 23:53:21.492 TRACE neutron.agent.linux.dhcp   File "/opt/stack/neutron/neutron/agent/linux/ip_lib.py", line 883, in exists
  2016-04-22 23:53:21.492 TRACE neutron.agent.linux.dhcp     return name in os.listdir(IP_NETNS_PATH)
  2016-04-22 23:53:21.492 TRACE neutron.agent.linux.dhcp OSError: [Errno 2] No such file or directory: '/var/run/netns'


  Unable to enable dhcp for 37b27bd3-5072-4c3a-b26d-b7b67c2bc788.
  2016-04-22 23:53:21.614 TRACE neutron.agent.dhcp.agent Traceback (most recent call last):
  2016-04-22 23:53:21.614 TRACE neutron.agent.dhcp.agent   File "/opt/stack/neutron/neutron/agent/dhcp/agent.py", line 112, in call_driver
  2016-04-22 23:53:21.614 TRACE neutron.agent.dhcp.agent     getattr(driver, action)(**action_kwargs)
  2016-04-22 23:53:21.614 TRACE neutron.agent.dhcp.agent   File "/opt/stack/neutron/neutron/agent/linux/dhcp.py", line 209, in enable
  2016-04-22 23:53:21.614 TRACE neutron.agent.dhcp.agent     interface_name = self.device_manager.setup(self.network)
  2016-04-22 23:53:21.614 TRACE neutron.agent.dhcp.agent   File "/opt/stack/neutron/neutron/agent/linux/dhcp.py", line 1240, in setup
  2016-04-22 23:53:21.614 TRACE neutron.agent.dhcp.agent     self.plugin.release_dhcp_port(network.id, port.device_id)
  2016-04-22 23:53:21.614 TRACE neutron.agent.dhcp.agent   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
  2016-04-22 23:53:21.614 TRACE neutron.agent.dhcp.agent     self.force_reraise()
  2016-04-22 23:53:21.614 TRACE neutron.agent.dhcp.agent   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
  2016-04-22 23:53:21.614 TRACE neutron.agent.dhcp.agent     six.reraise(self.type_, self.value, self.tb)
  2016-04-22 23:53:21.614 TRACE neutron.agent.dhcp.agent   File "/opt/stack/neutron/neutron/agent/linux/dhcp.py", line 1234, in setup
  2016-04-22 23:53:21.614 TRACE neutron.agent.dhcp.agent     mtu=network.get('mtu'))

  q-l3.log(repeatedly):

  2016-04-22 23:54:14.659 ESC[01;31mERROR oslo_service.periodic_task
  req-43829b71-bb53-48d1-826b-7421a9a3612e None None Error during L3NATAgentWithStateReport.periodic_sync_routers_task
  2016-04-22 23:54:14.659 TRACE oslo_service.periodic_task Traceback (most recent call last):
  2016-04-22 23:54:14.659 TRACE oslo_service.periodic_task   File "/usr/local/lib/python2.7/dist-packages/oslo_service/periodic_task.py", line 220, in run_periodic_tasks
  2016-04-22 23:54:14.659 TRACE oslo_service.periodic_task     task(self, context)
  2016-04-22 23:54:14.659 TRACE oslo_service.periodic_task   File "/opt/stack/neutron/neutron/agent/l3/agent.py", line 540, in periodic_sync_routers_task
  2016-04-22 23:54:14.659 TRACE oslo_service.periodic_task     with self.namespaces_manager as ns_manager:
  2016-04-22 23:54:14.659 TRACE oslo_service.periodic_task   File "/opt/stack/neutron/neutron/agent/l3/namespace_manager.py", line 72, in __enter__
  2016-04-22 23:54:14.659 TRACE oslo_service.periodic_task     self._all_namespaces = self.list_all()
  2016-04-22 23:54:14.659 TRACE oslo_service.periodic_task   File "/opt/stack/neutron/neutron/agent/l3/namespace_manager.py", line 119, in list_all
  2016-04-22 23:54:14.659 TRACE oslo_service.periodic_task     namespaces = root_ip.get_namespaces()
  2016-04-22 23:54:14.659 TRACE oslo_service.periodic_task   File "/opt/stack/neutron/neutron/agent/linux/ip_lib.py", line 250, in get_namespaces
  2016-04-22 23:54:14.659 TRACE oslo_service.periodic_task     return os.listdir(IP_NETNS_PATH)
  2016-04-22 23:54:14.659 TRACE oslo_service.periodic_task OSError: [Errno 2] No such file or directory: '/var/run/netns'

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1574881/+subscriptions


References