← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1611612] [NEW] linuxbridge and dhcp agents race removing tap

 

Public bug reported:

When a network is deleted, an exception can happen because the lb-agent
tries to removes the dhcp tap from the bridge at about the same time as
the dhcp-agent is deleting the tap. The unhandled expection results in
the bridge not getting cleaned up and an error and stacktrace in the
logs.

http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%20%5C%22self.remove_interface%5C%22

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 133, in _process_incoming
    res = self.dispatcher.dispatch(message)
  File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 150, in dispatch
    return self._do_dispatch(endpoint, method, ctxt, args)
  File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 121, in _do_dispatch
    result = func(ctxt, **new_args)
  File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 803, in network_delete
    self.agent.mgr.delete_bridge(bridge_name)
  File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 521, in delete_bridge
    self.remove_interface(bridge_name, interface)
  File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 568, in remove_interface
    if bridge_device.delif(interface_name):
  File "/opt/stack/new/neutron/neutron/agent/linux/bridge_lib.py", line 80, in delif
    return self._brctl(['delif', self.name, interface])
  File "/opt/stack/new/neutron/neutron/agent/linux/bridge_lib.py", line 55, in _brctl
    return ip_wrapper.netns.execute(cmd, run_as_root=True)
  File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 876, in execute
    log_fail_as_error=log_fail_as_error, **kwargs)
  File "/opt/stack/new/neutron/neutron/agent/linux/utils.py", line 138, in execute
    raise RuntimeError(msg)
RuntimeError: Exit code: 1; Stdin: ; Stdout: ; Stderr: device tap1aa0d45a-39 is not a slave of brq6d449049-5c

** Affects: neutron
     Importance: Undecided
     Assignee: Darragh O'Reilly (darragh-oreilly)
         Status: In Progress


** Tags: linuxbridge

** Tags added: linuxbridge

** Changed in: neutron
     Assignee: (unassigned) => Darragh O'Reilly (darragh-oreilly)

** Changed in: neutron
       Status: New => In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1611612

Title:
  linuxbridge and dhcp agents race removing tap

Status in neutron:
  In Progress

Bug description:
  When a network is deleted, an exception can happen because the lb-
  agent tries to removes the dhcp tap from the bridge at about the same
  time as the dhcp-agent is deleting the tap. The unhandled expection
  results in the bridge not getting cleaned up and an error and
  stacktrace in the logs.

  http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%20%5C%22self.remove_interface%5C%22

  Traceback (most recent call last):
    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 133, in _process_incoming
      res = self.dispatcher.dispatch(message)
    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 150, in dispatch
      return self._do_dispatch(endpoint, method, ctxt, args)
    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 121, in _do_dispatch
      result = func(ctxt, **new_args)
    File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 803, in network_delete
      self.agent.mgr.delete_bridge(bridge_name)
    File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 521, in delete_bridge
      self.remove_interface(bridge_name, interface)
    File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 568, in remove_interface
      if bridge_device.delif(interface_name):
    File "/opt/stack/new/neutron/neutron/agent/linux/bridge_lib.py", line 80, in delif
      return self._brctl(['delif', self.name, interface])
    File "/opt/stack/new/neutron/neutron/agent/linux/bridge_lib.py", line 55, in _brctl
      return ip_wrapper.netns.execute(cmd, run_as_root=True)
    File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 876, in execute
      log_fail_as_error=log_fail_as_error, **kwargs)
    File "/opt/stack/new/neutron/neutron/agent/linux/utils.py", line 138, in execute
      raise RuntimeError(msg)
  RuntimeError: Exit code: 1; Stdin: ; Stdout: ; Stderr: device tap1aa0d45a-39 is not a slave of brq6d449049-5c

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1611612/+subscriptions


Follow ups