← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1776469] Re: neutron-netns-cleanup explodes when trying to delete an OVS internal port

 

Reviewed:  https://review.openstack.org/574712
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=7458575cfbc00a9bedf4d514a95e9b891639d5e8
Submitter: Zuul
Branch:    master

commit 7458575cfbc00a9bedf4d514a95e9b891639d5e8
Author: Miguel Angel Ajo <majopela@xxxxxxxxxx>
Date:   Tue Jun 12 14:35:39 2018 +0200

    Convert missing exception on device.link.delete()
    
    Once we started using oslo.privsep the call to device.link.delete()
    should return RuntimeError when the device can't be handled by ip link
    for example, when it's an ovs internal device.
    
    Closes-Bug: #1776469
    
    Change-Id: Ibf4b0bbb54aef38fc569036880668c745cb5c096


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1776469

Title:
  neutron-netns-cleanup explodes when trying to delete an OVS internal
  port

Status in neutron:
  Fix Released

Bug description:
  
  Apparently, the exception is not bubbling up out of privsep, and the cleanup exits instead of retrying with ovsdb del port:

  https://github.com/openstack/neutron/blob/100491cec72ecf694cc8cbd6cd17b66a191a5bd7/neutron/cmd/netns_cleanup.py#L124

  
  def unplug_device(conf, device):
      orig_log_fail_as_error = device.get_log_fail_as_error()
      device.set_log_fail_as_error(False)
      try:
          device.link.delete()
      except RuntimeError:
          device.set_log_fail_as_error(orig_log_fail_as_error)
          # Maybe the device is OVS port, so try to delete
          ovs = ovs_lib.BaseOVS()
          bridge_name = ovs.get_bridge_for_iface(device.name)
          if bridge_name:
              bridge = ovs_lib.OVSBridge(bridge_name)
              bridge.delete_port(device.name)
          else:
              LOG.debug('Unable to find bridge for device: %s', device.name)
      finally:
          device.set_log_fail_as_error(orig_log_fail_as_error)


  neutron-netns-cleanup --config-file /usr/share/neutron/neutron-
  dist.conf --config-dir /usr/share/neutron/l3_agent --config-file
  /etc/neutron/neutron.conf --config-file /etc/neutron/l3_agent.ini
  --config-dir /etc/neutron/conf.d/common --config-dir
  /etc/neutron/conf.d/neutron-l3-agent --agent-type l3 -d --force

  
  2018-06-12 11:39:26.868 254573 INFO neutron.common.config [-] Logging enabled!
  2018-06-12 11:39:26.868 254573 INFO neutron.common.config [-] /usr/bin/neutron-netns-cleanup version 13.0.0.0b2.dev174
  2018-06-12 11:39:26.868 254573 DEBUG neutron.common.config [-] command line: /usr/bin/neutron-netns-cleanup --config-file /usr/share/neutron/neutron-dist.conf --config-dir /usr/share/neutron/l3_agent --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/l3_agent.ini --config-dir /etc/neutron/conf.d/common --config-dir /etc/neutron/conf.d/neutron-l3-agent --agent-type l3 -d --force setup_logging /usr/lib/python2.7/site-packages/neutron/common/config.py:104
  2018-06-12 11:39:26.869 254573 INFO oslo.privsep.daemon [-] Running privsep helper: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'privsep-helper', '--config-file', '/usr/share/neutron/neutron-dist.conf', '--config-file', '/etc/neutron/neutron.conf', '--config-file', '/etc/neutron/l3_agent.ini', '--config-dir', '/etc/neutron/conf.d/neutron-l3-agent', '--privsep_context', 'neutron.privileged.default', '--privsep_sock_path', '/tmp/tmpNU7Loh/privsep.sock']
  2018-06-12 11:39:27.455 254573 INFO oslo.privsep.daemon [-] Spawned new privsep daemon via rootwrap
  2018-06-12 11:39:27.456 254573 DEBUG oslo.privsep.daemon [-] Accepted privsep connection to /tmp/tmpNU7Loh/privsep.sock __init__ /usr/lib/python2.7/site-packages/oslo_privsep/daemon.py:331
  2018-06-12 11:39:27.386 254707 INFO oslo.privsep.daemon [-] privsep daemon starting
  2018-06-12 11:39:27.390 254707 INFO oslo.privsep.daemon [-] privsep process running with uid/gid: 0/0
  2018-06-12 11:39:27.395 254707 INFO oslo.privsep.daemon [-] privsep process running with capabilities (eff/prm/inh): CAP_NET_ADMIN|CAP_SYS_ADMIN/CAP_NET_ADMIN|CAP_SYS_ADMIN/none
  2018-06-12 11:39:27.395 254707 INFO oslo.privsep.daemon [-] privsep daemon running as pid 254707
  2018-06-12 11:39:27.458 254707 DEBUG oslo.privsep.daemon [-] privsep: request[140529299646096]: (1,) loop /usr/lib/python2.7/site-packages/oslo_privsep/daemon.py:443
  2018-06-12 11:39:27.458 254707 DEBUG oslo.privsep.daemon [-] privsep: reply[140529299646096]: (2,) loop /usr/lib/python2.7/site-packages/oslo_privsep/daemon.py:456
  2018-06-12 11:39:27.458 254707 DEBUG oslo.privsep.daemon [-] privsep: request[140529299646096]: (3, 'neutron.privileged.agent.linux.ip_lib.list_netns', (), {}) loop /usr/lib/python2.7/site-packages/oslo_privsep/daemon.py:443
  2018-06-12 11:39:27.501 254707 DEBUG oslo.privsep.daemon [-] privsep: reply[140529299646096]: (4, ['qdhcp-64aa11b0-d9ff-47c3-9a44-2906bc22d724', 'qrouter-c24debdc-7bcd-40d7-90b9-32e0ec9bb11a', 'qdhcp-4b523888-7121-4133-b0c1-ff6a81a40dcd']) loop /usr/lib/python2.7/site-packages/oslo_privsep/daemon.py:456
  2018-06-12 11:39:30.179 254573 DEBUG neutron.agent.linux.utils [-] Unable to access /var/lib/neutron/dhcp/qrouter-c24debdc-7bcd-40d7-90b9-32e0ec9bb11a/pid get_value_from_file /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:254
  2018-06-12 11:39:30.179 254707 DEBUG oslo.privsep.daemon [-] privsep: request[140529299646096]: (3, 'neutron.privileged.agent.linux.ip_lib.list_netns', (), {}) loop /usr/lib/python2.7/site-packages/oslo_privsep/daemon.py:443
  2018-06-12 11:39:30.179 254707 DEBUG oslo.privsep.daemon [-] privsep: reply[140529299646096]: (4, ['qdhcp-64aa11b0-d9ff-47c3-9a44-2906bc22d724', 'qrouter-c24debdc-7bcd-40d7-90b9-32e0ec9bb11a', 'qdhcp-4b523888-7121-4133-b0c1-ff6a81a40dcd']) loop /usr/lib/python2.7/site-packages/oslo_privsep/daemon.py:456
  2018-06-12 11:39:30.180 254573 DEBUG neutron.agent.linux.utils [-] Running command (rootwrap daemon): ['ip', 'netns', 'exec', 'qrouter-c24debdc-7bcd-40d7-90b9-32e0ec9bb11a', 'netstat', '-nlp'] execute_rootwrap_daemon /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108
  2018-06-12 11:39:30.189 254573 DEBUG oslo_rootwrap.client [-] Popen for ['sudo', 'neutron-rootwrap-daemon', '/etc/neutron/rootwrap.conf'] command has been instantiated _initialize /usr/lib/python2.7/site-packages/oslo_rootwrap/client.py:73
  2018-06-12 11:39:30.327 254573 INFO oslo_rootwrap.client [-] Spawned new rootwrap daemon process with pid=255039
  2018-06-12 11:39:30.411 254573 DEBUG neutron.agent.linux.utils [-] Running command: ['ps', '-o', 'ppid=', '64523'] create_process /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:92
  2018-06-12 11:39:30.450 254573 DEBUG neutron.agent.linux.utils [-] Running command: ['ps', '-o', 'ppid=', '64507'] create_process /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:92
  2018-06-12 11:39:30.486 254573 WARNING neutron.cmd.netns_cleanup [-] Killing (15) [64507] /usr/sbin/haproxy -Ds -f /var/lib/neutron/ns-metadata-proxy/c24debdc-7bcd-40d7-9
  2018-06-12 11:39:30.486 254573 DEBUG neutron.agent.linux.utils [-] Running command (rootwrap daemon): ['kill', '-15', '64507'] execute_rootwrap_daemon /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108
  2018-06-12 11:39:30.492 254573 DEBUG neutron.agent.linux.utils [-] Running command (rootwrap daemon): ['ip', 'netns', 'exec', 'qrouter-c24debdc-7bcd-40d7-90b9-32e0ec9bb11a', 'netstat', '-nlp'] execute_rootwrap_daemon /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108
  2018-06-12 11:39:31.543 254573 DEBUG neutron.agent.linux.utils [-] Running command (rootwrap daemon): ['ip', 'netns', 'exec', 'qrouter-c24debdc-7bcd-40d7-90b9-32e0ec9bb11a', 'netstat', '-nlp'] execute_rootwrap_daemon /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108
  2018-06-12 11:39:31.615 254573 DEBUG neutron.agent.linux.utils [-] Running command (rootwrap daemon): ['ip', 'netns', 'exec', 'qrouter-c24debdc-7bcd-40d7-90b9-32e0ec9bb11a', 'find', '/sys/class/net', '-maxdepth', '1', '-type', 'l', '-printf', '%f '] execute_rootwrap_daemon /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108
  2018-06-12 11:39:31.658 254707 DEBUG oslo.privsep.daemon [-] privsep: request[140529299646096]: (3, 'neutron.privileged.agent.linux.ip_lib.delete_interface', (u'qr-5c4dbfdf-2e', 'qrouter-c24debdc-7bcd-40d7-90b9-32e0ec9bb11a'), {}) loop /usr/lib/python2.7/site-packages/oslo_privsep/daemon.py:443
  2018-06-12 11:39:31.696 254707 DEBUG oslo.privsep.daemon [-] privsep: Exception during request[140529299646096]: (95, 'Operation not supported') loop /usr/lib/python2.7/site-packages/oslo_privsep/daemon.py:449
  Traceback (most recent call last):
    File "/usr/lib/python2.7/site-packages/oslo_privsep/daemon.py", line 445, in loop
      reply = self._process_cmd(*msg)
    File "/usr/lib/python2.7/site-packages/oslo_privsep/daemon.py", line 428, in _process_cmd
      ret = func(*f_args, **f_kwargs)
    File "/usr/lib/python2.7/site-packages/oslo_privsep/priv_context.py", line 209, in _wrap
      return func(*args, **kwargs)
    File "/usr/lib/python2.7/site-packages/neutron/privileged/agent/linux/ip_lib.py", line 242, in delete_interface
      _run_iproute_link("del", ifname, namespace, **kwargs)
    File "/usr/lib/python2.7/site-packages/neutron/privileged/agent/linux/ip_lib.py", line 131, in _run_iproute_link
      return ip.link(command, index=idx, **kwargs)
    File "/usr/lib/python2.7/site-packages/pyroute2/iproute.py", line 1310, in link
      msg_flags=msg_flags)
    File "/usr/lib/python2.7/site-packages/pyroute2/netlink/nlsocket.py", line 804, in nlm_request
      return do_try()
    File "/usr/lib/python2.7/site-packages/pyroute2/netlink/nlsocket.py", line 783, in do_try
      callback=callback)
    File "/usr/lib/python2.7/site-packages/pyroute2/netlink/nlsocket.py", line 648, in get
      raise msg['header']['error']
  NetlinkError: (95, 'Operation not supported')
  2018-06-12 11:39:31.697 254707 DEBUG oslo.privsep.daemon [-] privsep: reply[140529299646096]: (5, 'pyroute2.netlink.exceptions.NetlinkError', (95, 'Operation not supported')) loop /usr/lib/python2.7/site-packages/oslo_privsep/daemon.py:456
  2018-06-12 11:39:31.709 254573 ERROR neutron.cmd.netns_cleanup [-] Error unable to destroy namespace: qrouter-c24debdc-7bcd-40d7-90b9-32e0ec9bb11a: NetlinkError: (95, 'Operation not supported')
  2018-06-12 11:39:31.709 254573 ERROR neutron.cmd.netns_cleanup Traceback (most recent call last):
  2018-06-12 11:39:31.709 254573 ERROR neutron.cmd.netns_cleanup   File "/usr/lib/python2.7/site-packages/neutron/cmd/netns_cleanup.py", line 251, in destroy_namespace
  2018-06-12 11:39:31.709 254573 ERROR neutron.cmd.netns_cleanup     unplug_device(conf, device)
  2018-06-12 11:39:31.709 254573 ERROR neutron.cmd.netns_cleanup   File "/usr/lib/python2.7/site-packages/neutron/cmd/netns_cleanup.py", line 126, in unplug_device
  2018-06-12 11:39:31.709 254573 ERROR neutron.cmd.netns_cleanup     device.link.delete()
  2018-06-12 11:39:31.709 254573 ERROR neutron.cmd.netns_cleanup   File "/usr/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 549, in delete
  2018-06-12 11:39:31.709 254573 ERROR neutron.cmd.netns_cleanup     privileged.delete_interface(self.name, self._parent.namespace)
  2018-06-12 11:39:31.709 254573 ERROR neutron.cmd.netns_cleanup   File "/usr/lib/python2.7/site-packages/oslo_privsep/priv_context.py", line 207, in _wrap
  2018-06-12 11:39:31.709 254573 ERROR neutron.cmd.netns_cleanup     return self.channel.remote_call(name, args, kwargs)
  2018-06-12 11:39:31.709 254573 ERROR neutron.cmd.netns_cleanup   File "/usr/lib/python2.7/site-packages/oslo_privsep/daemon.py", line 202, in remote_call
  2018-06-12 11:39:31.709 254573 ERROR neutron.cmd.netns_cleanup     raise exc_type(*result[2])
  2018-06-12 11:39:31.709 254573 ERROR neutron.cmd.netns_cleanup NetlinkError: (95, 'Operation not supported')
  2018-06-12 11:39:31.709 254573 ERROR neutron.cmd.netns_cleanup
  2018-06-12 11:39:31.712 254573 INFO oslo_rootwrap.client [-] Stopping rootwrap daemon process with pid=255039


  neutron l3-agent is a bit more smart while deleting ports and it will
  check if br-int is in the system first, then delete the port:

  11:11:40.861 85586 DEBUG neutron.agent.l3.router_info [-] removing
  internal network: port(567309c3-a831-4086-bc77-d3523b149e74) interface
  (qr-567309c3-a8) internal_network_removed /usr/lib/python2.7/site-
  packages/neutron/agent/l3/router_info.py:478

  11:11:40.862 95332 DEBUG oslo.privsep.daemon [-] privsep: request[140596640998416]: (3, 'neutron.privileged.agent.linux.ip_lib.interface_exists', (u'qr-567309c3-a8', u'qrouter-c24debdc-7bcd-40d7-90b9-32e0ec9bb11a'), {}) loop /usr/lib/python2.7/site-packages/oslo_privsep/daemon.py:443
  13:11 <ajo> 2018-06-12 11:11:40.877 85586 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn command(idx=0): DelPortCommand(if_exists=True, bridge=br-int, port=qr-567309c3-a8) do_commit /usr/lib/python2.7/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:84

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1776469/+subscriptions


References