← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1290068] [NEW] Some dhcp tap devices remain after tempest run

 

Public bug reported:

After running tempest.api.network, 'ip link' shows many tap devices that
were not cleaned up. These are tap devices for dhcp ports that no longer
exist in neutron.

The sequence why these taps remain is:

- in dhcp_agent.py call_driver() is called with 'enable' for the
network.

- in dhcp.py DeviceManager, setup() RPCs to creates the logical dhcp
port and the tap device successfully. Setup() finally calls
_set_default_route(), and this attempts to do a RPC to the plugin to get
the port. But tempest has finished the test and deleted the network in
the time since the port was created, so a NetworkNotFound is raised.
https://github.com/openstack/neutron/blob/5f8617bacf05d02db995289e734345bebab8124e/neutron/agent/linux/dhcp.py#L652

- the NetworkNotFound is caught by call_driver() and the network is not
put into the cache.
https://github.com/openstack/neutron/blob/5f8617bacf05d02db995289e734345bebab8124e/neutron/agent/dhcp_agent.py#L214

- then network_delete_end is processed, but this does nothing as there is no network in the cache.
https://github.com/openstack/neutron/blob/5f8617bacf05d02db995289e734345bebab8124e/neutron/agent/dhcp_agent.py#L224 So unplug() is never called for the dhcp tap device.
 

It's not clear to me why _set_default_route() ever needs to do the plugin.get_dhcp_port() RPC. When the dhcp agent calls the 'enable' method on the helper, DeviceManager.setup() has the port and could pass it in the chained call to _set_default_route(). 
And when the dhcp agent calls the 'restart' method, that calls enable(), but it then also calls DeviceManager.update() which just calls _set_default_route() for a second time. Also _set_default_route() gets called on every reload_allocation, so this is frequent enough.

** Affects: neutron
     Importance: Undecided
     Assignee: Darragh O'Reilly (darragh-oreilly)
         Status: In Progress


** Tags: l3-ipam-dhcp

** Changed in: neutron
     Assignee: (unassigned) => Darragh O'Reilly (darragh-oreilly)

** Changed in: neutron
       Status: New => In Progress

** Tags added: l3-ipam-dhcp

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1290068

Title:
  Some dhcp tap devices remain after tempest run

Status in OpenStack Neutron (virtual network service):
  In Progress

Bug description:
  After running tempest.api.network, 'ip link' shows many tap devices
  that were not cleaned up. These are tap devices for dhcp ports that no
  longer exist in neutron.

  The sequence why these taps remain is:

  - in dhcp_agent.py call_driver() is called with 'enable' for the
  network.

  - in dhcp.py DeviceManager, setup() RPCs to creates the logical dhcp
  port and the tap device successfully. Setup() finally calls
  _set_default_route(), and this attempts to do a RPC to the plugin to
  get the port. But tempest has finished the test and deleted the
  network in the time since the port was created, so a NetworkNotFound
  is raised.
  https://github.com/openstack/neutron/blob/5f8617bacf05d02db995289e734345bebab8124e/neutron/agent/linux/dhcp.py#L652

  - the NetworkNotFound is caught by call_driver() and the network is
  not put into the cache.
  https://github.com/openstack/neutron/blob/5f8617bacf05d02db995289e734345bebab8124e/neutron/agent/dhcp_agent.py#L214

  - then network_delete_end is processed, but this does nothing as there is no network in the cache.
  https://github.com/openstack/neutron/blob/5f8617bacf05d02db995289e734345bebab8124e/neutron/agent/dhcp_agent.py#L224 So unplug() is never called for the dhcp tap device.
   

  It's not clear to me why _set_default_route() ever needs to do the plugin.get_dhcp_port() RPC. When the dhcp agent calls the 'enable' method on the helper, DeviceManager.setup() has the port and could pass it in the chained call to _set_default_route(). 
  And when the dhcp agent calls the 'restart' method, that calls enable(), but it then also calls DeviceManager.update() which just calls _set_default_route() for a second time. Also _set_default_route() gets called on every reload_allocation, so this is frequent enough.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1290068/+subscriptions


Follow ups

References