← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1841967] [NEW] ML2 mech driver sometimes receives network context without provider attributes in delete_network_postcommit

 

Public bug reported:

When a network is deleted, sometimes the delete_network_postcommit
method of my ML2 mechanism driver receives a network object in the
context that has the provider attributes set to None.

I am using Rocky (13.0.4), on CentOS 7.5 + RDO, and kolla-ansible. I
have three controllers running neutron-server.

Specifically, the mechanism driver is networking-generic-switch. It
needs the provider information in order to configure VLANs on physical
switches, and without it I am left with stale switch configuration.

In my testing I have found that reducing the number of neutron-server
instances reduces the likelihood of seeing this issue. I did not see it
with only one instance running, but only tested ~10 times.

I have collected logs from a broken case and a working case, and one key
difference I can see is that in the working case I see two of these
messages, and in the broken case I see three:

Network 3ed87da6-0b3a-455a-b813-7d069dc9e112 has no segments
_extend_network_dict_provider /usr/lib/python2.7/site-
packages/neutron/plugins/ml2/managers.py:168

Indeed, _extend_network_dict_provider sets the provider attributes to
None if there are no segments found in the DB.

It seems to be a race condition between segment deletion and creation of
the _mech_context in the network precommit.

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1841967

Title:
  ML2 mech driver sometimes receives network context without provider
  attributes in delete_network_postcommit

Status in neutron:
  New

Bug description:
  When a network is deleted, sometimes the delete_network_postcommit
  method of my ML2 mechanism driver receives a network object in the
  context that has the provider attributes set to None.

  I am using Rocky (13.0.4), on CentOS 7.5 + RDO, and kolla-ansible. I
  have three controllers running neutron-server.

  Specifically, the mechanism driver is networking-generic-switch. It
  needs the provider information in order to configure VLANs on physical
  switches, and without it I am left with stale switch configuration.

  In my testing I have found that reducing the number of neutron-server
  instances reduces the likelihood of seeing this issue. I did not see
  it with only one instance running, but only tested ~10 times.

  I have collected logs from a broken case and a working case, and one
  key difference I can see is that in the working case I see two of
  these messages, and in the broken case I see three:

  Network 3ed87da6-0b3a-455a-b813-7d069dc9e112 has no segments
  _extend_network_dict_provider /usr/lib/python2.7/site-
  packages/neutron/plugins/ml2/managers.py:168

  Indeed, _extend_network_dict_provider sets the provider attributes to
  None if there are no segments found in the DB.

  It seems to be a race condition between segment deletion and creation
  of the _mech_context in the network precommit.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1841967/+subscriptions


Follow ups