← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1950679] [NEW] [ovn] neutron_ovn_db_sync_util hangs on sync_routers_and_rports

 

Public bug reported:

neutron-ovn-db-sync-util hangs in certain scenarios while running
sync_routers_and_rports.

Specifically, it seems to be hanging on self.l3_plugin.get_routers(ctx)
-> model_query.get_collection(...) of get_routers(...) in neutron.db.l3_db.py
-> get_collection(...) in neutron_lib.db.model_query.py runs dict_funcs which somehow reaches to nb_ovn property accessor in neutron.plugins.ml2.drivers.ovn.mech_driver.mech_driver.py
-> which runs self._post_fork_event.wait() 

That mutex seems to never be "set" and blocks further execution because
it might not be applicable to this flow.

It looks like the neutron-ovn-db-sync-util might need to always "set" it
since it mocks other parts of the NB/DB client in a similar fashion to
some unit tests.

I'm not yet sure what kind of exact circumstances lead to that access
and that wait(), syncing via the util to an empty OVN NB/DB seems to
work. I see the issue more frequently on subsequent runs.

** Affects: neutron
     Importance: Undecided
         Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1950679

Title:
  [ovn] neutron_ovn_db_sync_util hangs on sync_routers_and_rports

Status in neutron:
  In Progress

Bug description:
  neutron-ovn-db-sync-util hangs in certain scenarios while running
  sync_routers_and_rports.

  Specifically, it seems to be hanging on self.l3_plugin.get_routers(ctx)
  -> model_query.get_collection(...) of get_routers(...) in neutron.db.l3_db.py
  -> get_collection(...) in neutron_lib.db.model_query.py runs dict_funcs which somehow reaches to nb_ovn property accessor in neutron.plugins.ml2.drivers.ovn.mech_driver.mech_driver.py
  -> which runs self._post_fork_event.wait() 

  That mutex seems to never be "set" and blocks further execution
  because it might not be applicable to this flow.

  It looks like the neutron-ovn-db-sync-util might need to always "set"
  it since it mocks other parts of the NB/DB client in a similar fashion
  to some unit tests.

  I'm not yet sure what kind of exact circumstances lead to that access
  and that wait(), syncing via the util to an empty OVN NB/DB seems to
  work. I see the issue more frequently on subsequent runs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1950679/+subscriptions



Follow ups