← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1950679] Re: [ovn] neutron_ovn_db_sync_util hangs on sync_routers_and_rports

 

Reviewed:  https://review.opendev.org/c/openstack/neutron/+/817637
Committed: https://opendev.org/openstack/neutron/commit/7e2f73350ffdc90f7b340788db36edc439f96f6e
Submitter: "Zuul (22348)"
Branch:    master

commit 7e2f73350ffdc90f7b340788db36edc439f96f6e
Author: Daniel Speichert <Daniel_Speichert@xxxxxxxxxxx>
Date:   Thu Nov 11 13:18:49 2021 -0500

    [OVN] Fix deadlock in neutron_ovn_db_sync_util.py
    
    A feature to synchronize OVN DB connections when handling events
    introduced in 90980f496cfa3cc5df1c93cf834a44f33d3f1f6f is not applicable
    to the offline sync process executed by this utility.
    
    Closes-bug: #1950679
    Change-Id: Iac4eb364bfc1c44f5d4526bae71967bede29cc36


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1950679

Title:
  [ovn] neutron_ovn_db_sync_util hangs on sync_routers_and_rports

Status in neutron:
  Fix Released

Bug description:
  neutron-ovn-db-sync-util hangs in certain scenarios while running
  sync_routers_and_rports.

  Specifically, it seems to be hanging on self.l3_plugin.get_routers(ctx)
  -> model_query.get_collection(...) of get_routers(...) in neutron.db.l3_db.py
  -> get_collection(...) in neutron_lib.db.model_query.py runs dict_funcs which somehow reaches to nb_ovn property accessor in neutron.plugins.ml2.drivers.ovn.mech_driver.mech_driver.py
  -> which runs self._post_fork_event.wait() 

  That mutex seems to never be "set" and blocks further execution
  because it might not be applicable to this flow.

  It looks like the neutron-ovn-db-sync-util might need to always "set"
  it since it mocks other parts of the NB/DB client in a similar fashion
  to some unit tests.

  I'm not yet sure what kind of exact circumstances lead to that access
  and that wait(), syncing via the util to an empty OVN NB/DB seems to
  work. I see the issue more frequently on subsequent runs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1950679/+subscriptions



References