yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #93159
[Bug 2024205] Re: [OVN] Hash Ring nodes removed when "periodic worker" is killed
** Changed in: neutron
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2024205
Title:
[OVN] Hash Ring nodes removed when "periodic worker" is killed
Status in neutron:
Fix Released
Bug description:
Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=2213910
In the ML2/OVN driver we set a signal handler for SIGTERM to remove
the hash ring nodes upon the service exit [0] but, during the
investigation of one bug with a customer we identified that an
unrelated Neutron worker is killed (such as the "periodic worker" in
this case) this could lead to that process removing the entries from
the ovn_hash_ring table for that hostname.
If this happens on all controllers, the ovn_hash_ring table is
rendered empty and OVSDB events are no longer processed by ML2/OVN.
Proposed solution:
This LP proposes to make this more reliable by instead of removing the
nodes from the ovn_hash_ring table at exiting, we would mark them as
offline instead. That way, if a worker dies the nodes will remain
registered in the table and the heartbeat thread will set them as
online again on the next beat. If the service is properly stopped the
heartbeat won't be running and the nodes will be seeing as offline to
the Hash Ring manager.
As a note, upon the next startup of the service the nodes matching the
server hostname will be removed from the ovn_hash_ring table and added
again accordingly as Neutron worker are spawned [1].
[0] https://github.com/openstack/neutron/blob/cbb89fdb1414a1b3a8e8b3a9a4154ef627bb9d1a/neutron/plugins/ml2/drivers/ovn/mech_driver/mech_driver.py#L295-L296
[1] https://github.com/openstack/neutron/blob/cbb89fdb1414a1b3a8e8b3a9a4154ef627bb9d1a/neutron/plugins/ml2/drivers/ovn/mech_driver/mech_driver.py#L316
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2024205/+subscriptions
References