yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #89811
[Bug 1991655] Re: Hash Ring nodes considered dead because of delayed probing
Reviewed: https://review.opendev.org/c/openstack/neutron/+/860233
Committed: https://opendev.org/openstack/neutron/commit/240f2c6aebb5a958e3cdea9b9188e7f605238494
Submitter: "Zuul (22348)"
Branch: master
commit 240f2c6aebb5a958e3cdea9b9188e7f605238494
Author: Lucas Alvares Gomes <lucasagomes@xxxxxxxxx>
Date: Tue Oct 4 10:27:04 2022 +0100
Split Hash Ring probing from the maintenance task
This patch split out the Hash Ring probing out of the maitenance task
into it's own thread. The idea is to speed up the start of probing by
doing it right after adding a node to the Hash Ring.
By doing that, we avoid the problem of delaying probing in case the
connection with OVSDB takes longer than expected to connect and the hash
ring nodes are considered dead as they weren't probed in time.
The patch re-uses the same classes as before to start this new thread
(instead of reusing the maintenance task thread). It adds a layer of
synchronization with a lock to make sure that only one new Hash Ring
probing thread is started.
Closes-Bug: #1991655
Change-Id: Ic04493f20eb9aecda563942c51f343dc4202523a
Signed-off-by: Lucas Alvares Gomes <lucasagomes@xxxxxxxxx>
** Changed in: neutron
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1991655
Title:
Hash Ring nodes considered dead because of delayed probing
Status in neutron:
Fix Released
Bug description:
Right now, probing the hash ring nodes happens as part of the
maintenance task thread but this thread is only started after we
establish a connection with the OVSDB servers in the
post_fork_initialize() method for ML2/OVN.
If this connection with OVSDB takes longer than expected, it's
possible that the nodes in the hash ring (that have to be added prior
to this connection) will time out because the maintenance task thread
has not yet being started.
Ideally, we would need to separate the probing to it's own periodic
thread that is started before the connections with the OVSDBs to avoid
this problem.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1991655/+subscriptions
References