yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #95160
[Bug 2093347] [NEW] [ovn-octavia-provider] first request is stuck on OVN txn
Public bug reported:
On a fresh env, first action over the ovn-provider is getting stuck for
180s on the first txn over OVN NB DB.
After some depth analysis over the threads running there we saw that the
GC is called on the driver class, and then calling the shutdown on the
helper, doing a join() over the daemon thread responsible of manage th e
requests over the helper. At this way we have a dead_lock because any
further txn over OVN DB done by ovsdbapp is done using lock and the join
is also waiting for that lock getting the thread hang for 180s (timeout
ovsdbapp)
The inspect on the thread shows this behaviour during the stuck time:
Process 2249966: /usr/bin/uwsgi --ini /etc/octavia/octavia-uwsgi.ini --venv /opt/stack/data/venv
Python v3.12.3 (/usr/bin/uwsgi-core)
Thread 2062601 (active): "uWSGIWorker1Core0"
Thread 2250013 (idle): "Thread-2 (run)"
_wait_for_tstate_lock (threading.py:1167)
join (threading.py:1147)
shutdown (ovn_octavia_provider/helper.py:112)
__del__ (ovn_octavia_provider/driver.py:51)
__subclasscheck__ (<frozen abc>:123)
__subclasscheck__ (<frozen abc>:123)
__subclasscheck__ (<frozen abc>:123)
__subclasscheck__ (<frozen abc>:123)
__instancecheck__ (<frozen abc>:119)
db_replace_record (ovsdbapp/backend/ovs_idl/idlutils.py:452)
set_column (ovsdbapp/backend/ovs_idl/command.py:62)
set_columns (ovsdbapp/backend/ovs_idl/command.py:67)
run_idl (ovsdbapp/backend/ovs_idl/command.py:115)
do_commit (ovsdbapp/backend/ovs_idl/transaction.py:92)
run (ovsdbapp/backend/ovs_idl/connection.py:118)
run (threading.py:1010)
_bootstrap_inner (threading.py:1073)
_bootstrap (threading.py:1030)
Thread 2250332 (idle): "Thread-3 (request_handler)"
wait (threading.py:359)
get (queue.py:180)
commit (ovsdbapp/backend/ovs_idl/transaction.py:54)
__exit__ (ovsdbapp/api.py:71)
transaction (ovsdbapp/api.py:114)
__exit__ (contextlib.py:144)
transaction (impl_idl_ovn.py:180)
__exit__ (contextlib.py:144)
execute (ovsdbapp/backend/ovs_idl/command.py:49)
lb_create (ovn_octavia_provider/helper.py:1146)
request_handler (ovn_octavia_provider/helper.py:401)
run (threading.py:1010)
_bootstrap_inner (threading.py:1073)
_bootstrap (threading.py:1030)
** Affects: neutron
Importance: Undecided
Assignee: Fernando Royo (froyoredhat)
Status: In Progress
** Tags: ovn-octavia-provider
** Changed in: neutron
Assignee: (unassigned) => Fernando Royo (froyoredhat)
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2093347
Title:
[ovn-octavia-provider] first request is stuck on OVN txn
Status in neutron:
In Progress
Bug description:
On a fresh env, first action over the ovn-provider is getting stuck
for 180s on the first txn over OVN NB DB.
After some depth analysis over the threads running there we saw that
the GC is called on the driver class, and then calling the shutdown on
the helper, doing a join() over the daemon thread responsible of
manage th e requests over the helper. At this way we have a dead_lock
because any further txn over OVN DB done by ovsdbapp is done using
lock and the join is also waiting for that lock getting the thread
hang for 180s (timeout ovsdbapp)
The inspect on the thread shows this behaviour during the stuck time:
Process 2249966: /usr/bin/uwsgi --ini /etc/octavia/octavia-uwsgi.ini --venv /opt/stack/data/venv
Python v3.12.3 (/usr/bin/uwsgi-core)
Thread 2062601 (active): "uWSGIWorker1Core0"
Thread 2250013 (idle): "Thread-2 (run)"
_wait_for_tstate_lock (threading.py:1167)
join (threading.py:1147)
shutdown (ovn_octavia_provider/helper.py:112)
__del__ (ovn_octavia_provider/driver.py:51)
__subclasscheck__ (<frozen abc>:123)
__subclasscheck__ (<frozen abc>:123)
__subclasscheck__ (<frozen abc>:123)
__subclasscheck__ (<frozen abc>:123)
__instancecheck__ (<frozen abc>:119)
db_replace_record (ovsdbapp/backend/ovs_idl/idlutils.py:452)
set_column (ovsdbapp/backend/ovs_idl/command.py:62)
set_columns (ovsdbapp/backend/ovs_idl/command.py:67)
run_idl (ovsdbapp/backend/ovs_idl/command.py:115)
do_commit (ovsdbapp/backend/ovs_idl/transaction.py:92)
run (ovsdbapp/backend/ovs_idl/connection.py:118)
run (threading.py:1010)
_bootstrap_inner (threading.py:1073)
_bootstrap (threading.py:1030)
Thread 2250332 (idle): "Thread-3 (request_handler)"
wait (threading.py:359)
get (queue.py:180)
commit (ovsdbapp/backend/ovs_idl/transaction.py:54)
__exit__ (ovsdbapp/api.py:71)
transaction (ovsdbapp/api.py:114)
__exit__ (contextlib.py:144)
transaction (impl_idl_ovn.py:180)
__exit__ (contextlib.py:144)
execute (ovsdbapp/backend/ovs_idl/command.py:49)
lb_create (ovn_octavia_provider/helper.py:1146)
request_handler (ovn_octavia_provider/helper.py:401)
run (threading.py:1010)
_bootstrap_inner (threading.py:1073)
_bootstrap (threading.py:1030)
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2093347/+subscriptions