yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #95250
[Bug 2095590] [NEW] [OVN] Neutron API stops refreshing the hash ring node
Public bug reported:
The Neutron API, when using WSGI, stops refreshing the hash ring node if
there is no activity. The OVN IDL connections (NB and SB) also
disconnect.
There are plenty of errors in the testing job [1]. For example, from [2]:
* The last command executed before running the tempest tests happen at 08:22:41.
* The tempest tests start at 08:24:23 [3].
* There is an activity gap in the Neutron API logs of around 100 seconds (from [4]):
"""
Jan 23 08:22:41.500324 np0039651784 devstack@neutron-api.service[60810]: [pid: 60810|app: 0|req: 10/40] 2001:4802:7805:104:be76:4eff:fe20:1d8f () {68 vars in 1517 bytes} [Thu Jan 23 08:22:40 2025] POST /networking/v2.0/subnets => generated 665 bytes in 840 msecs (HTTP/1.1 201) 4 headers in 162 bytes (1 switches on core 0)
Jan 23 08:24:23.878996 np0039651784 devstack@neutron-api.service[60811]: DEBUG futurist.periodics [-] Submitting periodic callback 'neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.maintenance.HashRingHealthCheckPeriodics.touch_hash_ring_node' {{(pid=60811) _process_scheduled /opt/stack/data/venv/lib/python3.12/site-packages/futurist/periodics.py:638}}
"""
* One of the workers (PID 60810) restart the activity 2 minutes later after the last log line to update the hash ring (now it should be updated every 15 seconds). That leads to:
** An incorrect count of the active nodes: Hash Ring loaded. 3 active nodes. 0 offline nodes
** The IDL disconnections [4].
It is worth mentioning that uWSGI option "enable-threads" is enabled right now.
[1]https://review.opendev.org/c/openstack/neutron/+/932601
[2]https://8a3e2af9f348776bb6b6-c0288c15cf27fe5a39c9948ecafb7329.ssl.cf2.rackcdn.com/932601/17/check/neutron-ovn-tempest-ipv6-only-ovs-release-wsgi-4/e3be37a/testr_results.html
[3]https://8a3e2af9f348776bb6b6-c0288c15cf27fe5a39c9948ecafb7329.ssl.cf2.rackcdn.com/932601/17/check/neutron-ovn-tempest-ipv6-only-ovs-release-wsgi-4/e3be37a/controller/logs/tempest_log.txt
[4]https://paste.opendev.org/show/buu2O3Jt3AGcfTfsWrzf/
** Affects: neutron
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2095590
Title:
[OVN] Neutron API stops refreshing the hash ring node
Status in neutron:
New
Bug description:
The Neutron API, when using WSGI, stops refreshing the hash ring node
if there is no activity. The OVN IDL connections (NB and SB) also
disconnect.
There are plenty of errors in the testing job [1]. For example, from [2]:
* The last command executed before running the tempest tests happen at 08:22:41.
* The tempest tests start at 08:24:23 [3].
* There is an activity gap in the Neutron API logs of around 100 seconds (from [4]):
"""
Jan 23 08:22:41.500324 np0039651784 devstack@neutron-api.service[60810]: [pid: 60810|app: 0|req: 10/40] 2001:4802:7805:104:be76:4eff:fe20:1d8f () {68 vars in 1517 bytes} [Thu Jan 23 08:22:40 2025] POST /networking/v2.0/subnets => generated 665 bytes in 840 msecs (HTTP/1.1 201) 4 headers in 162 bytes (1 switches on core 0)
Jan 23 08:24:23.878996 np0039651784 devstack@neutron-api.service[60811]: DEBUG futurist.periodics [-] Submitting periodic callback 'neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.maintenance.HashRingHealthCheckPeriodics.touch_hash_ring_node' {{(pid=60811) _process_scheduled /opt/stack/data/venv/lib/python3.12/site-packages/futurist/periodics.py:638}}
"""
* One of the workers (PID 60810) restart the activity 2 minutes later after the last log line to update the hash ring (now it should be updated every 15 seconds). That leads to:
** An incorrect count of the active nodes: Hash Ring loaded. 3 active nodes. 0 offline nodes
** The IDL disconnections [4].
It is worth mentioning that uWSGI option "enable-threads" is enabled right now.
[1]https://review.opendev.org/c/openstack/neutron/+/932601
[2]https://8a3e2af9f348776bb6b6-c0288c15cf27fe5a39c9948ecafb7329.ssl.cf2.rackcdn.com/932601/17/check/neutron-ovn-tempest-ipv6-only-ovs-release-wsgi-4/e3be37a/testr_results.html
[3]https://8a3e2af9f348776bb6b6-c0288c15cf27fe5a39c9948ecafb7329.ssl.cf2.rackcdn.com/932601/17/check/neutron-ovn-tempest-ipv6-only-ovs-release-wsgi-4/e3be37a/controller/logs/tempest_log.txt
[4]https://paste.opendev.org/show/buu2O3Jt3AGcfTfsWrzf/
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2095590/+subscriptions