yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #87340
[Bug 1946187] [NEW] HA routers not going to be "primary" at all
Public bug reported:
It happens in the CI from time to time that many tests are failing
because router is in backup state all the time and it's never
transitioned to be primary on the node.
Examples of the failure:
https://3142cc95d58eb8a4ee07-043369ac575bbfe29758366f4ba498a1.ssl.cf1.rackcdn.com/765072/8/check/neutron-tempest-plugin-scenario-openvswitch/499b47d/controller/logs/screen-q-l3.txt
https://6599da62140c9583e14a-cd7f53ffbb0b86c69deae453da021fe8.ssl.cf5.rackcdn.com/811746/4/check/neutron-
tempest-plugin-scenario-openvswitch/3cafcd7/testr_results.html
https://zuul.opendev.org/t/openstack/build/75c056464b6f445ebde18c1b07f5bcce
Example of stacktrace:
Traceback (most recent call last):
File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/neutron_tempest_plugin/common/utils.py", line 80, in wait_until_true
eventlet.sleep(sleep)
File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/eventlet/greenthread.py", line 36, in sleep
hub.switch()
File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/eventlet/hubs/hub.py", line 313, in switch
return self.greenlet.switch()
eventlet.timeout.Timeout: 600 seconds
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/neutron_tempest_plugin/scenario/test_basic.py", line 35, in test_basic_instance
self.setup_network_and_server()
File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/neutron_tempest_plugin/scenario/base.py", line 281, in setup_network_and_server
router = self.create_router_by_client(**kwargs)
File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/neutron_tempest_plugin/scenario/base.py", line 209, in create_router_by_client
cls._wait_for_router_ha_active(router['id'])
File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/neutron_tempest_plugin/scenario/base.py", line 228, in _wait_for_router_ha_active
utils.wait_until_true(_router_active_on_l3_agent,
File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/neutron_tempest_plugin/common/utils.py", line 84, in wait_until_true
raise exception
tempest.lib.exceptions.TimeoutException: Request timed out
Details: Router 1c4ce297-5a04-4794-9720-20fdec9ca4e5 is not active on any of the L3 agents
** Affects: neutron
Importance: High
Status: Confirmed
** Tags: gate-failure l3-ha
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1946187
Title:
HA routers not going to be "primary" at all
Status in neutron:
Confirmed
Bug description:
It happens in the CI from time to time that many tests are failing
because router is in backup state all the time and it's never
transitioned to be primary on the node.
Examples of the failure:
https://3142cc95d58eb8a4ee07-043369ac575bbfe29758366f4ba498a1.ssl.cf1.rackcdn.com/765072/8/check/neutron-tempest-plugin-scenario-openvswitch/499b47d/controller/logs/screen-q-l3.txt
https://6599da62140c9583e14a-cd7f53ffbb0b86c69deae453da021fe8.ssl.cf5.rackcdn.com/811746/4/check/neutron-
tempest-plugin-scenario-openvswitch/3cafcd7/testr_results.html
https://zuul.opendev.org/t/openstack/build/75c056464b6f445ebde18c1b07f5bcce
Example of stacktrace:
Traceback (most recent call last):
File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/neutron_tempest_plugin/common/utils.py", line 80, in wait_until_true
eventlet.sleep(sleep)
File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/eventlet/greenthread.py", line 36, in sleep
hub.switch()
File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/eventlet/hubs/hub.py", line 313, in switch
return self.greenlet.switch()
eventlet.timeout.Timeout: 600 seconds
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/neutron_tempest_plugin/scenario/test_basic.py", line 35, in test_basic_instance
self.setup_network_and_server()
File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/neutron_tempest_plugin/scenario/base.py", line 281, in setup_network_and_server
router = self.create_router_by_client(**kwargs)
File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/neutron_tempest_plugin/scenario/base.py", line 209, in create_router_by_client
cls._wait_for_router_ha_active(router['id'])
File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/neutron_tempest_plugin/scenario/base.py", line 228, in _wait_for_router_ha_active
utils.wait_until_true(_router_active_on_l3_agent,
File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/neutron_tempest_plugin/common/utils.py", line 84, in wait_until_true
raise exception
tempest.lib.exceptions.TimeoutException: Request timed out
Details: Router 1c4ce297-5a04-4794-9720-20fdec9ca4e5 is not active on any of the L3 agents
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1946187/+subscriptions
Follow ups