yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #92544
[Bug 2025129] [NEW] DvrLocalRouter init references namespace before it is created
Public bug reported:
Description
-----------
When the DvrLocalRouter object is instantiated, it calls the the
_load_used_fip_information() function. In some cases this function will
try to add ip rules in a specific network namespace however that
namespace may not exist at the time. This results in
neutron.privileged.agent.linux.ip_lib.NetworkNamespaceNotFound being
thrown.
Pre-conditions
--------------
- DVR is in use and the created router is distributed and HA
- The state file 'fip-priorities' is missing some entires which results in https://opendev.org/openstack/neutron/src/commit/0c5d4b872899497437d1399c845be756103a46d3/neutron/agent/l3/dvr_local_router.py#L76 being skipped
- The qrouter network namespace does not exist (possibly due to a reboot of the host or something similar)
Step-by-step reproduction steps
-------------------------------
- Setup OpenStack with DVR enabled
- Create a HA router with an external subnet attached so we can use the IPs as FIPs
- Create a VM with a FIP attached from the aforementioned router
- SSH to the host running the aforementioned VM and:
- Delete the qrouter namespace associated with this router
- Remove the entry for the FIP from the fip-priorities state file in the Neutron state directory
- Restart the Neutron L3 agent
Expected output
---------------
Neutron L3 agent should restart without any errors.
Actual output
-------------
Neutron L3 agent throws a NetworkNamespaceNotFound exception for each
missing FIP in the fip-priorities state file, fails to setup the router
and then retries. Note that if there are more than 5 missing FIP entires
in the fip-priorities file then the router setup fails completely as it
hits the retry limit specified in
https://opendev.org/openstack/neutron/src/commit/0c5d4b872899497437d1399c845be756103a46d3/neutron/agent/l3/agent.py#L730-L733.
This leaves the router completely broken and not setup on the node
resulting in broken networking for all VMs using that router on a
particular host.
Version
-------
- OpenStack version - master/zed
- Linux distro - AlmaLinux9
- Deployed via Kolla Ansible
** Affects: neutron
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2025129
Title:
DvrLocalRouter init references namespace before it is created
Status in neutron:
New
Bug description:
Description
-----------
When the DvrLocalRouter object is instantiated, it calls the the
_load_used_fip_information() function. In some cases this function
will try to add ip rules in a specific network namespace however that
namespace may not exist at the time. This results in
neutron.privileged.agent.linux.ip_lib.NetworkNamespaceNotFound being
thrown.
Pre-conditions
--------------
- DVR is in use and the created router is distributed and HA
- The state file 'fip-priorities' is missing some entires which results in https://opendev.org/openstack/neutron/src/commit/0c5d4b872899497437d1399c845be756103a46d3/neutron/agent/l3/dvr_local_router.py#L76 being skipped
- The qrouter network namespace does not exist (possibly due to a reboot of the host or something similar)
Step-by-step reproduction steps
-------------------------------
- Setup OpenStack with DVR enabled
- Create a HA router with an external subnet attached so we can use the IPs as FIPs
- Create a VM with a FIP attached from the aforementioned router
- SSH to the host running the aforementioned VM and:
- Delete the qrouter namespace associated with this router
- Remove the entry for the FIP from the fip-priorities state file in the Neutron state directory
- Restart the Neutron L3 agent
Expected output
---------------
Neutron L3 agent should restart without any errors.
Actual output
-------------
Neutron L3 agent throws a NetworkNamespaceNotFound exception for each
missing FIP in the fip-priorities state file, fails to setup the
router and then retries. Note that if there are more than 5 missing
FIP entires in the fip-priorities file then the router setup fails
completely as it hits the retry limit specified in
https://opendev.org/openstack/neutron/src/commit/0c5d4b872899497437d1399c845be756103a46d3/neutron/agent/l3/agent.py#L730-L733.
This leaves the router completely broken and not setup on the node
resulting in broken networking for all VMs using that router on a
particular host.
Version
-------
- OpenStack version - master/zed
- Linux distro - AlmaLinux9
- Deployed via Kolla Ansible
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2025129/+subscriptions
Follow ups