yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #87776
[Bug 1952907] [NEW] Gratuitous ARPs are not sent during master transition
Public bug reported:
* High level description:
When a router transitions to MASTER state, keepalived should send GARPs but it fails because qg-* interface is down(it comes up about 1 sec after that, so it might be some race condition)
Keepalived should also send another GARPs after 60 seconds(garp_master_delay) but it doesn't(probably because first ones fail, but I'm not 100% sure).
When I add random port to this router to trigger keepalived's reload,
then all GARPs are sent properly(because netns is already configured and
qg-* interface is up for the whole time)
* Pre-conditions:
Operating System: Ubuntu 20.04
Keepalived version: 2.0.19
Affected neutron releases:
- my AIO env: Xena (master/106fa3e6d3f0b1c32ef28fe9dd6b125b9317e9cf # HEAD as of 29.09.2021)
- my prod env: Victoria
- (most likely all versions after this change https://review.opendev.org/c/openstack/neutron/+/707406)
* Step-by-step reproduction:
Simply perform a failover on HA router.
The same goal may be also achieved by removing all l3 agents from the router, and then adding one, so:
# openstack router create neutron-bug
# openstack router set --external-gateway public neutron-bug
# neutron l3-agent-list-hosting-router neutron-bug
# (for all l3 agents): neutron l3-agent-router-remove L3_AGENT_ID neutron-bug
# (for a single l3 agent): neutron l3-agent-router-add L3_AGENT_ID neutron-bug
(GARPs are not sent)
# openstack router add port neutron-bug test-port
(GARPs are sent properly)
* Expected output:
Gratuitous ARPs should be sent from router's namespace during MASTER
transition.
* Actual output:
Gratuitous ARPs are not sent.
Keepalived complains about: Error 100 (Network is down) sending gratuitous ARP on qg-4a2f0239-5c for 172.29.249.194
qg-* interface wakes up about 1 second after keepalived tries to send GARPs.
* Attachments:
Keepalived logs: https://paste.openstack.org/raw/811372/
Interfaces inside router's netns + tcpdump from master transition: https://paste.openstack.org/raw/811373/
** Affects: neutron
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1952907
Title:
Gratuitous ARPs are not sent during master transition
Status in neutron:
New
Bug description:
* High level description:
When a router transitions to MASTER state, keepalived should send GARPs but it fails because qg-* interface is down(it comes up about 1 sec after that, so it might be some race condition)
Keepalived should also send another GARPs after 60 seconds(garp_master_delay) but it doesn't(probably because first ones fail, but I'm not 100% sure).
When I add random port to this router to trigger keepalived's reload,
then all GARPs are sent properly(because netns is already configured
and qg-* interface is up for the whole time)
* Pre-conditions:
Operating System: Ubuntu 20.04
Keepalived version: 2.0.19
Affected neutron releases:
- my AIO env: Xena (master/106fa3e6d3f0b1c32ef28fe9dd6b125b9317e9cf # HEAD as of 29.09.2021)
- my prod env: Victoria
- (most likely all versions after this change https://review.opendev.org/c/openstack/neutron/+/707406)
* Step-by-step reproduction:
Simply perform a failover on HA router.
The same goal may be also achieved by removing all l3 agents from the router, and then adding one, so:
# openstack router create neutron-bug
# openstack router set --external-gateway public neutron-bug
# neutron l3-agent-list-hosting-router neutron-bug
# (for all l3 agents): neutron l3-agent-router-remove L3_AGENT_ID neutron-bug
# (for a single l3 agent): neutron l3-agent-router-add L3_AGENT_ID neutron-bug
(GARPs are not sent)
# openstack router add port neutron-bug test-port
(GARPs are sent properly)
* Expected output:
Gratuitous ARPs should be sent from router's namespace during MASTER
transition.
* Actual output:
Gratuitous ARPs are not sent.
Keepalived complains about: Error 100 (Network is down) sending gratuitous ARP on qg-4a2f0239-5c for 172.29.249.194
qg-* interface wakes up about 1 second after keepalived tries to send GARPs.
* Attachments:
Keepalived logs: https://paste.openstack.org/raw/811372/
Interfaces inside router's netns + tcpdump from master transition: https://paste.openstack.org/raw/811373/
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1952907/+subscriptions
Follow ups