yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #85585
[Bug 1846198] Re: packet loss during active L3 HA agent restart
Fix has been released for Victoria
** Changed in: openstack-ansible
Status: New => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1846198
Title:
packet loss during active L3 HA agent restart
Status in neutron:
Invalid
Status in openstack-ansible:
Fix Released
Bug description:
Deployment:
Openstack-ansible 19.0.3(Stein) with two dedicated network nodes(is_metal=True) + linuxbridge + vxlan.
Ubuntu 16.04.6 4.15.0-62-generic
neutron l3-agent-list-hosting-router R1
+--------------------------------------+---------------+----------------+-------+----------+
| id | host | admin_state_up | alive | ha_state |
+--------------------------------------+---------------+----------------+-------+----------+
| 1b3b1b5d-08e7-48a1-ab8d-256d94099fb6 | test-network2 | True | :-) | standby |
| fa402ada-7716-4ad4-a004-7f8114fb1edf | test-network1 | True | :-) | active |
+--------------------------------------+---------------+----------------+-------+----------+
How to reproduce: Restart the active l3 agent. (systemctl restart
neutron-l3-agent.service)
test-network1 server side events:
systemctl restart neutron-l3-agent: @02:58:56.135635630
ip monitor terminated (kill -9) @02:58:56.208922038
vip ips removed @02:58:56.268074480
keepalived terminated @02:58:57.318596743
l3-agent terminated @02:59:07.504366398
keepalived-state-change terminated @03:01:07.735281710
test-network1 journal:
@02:58:56 test-network1 systemd[1]: Stopping neutron-l3-agent service...
@02:58:56 test-network1 Keepalived_vrrp[24400]: VRRP_Instance(VR_217) sent 0 priority
@02:58:56 test-network1 Keepalived_vrrp[24400]: VRRP_Instance(VR_217) removing protocol Virtual Routes
@02:58:56 test-network1 Keepalived_vrrp[24400]: VRRP_Instance(VR_217) removing protocol VIPs.
@02:58:56 test-network1 Keepalived_vrrp[24400]: VRRP_Instance(VR_217) removing protocol E-VIPs.
@02:58:56 test-network1 Keepalived[24394]: Stopping
@02:58:56 test-network1 neutron-keepalived-state-change[24278]: 2019-10-01 02:58:56.193 24278 DEBUG neutron.agent.linux.utils [-] enax_custom_log: pid: 24283, signal: 9 kill_process /openstack/venvs/neutron-19.0.4.dev1/lib/python2.7/site-packages/neutron/agent/linux/utils.py:243
@02:58:56 test-network1 audit[24089]: USER_END pid=24089 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:session_close acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=? res=success'
@02:58:56 test-network1 sudo[24089]: pam_unix(sudo:session): session closed for user root
@02:58:56 test-network1 audit[24089]: CRED_DISP pid=24089 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=? res=success'
@02:58:57 test-network1 Keepalived_vrrp[24400]: Stopped
@02:58:57 test-network1 Keepalived[24394]: Stopped Keepalived v1.3.9 (10/21,2017)
TCPDUMP qrouter-24010932-a0a4-4454-9539-27c1535c5ed8 ha-57528491-1b:
@02:58:53.130735 IP 169.254.195.168 > 224.0.0.18: VRRPv2, Advertisement, vrid 217, prio 50, authtype simple, intvl 2s, length 20
@02:58:55.131926 IP 169.254.195.168 > 224.0.0.18: VRRPv2, Advertisement, vrid 217, prio 50, authtype simple, intvl 2s, length 20
@02:58:56.188558 IP 169.254.195.168 > 224.0.0.18: VRRPv2, Advertisement, vrid 217, prio 0, authtype simple, intvl 2s, length 20
@02:58:56.215889 IP 169.254.195.168 > 224.0.0.22: igmp v3 report, 1 group record(s)
@02:58:56.539804 IP 169.254.195.168 > 224.0.0.22: igmp v3 report, 1 group record(s)
@02:58:56.995386 IP 169.254.194.242 > 224.0.0.18: VRRPv2, Advertisement, vrid 217, prio 50, authtype simple, intvl 2s, length 20
@02:58:58.998565 ARP, Request who-has 169.254.0.217 (ff:ff:ff:ff:ff:ff) tell 169.254.0.217, length 28
@02:58:59.000138 ARP, Request who-has 169.254.0.217 (ff:ff:ff:ff:ff:ff) tell 169.254.0.217, length 28
@02:58:59.001063 ARP, Request who-has 169.254.0.217 (ff:ff:ff:ff:ff:ff) tell 169.254.0.217, length 28
@02:58:59.002173 ARP, Request who-has 169.254.0.217 (ff:ff:ff:ff:ff:ff) tell 169.254.0.217, length 28
@02:58:59.003018 ARP, Request who-has 169.254.0.217 (ff:ff:ff:ff:ff:ff) tell 169.254.0.217, length 28
@02:58:59.003860 IP 169.254.194.242 > 224.0.0.18: VRRPv2, Advertisement, vrid 217, prio 50, authtype simple, intvl 2s, length 20
@02:59:01.004772 IP 169.254.194.242 > 224.0.0.18: VRRPv2, Advertisement, vrid 217, prio 50, authtype simple, intvl 2s, length 20
After l3-agent restart
neutron l3-agent-list-hosting-router R1
+--------------------------------------+---------------+----------------+-------+----------+
| id | host | admin_state_up | alive | ha_state |
+--------------------------------------+---------------+----------------+-------+----------+
| 1b3b1b5d-08e7-48a1-ab8d-256d94099fb6 | test-network2 | True | :-) | active |
| fa402ada-7716-4ad4-a004-7f8114fb1edf | test-network1 | True | :-) | standby |
+--------------------------------------+---------------+----------------+-------+----------+
Logs and configs in the attachment.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1846198/+subscriptions
References