yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #85743
[Bug 1887148] Re: Network loop between physical networks with DVR
This bug was fixed in the package neutron - 2:12.1.1-0ubuntu4
---------------
neutron (2:12.1.1-0ubuntu4) bionic; urgency=medium
* Fix interrupt of VLAN traffic on reboot of neutron-ovs-agent:
- d/p/0001-ovs-agent-signal-to-plugin-if-tunnel-refresh-needed.patch (LP: #1853613)
- d/p/0002-Do-not-block-connection-between-br-int-and-br-phys-o.patch (LP: #1869808)
- d/p/0003-Ensure-that-stale-flows-are-cleaned-from-phys_bridge.patch (LP: #1864822)
- d/p/0004-DVR-Reconfigure-re-created-physical-bridges-for-dvr-.patch (LP: #1864822)
- d/p/0005-Ensure-drop-flows-on-br-int-at-agent-startup-for-DVR.patch (LP: #1887148)
- d/p/0006-Don-t-check-if-any-bridges-were-recrected-when-OVS-w.patch (LP: #1864822)
- d/p/0007-Not-remove-the-running-router-when-MQ-is-unreachable.patch (LP: #1871850)
-- Edward Hope-Morley <edward.hope-morley@xxxxxxxxxxxxx> Mon, 22 Feb
2021 16:55:40 +0000
** Changed in: neutron (Ubuntu Bionic)
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1887148
Title:
Network loop between physical networks with DVR
Status in Ubuntu Cloud Archive:
Invalid
Status in Ubuntu Cloud Archive queens series:
Fix Committed
Status in Ubuntu Cloud Archive rocky series:
Fix Committed
Status in neutron:
Fix Released
Status in neutron package in Ubuntu:
Fix Released
Status in neutron source package in Bionic:
Fix Released
Bug description:
(For SRU template, please see bug 1869808, as the SRU info there
applies to this bug also)
Our CI experienced a network loop due to https://review.opendev.org/#/c/733568/ . DVR is enabled and there is more than one physical bridge mapping, and the neutron server was not available when the ovs agents were started.
Steps
=====
# add more physical bridges
ovs-vsctl add-br br-physnet1
ip link set dev br-physnet1 up
ovs-vsctl add-br br-physnet2
ip link set dev br-physnet2 up
# set a broadcast going from one bridge
ip address add 1.1.1.1/31 dev br-physnet1
arping -b -I br-physnet1 1.1.1.1
# listen on the other
tcpdump -eni br-physnet2
# Update /etc/neutron/plugins/ml2/ml2_conf.ini
[ml2_type_vlan]
network_vlan_ranges = public,physnet1,physnet2
[ovs]
datapath_type = system
bridge_mappings = public:br-ex,physnet1:br-physnet1,physnet2:br-physnet2
tunnel_bridge = br-tun
local_ip = 127.0.0.1
[agent]
tunnel_types = vxlan
root_helper_daemon = sudo /usr/local/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf
root_helper = sudo /usr/local/bin/neutron-rootwrap /etc/neutron/rootwrap.conf
enable_distributed_routing = True
l2_population = True
# stop server and agent
systemctl stop devstack@q-svc
systemctl stop devstack@q-agt
# clear all flows
for BR in $(sudo ovs-vsctl list-br); do echo $BR; sudo ovs-ofctl del-flows $BR; done
# start agent
systemctl start devstack@q-agt
$ sudo tcpdump -eni br-physnet2
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br-physnet2, link-type EN10MB (Ethernet), capture size 262144 bytes
09:46:56.577183 e2:ab:d4:16:46:4d > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 1.1.1.1 (ff:ff:ff:ff:ff:ff) tell 1.1.1.1, length 28
09:46:57.577568 e2:ab:d4:16:46:4d > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 1.1.1.1 (ff:ff:ff:ff:ff:ff) tell 1.1.1.1, length 28
...
If there is more than one node running the ovs agent in this state,
then there will be a network loop and packets can multiple quickly and
overwhelm the network. We saw ~1 million packets/sec.
I think because the neutron server is not available, the get_dvr_mac_address rpc is blocked and the required drops are not installed:
https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_dvr_neutron_agent.py#L138
https://github.com/openstack/neutron/blob/5999716cfc4a00ac426e016eabbb51247ba0b190/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_dvr_neutron_agent.py#L230-L234
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1887148/+subscriptions
References