yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #83346
[Bug 1887148] Re: Network loop between physical networks with DVR
Reviewed: https://review.opendev.org/740724
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=c1a77ef8b74bb9b5abbc5cb03fb3201383122eb8
Submitter: Zuul
Branch: master
commit c1a77ef8b74bb9b5abbc5cb03fb3201383122eb8
Author: Darragh O'Reilly <doreilly@xxxxxxxx>
Date: Mon Jul 13 14:48:10 2020 +0000
Ensure drop flows on br-int at agent startup for DVR too
Commit 90212b12 changed the OVS agent so adding vital drop flows on
br-int (table 0 priority 2) for packets from physical bridges was
deferred until DVR initialization later on. But if br-int has no flows
from a previous run (eg after host reboot), then these packets will hit
the NORMAL flow in table 60. And if there is more than one physical
bridge, then the physical interfaces from the different bridges are now
essentially connected at layer 2 and a network loop is possible in the
time before the flows are added by DVR. Also the DVR code won't add them
until after RPC calls to the server, so a loop is more likely if the
server is not available.
This patch restores adding these flows to when the physical bridges are
first configured. Also updated a comment that was no longer correct and
updated the unit test.
Change-Id: I42c33fefaae6a7bee134779c840f35632823472e
Closes-Bug: #1887148
Related-Bug: #1869808
** Changed in: neutron
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1887148
Title:
Network loop between physical networks with DVR
Status in neutron:
Fix Released
Bug description:
Our CI experienced a network loop due to
https://review.opendev.org/#/c/733568/ . DVR is enabled and there is
more than one physical bridge mapping, and the neutron server was not
available when the ovs agents were started.
Steps
=====
# add more physical bridges
ovs-vsctl add-br br-physnet1
ip link set dev br-physnet1 up
ovs-vsctl add-br br-physnet2
ip link set dev br-physnet2 up
# set a broadcast going from one bridge
ip address add 1.1.1.1/31 dev br-physnet1
arping -b -I br-physnet1 1.1.1.1
# listen on the other
tcpdump -eni br-physnet2
# Update /etc/neutron/plugins/ml2/ml2_conf.ini
[ml2_type_vlan]
network_vlan_ranges = public,physnet1,physnet2
[ovs]
datapath_type = system
bridge_mappings = public:br-ex,physnet1:br-physnet1,physnet2:br-physnet2
tunnel_bridge = br-tun
local_ip = 127.0.0.1
[agent]
tunnel_types = vxlan
root_helper_daemon = sudo /usr/local/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf
root_helper = sudo /usr/local/bin/neutron-rootwrap /etc/neutron/rootwrap.conf
enable_distributed_routing = True
l2_population = True
# stop server and agent
systemctl stop devstack@q-svc
systemctl stop devstack@q-agt
# clear all flows
for BR in $(sudo ovs-vsctl list-br); do echo $BR; sudo ovs-ofctl del-flows $BR; done
# start agent
systemctl start devstack@q-agt
$ sudo tcpdump -eni br-physnet2
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br-physnet2, link-type EN10MB (Ethernet), capture size 262144 bytes
09:46:56.577183 e2:ab:d4:16:46:4d > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 1.1.1.1 (ff:ff:ff:ff:ff:ff) tell 1.1.1.1, length 28
09:46:57.577568 e2:ab:d4:16:46:4d > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 1.1.1.1 (ff:ff:ff:ff:ff:ff) tell 1.1.1.1, length 28
...
If there is more than one node running the ovs agent in this state,
then there will be a network loop and packets can multiple quickly and
overwhelm the network. We saw ~1 million packets/sec.
I think because the neutron server is not available, the get_dvr_mac_address rpc is blocked and the required drops are not installed:
https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_dvr_neutron_agent.py#L138
https://github.com/openstack/neutron/blob/5999716cfc4a00ac426e016eabbb51247ba0b190/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_dvr_neutron_agent.py#L230-L234
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1887148/+subscriptions
References