yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #79391
[Bug 1732067] Re: openvswitch firewall flows cause flooding on integration bridge
Since this report concerns a possible security risk, an incomplete
security advisory task has been added while the core security reviewers
for the affected project or projects confirm the bug and discuss the
scope of any vulnerability along with potential solutions.
(The duplicate bug 1813439 was previously being tracked as a potential
vulnerability report.)
** Also affects: ossa
Importance: Undecided
Status: New
** Changed in: ossa
Status: New => Incomplete
** Information type changed from Public to Public Security
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1732067
Title:
openvswitch firewall flows cause flooding on integration bridge
Status in neutron:
In Progress
Status in OpenStack Security Advisory:
Incomplete
Bug description:
Environment: OpenStack Newton
Driver: ML2 w/ OVS
Firewall: openvswitch
In this environment, we have observed OVS flooding network traffic
across all ports in a given VLAN on the integration bridge due to the
lack of a FDB entry for the destination MAC address. Across the large
fleet of 240+ nodes, this is causing a considerable amount of noise on
any given node.
In this test, we have 3 machines:
Client: fa:16:3e:e8:59:00 (10.10.60.2)
Server: fa:16:3e:80:cb:0a (10.10.60.9)
Bystander: fa:16:3e:a0:ee:02 (10.10.60.10)
The server is running a web server using netcat:
while true ; do sudo nc -l -p 80 < index.html ; done
Client requests page using curl:
ip netns exec qdhcp-b07e6cb3-0943-45a2-b5ff-efb7e99e4d3d curl
http://10.10.60.9/
We should expect to see the communication limited to the client and
server. However, the captures below reflect the server->client
responses being broadcast out all tap interfaces connected to br-int
in the same local vlan:
root@osa-newton-ovs-compute01:~# tcpdump -i tap5f03424d-1c -ne port 80
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tap5f03424d-1c, link-type EN10MB (Ethernet), capture size 262144 bytes
02:20:30.190675 fa:16:3e:e8:59:00 > fa:16:3e:80:cb:0a, ethertype IPv4 (0x0800), length 74: 10.10.60.2.54796 > 10.10.60.9.80: Flags [S], seq 213484442, win 29200, options [mss 1460,sackOK,TS val 140883559 ecr 0,nop,wscale 7], length 0
02:20:30.191926 fa:16:3e:80:cb:0a > fa:16:3e:e8:59:00, ethertype IPv4 (0x0800), length 74: 10.10.60.9.80 > 10.10.60.2.54796: Flags [S.], seq 90006557, ack 213484443, win 14480, options [mss 1460,sackOK,TS val 95716 ecr 140883559,nop,wscale 4], length 0
02:20:30.192837 fa:16:3e:e8:59:00 > fa:16:3e:80:cb:0a, ethertype IPv4 (0x0800), length 66: 10.10.60.2.54796 > 10.10.60.9.80: Flags [.], ack 1, win 229, options [nop,nop,TS val 140883560 ecr 95716], length 0
02:20:30.192986 fa:16:3e:e8:59:00 > fa:16:3e:80:cb:0a, ethertype IPv4 (0x0800), length 140: 10.10.60.2.54796 > 10.10.60.9.80: Flags [P.], seq 1:75, ack 1, win 229, options [nop,nop,TS val 140883560 ecr 95716], length 74: HTTP: GET / HTTP/1.1
02:20:30.195806 fa:16:3e:80:cb:0a > fa:16:3e:e8:59:00, ethertype IPv4 (0x0800), length 79: 10.10.60.9.80 > 10.10.60.2.54796: Flags [P.], seq 1:14, ack 1, win 905, options [nop,nop,TS val 95717 ecr 140883560], length 13: HTTP
02:20:30.196207 fa:16:3e:e8:59:00 > fa:16:3e:80:cb:0a, ethertype IPv4 (0x0800), length 66: 10.10.60.2.54796 > 10.10.60.9.80: Flags [.], ack 14, win 229, options [nop,nop,TS val 140883561 ecr 95717], length 0
02:20:30.197481 fa:16:3e:80:cb:0a > fa:16:3e:e8:59:00, ethertype IPv4 (0x0800), length 66: 10.10.60.9.80 > 10.10.60.2.54796: Flags [.], ack 75, win 905, options [nop,nop,TS val 95717 ecr 140883560], length 0
^^^ On the server tap we see the bi-directional traffic
root@osa-newton-ovs-compute01:/home/ubuntu# tcpdump -i tapb8051da9-60 -ne port 80
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tapb8051da9-60, link-type EN10MB (Ethernet), capture size 262144 bytes
02:20:30.192165 fa:16:3e:80:cb:0a > fa:16:3e:e8:59:00, ethertype IPv4 (0x0800), length 74: 10.10.60.9.80 > 10.10.60.2.54796: Flags [S.], seq 90006557, ack 213484443, win 14480, options [mss 1460,sackOK,TS val 95716 ecr 140883559,nop,wscale 4], length 0
02:20:30.195827 fa:16:3e:80:cb:0a > fa:16:3e:e8:59:00, ethertype IPv4 (0x0800), length 79: 10.10.60.9.80 > 10.10.60.2.54796: Flags [P.], seq 1:14, ack 1, win 905, options [nop,nop,TS val 95717 ecr 140883560], length 13: HTTP
02:20:30.197500 fa:16:3e:80:cb:0a > fa:16:3e:e8:59:00, ethertype IPv4 (0x0800), length 66: 10.10.60.9.80 > 10.10.60.2.54796: Flags [.], ack 75, win 905, options [nop,nop,TS val 95717 ecr 140883560], length 0
^^^ On the bystander tap we see the flooded traffic
The FDB tables reflect the lack of CAM entry for the client on br-int
bridge. I would expect to see the MAC address on the patch uplink:
root@osa-newton-ovs-compute01:/home/ubuntu# ovs-appctl fdb/show br-int | grep 'fa:16:3e:e8:59:00'
root@osa-newton-ovs-compute01:/home/ubuntu# ovs-appctl fdb/show br-provider | grep 'fa:16:3e:e8:59:00'
2 850 fa:16:3e:e8:59:00 3
Sources[1] point to the fact that an 'output' action negates the MAC learning mechanism in OVS. Related Table 82 entries are below, and code is here[2]:
cookie=0x94ebb7913c37a0ec, duration=415.490s, table=82, n_packets=5, n_bytes=424, idle_age=31, priority=70,ct_state=+est-rel-rpl,tcp,reg5=0xd,dl_dst=fa:16:3e:80:cb:0a,tp_dst=80 actions=strip_vlan,output:13
cookie=0x94ebb7913c37a0ec, duration=415.489s, table=82, n_packets=354, n_bytes=35229, idle_age=154, priority=70,ct_state=+est-rel-rpl,tcp,reg5=0xd,dl_dst=fa:16:3e:80:cb:0a,tp_dst=22 actions=strip_vlan,output:13
cookie=0x94ebb7913c37a0ec, duration=415.489s, table=82, n_packets=1, n_bytes=78, idle_age=154, priority=70,ct_state=+new-est,tcp,reg5=0xd,dl_dst=fa:16:3e:80:cb:0a,tp_dst=80 actions=ct(commit,zone=NXM_NX_REG6[0..15]),strip_vlan,output:13
cookie=0x94ebb7913c37a0ec, duration=415.489s, table=82, n_packets=1, n_bytes=78, idle_age=415, priority=70,ct_state=+new-est,tcp,reg5=0xd,dl_dst=fa:16:3e:80:cb:0a,tp_dst=22 actions=ct(commit,zone=NXM_NX_REG6[0..15]),strip_vlan,output:13
cookie=0x94ebb7913c37a0ec, duration=415.491s, table=82, n_packets=120, n_bytes=7920, idle_age=305, priority=50,ct_state=+est-rel+rpl,ct_zone=4,ct_mark=0,reg5=0xd,dl_dst=fa:16:3e:80:cb:0a actions=strip_vlan,output:13
cookie=0x94ebb7913c37a0ec, duration=415.491s, table=82, n_packets=0, n_bytes=0, idle_age=415, priority=50,ct_state=-new-est+rel-inv,ct_zone=4,ct_mark=0,reg5=0xd,dl_dst=fa:16:3e:80:cb:0a actions=strip_vlan,output:13
My testing shows that massaging the flow rules to remove the 'output'
action and instead use a 'mod_vlan_vid' action (for the sake of
getting it working) results in expected behavior:
cookie=0x85cd1a977dd54be0, duration=0.359s, table=82, n_packets=0, n_bytes=0, idle_age=2110, priority=70,ct_state=+est-rel-rpl,tcp,reg5=0xd,dl_dst=fa:16:3e:80:cb:0a,tp_dst=80 actions=mod_vlan_vid:4,NORMAL
cookie=0x85cd1a977dd54be0, duration=0.359s, table=82, n_packets=0, n_bytes=0, idle_age=518, priority=70,ct_state=+est-rel-rpl,tcp,reg5=0xd,dl_dst=fa:16:3e:80:cb:0a,tp_dst=22 actions=mod_vlan_vid:4,NORMAL
cookie=0x85cd1a977dd54be0, duration=0.359s, table=82, n_packets=0, n_bytes=0, idle_age=392, priority=70,ct_state=+new-est,tcp,reg5=0xd,dl_dst=fa:16:3e:80:cb:0a,tp_dst=80 actions=ct(commit,zone=NXM_NX_REG6[0..15]),mod_vlan_vid:4,NORMAL
cookie=0x85cd1a977dd54be0, duration=0.359s, table=82, n_packets=0, n_bytes=0, idle_age=185, priority=70,ct_state=+new-est,tcp,reg5=0xd,dl_dst=fa:16:3e:80:cb:0a,tp_dst=22 actions=ct(commit,zone=NXM_NX_REG6[0..15]),mod_vlan_vid:4,NORMAL
cookie=0x85cd1a977dd54be0, duration=0.361s, table=82, n_packets=0, n_bytes=0, idle_age=5263, priority=50,ct_state=+est-rel+rpl,ct_zone=4,ct_mark=0,reg5=0xd,dl_dst=fa:16:3e:80:cb:0a actions=strip_vlan,output:13
cookie=0x85cd1a977dd54be0, duration=0.361s, table=82, n_packets=0, n_bytes=0, idle_age=5373, priority=50,ct_state=-new-est+rel-inv,ct_zone=4,ct_mark=0,reg5=0xd,dl_dst=fa:16:3e:80:cb:0a actions=strip_vlan,output:13
The MAC of the client shows up now on br-int FDB:
root@osa-newton-ovs-compute01:/home/ubuntu# ovs-appctl fdb/show br-int | grep 'fa:16:3e:e8:59:00'
1 4 fa:16:3e:e8:59:00 2
The test below shows that traffic is only seen on server tap and not
bystander tap:
root@osa-newton-ovs-compute01:/home/ubuntu# tcpdump -i tap5f03424d-1c -ne port 80
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tap5f03424d-1c, link-type EN10MB (Ethernet), capture size 262144 bytes
03:46:52.606940 fa:16:3e:e8:59:00 > fa:16:3e:80:cb:0a, ethertype IPv4 (0x0800), length 74: 10.10.60.2.55808 > 10.10.60.9.80: Flags [S], seq 3645914146, win 29200, options [mss 1460,sackOK,TS val 142179163 ecr 0,nop,wscale 7], length 0
03:46:52.608880 fa:16:3e:80:cb:0a > fa:16:3e:e8:59:00, ethertype IPv4 (0x0800), length 74: 10.10.60.9.80 > 10.10.60.2.55808: Flags [S.], seq 3531519972, ack 3645914147, win 14480, options [mss 1460,sackOK,TS val 1391324 ecr 142179163,nop,wscale 4], length 0
03:46:52.610175 fa:16:3e:e8:59:00 > fa:16:3e:80:cb:0a, ethertype IPv4 (0x0800), length 66: 10.10.60.2.55808 > 10.10.60.9.80: Flags [.], ack 1, win 229, options [nop,nop,TS val 142179164 ecr 1391324], length 0
03:46:52.610273 fa:16:3e:e8:59:00 > fa:16:3e:80:cb:0a, ethertype IPv4 (0x0800), length 140: 10.10.60.2.55808 > 10.10.60.9.80: Flags [P.], seq 1:75, ack 1, win 229, options [nop,nop,TS val 142179164 ecr 1391324], length 74: HTTP: GET / HTTP/1.1
03:46:52.613851 fa:16:3e:80:cb:0a > fa:16:3e:e8:59:00, ethertype IPv4 (0x0800), length 66: 10.10.60.9.80 > 10.10.60.2.55808: Flags [.], ack 75, win 905, options [nop,nop,TS val 1391325 ecr 142179164], length 0
03:46:52.614007 fa:16:3e:80:cb:0a > fa:16:3e:e8:59:00, ethertype IPv4 (0x0800), length 79: 10.10.60.9.80 > 10.10.60.2.55808: Flags [P.], seq 1:14, ack 75, win 905, options [nop,nop,TS val 1391325 ecr 142179164], length 13: HTTP
03:46:52.614314 fa:16:3e:e8:59:00 > fa:16:3e:80:cb:0a, ethertype IPv4 (0x0800), length 66: 10.10.60.2.55808 > 10.10.60.9.80: Flags [.], ack 14, win 229, options [nop,nop,TS val 142179165 ecr 1391325], length 0
root@osa-newton-ovs-compute01:/home/ubuntu# tcpdump -i tapb8051da9-60 -ne port 80
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tapb8051da9-60, link-type EN10MB (Ethernet), capture size 262144 bytes
>> Nothing! As expected.
I need to build out an environment using the master branch, but the
code at [3] seems to indicate the 'output' action is still specified.
Thanks for taking a look and let me know if you have any questions.
[1] https://mail.openvswitch.org/pipermail/ovs-discuss/2016-August/042276.html
[2] https://github.com/openstack/neutron/blob/newton-eol/neutron/agent/linux/openvswitch_firewall/rules.py#L73
[3] https://github.com/openstack/neutron/blob/master/neutron/agent/linux/openvswitch_firewall/rules.py#L80
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1732067/+subscriptions
References