← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1732067] Re: openvswitch firewall flows cause flooding on integration bridge

 

Since this report concerns a possible security risk, an incomplete
security advisory task has been added while the core security reviewers
for the affected project or projects confirm the bug and discuss the
scope of any vulnerability along with potential solutions.

(The duplicate bug 1813439 was previously being tracked as a potential
vulnerability report.)

** Also affects: ossa
   Importance: Undecided
       Status: New

** Changed in: ossa
       Status: New => Incomplete

** Information type changed from Public to Public Security

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1732067

Title:
  openvswitch firewall flows cause flooding on integration bridge

Status in neutron:
  In Progress
Status in OpenStack Security Advisory:
  Incomplete

Bug description:
  Environment: OpenStack Newton
  Driver: ML2 w/ OVS
  Firewall: openvswitch

  In this environment, we have observed OVS flooding network traffic
  across all ports in a given VLAN on the integration bridge due to the
  lack of a FDB entry for the destination MAC address. Across the large
  fleet of 240+ nodes, this is causing a considerable amount of noise on
  any given node.

  In this test, we have 3 machines:

  Client: fa:16:3e:e8:59:00 (10.10.60.2)
  Server: fa:16:3e:80:cb:0a (10.10.60.9)
  Bystander: fa:16:3e:a0:ee:02 (10.10.60.10)

  The server is running a web server using netcat:

  while true ; do sudo nc -l -p 80 < index.html ; done

  Client requests page using curl:

  ip netns exec qdhcp-b07e6cb3-0943-45a2-b5ff-efb7e99e4d3d curl
  http://10.10.60.9/

  We should expect to see the communication limited to the client and
  server. However, the captures below reflect the server->client
  responses being broadcast out all tap interfaces connected to br-int
  in the same local vlan:

  root@osa-newton-ovs-compute01:~# tcpdump -i tap5f03424d-1c -ne port 80
  tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  listening on tap5f03424d-1c, link-type EN10MB (Ethernet), capture size 262144 bytes
  02:20:30.190675 fa:16:3e:e8:59:00 > fa:16:3e:80:cb:0a, ethertype IPv4 (0x0800), length 74: 10.10.60.2.54796 > 10.10.60.9.80: Flags [S], seq 213484442, win 29200, options [mss 1460,sackOK,TS val 140883559 ecr 0,nop,wscale 7], length 0
  02:20:30.191926 fa:16:3e:80:cb:0a > fa:16:3e:e8:59:00, ethertype IPv4 (0x0800), length 74: 10.10.60.9.80 > 10.10.60.2.54796: Flags [S.], seq 90006557, ack 213484443, win 14480, options [mss 1460,sackOK,TS val 95716 ecr 140883559,nop,wscale 4], length 0
  02:20:30.192837 fa:16:3e:e8:59:00 > fa:16:3e:80:cb:0a, ethertype IPv4 (0x0800), length 66: 10.10.60.2.54796 > 10.10.60.9.80: Flags [.], ack 1, win 229, options [nop,nop,TS val 140883560 ecr 95716], length 0
  02:20:30.192986 fa:16:3e:e8:59:00 > fa:16:3e:80:cb:0a, ethertype IPv4 (0x0800), length 140: 10.10.60.2.54796 > 10.10.60.9.80: Flags [P.], seq 1:75, ack 1, win 229, options [nop,nop,TS val 140883560 ecr 95716], length 74: HTTP: GET / HTTP/1.1
  02:20:30.195806 fa:16:3e:80:cb:0a > fa:16:3e:e8:59:00, ethertype IPv4 (0x0800), length 79: 10.10.60.9.80 > 10.10.60.2.54796: Flags [P.], seq 1:14, ack 1, win 905, options [nop,nop,TS val 95717 ecr 140883560], length 13: HTTP
  02:20:30.196207 fa:16:3e:e8:59:00 > fa:16:3e:80:cb:0a, ethertype IPv4 (0x0800), length 66: 10.10.60.2.54796 > 10.10.60.9.80: Flags [.], ack 14, win 229, options [nop,nop,TS val 140883561 ecr 95717], length 0
  02:20:30.197481 fa:16:3e:80:cb:0a > fa:16:3e:e8:59:00, ethertype IPv4 (0x0800), length 66: 10.10.60.9.80 > 10.10.60.2.54796: Flags [.], ack 75, win 905, options [nop,nop,TS val 95717 ecr 140883560], length 0

  ^^^ On the server tap we see the bi-directional traffic

  root@osa-newton-ovs-compute01:/home/ubuntu# tcpdump -i tapb8051da9-60 -ne port 80
  tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  listening on tapb8051da9-60, link-type EN10MB (Ethernet), capture size 262144 bytes
  02:20:30.192165 fa:16:3e:80:cb:0a > fa:16:3e:e8:59:00, ethertype IPv4 (0x0800), length 74: 10.10.60.9.80 > 10.10.60.2.54796: Flags [S.], seq 90006557, ack 213484443, win 14480, options [mss 1460,sackOK,TS val 95716 ecr 140883559,nop,wscale 4], length 0
  02:20:30.195827 fa:16:3e:80:cb:0a > fa:16:3e:e8:59:00, ethertype IPv4 (0x0800), length 79: 10.10.60.9.80 > 10.10.60.2.54796: Flags [P.], seq 1:14, ack 1, win 905, options [nop,nop,TS val 95717 ecr 140883560], length 13: HTTP
  02:20:30.197500 fa:16:3e:80:cb:0a > fa:16:3e:e8:59:00, ethertype IPv4 (0x0800), length 66: 10.10.60.9.80 > 10.10.60.2.54796: Flags [.], ack 75, win 905, options [nop,nop,TS val 95717 ecr 140883560], length 0

  ^^^ On the bystander tap we see the flooded traffic

  The FDB tables reflect the lack of CAM entry for the client on br-int
  bridge. I would expect to see the MAC address on the patch uplink:

  root@osa-newton-ovs-compute01:/home/ubuntu# ovs-appctl fdb/show br-int | grep 'fa:16:3e:e8:59:00'
  root@osa-newton-ovs-compute01:/home/ubuntu# ovs-appctl fdb/show br-provider | grep 'fa:16:3e:e8:59:00'
      2   850  fa:16:3e:e8:59:00    3
      
  Sources[1] point to the fact that an 'output' action negates the MAC learning mechanism in OVS. Related Table 82 entries are below, and code is here[2]:

  cookie=0x94ebb7913c37a0ec, duration=415.490s, table=82, n_packets=5, n_bytes=424, idle_age=31, priority=70,ct_state=+est-rel-rpl,tcp,reg5=0xd,dl_dst=fa:16:3e:80:cb:0a,tp_dst=80 actions=strip_vlan,output:13
  cookie=0x94ebb7913c37a0ec, duration=415.489s, table=82, n_packets=354, n_bytes=35229, idle_age=154, priority=70,ct_state=+est-rel-rpl,tcp,reg5=0xd,dl_dst=fa:16:3e:80:cb:0a,tp_dst=22 actions=strip_vlan,output:13
  cookie=0x94ebb7913c37a0ec, duration=415.489s, table=82, n_packets=1, n_bytes=78, idle_age=154, priority=70,ct_state=+new-est,tcp,reg5=0xd,dl_dst=fa:16:3e:80:cb:0a,tp_dst=80 actions=ct(commit,zone=NXM_NX_REG6[0..15]),strip_vlan,output:13
  cookie=0x94ebb7913c37a0ec, duration=415.489s, table=82, n_packets=1, n_bytes=78, idle_age=415, priority=70,ct_state=+new-est,tcp,reg5=0xd,dl_dst=fa:16:3e:80:cb:0a,tp_dst=22 actions=ct(commit,zone=NXM_NX_REG6[0..15]),strip_vlan,output:13
  cookie=0x94ebb7913c37a0ec, duration=415.491s, table=82, n_packets=120, n_bytes=7920, idle_age=305, priority=50,ct_state=+est-rel+rpl,ct_zone=4,ct_mark=0,reg5=0xd,dl_dst=fa:16:3e:80:cb:0a actions=strip_vlan,output:13
  cookie=0x94ebb7913c37a0ec, duration=415.491s, table=82, n_packets=0, n_bytes=0, idle_age=415, priority=50,ct_state=-new-est+rel-inv,ct_zone=4,ct_mark=0,reg5=0xd,dl_dst=fa:16:3e:80:cb:0a actions=strip_vlan,output:13

  My testing shows that massaging the flow rules to remove the 'output'
  action and instead use a 'mod_vlan_vid' action (for the sake of
  getting it working) results in expected behavior:

  cookie=0x85cd1a977dd54be0, duration=0.359s, table=82, n_packets=0, n_bytes=0, idle_age=2110, priority=70,ct_state=+est-rel-rpl,tcp,reg5=0xd,dl_dst=fa:16:3e:80:cb:0a,tp_dst=80 actions=mod_vlan_vid:4,NORMAL
  cookie=0x85cd1a977dd54be0, duration=0.359s, table=82, n_packets=0, n_bytes=0, idle_age=518, priority=70,ct_state=+est-rel-rpl,tcp,reg5=0xd,dl_dst=fa:16:3e:80:cb:0a,tp_dst=22 actions=mod_vlan_vid:4,NORMAL
  cookie=0x85cd1a977dd54be0, duration=0.359s, table=82, n_packets=0, n_bytes=0, idle_age=392, priority=70,ct_state=+new-est,tcp,reg5=0xd,dl_dst=fa:16:3e:80:cb:0a,tp_dst=80 actions=ct(commit,zone=NXM_NX_REG6[0..15]),mod_vlan_vid:4,NORMAL
  cookie=0x85cd1a977dd54be0, duration=0.359s, table=82, n_packets=0, n_bytes=0, idle_age=185, priority=70,ct_state=+new-est,tcp,reg5=0xd,dl_dst=fa:16:3e:80:cb:0a,tp_dst=22 actions=ct(commit,zone=NXM_NX_REG6[0..15]),mod_vlan_vid:4,NORMAL
  cookie=0x85cd1a977dd54be0, duration=0.361s, table=82, n_packets=0, n_bytes=0, idle_age=5263, priority=50,ct_state=+est-rel+rpl,ct_zone=4,ct_mark=0,reg5=0xd,dl_dst=fa:16:3e:80:cb:0a actions=strip_vlan,output:13
  cookie=0x85cd1a977dd54be0, duration=0.361s, table=82, n_packets=0, n_bytes=0, idle_age=5373, priority=50,ct_state=-new-est+rel-inv,ct_zone=4,ct_mark=0,reg5=0xd,dl_dst=fa:16:3e:80:cb:0a actions=strip_vlan,output:13

  The MAC of the client shows up now on br-int FDB:

  root@osa-newton-ovs-compute01:/home/ubuntu# ovs-appctl fdb/show br-int | grep 'fa:16:3e:e8:59:00'
      1     4  fa:16:3e:e8:59:00    2

  The test below shows that traffic is only seen on server tap and not
  bystander tap:

  root@osa-newton-ovs-compute01:/home/ubuntu# tcpdump -i tap5f03424d-1c -ne port 80
  tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  listening on tap5f03424d-1c, link-type EN10MB (Ethernet), capture size 262144 bytes
  03:46:52.606940 fa:16:3e:e8:59:00 > fa:16:3e:80:cb:0a, ethertype IPv4 (0x0800), length 74: 10.10.60.2.55808 > 10.10.60.9.80: Flags [S], seq 3645914146, win 29200, options [mss 1460,sackOK,TS val 142179163 ecr 0,nop,wscale 7], length 0
  03:46:52.608880 fa:16:3e:80:cb:0a > fa:16:3e:e8:59:00, ethertype IPv4 (0x0800), length 74: 10.10.60.9.80 > 10.10.60.2.55808: Flags [S.], seq 3531519972, ack 3645914147, win 14480, options [mss 1460,sackOK,TS val 1391324 ecr 142179163,nop,wscale 4], length 0
  03:46:52.610175 fa:16:3e:e8:59:00 > fa:16:3e:80:cb:0a, ethertype IPv4 (0x0800), length 66: 10.10.60.2.55808 > 10.10.60.9.80: Flags [.], ack 1, win 229, options [nop,nop,TS val 142179164 ecr 1391324], length 0
  03:46:52.610273 fa:16:3e:e8:59:00 > fa:16:3e:80:cb:0a, ethertype IPv4 (0x0800), length 140: 10.10.60.2.55808 > 10.10.60.9.80: Flags [P.], seq 1:75, ack 1, win 229, options [nop,nop,TS val 142179164 ecr 1391324], length 74: HTTP: GET / HTTP/1.1
  03:46:52.613851 fa:16:3e:80:cb:0a > fa:16:3e:e8:59:00, ethertype IPv4 (0x0800), length 66: 10.10.60.9.80 > 10.10.60.2.55808: Flags [.], ack 75, win 905, options [nop,nop,TS val 1391325 ecr 142179164], length 0
  03:46:52.614007 fa:16:3e:80:cb:0a > fa:16:3e:e8:59:00, ethertype IPv4 (0x0800), length 79: 10.10.60.9.80 > 10.10.60.2.55808: Flags [P.], seq 1:14, ack 75, win 905, options [nop,nop,TS val 1391325 ecr 142179164], length 13: HTTP
  03:46:52.614314 fa:16:3e:e8:59:00 > fa:16:3e:80:cb:0a, ethertype IPv4 (0x0800), length 66: 10.10.60.2.55808 > 10.10.60.9.80: Flags [.], ack 14, win 229, options [nop,nop,TS val 142179165 ecr 1391325], length 0

  root@osa-newton-ovs-compute01:/home/ubuntu# tcpdump -i tapb8051da9-60 -ne port 80
  tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  listening on tapb8051da9-60, link-type EN10MB (Ethernet), capture size 262144 bytes

  >> Nothing! As expected.

  I need to build out an environment using the master branch, but the
  code at [3] seems to indicate the 'output' action is still specified.

  Thanks for taking a look and let me know if you have any questions.

  [1] https://mail.openvswitch.org/pipermail/ovs-discuss/2016-August/042276.html
  [2] https://github.com/openstack/neutron/blob/newton-eol/neutron/agent/linux/openvswitch_firewall/rules.py#L73
  [3] https://github.com/openstack/neutron/blob/master/neutron/agent/linux/openvswitch_firewall/rules.py#L80

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1732067/+subscriptions


References