← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2051351] Re: explicity_egress_direct prevents learning of local MACs and causes flooding of ingress packets, firewall_driver = openvswitch

 

Reviewed:  https://review.opendev.org/c/openstack/neutron/+/907382
Committed: https://opendev.org/openstack/neutron/commit/d6f56c5f96c42e1682f3d1723a65253429778c20
Submitter: "Zuul (22348)"
Branch:    master

commit d6f56c5f96c42e1682f3d1723a65253429778c20
Author: LIU Yulong <i@xxxxxxxxxxxx>
Date:   Thu Jan 27 17:01:43 2022 +0800

    Add a default goto table=94 for openvswitch fw
    
    If enable explicitly_egress_direct=True and set port as
    no security group and port_security=False, the ingress
    flood will reappear. The pipleline is:
    Ingress
    table_0 -> table_60 -> NORMAL -> VM
    Egress
    table_0 -> ... -> table_94 -> output
    
    Because ingress final action is normal, the br-int will learn the
    source MAC, but egress final action is output. So VM's mac will
    never be learnt by the br-int. Then ingress flood comes again.
    
    This patch adds a default direct flow to table 94 during the
    openflow security group init and explicitly_egress_direct=True, then
    the pipleline will be:
    Ingress
    table_0 -> table_60 -> table_94 -> output VM
    Egress
    table_0 -> ... -> table_94 -> output
    
    And this patch adds the flows coming from patch port which will
    match local vlan then go to table 94 do the same direct actions.
    
    Above flood issue will be addressed by these flows.
    
    Closes-Bug: #2051351
    Change-Id: Ia61784174ee610b338f26660b2954330abc131a1


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2051351

Title:
  explicity_egress_direct prevents learning of local MACs and causes
  flooding of ingress packets, firewall_driver = openvswitch

Status in neutron:
  Fix Released

Bug description:
  I believe this issue was already reported earlier:

  https://bugs.launchpad.net/neutron/+bug/1884708

  That bug has a fix committed:

  https://review.opendev.org/c/openstack/neutron/+/738551

  However I believe the above change fixed only part of the issue (with firewall_driver=noop).
  But the same problem is still not fixed with firewall_driver=openvswitch.

  First, I re-opened bug #1884708, but then I realized that nobody will
  notice a several year old bug's status change, so I rather opened this
  new bug report instead.

  Reproduction:

  # config
  ml2_conf.ini:
  [securitygroup]
  firewall_driver = openvswitch
  [agent]
  explicitly_egress_direct = True
  [ovs]
  bridge_mappings = physnet0:br-physnet0,...

  # a random IP on net0 we can ping
  sudo ip link set up dev br-physnet0
  sudo ip link add link br-physnet0 name br-physnet0.100 type vlan id 100
  sudo ip link set up dev br-physnet0.100
  sudo ip address add dev br-physnet0.100 10.0.100.1/24

  # code
  devstack 6b0f055b
  neutron $ git log --oneline -n2
  27601f8eea (HEAD, origin/bug/2048785, origin/HEAD) Set trunk parent port as access port in ovs to avoid loop
  3ef02cc2fb (origin/master) Consume code from neutron-lib
  openvswitch 2.17.8-0ubuntu0.22.04.1
  linux 5.15.0-91-generic

  # clean up first
  openstack server delete vm0 --wait
  openstack port delete port0
  openstack network delete net0

  # build the environment
  openstack network create net0 --provider-network-type vlan --provider-physical-network physnet0 --provider-segment 100
  openstack subnet create --network net0 --subnet-range 10.0.100.0/24 subnet0
  openstack port create --no-security-group --disable-port-security --network net0 --fixed-ip ip-address=10.0.100.10 port0
  openstack server create --flavor cirros256 --image cirros-0.6.2-x86_64-disk --nic port-id=port0 --availability-zone :devstack0a --wait vm0

  # mac addresses for reference
  $ openstack port show port0 -f value -c mac_address
  fa:16:3e:96:58:ab
  $ ifdata -ph br-physnet0
  82:E8:18:67:7E:40

  # generate traffic that will keep fdb entries fresh
  sudo virsh console "$( openstack server show vm0 -f value -c OS-EXT-SRV-ATTR:instance_name )"
  ping 10.0.100.1

  # clear all past junk
  for br in br-physnet0 br-int ; do sudo ovs-appctl fdb/flush "$br" ; done

  # br-int does not learn port0's mac despite the ongoing ping
  for br in br-physnet0 br-int ; do echo ">>> $br <<<" ; sudo ovs-appctl fdb/show "$br" | egrep -i "$( openstack port show port0 -f value -c mac_address )|$( ifdata -ph br-physnet0 )" ; done
  >>> br-physnet0 <<<
      1   100  fa:16:3e:96:58:ab    0
  LOCAL   100  82:e8:18:67:7e:40    0
  >>> br-int <<<
      1     4  82:e8:18:67:7e:40    0

  # port and physnet bridge mac in all fdbs, egress == vnic -> physnet bridge
  # in br-int we have a direct output action
  $ sudo ovs-appctl ofproto/trace br-int in_port="$( sudo ovs-vsctl -- --columns=ofport find Interface name=$( echo "tap$( openstack port show port0 -f value -c id )" | cut -b1-14 ) | awk '{ print $3 }' )",dl_vlan=0,dl_dst=$( ifdata -ph br-physnet0 ),dl_src=$( openstack port show port0 -f value -c mac_address )
  Flow: in_port=45,dl_vlan=0,dl_vlan_pcp=0,vlan_tci1=0x0000,dl_src=fa:16:3e:96:58:ab,dl_dst=82:e8:18:67:7e:40,dl_type=0x0000

  bridge("br-int")
  ----------------
   0. priority 0, cookie 0x2b36d6b4a42fe7b5
      goto_table:58
  58. priority 0, cookie 0x2b36d6b4a42fe7b5
      goto_table:60
  60. in_port=45, priority 100, cookie 0x2b36d6b4a42fe7b5
      set_field:0x2d->reg5
      set_field:0x4->reg6
      resubmit(,73)
  73. reg5=0x2d, priority 80, cookie 0x2b36d6b4a42fe7b5
      resubmit(,94)
  94. reg6=0x4,dl_src=fa:16:3e:96:58:ab,dl_dst=00:00:00:00:00:00/01:00:00:00:00:00, priority 10, cookie 0x2b36d6b4a42fe7b5
      push_vlan:0x8100
      set_field:4100->vlan_vid
      output:1

  bridge("br-physnet0")
  ---------------------
   0. in_port=1,dl_vlan=4, priority 4, cookie 0x85bc1a5077d54d3f
      set_field:4196->vlan_vid
      NORMAL
       -> forwarding to learned port

  Final flow: reg5=0x2d,reg6=0x4,in_port=45,dl_vlan=4,dl_vlan_pcp=0,dl_vlan1=0,dl_vlan_pcp1=0,dl_src=fa:16:3e:96:58:ab,dl_dst=82:e8:18:67:7e:40,dl_type=0x0000
  Megaflow: recirc_id=0,eth,in_port=45,dl_vlan=0,dl_vlan_pcp=0,dl_src=fa:16:3e:96:58:ab,dl_dst=82:e8:18:67:7e:40,dl_type=0x0000
  Datapath actions: pop_vlan,push_vlan(vid=100,pcp=0),1

  # port and physnet bridge mac in all fdbs, ingress == physnet bridge -> vnic
  # in br-int we have the normal action flooding, despite the ongoing ping
  $ sudo ovs-appctl ofproto/trace br-physnet0 in_port=LOCAL,dl_vlan=100,dl_src=$( ifdata -ph br-physnet0 ),dl_dst=$( openstack port show port0 -f value -c mac_address )
  Flow: in_port=LOCAL,dl_vlan=100,dl_vlan_pcp=0,vlan_tci1=0x0000,dl_src=82:e8:18:67:7e:40,dl_dst=fa:16:3e:96:58:ab,dl_type=0x0000

  bridge("br-physnet0")
  ---------------------
   0. priority 0, cookie 0x85bc1a5077d54d3f
      NORMAL
       -> forwarding to learned port

  bridge("br-int")
  ----------------
   0. in_port=1,dl_vlan=100, priority 3, cookie 0x2b36d6b4a42fe7b5
      set_field:4100->vlan_vid
      goto_table:58
  58. priority 0, cookie 0x2b36d6b4a42fe7b5
      goto_table:60
  60. priority 3, cookie 0x2b36d6b4a42fe7b5
      NORMAL
       -> no learned MAC for destination, flooding

  bridge("br-tun")
  ----------------
   0. in_port=1, priority 1, cookie 0xc8cfff9c6bbea88d
      goto_table:2
   2. dl_dst=00:00:00:00:00:00/01:00:00:00:00:00, priority 0, cookie 0xc8cfff9c6bbea88d
      goto_table:20
  20. priority 0, cookie 0xc8cfff9c6bbea88d
      goto_table:22
  22. priority 0, cookie 0xc8cfff9c6bbea88d
      drop

  Final flow: unchanged
  Megaflow: recirc_id=0,eth,in_port=LOCAL,dl_vlan=100,dl_vlan_pcp=0,dl_src=82:e8:18:67:7e:40,dl_dst=fa:16:3e:96:58:ab,dl_type=0x0000
  Datapath actions: pop_vlan,push_vlan(vid=4,pcp=0),8,13,pop_vlan,9,11

  This bug has a long history:

  round #1 - some unnecessary flooding in the egress direction
  https://bugs.launchpad.net/neutron/+bug/1732067
  https://bugs.launchpad.net/neutron/+bug/1841622
  fix introducing explicitly_egress_direct:
  https://review.opendev.org/c/openstack/neutron/+/666991

  round #2 - the fix above introduced some unnecessary ingress flooding
  https://bugs.launchpad.net/neutron/+bug/1884708
  fix for firewall_driver=noop
  https://review.opendev.org/c/openstack/neutron/+/738551
  also related:
  https://bugs.launchpad.net/neutron/+bug/1732067/comments/50
  https://bugs.launchpad.net/neutron/+bug/1732067/comments/79
  may be related:
  https://bugs.launchpad.net/neutron/+bug/1866445

  round #3 (today)
  https://bugs.launchpad.net/neutron/+bug/2048785/comments/2
  https://bugs.launchpad.net/neutron/+bug/1884708/comments/29

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2051351/+subscriptions



References