← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2048785] [NEW] Trunk parent port (tpt port) vlan_mode is wrong in ovs

 

Public bug reported:

... therefore a forwarding loop, packet duplication, packet loss and
double tagging is possible.

Today a trunk bridge with one parent and one subport looks like this:

# ovs-vsctl show
...
    Bridge tbr-b2781877-3
        datapath_type: system
        Port spt-28c9689e-9e
            tag: 101
            Interface spt-28c9689e-9e
                type: patch
                options: {peer=spi-28c9689e-9e}
        Port tap3709f1a1-a5
            Interface tap3709f1a1-a5
        Port tpt-3709f1a1-a5
            Interface tpt-3709f1a1-a5
                type: patch
                options: {peer=tpi-3709f1a1-a5}
        Port tbr-b2781877-3
            Interface tbr-b2781877-3
                type: internal
...

# ovs-vsctl find Port name=tpt-3709f1a1-a5 | egrep 'tag|vlan_mode|trunks'
tag                 : []
trunks              : []
vlan_mode           : []

# ovs-vsctl find Port name=spt-28c9689e-9e | egrep 'tag|vlan_mode|trunks'
tag                 : 101
trunks              : []
vlan_mode           : []

I believe the vlan_mode of the tpt port is wrong (at least when the port is not "vlan_transparent") and it should have the value "access".
Even when the port is "vlan_transparent", forwarding loops between br-int and a trunk bridge should be prevented.

According to: http://www.openvswitch.org/support/dist-docs/ovs-
vswitchd.conf.db.5.txt

"""
       vlan_mode: optional string, one of access, dot1q-tunnel, native-tagged,
       native-untagged, or trunk
              The VLAN mode of the port, as described above. When this  column
              is empty, a default mode is selected as follows:

              •      If  tag contains a value, the port is an access port. The
                     trunks column should be empty.

              •      Otherwise, the port is a trunk port.  The  trunks  column
                     value is honored if it is present.
"""

"""
       trunks: set of up to 4,096 integers, in range 0 to 4,095
              For  a trunk, native-tagged, or native-untagged port, the 802.1Q
              VLAN or VLANs that this port trunks; if it is  empty,  then  the
              port trunks all VLANs. Must be empty if this is an access port.

              A native-tagged or native-untagged port always trunks its native
              VLAN, regardless of whether trunks includes that VLAN.
"""

The above combination of tag, trunks and vlan_mode for the tpt port
means that it is in trunk mode (in the ovs sense) and it forwards both
untagged and tagged frames with any vlan tag. But the tpt port should
only forward untagged frames.

Feel free to treat this as the end of the bug report. But below I'll add
more about how we found this bug, in what conditions can it be
triggered, what consequences it may have. However please keep in mind I
don't have a full upstream reproduction at the moment. Nor have I a full
analysis of every suspicion mentioned below.

I'm aware of a full reproduction of this bug only in a downstream
environment, which looked like below. While the following was sufficient
to reproduce the problem, this was likely far from a minimal
reproduction and some/many of the below steps are unnecessary.

* [securitygroup].firewall_driver = noop
* [ovs].explicitly_egress_direct = True
* 2 VMs started on the same compute.
* Both having a trunk port with one parent and one subport.
* The parent and the subport of each trunk have the same MAC address.
* All ports are on vlan networks belonging to the same physnet.
* All ports are created with --disable-port-security and --no-security-group.
* The subport segmentation IDs and the corresponding vlan network segmentation IDs were the same (as if they used "inherit").
* Traffic was generated from a 3rd VM on a different compute addressed to one of the VM's subport IP, for which
* the destination MAC was not yet learned by either br-int or the two trunk bridges on the host.

I believe the environment looked like this:

openstack network create net0 --provider-network-type vlan --provider-physical-network physnet0 --provider-segment 100
openstack network create net1 --provider-network-type vlan --provider-physical-network physnet0 --provider-segment 101

openstack subnet create --network net0 --subnet-range 10.0.100.0/24 subnet0
openstack subnet create --network net1 --subnet-range 10.0.101.0/24 subnet1

openstack port create --no-security-group --disable-port-security --network net0 --fixed-ip ip-address=10.0.100.10 port0a
port0a_mac="$( openstack port show port0a -f value -c mac_address )"
openstack port create --no-security-group --disable-port-security --mac-address "$port0a_mac" --network net1 --fixed-ip ip-address=10.0.101.10 port1a

openstack port create --no-security-group --disable-port-security --network net0 --fixed-ip ip-address=10.0.100.11 port0b
port0b_mac="$( openstack port show port0b -f value -c mac_address )"
openstack port create --no-security-group --disable-port-security --mac-address "$port0b_mac" --network net1 --fixed-ip ip-address=10.0.101.11 port1b

openstack network trunk create --parent-port port0a trunka
openstack network trunk set --subport port=port1a,segmentation-type=vlan,segmentation-id=101 trunka

openstack network trunk create --parent-port port0b trunkb
openstack network trunk set --subport port=port1b,segmentation-type=vlan,segmentation-id=101 trunkb

openstack server create --flavor ds1G --image u1804 --nic port-id=port0a --wait vma
openstack server create --flavor ds1G --image u1804 --nic port-id=port0b --wait vmb # booted on the same compute as vma

At the moment I don't have a reproduction independent of that
environment, that re-creates the same state of the bridges' FDBs and the
same kind of traffic.

Anyway, in this environment colleagues observed:
* Lost frames.
* Duplicated frames arriving to the vNIC of one of the VMs.
* Unexpectedly double tagged frames on the physical bridge leaving the compute host.

Local analysis showed as the traffic arrived to br-int, which did not have the dst MAC in its FDB, had to flood to all ports.
This way the frame ended up on both trunk bridges.
One of these trunk bridges was on the proper way to the destination address.
But the other trunk bridge, also not having the dst MAC in its FDB, had to flood to all ports.
And this trunk bridge also flooded the frame to its tpt port back to br-int.
But the tpt port conceptually is in a different VLAN and the frame should never have been flooded to that port.
However the tpt port has the wrong configuration and forwards the traffic from the wrong VLANs.

After the looped frame got back to br-int, it reached the intended VMs
vNIC via the trunk parent (sic!) port. Which means that the latter trunk
bridge learned the traffic generator's source MAC now on the wrong port.
I have a suspicion that this may have lead to the unexpectedly double
tagged packets in the other direction.

** Affects: neutron
     Importance: Undecided
     Assignee: Bence Romsics (bence-romsics)
         Status: In Progress


** Tags: trunk

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2048785

Title:
  Trunk parent port (tpt port) vlan_mode is wrong in ovs

Status in neutron:
  In Progress

Bug description:
  ... therefore a forwarding loop, packet duplication, packet loss and
  double tagging is possible.

  Today a trunk bridge with one parent and one subport looks like this:

  # ovs-vsctl show
  ...
      Bridge tbr-b2781877-3
          datapath_type: system
          Port spt-28c9689e-9e
              tag: 101
              Interface spt-28c9689e-9e
                  type: patch
                  options: {peer=spi-28c9689e-9e}
          Port tap3709f1a1-a5
              Interface tap3709f1a1-a5
          Port tpt-3709f1a1-a5
              Interface tpt-3709f1a1-a5
                  type: patch
                  options: {peer=tpi-3709f1a1-a5}
          Port tbr-b2781877-3
              Interface tbr-b2781877-3
                  type: internal
  ...

  # ovs-vsctl find Port name=tpt-3709f1a1-a5 | egrep 'tag|vlan_mode|trunks'
  tag                 : []
  trunks              : []
  vlan_mode           : []

  # ovs-vsctl find Port name=spt-28c9689e-9e | egrep 'tag|vlan_mode|trunks'
  tag                 : 101
  trunks              : []
  vlan_mode           : []

  I believe the vlan_mode of the tpt port is wrong (at least when the port is not "vlan_transparent") and it should have the value "access".
  Even when the port is "vlan_transparent", forwarding loops between br-int and a trunk bridge should be prevented.

  According to: http://www.openvswitch.org/support/dist-docs/ovs-
  vswitchd.conf.db.5.txt

  """
         vlan_mode: optional string, one of access, dot1q-tunnel, native-tagged,
         native-untagged, or trunk
                The VLAN mode of the port, as described above. When this  column
                is empty, a default mode is selected as follows:

                •      If  tag contains a value, the port is an access port. The
                       trunks column should be empty.

                •      Otherwise, the port is a trunk port.  The  trunks  column
                       value is honored if it is present.
  """

  """
         trunks: set of up to 4,096 integers, in range 0 to 4,095
                For  a trunk, native-tagged, or native-untagged port, the 802.1Q
                VLAN or VLANs that this port trunks; if it is  empty,  then  the
                port trunks all VLANs. Must be empty if this is an access port.

                A native-tagged or native-untagged port always trunks its native
                VLAN, regardless of whether trunks includes that VLAN.
  """

  The above combination of tag, trunks and vlan_mode for the tpt port
  means that it is in trunk mode (in the ovs sense) and it forwards both
  untagged and tagged frames with any vlan tag. But the tpt port should
  only forward untagged frames.

  Feel free to treat this as the end of the bug report. But below I'll
  add more about how we found this bug, in what conditions can it be
  triggered, what consequences it may have. However please keep in mind
  I don't have a full upstream reproduction at the moment. Nor have I a
  full analysis of every suspicion mentioned below.

  I'm aware of a full reproduction of this bug only in a downstream
  environment, which looked like below. While the following was
  sufficient to reproduce the problem, this was likely far from a
  minimal reproduction and some/many of the below steps are unnecessary.

  * [securitygroup].firewall_driver = noop
  * [ovs].explicitly_egress_direct = True
  * 2 VMs started on the same compute.
  * Both having a trunk port with one parent and one subport.
  * The parent and the subport of each trunk have the same MAC address.
  * All ports are on vlan networks belonging to the same physnet.
  * All ports are created with --disable-port-security and --no-security-group.
  * The subport segmentation IDs and the corresponding vlan network segmentation IDs were the same (as if they used "inherit").
  * Traffic was generated from a 3rd VM on a different compute addressed to one of the VM's subport IP, for which
  * the destination MAC was not yet learned by either br-int or the two trunk bridges on the host.

  I believe the environment looked like this:

  openstack network create net0 --provider-network-type vlan --provider-physical-network physnet0 --provider-segment 100
  openstack network create net1 --provider-network-type vlan --provider-physical-network physnet0 --provider-segment 101

  openstack subnet create --network net0 --subnet-range 10.0.100.0/24 subnet0
  openstack subnet create --network net1 --subnet-range 10.0.101.0/24 subnet1

  openstack port create --no-security-group --disable-port-security --network net0 --fixed-ip ip-address=10.0.100.10 port0a
  port0a_mac="$( openstack port show port0a -f value -c mac_address )"
  openstack port create --no-security-group --disable-port-security --mac-address "$port0a_mac" --network net1 --fixed-ip ip-address=10.0.101.10 port1a

  openstack port create --no-security-group --disable-port-security --network net0 --fixed-ip ip-address=10.0.100.11 port0b
  port0b_mac="$( openstack port show port0b -f value -c mac_address )"
  openstack port create --no-security-group --disable-port-security --mac-address "$port0b_mac" --network net1 --fixed-ip ip-address=10.0.101.11 port1b

  openstack network trunk create --parent-port port0a trunka
  openstack network trunk set --subport port=port1a,segmentation-type=vlan,segmentation-id=101 trunka

  openstack network trunk create --parent-port port0b trunkb
  openstack network trunk set --subport port=port1b,segmentation-type=vlan,segmentation-id=101 trunkb

  openstack server create --flavor ds1G --image u1804 --nic port-id=port0a --wait vma
  openstack server create --flavor ds1G --image u1804 --nic port-id=port0b --wait vmb # booted on the same compute as vma

  At the moment I don't have a reproduction independent of that
  environment, that re-creates the same state of the bridges' FDBs and
  the same kind of traffic.

  Anyway, in this environment colleagues observed:
  * Lost frames.
  * Duplicated frames arriving to the vNIC of one of the VMs.
  * Unexpectedly double tagged frames on the physical bridge leaving the compute host.

  Local analysis showed as the traffic arrived to br-int, which did not have the dst MAC in its FDB, had to flood to all ports.
  This way the frame ended up on both trunk bridges.
  One of these trunk bridges was on the proper way to the destination address.
  But the other trunk bridge, also not having the dst MAC in its FDB, had to flood to all ports.
  And this trunk bridge also flooded the frame to its tpt port back to br-int.
  But the tpt port conceptually is in a different VLAN and the frame should never have been flooded to that port.
  However the tpt port has the wrong configuration and forwards the traffic from the wrong VLANs.

  After the looped frame got back to br-int, it reached the intended VMs
  vNIC via the trunk parent (sic!) port. Which means that the latter
  trunk bridge learned the traffic generator's source MAC now on the
  wrong port. I have a suspicion that this may have lead to the
  unexpectedly double tagged packets in the other direction.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2048785/+subscriptions



Follow ups