yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #93271
[Bug 2048785] [NEW] Trunk parent port (tpt port) vlan_mode is wrong in ovs
Public bug reported:
... therefore a forwarding loop, packet duplication, packet loss and
double tagging is possible.
Today a trunk bridge with one parent and one subport looks like this:
# ovs-vsctl show
...
Bridge tbr-b2781877-3
datapath_type: system
Port spt-28c9689e-9e
tag: 101
Interface spt-28c9689e-9e
type: patch
options: {peer=spi-28c9689e-9e}
Port tap3709f1a1-a5
Interface tap3709f1a1-a5
Port tpt-3709f1a1-a5
Interface tpt-3709f1a1-a5
type: patch
options: {peer=tpi-3709f1a1-a5}
Port tbr-b2781877-3
Interface tbr-b2781877-3
type: internal
...
# ovs-vsctl find Port name=tpt-3709f1a1-a5 | egrep 'tag|vlan_mode|trunks'
tag : []
trunks : []
vlan_mode : []
# ovs-vsctl find Port name=spt-28c9689e-9e | egrep 'tag|vlan_mode|trunks'
tag : 101
trunks : []
vlan_mode : []
I believe the vlan_mode of the tpt port is wrong (at least when the port is not "vlan_transparent") and it should have the value "access".
Even when the port is "vlan_transparent", forwarding loops between br-int and a trunk bridge should be prevented.
According to: http://www.openvswitch.org/support/dist-docs/ovs-
vswitchd.conf.db.5.txt
"""
vlan_mode: optional string, one of access, dot1q-tunnel, native-tagged,
native-untagged, or trunk
The VLAN mode of the port, as described above. When this column
is empty, a default mode is selected as follows:
• If tag contains a value, the port is an access port. The
trunks column should be empty.
• Otherwise, the port is a trunk port. The trunks column
value is honored if it is present.
"""
"""
trunks: set of up to 4,096 integers, in range 0 to 4,095
For a trunk, native-tagged, or native-untagged port, the 802.1Q
VLAN or VLANs that this port trunks; if it is empty, then the
port trunks all VLANs. Must be empty if this is an access port.
A native-tagged or native-untagged port always trunks its native
VLAN, regardless of whether trunks includes that VLAN.
"""
The above combination of tag, trunks and vlan_mode for the tpt port
means that it is in trunk mode (in the ovs sense) and it forwards both
untagged and tagged frames with any vlan tag. But the tpt port should
only forward untagged frames.
Feel free to treat this as the end of the bug report. But below I'll add
more about how we found this bug, in what conditions can it be
triggered, what consequences it may have. However please keep in mind I
don't have a full upstream reproduction at the moment. Nor have I a full
analysis of every suspicion mentioned below.
I'm aware of a full reproduction of this bug only in a downstream
environment, which looked like below. While the following was sufficient
to reproduce the problem, this was likely far from a minimal
reproduction and some/many of the below steps are unnecessary.
* [securitygroup].firewall_driver = noop
* [ovs].explicitly_egress_direct = True
* 2 VMs started on the same compute.
* Both having a trunk port with one parent and one subport.
* The parent and the subport of each trunk have the same MAC address.
* All ports are on vlan networks belonging to the same physnet.
* All ports are created with --disable-port-security and --no-security-group.
* The subport segmentation IDs and the corresponding vlan network segmentation IDs were the same (as if they used "inherit").
* Traffic was generated from a 3rd VM on a different compute addressed to one of the VM's subport IP, for which
* the destination MAC was not yet learned by either br-int or the two trunk bridges on the host.
I believe the environment looked like this:
openstack network create net0 --provider-network-type vlan --provider-physical-network physnet0 --provider-segment 100
openstack network create net1 --provider-network-type vlan --provider-physical-network physnet0 --provider-segment 101
openstack subnet create --network net0 --subnet-range 10.0.100.0/24 subnet0
openstack subnet create --network net1 --subnet-range 10.0.101.0/24 subnet1
openstack port create --no-security-group --disable-port-security --network net0 --fixed-ip ip-address=10.0.100.10 port0a
port0a_mac="$( openstack port show port0a -f value -c mac_address )"
openstack port create --no-security-group --disable-port-security --mac-address "$port0a_mac" --network net1 --fixed-ip ip-address=10.0.101.10 port1a
openstack port create --no-security-group --disable-port-security --network net0 --fixed-ip ip-address=10.0.100.11 port0b
port0b_mac="$( openstack port show port0b -f value -c mac_address )"
openstack port create --no-security-group --disable-port-security --mac-address "$port0b_mac" --network net1 --fixed-ip ip-address=10.0.101.11 port1b
openstack network trunk create --parent-port port0a trunka
openstack network trunk set --subport port=port1a,segmentation-type=vlan,segmentation-id=101 trunka
openstack network trunk create --parent-port port0b trunkb
openstack network trunk set --subport port=port1b,segmentation-type=vlan,segmentation-id=101 trunkb
openstack server create --flavor ds1G --image u1804 --nic port-id=port0a --wait vma
openstack server create --flavor ds1G --image u1804 --nic port-id=port0b --wait vmb # booted on the same compute as vma
At the moment I don't have a reproduction independent of that
environment, that re-creates the same state of the bridges' FDBs and the
same kind of traffic.
Anyway, in this environment colleagues observed:
* Lost frames.
* Duplicated frames arriving to the vNIC of one of the VMs.
* Unexpectedly double tagged frames on the physical bridge leaving the compute host.
Local analysis showed as the traffic arrived to br-int, which did not have the dst MAC in its FDB, had to flood to all ports.
This way the frame ended up on both trunk bridges.
One of these trunk bridges was on the proper way to the destination address.
But the other trunk bridge, also not having the dst MAC in its FDB, had to flood to all ports.
And this trunk bridge also flooded the frame to its tpt port back to br-int.
But the tpt port conceptually is in a different VLAN and the frame should never have been flooded to that port.
However the tpt port has the wrong configuration and forwards the traffic from the wrong VLANs.
After the looped frame got back to br-int, it reached the intended VMs
vNIC via the trunk parent (sic!) port. Which means that the latter trunk
bridge learned the traffic generator's source MAC now on the wrong port.
I have a suspicion that this may have lead to the unexpectedly double
tagged packets in the other direction.
** Affects: neutron
Importance: Undecided
Assignee: Bence Romsics (bence-romsics)
Status: In Progress
** Tags: trunk
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2048785
Title:
Trunk parent port (tpt port) vlan_mode is wrong in ovs
Status in neutron:
In Progress
Bug description:
... therefore a forwarding loop, packet duplication, packet loss and
double tagging is possible.
Today a trunk bridge with one parent and one subport looks like this:
# ovs-vsctl show
...
Bridge tbr-b2781877-3
datapath_type: system
Port spt-28c9689e-9e
tag: 101
Interface spt-28c9689e-9e
type: patch
options: {peer=spi-28c9689e-9e}
Port tap3709f1a1-a5
Interface tap3709f1a1-a5
Port tpt-3709f1a1-a5
Interface tpt-3709f1a1-a5
type: patch
options: {peer=tpi-3709f1a1-a5}
Port tbr-b2781877-3
Interface tbr-b2781877-3
type: internal
...
# ovs-vsctl find Port name=tpt-3709f1a1-a5 | egrep 'tag|vlan_mode|trunks'
tag : []
trunks : []
vlan_mode : []
# ovs-vsctl find Port name=spt-28c9689e-9e | egrep 'tag|vlan_mode|trunks'
tag : 101
trunks : []
vlan_mode : []
I believe the vlan_mode of the tpt port is wrong (at least when the port is not "vlan_transparent") and it should have the value "access".
Even when the port is "vlan_transparent", forwarding loops between br-int and a trunk bridge should be prevented.
According to: http://www.openvswitch.org/support/dist-docs/ovs-
vswitchd.conf.db.5.txt
"""
vlan_mode: optional string, one of access, dot1q-tunnel, native-tagged,
native-untagged, or trunk
The VLAN mode of the port, as described above. When this column
is empty, a default mode is selected as follows:
• If tag contains a value, the port is an access port. The
trunks column should be empty.
• Otherwise, the port is a trunk port. The trunks column
value is honored if it is present.
"""
"""
trunks: set of up to 4,096 integers, in range 0 to 4,095
For a trunk, native-tagged, or native-untagged port, the 802.1Q
VLAN or VLANs that this port trunks; if it is empty, then the
port trunks all VLANs. Must be empty if this is an access port.
A native-tagged or native-untagged port always trunks its native
VLAN, regardless of whether trunks includes that VLAN.
"""
The above combination of tag, trunks and vlan_mode for the tpt port
means that it is in trunk mode (in the ovs sense) and it forwards both
untagged and tagged frames with any vlan tag. But the tpt port should
only forward untagged frames.
Feel free to treat this as the end of the bug report. But below I'll
add more about how we found this bug, in what conditions can it be
triggered, what consequences it may have. However please keep in mind
I don't have a full upstream reproduction at the moment. Nor have I a
full analysis of every suspicion mentioned below.
I'm aware of a full reproduction of this bug only in a downstream
environment, which looked like below. While the following was
sufficient to reproduce the problem, this was likely far from a
minimal reproduction and some/many of the below steps are unnecessary.
* [securitygroup].firewall_driver = noop
* [ovs].explicitly_egress_direct = True
* 2 VMs started on the same compute.
* Both having a trunk port with one parent and one subport.
* The parent and the subport of each trunk have the same MAC address.
* All ports are on vlan networks belonging to the same physnet.
* All ports are created with --disable-port-security and --no-security-group.
* The subport segmentation IDs and the corresponding vlan network segmentation IDs were the same (as if they used "inherit").
* Traffic was generated from a 3rd VM on a different compute addressed to one of the VM's subport IP, for which
* the destination MAC was not yet learned by either br-int or the two trunk bridges on the host.
I believe the environment looked like this:
openstack network create net0 --provider-network-type vlan --provider-physical-network physnet0 --provider-segment 100
openstack network create net1 --provider-network-type vlan --provider-physical-network physnet0 --provider-segment 101
openstack subnet create --network net0 --subnet-range 10.0.100.0/24 subnet0
openstack subnet create --network net1 --subnet-range 10.0.101.0/24 subnet1
openstack port create --no-security-group --disable-port-security --network net0 --fixed-ip ip-address=10.0.100.10 port0a
port0a_mac="$( openstack port show port0a -f value -c mac_address )"
openstack port create --no-security-group --disable-port-security --mac-address "$port0a_mac" --network net1 --fixed-ip ip-address=10.0.101.10 port1a
openstack port create --no-security-group --disable-port-security --network net0 --fixed-ip ip-address=10.0.100.11 port0b
port0b_mac="$( openstack port show port0b -f value -c mac_address )"
openstack port create --no-security-group --disable-port-security --mac-address "$port0b_mac" --network net1 --fixed-ip ip-address=10.0.101.11 port1b
openstack network trunk create --parent-port port0a trunka
openstack network trunk set --subport port=port1a,segmentation-type=vlan,segmentation-id=101 trunka
openstack network trunk create --parent-port port0b trunkb
openstack network trunk set --subport port=port1b,segmentation-type=vlan,segmentation-id=101 trunkb
openstack server create --flavor ds1G --image u1804 --nic port-id=port0a --wait vma
openstack server create --flavor ds1G --image u1804 --nic port-id=port0b --wait vmb # booted on the same compute as vma
At the moment I don't have a reproduction independent of that
environment, that re-creates the same state of the bridges' FDBs and
the same kind of traffic.
Anyway, in this environment colleagues observed:
* Lost frames.
* Duplicated frames arriving to the vNIC of one of the VMs.
* Unexpectedly double tagged frames on the physical bridge leaving the compute host.
Local analysis showed as the traffic arrived to br-int, which did not have the dst MAC in its FDB, had to flood to all ports.
This way the frame ended up on both trunk bridges.
One of these trunk bridges was on the proper way to the destination address.
But the other trunk bridge, also not having the dst MAC in its FDB, had to flood to all ports.
And this trunk bridge also flooded the frame to its tpt port back to br-int.
But the tpt port conceptually is in a different VLAN and the frame should never have been flooded to that port.
However the tpt port has the wrong configuration and forwards the traffic from the wrong VLANs.
After the looped frame got back to br-int, it reached the intended VMs
vNIC via the trunk parent (sic!) port. Which means that the latter
trunk bridge learned the traffic generator's source MAC now on the
wrong port. I have a suspicion that this may have lead to the
unexpectedly double tagged packets in the other direction.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2048785/+subscriptions
Follow ups