yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #61383
[Bug 1663232] [NEW] Openvswitch VLAN stripping issue with tunneling
Public bug reported:
Openvswitch VLAN tag stripping and MTU issue
When L2 population is not used in an environment using Openvswitch as
ML2 or when the learned rules are matching, the vlan tag used internally
by Neutron is not stripped. Hence, for VXLAN the overhead of the
tunneling is higher than the MTU reduction on the virtual networks
because the VLAN tag is not stripped, thus causing MTU issues.
In my setup, I have several OpenStack clouds (Newton) deployed using
Fuel, with VXLAN segmentation and using Openvswitch. It runs on Ubuntu
16.04. Some machines in the tenants virtual networks act as bridges and
thus L2 population is not sufficient, the learning feature of br-tun is
required. The deployments are the most basic that can be performed with
Fuel 10 (no additionnal services).
The overhead of VXLAN is 50 Bytes, if the original ethernet frame does
not have a VLAN tag. However, if the ethernet frame has a vlan tag, the
overhead is 54 Bytes. When setting up the virtual network MTU, Neutron
assumes that there is no vlan tag. However, Neutron uses internally vlan
tags to isolate the networks in br-int and br-tun. When using L2
populations, the rules set in br-tun strip the vlan tag before
tunneling, hence everything work properly. But, when L2 population is
not used or its rules not hit and the learning part takes place, the
learned rules do not strip the vlan, they only zero it, hence the
overhead is 54 Bytes and the communication is broken.
The following learning rule in br-tun installs flows that zero the vlan
tag and do not remove it.
in table 10:
#table=10,priority=0,actions=learn(table=20,hard_timeout=300,priority=1,cookie=0xa9e495b5c54cf90c,OXM_OF_VLAN_VID[],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->OXM_OF_VLAN_VID[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:OXM_OF_IN_PORT[]),output:1
resulting in flows in table 20 like this :
#table=20,priority=1,vlan_tci=0x0003/0x0fff,dl_dst=fa:16:3e:b6:53:e2 actions=load:0->OXM_OF_VLAN_VID[],load:0x18->NXM_NX_TUN_ID[],output:2
This flow does not remove the vlan tag. When using L2 population, some flows with higher priority are inserted, that do strip the vlan tag correctly. However, the learned flows are used if the L2 populations flow do not match.
Expected output : traffic without vlan tag tunneled in VXLAN with a 50 Bytes overhead
Actual output : traffic with a vlan tag (0) tunneled in VXLAN with a 54 Bytes overhead
The issue does not happen for GRE as the 4 additionnal bytes are still
fitting in the 50 Bytes MTU reduction on the tenant network
** Affects: neutron
Importance: Undecided
Status: New
** Tags: mtu openvswitch tunnel vlan
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1663232
Title:
Openvswitch VLAN stripping issue with tunneling
Status in neutron:
New
Bug description:
Openvswitch VLAN tag stripping and MTU issue
When L2 population is not used in an environment using Openvswitch as
ML2 or when the learned rules are matching, the vlan tag used
internally by Neutron is not stripped. Hence, for VXLAN the overhead
of the tunneling is higher than the MTU reduction on the virtual
networks because the VLAN tag is not stripped, thus causing MTU
issues.
In my setup, I have several OpenStack clouds (Newton) deployed using
Fuel, with VXLAN segmentation and using Openvswitch. It runs on Ubuntu
16.04. Some machines in the tenants virtual networks act as bridges
and thus L2 population is not sufficient, the learning feature of br-
tun is required. The deployments are the most basic that can be
performed with Fuel 10 (no additionnal services).
The overhead of VXLAN is 50 Bytes, if the original ethernet frame does
not have a VLAN tag. However, if the ethernet frame has a vlan tag,
the overhead is 54 Bytes. When setting up the virtual network MTU,
Neutron assumes that there is no vlan tag. However, Neutron uses
internally vlan tags to isolate the networks in br-int and br-tun.
When using L2 populations, the rules set in br-tun strip the vlan tag
before tunneling, hence everything work properly. But, when L2
population is not used or its rules not hit and the learning part
takes place, the learned rules do not strip the vlan, they only zero
it, hence the overhead is 54 Bytes and the communication is broken.
The following learning rule in br-tun installs flows that zero the
vlan tag and do not remove it.
in table 10:
#table=10,priority=0,actions=learn(table=20,hard_timeout=300,priority=1,cookie=0xa9e495b5c54cf90c,OXM_OF_VLAN_VID[],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->OXM_OF_VLAN_VID[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:OXM_OF_IN_PORT[]),output:1
resulting in flows in table 20 like this :
#table=20,priority=1,vlan_tci=0x0003/0x0fff,dl_dst=fa:16:3e:b6:53:e2 actions=load:0->OXM_OF_VLAN_VID[],load:0x18->NXM_NX_TUN_ID[],output:2
This flow does not remove the vlan tag. When using L2 population, some flows with higher priority are inserted, that do strip the vlan tag correctly. However, the learned flows are used if the L2 populations flow do not match.
Expected output : traffic without vlan tag tunneled in VXLAN with a 50 Bytes overhead
Actual output : traffic with a vlan tag (0) tunneled in VXLAN with a 54 Bytes overhead
The issue does not happen for GRE as the 4 additionnal bytes are still
fitting in the 50 Bytes MTU reduction on the tenant network
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1663232/+subscriptions
Follow ups