yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #56647
[Bug 1622017] Re: OVS agent is not removing VLAN tags before tunnels when configured with native OF interface
Reviewed: https://review.openstack.org/368553
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=4361f7543f984cf5f09c0c7070ac6b0f22f3b6b1
Submitter: Jenkins
Branch: master
commit 4361f7543f984cf5f09c0c7070ac6b0f22f3b6b1
Author: IWAMOTO Toshihiro <iwamoto@xxxxxxxxxxxxx>
Date: Mon Sep 12 14:36:18 2016 +0900
of_interface: Use vlan_tci instead of vlan_vid
To pop VLAN tags in learn action generated flows, vlan_tci should
be used instead of vlan_vid. Otherwise, VLAN tags with VID=0 are
left.
Change-Id: Ie38ab860424f6e2e2448abac82c428dae3a8a544
Closes-bug: #1622017
** Changed in: neutron
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1622017
Title:
OVS agent is not removing VLAN tags before tunnels when configured
with native OF interface
Status in neutron:
Fix Released
Bug description:
In investigating an MTU issue, an accounted-for overhead of 4 bytes
was discovered. A spurious 802.1q header was discovered using tcpdump
when attempting to connect to a guest via floating IP. The tenant
network type is VXLAN and the VXLAN endpoints themselves are on a
VLAN. This issue effectively breaks communication with guests via
floating ip for some system configurations.
The test system is configured with a default global_physnet_mtu of
1500 and inspection of the router namespace confirms that the tenant
network's router interface has been automatically configured to with
an MTU of 1450. Ping was used to test. e.g. ping -M do -s 1422
192.0.2.58 (1422 is the maximum that should fit in the 1450 MTU
without fragmentation).
With the system configured as described, "ping -s 1420 <floating ip>"
fails.
tcpdump on the controller reveals:
root@overcloud-controller-0 heat-admin]# tcpdump -vvv -e -i any icmp
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
18:32:49.163223 P 52:54:00:01:09:3c (oui Unknown) ethertype IPv4 (0x0800), length 1464: (tos 0x0, ttl 64, id 37535, offset 0, flags [DF], proto ICMP (1), length 1448)
192.0.2.1 > 192.0.2.58: ICMP echo request, id 16083, seq 1, length 1428
18:32:49.163340 In 00:00:00:00:00:00 (oui Ethernet) ethertype IPv4 (0x0800), length 592: (tos 0xc0, ttl 64, id 4395, offset 0, flags [none], proto ICMP (1), length 576)
overcloud-controller-0.tenant.localdomain > overcloud-controller-0.tenant.localdomain: ICMP overcloud-novacompute-0.tenant.localdomain unreachable - need to frag (mtu 1500), length 556
(tos 0x0, ttl 64, id 22077, offset 0, flags [DF], proto UDP (17), length 1502)
overcloud-controller-0.tenant.localdomain.51706 > overcloud-novacompute-0.tenant.localdomain.4789: [no cksum] VXLAN, flags [I] (0x08), vni 36
Adjusting the ping size to allow for a 4 byte header (e.g. ping -s 1418 <floating ip>) succeeds.
Using an alternate tcpdump command to get information from the VXLAN traffic, reveals unusual extra 802.1q header with a vlan ID of 0:
[root@overcloud-controller-0 heat-admin]# tcpdump -vvv -n -e -i any udp
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
18:36:48.095985 Out 56:13:19:d8:af:27 ethertype IPv4 (0x0800), length 1516: (tos 0x0, ttl 64, id 22088, offset 0, flags [DF], proto UDP (17), length 1500)
172.16.0.5.51706 > 172.16.0.10.4789: [no cksum] VXLAN, flags [I] (0x08), vni 36
fa:16:3e:99:37:ce > fa:16:3e:06:65:6f, ethertype 802.1Q (0x8100), length 1464: vlan 0, p 0, ethertype IPv4, (tos 0x0, ttl 63, id 37541, offset 0, flags [DF], proto ICMP (1), length 1446)
192.0.2.1 > 192.168.2.101: ICMP echo request, id 16422, seq 1, length 1426
18:36:48.097861 P ea:0c:37:f7:69:5e ethertype 802.1Q (0x8100), length 1520: vlan 50, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 22354, offset 0, flags [DF], proto UDP (17), length 1500)
172.16.0.10.50337 > 172.16.0.5.4789: [no cksum] VXLAN, flags [I] (0x08), vni 36
The flow table is similar to (this was taken from the compute node,
not the controller but the br-tun flow tables follow the same form
with only different values for local segment IDs)
[root@overcloud-novacompute-0 ml2]# ovs-ofctl -O OpenFlow13 dump-flows br-tun
OFPST_FLOW reply (OF1.3) (xid=0x2):
cookie=0xb13175655506ca2e, duration=11.785s, table=0, n_packets=0, n_bytes=0, priority=1,in_port=1 actions=goto_table:2
cookie=0xb13175655506ca2e, duration=10.955s, table=0, n_packets=0, n_bytes=0, priority=1,in_port=2 actions=goto_table:4
cookie=0xb13175655506ca2e, duration=11.783s, table=0, n_packets=0, n_bytes=0, priority=0 actions=drop
cookie=0xb13175655506ca2e, duration=11.781s, table=2, n_packets=0, n_bytes=0, priority=0,dl_dst=00:00:00:00:00:00/01:00:00:00:00:00 actions=goto_table:20
cookie=0xb13175655506ca2e, duration=11.779s, table=2, n_packets=0, n_bytes=0, priority=0,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00 actions=goto_table:22
cookie=0xb13175655506ca2e, duration=11.778s, table=3, n_packets=0, n_bytes=0, priority=0 actions=drop
cookie=0xb13175655506ca2e, duration=10.677s, table=4, n_packets=0, n_bytes=0, priority=1,tun_id=0x24 actions=push_vlan:0x8100,set_field:4097->vlan_vid,goto_table:10
cookie=0xb13175655506ca2e, duration=11.777s, table=4, n_packets=0, n_bytes=0, priority=0 actions=drop
cookie=0xb13175655506ca2e, duration=11.776s, table=6, n_packets=0, n_bytes=0, priority=0 actions=drop
cookie=0xb13175655506ca2e, duration=11.774s, table=10, n_packets=0, n_bytes=0, priority=1 actions=learn(table=20,hard_timeout=300,priority=1,cookie=0xb13175655506ca2e,OXM_OF_VLAN_VID[],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->OXM_OF_VLAN_VID[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:OXM_OF_IN_PORT[]),output:1
cookie=0xb13175655506ca2e, duration=11.772s, table=20, n_packets=0, n_bytes=0, priority=0 actions=goto_table:22
cookie=0xb13175655506ca2e, duration=10.680s, table=22, n_packets=0, n_bytes=0, priority=1,dl_vlan=1 actions=pop_vlan,set_field:0x24->tun_id,output:2
cookie=0xb13175655506ca2e, duration=11.771s, table=22, n_packets=0, n_bytes=0, priority=0 actions=drop
On a hunch, the same trials were performed with the openvswitch agents
on the controller and compute nodes configured to use the ovs-ofctl OF
interface. ping -s 1422 192.0.2.58 as well as ssh to the guests and
copies of large amount of data are now possible. The same tcpdump
command shows that the extra 802.1q information is not present:
#with ofctl instead of native
[root@overcloud-controller-0 ml2]# tcpdump -vvv -n -e -i any udp
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
19:10:31.570425 Out 56:13:19:d8:af:27 ethertype IPv4 (0x0800), length 1512: (tos 0x0, ttl 64, id 22104, offset 0, flags [DF], proto UDP (17), length 1496)
172.16.0.5.51706 > 172.16.0.10.4789: [no cksum] VXLAN, flags [I] (0x08), vni 36
fa:16:3e:99:37:ce > fa:16:3e:06:65:6f, ethertype IPv4 (0x0800), length 1460: (tos 0x0, ttl 63, id 37549, offset 0, flags [DF], proto ICMP (1), length 1446)
192.0.2.1 > 192.168.2.101: ICMP echo request, id 19062, seq 1, length 1426
19:10:31.572143 P ea:0c:37:f7:69:5e ethertype 802.1Q (0x8100), length 1520: vlan 50, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 22370, offset 0, flags [DF], proto UDP (17), length 1500)
172.16.0.10.50337 > 172.16.0.5.4789: [no cksum] VXLAN, flags [I] (0x08), vni 36
The flow table is also different, using strip_vlan instead of pop_vlan
(as well as other obvious differences)
[root@overcloud-novacompute-0 ml2]# ovs-ofctl dump-flows br-tun
NXST_FLOW reply (xid=0x4):
cookie=0xb4814c0ff5ea6fd4, duration=2095.101s, table=0, n_packets=115156, n_bytes=8744100, idle_age=546, priority=1,in_port=1 actions=resubmit(,2)
cookie=0xb4814c0ff5ea6fd4, duration=2094.475s, table=0, n_packets=346419, n_bytes=274503223, idle_age=546, priority=1,in_port=2 actions=resubmit(,4)
cookie=0xb4814c0ff5ea6fd4, duration=2095.100s, table=0, n_packets=0, n_bytes=0, idle_age=2095, priority=0 actions=drop
cookie=0xb4814c0ff5ea6fd4, duration=2095.099s, table=2, n_packets=115155, n_bytes=8744058, idle_age=546, priority=0,dl_dst=00:00:00:00:00:00/01:00:00:00:00:00 actions=resubmit(,20)
cookie=0xb4814c0ff5ea6fd4, duration=2095.099s, table=2, n_packets=1, n_bytes=42, idle_age=1263, priority=0,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00 actions=resubmit(,22)
cookie=0xb4814c0ff5ea6fd4, duration=2095.098s, table=3, n_packets=0, n_bytes=0, idle_age=2095, priority=0 actions=drop
cookie=0xb4814c0ff5ea6fd4, duration=2094.227s, table=4, n_packets=346419, n_bytes=274503223, idle_age=546, priority=1,tun_id=0x24 actions=mod_vlan_vid:1,resubmit(,10)
cookie=0xb4814c0ff5ea6fd4, duration=2095.097s, table=4, n_packets=0, n_bytes=0, idle_age=2095, priority=0 actions=drop
cookie=0xb4814c0ff5ea6fd4, duration=2095.097s, table=6, n_packets=0, n_bytes=0, idle_age=2095, priority=0 actions=drop
cookie=0xb4814c0ff5ea6fd4, duration=2095.096s, table=10, n_packets=346419, n_bytes=274503223, idle_age=546, priority=1 actions=learn(table=20,hard_timeout=300,priority=1,cookie=0xb4814c0ff5ea6fd4,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:NXM_OF_IN_PORT[]),output:1
cookie=0xb4814c0ff5ea6fd4, duration=2095.096s, table=20, n_packets=0, n_bytes=0, idle_age=2095, priority=0 actions=resubmit(,22)
cookie=0xb4814c0ff5ea6fd4, duration=2094.235s, table=22, n_packets=1, n_bytes=42, idle_age=1263, dl_vlan=1 actions=strip_vlan,set_tunnel:0x24,output:2
cookie=0xb4814c0ff5ea6fd4, duration=2095.086s, table=22, n_packets=0, n_bytes=0, idle_age=2095, priority=0 actions=drop
System details follow:
System info: CentOS Linux release 7.2.1511 (Core)
Kernel version: 3.10.0-327.28.3.el7.x86_6
System is a tripleo deployment using a network isolation type network environment (see docs for details)
Deployment command line:
openstack overcloud deploy --templates ./tripleo-heat-templates
-e ~/tripleo-heat-templates/environments/network-isolation.yaml
-e ~/tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml
-e ~/for_net_isolation.yaml
All templates "stock" except for last, contains:
parameter_defaults:
EC2MetadataIp: 192.0.2.1
ControlPlaneDefaultRoute: 192.0.2.1
OpenStack packages
openvswitch.x86_64 2.5.0-2.el7 @delorean-newton-testing
openstack-neutron-openvswitch.noarch
1:9.0.0-0.20160907193737.dc6508a.el7.centos
@delorean
[root@overcloud-controller-0 ~]# ovs-vsctl --version
ovs-vsctl (Open vSwitch) 2.5.0
Compiled Mar 18 2016 15:00:11
DB Schema 7.12.1
[root@overcloud-controller-0 ~]# ovs-ofctl --version
ovs-ofctl (Open vSwitch) 2.5.0
Compiled Mar 18 2016 15:00:11
OpenFlow versions 0x1:0x4
python-ryu-common.noarch 4.3-2.el7 @delorean-newton-testing
python2-ryu.noarch 4.3-2.el7 @delorean-newton-testing
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1622017/+subscriptions
References