← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1445089] Re: allowed-address-pairs broken with l2pop/arp responder and LinuxBridge/VXLAN

 

Reviewed:  https://review.openstack.org/278597
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=bbd881f3a970143e1954cb277e5235dddd26c5d0
Submitter: Jenkins
Branch:    master

commit bbd881f3a970143e1954cb277e5235dddd26c5d0
Author: Mark McClain <mark@xxxxxxxxxxx>
Date:   Wed Feb 10 13:28:21 2016 -0500

    add arp_responder flag to linuxbridge agent
    
    When the ARP responder is enabled, secondary IP addresses explicitly
    allowed by via the allowed-address-pairs extensions do not resolve.
    This change adds the ability to enable the local ARP responder similar
    to the feature in the OVS agent.  This change disables local ARP
    responses by default, so ARP traffic will be sent over the overlay.
    
    DocImpact
    UpgradeImpact
    
    Change-Id: I5da4afa44fc94032880ea59ec574df504470fb4a
    Closes-Bug: 1445089


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1445089

Title:
  allowed-address-pairs broken with l2pop/arp responder and
  LinuxBridge/VXLAN

Status in neutron:
  Fix Released

Bug description:
  Problem:

  In Icehouse/Juno, when using ML2/LinuxBridge and VXLAN networks,
  allowed-address-pairs functionality is broken. It appears to be a case
  where the node drops broadcast traffic (ff:ff:ff:ff:ff:ff),
  specifically ARP requests, from an instance.

  Steps to reproduce:

  1. Create two instances in the same VXLAN network on two different hosts
  2. Add a secondary IP address to instance #1, and add it to the port using --allowed-address-pairs
  3. Ping from instance #1 to instance #2 using the secondary IP address
  4. On the compute node hosting instance #2, observe that the ARP request can be seen on the vxlan interface, but not the parent interface

  Steps to resolve:

  1. Add static ARP entry to instance #2 
  2. -OR- Add static ARP entry/neighbor entry to compute node hosting instance #2

  The resolutions above become problematic when the allowed addresses
  are networks rather than single IPs, as in the cases where instances
  are acting as routers or NFV devices of some kind.

  -------------------

  Example:

  Create network:
  neutron net-create testnet
  neutron subnet-create testnet 192.168.100.0/24

  Create ports, one for each instance:
  neutron port-create 56c413ca-6ef1-45c8-a3e5-6241ad24bb23
  neutron port-create 56c413ca-6ef1-45c8-a3e5-6241ad24bb23

  Add security group and allowed-address-pairs to each port (IP to be shared)
  neutron port-update 6d6796cd-455f-4b48-9e1a-8316bd336aa4 --security-group 378e3851-ae7f-40b3-94e3-c05cad5cb56b --allowed-address-pairs type=dict list=true ip_address=192.168.100.254
  neutron port-update 0715121b-4cc8-4437-8840-aa74be619c2e --security-group 378e3851-ae7f-40b3-94e3-c05cad5cb56b --allowed-address-pairs type=dict list=true ip_address=192.168.100.254

  Boot instances:
  nova boot --flavor 2 --image 0af87835-f50f-4461-abaa-b6f088c64744 --nic port-id=6d6796cd-455f-4b48-9e1a-8316bd336aa4 --key_name rpc_support --availability-zone nova:626976-Compute001 20150331-COMP1-TEST
  nova boot --flavor 2 --image 0af87835-f50f-4461-abaa-b6f088c64744 --nic port-id=0715121b-4cc8-4437-8840-aa74be619c2e --key_name rpc_support --availability-zone nova:626977-Compute002 20150331-COMP2-TEST

  Observe that the proper iptables rules are in place on the compute
  nodes:

  root@Compute001:~# iptables-save | grep 6d6796cd
  -A neutron-linuxbri-s6d6796cd-4 -s 192.168.100.254/32 -m mac --mac-source FA:16:3E:BF:B0:A1 -j RETURN
  -A neutron-linuxbri-s6d6796cd-4 -s 192.168.100.5/32 -m mac --mac-source FA:16:3E:BF:B0:A1 -j RETURN
  -A neutron-linuxbri-s6d6796cd-4 -j DROP

  root@Compute002:~# iptables-save | grep 0715121b
  -A neutron-linuxbri-s0715121b-4 -s 192.168.100.254/32 -m mac --mac-source FA:16:3E:1C:9D:55 -j RETURN
  -A neutron-linuxbri-s0715121b-4 -s 192.168.100.6/32 -m mac --mac-source FA:16:3E:1C:9D:55 -j RETURN
  -A neutron-linuxbri-s0715121b-4 -j DROP

  Verify that ARP entries exist on the compute nodes (instances can ping
  each other at fixed IP as expected):

  root@Compute001:~# arp -an | grep 192.168.100
  ? (192.168.100.4) at fa:16:3e:4d:73:7b [ether] PERM on vxlan-2
  ? (192.168.100.6) at fa:16:3e:1c:9d:55 [ether] PERM on vxlan-2
  ? (192.168.100.2) at fa:16:3e:d4:53:75 [ether] PERM on vxlan-2
  ? (192.168.100.3) at fa:16:3e:a6:a4:03 [ether] PERM on vxlan-2

  root@Compute002:~# arp -an | grep 192.168.100
  ? (192.168.100.3) at fa:16:3e:a6:a4:03 [ether] PERM on vxlan-2
  ? (192.168.100.4) at fa:16:3e:4d:73:7b [ether] PERM on vxlan-2
  ? (192.168.100.2) at fa:16:3e:d4:53:75 [ether] PERM on vxlan-2
  ? (192.168.100.5) at fa:16:3e:bf:b0:a1 [ether] PERM on vxlan-2

  !!!!! TEST !!!!!

  Test: Configure 192.168.100.254 as a secondary address on INSTANCE#1
  and ping out to INSTANCE#2

  root@20150331-comp1-test:~# ip a a 192.168.100.254/32 dev eth0

  root@20150331-comp1-test:~# ping -I 192.168.100.254 192.168.100.6
  PING 192.168.100.6 (192.168.100.6) from 192.168.100.254 : 56(84) bytes of data.
  ^C
  --- 192.168.100.6 ping statistics ---
  26 packets transmitted, 0 received, 100% packet loss, time 25200ms

  Result: Failure to reach destination

  !!!!! TROUBLESHOOT !!!!!

  Process: 
  1. Start ping:

  root@20150331-comp1-test:~# ping -I 192.168.100.254 192.168.100.6
  PING 192.168.100.6 (192.168.100.6) from 192.168.100.254 : 56(84) bytes of data.

  2. Dump on vxlan interface on local compute node:

  root@Compute001:~# tcpdump -i vxlan-2 -ne
  tcpdump: WARNING: vxlan-2: no IPv4 address assigned
  tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  listening on vxlan-2, link-type EN10MB (Ethernet), capture size 65535 bytes
  14:22:06.595700 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 28, length 64
  14:22:07.603721 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 29, length 64
  14:22:08.611701 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 30, length 64
  14:22:09.619712 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 31, length 64

  3. Dump on parent interface of local compute node:

  root@Compute001:~# tcpdump -i bond1.206 -ne
  tcpdump: WARNING: bond1.206: no IPv4 address assigned
  tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  listening on bond1.206, link-type EN10MB (Ethernet), capture size 65535 bytes
  14:31:15.655396 90:e2:ba:73:71:cd > 90:e2:ba:71:3b:1d, ethertype IPv4 (0x0800), length 148: 172.28.240.20.37449 > 172.28.240.21.8472: OTV, flags [I] (0x08), overlay 0, instance 2
  fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1527, seq 4, length 64
  14:31:16.663468 90:e2:ba:73:71:cd > 90:e2:ba:71:3b:1d, ethertype IPv4 (0x0800), length 148: 172.28.240.20.37449 > 172.28.240.21.8472: OTV, flags [I] (0x08), overlay 0, instance 2
  fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1527, seq 5, length 64
  14:31:17.671412 90:e2:ba:73:71:cd > 90:e2:ba:71:3b:1d, ethertype IPv4 (0x0800), length 148: 172.28.240.20.37449 > 172.28.240.21.8472: OTV, flags [I] (0x08), overlay 0, instance 2
  fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1527, seq 6, length 64
  14:31:18.679443 90:e2:ba:73:71:cd > 90:e2:ba:71:3b:1d, ethertype IPv4 (0x0800), length 148: 172.28.240.20.37449 > 172.28.240.21.8472: OTV, flags [I] (0x08), overlay 0, instance 2
  fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1527, seq 7, length 64
  14:31:19.687445 90:e2:ba:73:71:cd > 90:e2:ba:71:3b:1d, ethertype IPv4 (0x0800), length 148: 172.28.240.20.37449 > 172.28.240.21.8472: OTV, flags [I] (0x08), overlay 0, instance 2
  fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1527, seq 8, length 64
  ^C

  NOTE: ICMP requests are being sent to 192.168.100.6 from
  192.168.100.254 with no response.

  4. Dump on parent interface on remote compute node:

  root@Compute002:~# tcpdump -i bond1.206 -ne
  tcpdump: WARNING: bond1.206: no IPv4 address assigned
  tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  listening on bond1.206, link-type EN10MB (Ethernet), capture size 65535 bytes
  14:27:12.889311 90:e2:ba:73:71:cd > 90:e2:ba:71:3b:1d, ethertype IPv4 (0x0800), length 148: 172.28.240.20.37449 > 172.28.240.21.8472: OTV, flags [I] (0x08), overlay 0, instance 2
  fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 333, length 64
  14:27:13.889318 90:e2:ba:73:71:cd > 90:e2:ba:71:3b:1d, ethertype IPv4 (0x0800), length 148: 172.28.240.20.37449 > 172.28.240.21.8472: OTV, flags [I] (0x08), overlay 0, instance 2
  fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 334, length 64
  14:27:14.889392 90:e2:ba:73:71:cd > 90:e2:ba:71:3b:1d, ethertype IPv4 (0x0800), length 148: 172.28.240.20.37449 > 172.28.240.21.8472: OTV, flags [I] (0x08), overlay 0, instance 2
  fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 335, length 64
  14:27:15.889315 90:e2:ba:73:71:cd > 90:e2:ba:71:3b:1d, ethertype IPv4 (0x0800), length 148: 172.28.240.20.37449 > 172.28.240.21.8472: OTV, flags [I] (0x08), overlay 0, instance 2
  fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 336, length 64
  14:27:16.889357 90:e2:ba:73:71:cd > 90:e2:ba:71:3b:1d, ethertype IPv4 (0x0800), length 148: 172.28.240.20.37449 > 172.28.240.21.8472: OTV, flags [I] (0x08), overlay 0, instance 2
  fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 337, length 64

  5. Dump on bridge interface on remote compute node:

  root@Compute002:~# tcpdump -i brq56c413ca-6e -ne
  tcpdump: WARNING: brq56c413ca-6e: no IPv4 address assigned
  tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  listening on brq56c413ca-6e, link-type EN10MB (Ethernet), capture size 65535 bytes
  14:34:00.950062 fa:16:3e:1c:9d:55 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.100.254 tell 192.168.100.6, length 28
  14:34:00.969137 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1527, seq 168, length 64
  14:34:01.977167 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1527, seq 169, length 64
  14:34:01.977443 fa:16:3e:1c:9d:55 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.100.254 tell 192.168.100.6, length 28
  14:34:02.974092 fa:16:3e:1c:9d:55 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.100.254 tell 192.168.100.6, length 28
  14:34:02.985166 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1527, seq 170, length 64
  14:34:03.974131 fa:16:3e:1c:9d:55 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.100.254 tell 192.168.100.6, length 28
  14:34:03.993172 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1527, seq 171, length 64
  14:34:05.001197 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1527, seq 172, length 64
  14:34:05.001449 fa:16:3e:1c:9d:55 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.100.254 tell 192.168.100.6, length 28
  14:34:05.998187 fa:16:3e:1c:9d:55 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.100.254 tell 192.168.100.6, length 28
  14:34:06.009204 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1527, seq 173, length 64

  6. Dump on vxlan interface on remote compute node:

  root@Compute002:~# tcpdump -i vxlan-2 -ne
  tcpdump: WARNING: vxlan-2: no IPv4 address assigned
  tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  listening on vxlan-2, link-type EN10MB (Ethernet), capture size 65535 bytes
  14:23:04.052320 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 85, length 64
  14:23:04.052704 fa:16:3e:1c:9d:55 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.100.254 tell 192.168.100.6, length 28
  14:23:05.049944 fa:16:3e:1c:9d:55 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.100.254 tell 192.168.100.6, length 28
  14:23:05.060333 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 86, length 64
  14:23:06.049961 fa:16:3e:1c:9d:55 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.100.254 tell 192.168.100.6, length 28
  14:23:06.068312 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 87, length 64
  14:23:07.076355 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 88, length 64
  14:23:07.076655 fa:16:3e:1c:9d:55 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.100.254 tell 192.168.100.6, length 28
  14:23:08.074033 fa:16:3e:1c:9d:55 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.100.254 tell 192.168.100.6, length 28
  14:23:08.084299 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 89, length 64

  NOTE: The remote instance is attempting ARP requests for source addr
  but is getting no response. In fact, the request appears to be dropped
  through vxlan-2 to its parent, bond1.206..

  !!!!! GETTING IT TO WORK !!!!!

  1a. Add an ARP entry on instance2

  arp -s 192.168.100.254 fa:16:3e:bf:b0:a1

  Result: Success!

  root@20150331-comp1-test:~# ping -I 192.168.100.254 192.168.100.6
  PING 192.168.100.6 (192.168.100.6) from 192.168.100.254 : 56(84) bytes of data.

  
  64 bytes from 192.168.100.6: icmp_seq=455 ttl=64 time=2014 ms
  64 bytes from 192.168.100.6: icmp_seq=456 ttl=64 time=1014 ms
  64 bytes from 192.168.100.6: icmp_seq=457 ttl=64 time=14.9 ms
  64 bytes from 192.168.100.6: icmp_seq=458 ttl=64 time=0.939 ms

  1b. -OR- Add an ARP entry to compute02

  arp -s 192.168.100.254 fa:16:3e:bf:b0:a1 -i vxlan-2

  Result: Success!

  root@20150331-comp1-test:~# ping -I 192.168.100.254 192.168.100.6
  PING 192.168.100.6 (192.168.100.6) from 192.168.100.254 : 56(84) bytes of data.
  64 bytes from 192.168.100.6: icmp_seq=543 ttl=64 time=1.17 ms
  64 bytes from 192.168.100.6: icmp_seq=544 ttl=64 time=0.812 ms
  64 bytes from 192.168.100.6: icmp_seq=545 ttl=64 time=0.819 ms
  64 bytes from 192.168.100.6: icmp_seq=546 ttl=64 time=0.810 ms
  64 bytes from 192.168.100.6: icmp_seq=547 ttl=64 time=0.794 ms
  64 bytes from 192.168.100.6: icmp_seq=548 ttl=64 time=0.820 ms

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1445089/+subscriptions


References