← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2031087] Re: ICMPv6 Neighbor Advertisement packets from VM's link-local address dropped by OVS

 

** Changed in: neutron
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2031087

Title:
  ICMPv6 Neighbor Advertisement packets from VM's link-local address
  dropped by OVS

Status in neutron:
  Fix Released

Bug description:
  When a VM transmits an ICMPv6 Neighbour Advertisement packet from its
  link-local (fe80::/64) address, the NA packet ends up being dropped by
  the OVS and is not forwarded to the external provider network. This
  causes connectivity issues as the external router is unable to resolve
  the link-layer MAC address for the VM's link-local IPv6 address. NA
  packets from the VM's global IPv6 address are forwarded correctly.

  Adding security group rule such as "Egress,IPv6,Any,Any,::/0" does
  *not* help, the drop rule appears to be built-in and not possible to
  override. However, disabling port security altogether does make the
  problem go away.

  We are running OpenStack Antelope, neutron 22.0.2 and OVN 23.03.
  Platform is AlmaLinux 9.2, RDO packages.

  We believe, but are not 100% sure, that this problem may have started
  after upgrading from OVN 22.12. Reverting the upgrade to confirm is
  unfortunately a complicated task, so we would like to avoid that if
  possible.

  Tcpdump can be used to confirm that the packets vanish inside OVS.
  First, on the tap interface connected to the VM. We can here see the
  external router (fe80::669d:99ff:fe3a:3d58) transmit NS packets to the
  VM's solicited-node multicast address, and the VM
  (fe80::18:59ff:fe37:204a) responds with a unicast NA packet:

  $ sudo tcpdump -i tapb7c872a4-a5 host fe80::669d:99ff:fe3a:3d58 and icmp6
  08:41:24.201970 IP6 fe80::669d:99ff:fe3a:3d58 > ff02::1:ff37:204a: ICMP6, neighbor solicitation, who has fe80::18:59ff:fe37:204a, length 32
  08:41:24.202004 IP6 fe80::18:59ff:fe37:204a > fe80::669d:99ff:fe3a:3d58: ICMP6, neighbor advertisement, tgt is fe80::18:59ff:fe37:204a, length 32
  08:41:25.366752 IP6 fe80::669d:99ff:fe3a:3d58 > ff02::1:ff37:204a: ICMP6, neighbor solicitation, who has fe80::18:59ff:fe37:204a, length 32
  08:41:25.366775 IP6 fe80::18:59ff:fe37:204a > fe80::669d:99ff:fe3a:3d58: ICMP6, neighbor advertisement, tgt is fe80::18:59ff:fe37:204a, length 32
  08:41:26.374637 IP6 fe80::669d:99ff:fe3a:3d58 > ff02::1:ff37:204a: ICMP6, neighbor solicitation, who has fe80::18:59ff:fe37:204a, length 32
  08:41:26.374693 IP6 fe80::18:59ff:fe37:204a > fe80::669d:99ff:fe3a:3d58: ICMP6, neighbor advertisement, tgt is fe80::18:59ff:fe37:204a, length 32

  However, while tcpdumping the same traffic on the external interface
  (bond0) on the provider VLAN tag the network is using, the NA packets
  are no longer there:

  $ sudo tcpdump -i bond0 vlan 882 and host fe80::669d:99ff:fe3a:3d58 and icmp6
  08:41:24.201964 IP6 fe80::669d:99ff:fe3a:3d58 > ff02::1:ff37:204a: ICMP6, neighbor solicitation, who has fe80::18:59ff:fe37:204a, length 32
  08:41:25.366747 IP6 fe80::669d:99ff:fe3a:3d58 > ff02::1:ff37:204a: ICMP6, neighbor solicitation, who has fe80::18:59ff:fe37:204a, length 32
  08:41:26.374625 IP6 fe80::669d:99ff:fe3a:3d58 > ff02::1:ff37:204a: ICMP6, neighbor solicitation, who has fe80::18:59ff:fe37:204a, length 32

  This explains why there are so many NS packets - the router keeps
  retrying forever.

  Compare this with NA packets from the VM's global address, which works
  as expected:

  $ sudo tcpdump -ni tapb7c872a4-a5 ether host 64:9d:99:3a:3d:58 and icmp6 and net not fe80::/10
  08:56:03.015378 IP6 2a02:c0:200:f012:ffff:0:1:1 > ff02::1:ff37:204a: ICMP6, neighbor solicitation, who has 2a02:c0:200:f012:18:59ff:fe37:204a, length 32
  08:56:03.015408 IP6 2a02:c0:200:f012:18:59ff:fe37:204a > 2a02:c0:200:f012:ffff:0:1:1: ICMP6, neighbor advertisement, tgt is 2a02:c0:200:f012:18:59ff:fe37:204a, length 32

  $ sudo tcpdump -ni bond0 vlan 882 and ether host 64:9d:99:3a:3d:58 and icmp6 and net not fe80::/10
  08:56:03.015292 IP6 2a02:c0:200:f012:ffff:0:1:1 > ff02::1:ff37:204a: ICMP6, neighbor solicitation, who has 2a02:c0:200:f012:18:59ff:fe37:204a, length 32
  08:56:03.015539 IP6 2a02:c0:200:f012:18:59ff:fe37:204a > 2a02:c0:200:f012:ffff:0:1:1: ICMP6, neighbor advertisement, tgt is 2a02:c0:200:f012:18:59ff:fe37:204a, length 32

  We can further confirm it by finding an explicit drop rule within OVS:

  $ sudo ovs-appctl dpif/dump-flows br-int | grep drop
  recirc_id(0),in_port(8),eth(src=02:18:59:37:20:4a),eth_type(0x86dd),ipv6(src=fe80::18:59ff:fe37:204a,proto=58,hlimit=255,frag=no),icmpv6(type=136,code=0),nd(target=fe80::18:59ff:fe37:204a,tll=02:18:59:37:20:4a), packets:104766, bytes:9009876, used:0.202s, actions:drop

  We see that there are a ton of built-in default rules pertaining to NA
  packets:

  $ sudo ovs-ofctl dump-flows br-int | grep -c icmp_type=136
  178

  This is not unexpected as ICMPv6 ND (NS/NA/RS/RA/etc) are essential
  parts of the IPv6 protocol (like ARP in IPv4), and should not be
  dropped even if the VM is using a "block everything" security group.
  Our assumption is that the logic in these rules are flawed somehow, so
  they inadvertently end up blocking the NA packets from the VM's link-
  local address.

  We have been unable to reproduce the problem using ofproto/trace,
  probably because it does not allow to set the icmp_type attribute for
  some reason.  If we add ",icmp_type=136" to the command line below, it
  fails with "prerequisites not met for setting icmp_type". We have no
  idea what that missing prerequisite could possibly be - any
  suggestions would be greatly appreciated.

  $ sudo ovs-appctl ofproto/trace br-int in_port=161,dl_src=02:18:59:37:20:4a,dl_dst=64:9d:99:3a:3d:58,icmp6,ipv6_src=fe80::18:59ff:fe37:204a,ipv6_dst=fe80::669d:99ff:fe3a:3d58 
  Flow: icmp6,in_port=161,vlan_tci=0x0000,dl_src=02:18:59:37:20:4a,dl_dst=64:9d:99:3a:3d:58,ipv6_src=fe80::18:59ff:fe37:204a,ipv6_dst=fe80::669d:99ff:fe3a:3d58,ipv6_label=0x00000,nw_tos=0,nw_ecn=0,nw_ttl=0,nw_frag=no,icmp_type=0,icmp_code=0

  bridge("br-int")
  ----------------
   0. in_port=161, priority 100, cookie 0x2f9439aa
      set_field:0x3e->reg13
      set_field:0x3f->reg11
      set_field:0x3d->reg12
      set_field:0x9->metadata
      set_field:0x2->reg14
      resubmit(,8)
   8. metadata=0x9, priority 50, cookie 0x59f248ee
      set_field:0/0x1000->reg10
      resubmit(,73)
      73. ipv6,reg14=0x2,metadata=0x9,dl_src=02:18:59:37:20:4a,ipv6_src=fe80::18:59ff:fe37:204a, priority 90, cookie 0x2f9439aa
              resubmit(,74)
          74. No match.
              drop
      move:NXM_NX_REG10[12]->NXM_NX_XXREG0[111]
       -> NXM_NX_XXREG0[111] is now 0
      resubmit(,9)
   9. metadata=0x9, priority 0, cookie 0xcc8526d3
      resubmit(,10)
  10. metadata=0x9, priority 0, cookie 0xc47fdc5d
      resubmit(,11)
  11. metadata=0x9, priority 0, cookie 0xddf6f6b9
      resubmit(,12)
  12. ipv6,metadata=0x9, priority 100, cookie 0x26ff06cc
      set_field:0x1000000000000000000000000/0x1000000000000000000000000->xxreg0
      resubmit(,13)
  13. metadata=0x9, priority 0, cookie 0xda44fc0c
      resubmit(,14)
  14. ipv6,reg0=0x1/0x1,metadata=0x9, priority 100, cookie 0xe977b8b8
      ct(table=15,zone=NXM_NX_REG13[0..15])
      drop
       -> A clone of the packet is forked to recirculate. The forked pipeline will be resumed at table 15.
       -> Sets the packet to an untracked state, and clears all the conntrack fields.

  Final flow: icmp6,reg0=0x1,reg11=0x3f,reg12=0x3d,reg13=0x3e,reg14=0x2,metadata=0x9,in_port=161,vlan_tci=0x0000,dl_src=02:18:59:37:20:4a,dl_dst=64:9d:99:3a:3d:58,ipv6_src=fe80::18:59ff:fe37:204a,ipv6_dst=fe80::669d:99ff:fe3a:3d58,ipv6_label=0x00000,nw_tos=0,nw_ecn=0,nw_ttl=0,nw_frag=no,icmp_type=0,icmp_code=0
  Megaflow: recirc_id=0,eth,icmp6,in_port=161,dl_src=02:18:59:37:20:4a,dl_dst=64:9d:99:3a:3d:58,ipv6_src=fe80::18:59ff:fe37:204a,ipv6_dst=fe80::669d:99ff:fe3a:3d58,nw_ttl=0,nw_frag=no,icmp_type=0x0/0x80,nd_target=::,nd_tll=00:00:00:00:00:00
  Datapath actions: ct(zone=62),recirc(0x23b8)

  ===============================================================================
  recirc(0x23b8) - resume conntrack with default ct_state=trk|new (use --ct-next to customize)
  ===============================================================================

  Flow:
  recirc_id=0x23b8,ct_state=new|trk,ct_zone=62,eth,icmp6,reg0=0x1,reg11=0x3f,reg12=0x3d,reg13=0x3e,reg14=0x2,metadata=0x9,in_port=161,vlan_tci=0x0000,dl_src=02:18:59:37:20:4a,dl_dst=64:9d:99:3a:3d:58,ipv6_src=fe80::18:59ff:fe37:204a,ipv6_dst=fe80::669d:99ff:fe3a:3d58,ipv6_label=0x00000,nw_tos=0,nw_ecn=0,nw_ttl=0,nw_frag=no,icmp_type=0,icmp_code=0

  bridge("br-int")
  ----------------
      thaw
          Resuming from table 15
  15. ct_state=+new-est+trk,metadata=0x9, priority 7, cookie 0x94acb803
      set_field:0x80000000000000000000000000/0x80000000000000000000000000->xxreg0
      set_field:0x200000000000000000000000000/0x200000000000000000000000000->xxreg0
      resubmit(,16)
  16. ipv6,reg0=0x80/0x80,reg14=0x2,metadata=0x9, priority 2002, cookie 0x4cdb3154
      set_field:0x2000000000000000000000000/0x2000000000000000000000000->xxreg0
      resubmit(,17)
  17. metadata=0x9, priority 0, cookie 0x77e302aa
      resubmit(,18)
  18. metadata=0x9, priority 0, cookie 0x97ee4db3
      resubmit(,19)
  19. metadata=0x9, priority 0, cookie 0x6b46ef3d
      resubmit(,20)
  20. metadata=0x9, priority 0, cookie 0x238074d5
      resubmit(,21)
  21. metadata=0x9, priority 0, cookie 0x4b2f00cb
      resubmit(,22)
  22. metadata=0x9, priority 0, cookie 0x1de1893e
      resubmit(,23)
  23. metadata=0x9, priority 0, cookie 0x1b7c54a9
      resubmit(,24)
  24. metadata=0x9, priority 0, cookie 0x91b808bf
      resubmit(,25)
  25. metadata=0x9, priority 0, cookie 0x827a7c62
      resubmit(,26)
  26. ipv6,reg0=0x2/0x2002,metadata=0x9, priority 100, cookie 0xf51cd562
      ct(commit,zone=NXM_NX_REG13[0..15],nat(src),exec(set_field:0/0x1->ct_mark))
      nat(src)
      set_field:0/0x1->ct_mark
       -> Sets the packet to an untracked state, and clears all the conntrack fields.
      resubmit(,27)
  27. metadata=0x9, priority 0, cookie 0xe9561f7f
      resubmit(,28)
  28. metadata=0x9, priority 0, cookie 0x426dc5bb
      resubmit(,29)
  29. metadata=0x9, priority 0, cookie 0xeab289c
      resubmit(,30)
  30. metadata=0x9, priority 0, cookie 0x620602c5
      resubmit(,31)
  31. metadata=0x9, priority 0, cookie 0x5504e379
      resubmit(,32)
  32. metadata=0x9, priority 0, cookie 0x5e1c22f5
      resubmit(,33)
  33. metadata=0x9, priority 0, cookie 0x8233a381
      set_field:0->reg15
      resubmit(,71)
      71. No match.
              drop
      resubmit(,34)
  34. reg15=0,metadata=0x9, priority 50, cookie 0x2dc6c0b8
      set_field:0x8001->reg15
      resubmit(,37)
  37. priority 0
      resubmit(,39)
  39. priority 0
      resubmit(,40)
  40. reg15=0x8001,metadata=0x9, priority 100, cookie 0xa23e45f
      set_field:0x3->reg13
      set_field:0x1->reg15
      resubmit(,41)
      41. priority 0
              set_field:0->reg0
              set_field:0->reg1
              set_field:0->reg2
              set_field:0->reg3
              set_field:0->reg4
              set_field:0->reg5
              set_field:0->reg6
              set_field:0->reg7
              set_field:0->reg8
              set_field:0->reg9
              resubmit(,42)
          42. ipv6,reg15=0x1,metadata=0x9, priority 110, cookie 0x6ae0a674
              resubmit(,43)
          43. ipv6,reg15=0x1,metadata=0x9, priority 110, cookie 0x9147caee
              resubmit(,44)
          44. metadata=0x9, priority 0, cookie 0xcbd84a69
              resubmit(,45)
          45. ct_state=-trk,metadata=0x9, priority 5, cookie 0xec86b1c8
              set_field:0x100000000000000000000000000/0x100000000000000000000000000->xxreg0
              set_field:0x200000000000000000000000000/0x200000000000000000000000000->xxreg0
              resubmit(,46)
          46. metadata=0x9, priority 0, cookie 0x9ae00a32
              resubmit(,47)
          47. metadata=0x9, priority 0, cookie 0x98ca16da
              resubmit(,48)
          48. metadata=0x9, priority 0, cookie 0x7eb5b6c5
              resubmit(,49)
          49. metadata=0x9, priority 0, cookie 0x149995b7
              resubmit(,50)
          50. metadata=0x9, priority 0, cookie 0x9158534f
              set_field:0/0x1000->reg10
              resubmit(,75)
              75. No match.
                      drop
              move:NXM_NX_REG10[12]->NXM_NX_XXREG0[111]
               -> NXM_NX_XXREG0[111] is now 0
              resubmit(,51)
          51. metadata=0x9, priority 0, cookie 0xb046f48c
              resubmit(,64)
          64. priority 0
              resubmit(,65)
          65. reg15=0x1,metadata=0x9, priority 100, cookie 0xfed4d5d9
              push_vlan:0x8100
              set_field:4978->vlan_vid
              output:69

              bridge("br-ex")
              ---------------
                   0. priority 0
                      NORMAL
                       -> forwarding to learned port
              pop_vlan
      set_field:0x8001->reg15

  Final flow: recirc_id=0x23b8,eth,icmp6,reg0=0x300,reg11=0x3f,reg12=0x3d,reg13=0x3,reg14=0x2,reg15=0x8001,metadata=0x9,in_port=161,vlan_tci=0x0000,dl_src=02:18:59:37:20:4a,dl_dst=64:9d:99:3a:3d:58,ipv6_src=fe80::18:59ff:fe37:204a,ipv6_dst=fe80::669d:99ff:fe3a:3d58,ipv6_label=0x00000,nw_tos=0,nw_ecn=0,nw_ttl=0,nw_frag=no,icmp_type=0,icmp_code=0
  Megaflow: recirc_id=0x23b8,ct_state=+new-est-rel-rpl-inv+trk,ct_mark=0/0x1,eth,icmp6,in_port=161,dl_src=02:18:59:37:20:4a,dl_dst=64:9d:99:3a:3d:58,ipv6_src=fe80::/10,ipv6_dst=fe80::669d:99ff:fe3a:3d58,nw_ttl=0,nw_frag=no,icmp_type=0x0/0x80
  Datapath actions: ct(commit,zone=62,mark=0/0x1,nat(src)),push_vlan(vid=882,pcp=0),2

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2031087/+subscriptions