yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #93521
[Bug 2031087] Re: ICMPv6 Neighbor Advertisement packets from VM's link-local address dropped by OVS
** Changed in: neutron
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2031087
Title:
ICMPv6 Neighbor Advertisement packets from VM's link-local address
dropped by OVS
Status in neutron:
Fix Released
Bug description:
When a VM transmits an ICMPv6 Neighbour Advertisement packet from its
link-local (fe80::/64) address, the NA packet ends up being dropped by
the OVS and is not forwarded to the external provider network. This
causes connectivity issues as the external router is unable to resolve
the link-layer MAC address for the VM's link-local IPv6 address. NA
packets from the VM's global IPv6 address are forwarded correctly.
Adding security group rule such as "Egress,IPv6,Any,Any,::/0" does
*not* help, the drop rule appears to be built-in and not possible to
override. However, disabling port security altogether does make the
problem go away.
We are running OpenStack Antelope, neutron 22.0.2 and OVN 23.03.
Platform is AlmaLinux 9.2, RDO packages.
We believe, but are not 100% sure, that this problem may have started
after upgrading from OVN 22.12. Reverting the upgrade to confirm is
unfortunately a complicated task, so we would like to avoid that if
possible.
Tcpdump can be used to confirm that the packets vanish inside OVS.
First, on the tap interface connected to the VM. We can here see the
external router (fe80::669d:99ff:fe3a:3d58) transmit NS packets to the
VM's solicited-node multicast address, and the VM
(fe80::18:59ff:fe37:204a) responds with a unicast NA packet:
$ sudo tcpdump -i tapb7c872a4-a5 host fe80::669d:99ff:fe3a:3d58 and icmp6
08:41:24.201970 IP6 fe80::669d:99ff:fe3a:3d58 > ff02::1:ff37:204a: ICMP6, neighbor solicitation, who has fe80::18:59ff:fe37:204a, length 32
08:41:24.202004 IP6 fe80::18:59ff:fe37:204a > fe80::669d:99ff:fe3a:3d58: ICMP6, neighbor advertisement, tgt is fe80::18:59ff:fe37:204a, length 32
08:41:25.366752 IP6 fe80::669d:99ff:fe3a:3d58 > ff02::1:ff37:204a: ICMP6, neighbor solicitation, who has fe80::18:59ff:fe37:204a, length 32
08:41:25.366775 IP6 fe80::18:59ff:fe37:204a > fe80::669d:99ff:fe3a:3d58: ICMP6, neighbor advertisement, tgt is fe80::18:59ff:fe37:204a, length 32
08:41:26.374637 IP6 fe80::669d:99ff:fe3a:3d58 > ff02::1:ff37:204a: ICMP6, neighbor solicitation, who has fe80::18:59ff:fe37:204a, length 32
08:41:26.374693 IP6 fe80::18:59ff:fe37:204a > fe80::669d:99ff:fe3a:3d58: ICMP6, neighbor advertisement, tgt is fe80::18:59ff:fe37:204a, length 32
However, while tcpdumping the same traffic on the external interface
(bond0) on the provider VLAN tag the network is using, the NA packets
are no longer there:
$ sudo tcpdump -i bond0 vlan 882 and host fe80::669d:99ff:fe3a:3d58 and icmp6
08:41:24.201964 IP6 fe80::669d:99ff:fe3a:3d58 > ff02::1:ff37:204a: ICMP6, neighbor solicitation, who has fe80::18:59ff:fe37:204a, length 32
08:41:25.366747 IP6 fe80::669d:99ff:fe3a:3d58 > ff02::1:ff37:204a: ICMP6, neighbor solicitation, who has fe80::18:59ff:fe37:204a, length 32
08:41:26.374625 IP6 fe80::669d:99ff:fe3a:3d58 > ff02::1:ff37:204a: ICMP6, neighbor solicitation, who has fe80::18:59ff:fe37:204a, length 32
This explains why there are so many NS packets - the router keeps
retrying forever.
Compare this with NA packets from the VM's global address, which works
as expected:
$ sudo tcpdump -ni tapb7c872a4-a5 ether host 64:9d:99:3a:3d:58 and icmp6 and net not fe80::/10
08:56:03.015378 IP6 2a02:c0:200:f012:ffff:0:1:1 > ff02::1:ff37:204a: ICMP6, neighbor solicitation, who has 2a02:c0:200:f012:18:59ff:fe37:204a, length 32
08:56:03.015408 IP6 2a02:c0:200:f012:18:59ff:fe37:204a > 2a02:c0:200:f012:ffff:0:1:1: ICMP6, neighbor advertisement, tgt is 2a02:c0:200:f012:18:59ff:fe37:204a, length 32
$ sudo tcpdump -ni bond0 vlan 882 and ether host 64:9d:99:3a:3d:58 and icmp6 and net not fe80::/10
08:56:03.015292 IP6 2a02:c0:200:f012:ffff:0:1:1 > ff02::1:ff37:204a: ICMP6, neighbor solicitation, who has 2a02:c0:200:f012:18:59ff:fe37:204a, length 32
08:56:03.015539 IP6 2a02:c0:200:f012:18:59ff:fe37:204a > 2a02:c0:200:f012:ffff:0:1:1: ICMP6, neighbor advertisement, tgt is 2a02:c0:200:f012:18:59ff:fe37:204a, length 32
We can further confirm it by finding an explicit drop rule within OVS:
$ sudo ovs-appctl dpif/dump-flows br-int | grep drop
recirc_id(0),in_port(8),eth(src=02:18:59:37:20:4a),eth_type(0x86dd),ipv6(src=fe80::18:59ff:fe37:204a,proto=58,hlimit=255,frag=no),icmpv6(type=136,code=0),nd(target=fe80::18:59ff:fe37:204a,tll=02:18:59:37:20:4a), packets:104766, bytes:9009876, used:0.202s, actions:drop
We see that there are a ton of built-in default rules pertaining to NA
packets:
$ sudo ovs-ofctl dump-flows br-int | grep -c icmp_type=136
178
This is not unexpected as ICMPv6 ND (NS/NA/RS/RA/etc) are essential
parts of the IPv6 protocol (like ARP in IPv4), and should not be
dropped even if the VM is using a "block everything" security group.
Our assumption is that the logic in these rules are flawed somehow, so
they inadvertently end up blocking the NA packets from the VM's link-
local address.
We have been unable to reproduce the problem using ofproto/trace,
probably because it does not allow to set the icmp_type attribute for
some reason. If we add ",icmp_type=136" to the command line below, it
fails with "prerequisites not met for setting icmp_type". We have no
idea what that missing prerequisite could possibly be - any
suggestions would be greatly appreciated.
$ sudo ovs-appctl ofproto/trace br-int in_port=161,dl_src=02:18:59:37:20:4a,dl_dst=64:9d:99:3a:3d:58,icmp6,ipv6_src=fe80::18:59ff:fe37:204a,ipv6_dst=fe80::669d:99ff:fe3a:3d58
Flow: icmp6,in_port=161,vlan_tci=0x0000,dl_src=02:18:59:37:20:4a,dl_dst=64:9d:99:3a:3d:58,ipv6_src=fe80::18:59ff:fe37:204a,ipv6_dst=fe80::669d:99ff:fe3a:3d58,ipv6_label=0x00000,nw_tos=0,nw_ecn=0,nw_ttl=0,nw_frag=no,icmp_type=0,icmp_code=0
bridge("br-int")
----------------
0. in_port=161, priority 100, cookie 0x2f9439aa
set_field:0x3e->reg13
set_field:0x3f->reg11
set_field:0x3d->reg12
set_field:0x9->metadata
set_field:0x2->reg14
resubmit(,8)
8. metadata=0x9, priority 50, cookie 0x59f248ee
set_field:0/0x1000->reg10
resubmit(,73)
73. ipv6,reg14=0x2,metadata=0x9,dl_src=02:18:59:37:20:4a,ipv6_src=fe80::18:59ff:fe37:204a, priority 90, cookie 0x2f9439aa
resubmit(,74)
74. No match.
drop
move:NXM_NX_REG10[12]->NXM_NX_XXREG0[111]
-> NXM_NX_XXREG0[111] is now 0
resubmit(,9)
9. metadata=0x9, priority 0, cookie 0xcc8526d3
resubmit(,10)
10. metadata=0x9, priority 0, cookie 0xc47fdc5d
resubmit(,11)
11. metadata=0x9, priority 0, cookie 0xddf6f6b9
resubmit(,12)
12. ipv6,metadata=0x9, priority 100, cookie 0x26ff06cc
set_field:0x1000000000000000000000000/0x1000000000000000000000000->xxreg0
resubmit(,13)
13. metadata=0x9, priority 0, cookie 0xda44fc0c
resubmit(,14)
14. ipv6,reg0=0x1/0x1,metadata=0x9, priority 100, cookie 0xe977b8b8
ct(table=15,zone=NXM_NX_REG13[0..15])
drop
-> A clone of the packet is forked to recirculate. The forked pipeline will be resumed at table 15.
-> Sets the packet to an untracked state, and clears all the conntrack fields.
Final flow: icmp6,reg0=0x1,reg11=0x3f,reg12=0x3d,reg13=0x3e,reg14=0x2,metadata=0x9,in_port=161,vlan_tci=0x0000,dl_src=02:18:59:37:20:4a,dl_dst=64:9d:99:3a:3d:58,ipv6_src=fe80::18:59ff:fe37:204a,ipv6_dst=fe80::669d:99ff:fe3a:3d58,ipv6_label=0x00000,nw_tos=0,nw_ecn=0,nw_ttl=0,nw_frag=no,icmp_type=0,icmp_code=0
Megaflow: recirc_id=0,eth,icmp6,in_port=161,dl_src=02:18:59:37:20:4a,dl_dst=64:9d:99:3a:3d:58,ipv6_src=fe80::18:59ff:fe37:204a,ipv6_dst=fe80::669d:99ff:fe3a:3d58,nw_ttl=0,nw_frag=no,icmp_type=0x0/0x80,nd_target=::,nd_tll=00:00:00:00:00:00
Datapath actions: ct(zone=62),recirc(0x23b8)
===============================================================================
recirc(0x23b8) - resume conntrack with default ct_state=trk|new (use --ct-next to customize)
===============================================================================
Flow:
recirc_id=0x23b8,ct_state=new|trk,ct_zone=62,eth,icmp6,reg0=0x1,reg11=0x3f,reg12=0x3d,reg13=0x3e,reg14=0x2,metadata=0x9,in_port=161,vlan_tci=0x0000,dl_src=02:18:59:37:20:4a,dl_dst=64:9d:99:3a:3d:58,ipv6_src=fe80::18:59ff:fe37:204a,ipv6_dst=fe80::669d:99ff:fe3a:3d58,ipv6_label=0x00000,nw_tos=0,nw_ecn=0,nw_ttl=0,nw_frag=no,icmp_type=0,icmp_code=0
bridge("br-int")
----------------
thaw
Resuming from table 15
15. ct_state=+new-est+trk,metadata=0x9, priority 7, cookie 0x94acb803
set_field:0x80000000000000000000000000/0x80000000000000000000000000->xxreg0
set_field:0x200000000000000000000000000/0x200000000000000000000000000->xxreg0
resubmit(,16)
16. ipv6,reg0=0x80/0x80,reg14=0x2,metadata=0x9, priority 2002, cookie 0x4cdb3154
set_field:0x2000000000000000000000000/0x2000000000000000000000000->xxreg0
resubmit(,17)
17. metadata=0x9, priority 0, cookie 0x77e302aa
resubmit(,18)
18. metadata=0x9, priority 0, cookie 0x97ee4db3
resubmit(,19)
19. metadata=0x9, priority 0, cookie 0x6b46ef3d
resubmit(,20)
20. metadata=0x9, priority 0, cookie 0x238074d5
resubmit(,21)
21. metadata=0x9, priority 0, cookie 0x4b2f00cb
resubmit(,22)
22. metadata=0x9, priority 0, cookie 0x1de1893e
resubmit(,23)
23. metadata=0x9, priority 0, cookie 0x1b7c54a9
resubmit(,24)
24. metadata=0x9, priority 0, cookie 0x91b808bf
resubmit(,25)
25. metadata=0x9, priority 0, cookie 0x827a7c62
resubmit(,26)
26. ipv6,reg0=0x2/0x2002,metadata=0x9, priority 100, cookie 0xf51cd562
ct(commit,zone=NXM_NX_REG13[0..15],nat(src),exec(set_field:0/0x1->ct_mark))
nat(src)
set_field:0/0x1->ct_mark
-> Sets the packet to an untracked state, and clears all the conntrack fields.
resubmit(,27)
27. metadata=0x9, priority 0, cookie 0xe9561f7f
resubmit(,28)
28. metadata=0x9, priority 0, cookie 0x426dc5bb
resubmit(,29)
29. metadata=0x9, priority 0, cookie 0xeab289c
resubmit(,30)
30. metadata=0x9, priority 0, cookie 0x620602c5
resubmit(,31)
31. metadata=0x9, priority 0, cookie 0x5504e379
resubmit(,32)
32. metadata=0x9, priority 0, cookie 0x5e1c22f5
resubmit(,33)
33. metadata=0x9, priority 0, cookie 0x8233a381
set_field:0->reg15
resubmit(,71)
71. No match.
drop
resubmit(,34)
34. reg15=0,metadata=0x9, priority 50, cookie 0x2dc6c0b8
set_field:0x8001->reg15
resubmit(,37)
37. priority 0
resubmit(,39)
39. priority 0
resubmit(,40)
40. reg15=0x8001,metadata=0x9, priority 100, cookie 0xa23e45f
set_field:0x3->reg13
set_field:0x1->reg15
resubmit(,41)
41. priority 0
set_field:0->reg0
set_field:0->reg1
set_field:0->reg2
set_field:0->reg3
set_field:0->reg4
set_field:0->reg5
set_field:0->reg6
set_field:0->reg7
set_field:0->reg8
set_field:0->reg9
resubmit(,42)
42. ipv6,reg15=0x1,metadata=0x9, priority 110, cookie 0x6ae0a674
resubmit(,43)
43. ipv6,reg15=0x1,metadata=0x9, priority 110, cookie 0x9147caee
resubmit(,44)
44. metadata=0x9, priority 0, cookie 0xcbd84a69
resubmit(,45)
45. ct_state=-trk,metadata=0x9, priority 5, cookie 0xec86b1c8
set_field:0x100000000000000000000000000/0x100000000000000000000000000->xxreg0
set_field:0x200000000000000000000000000/0x200000000000000000000000000->xxreg0
resubmit(,46)
46. metadata=0x9, priority 0, cookie 0x9ae00a32
resubmit(,47)
47. metadata=0x9, priority 0, cookie 0x98ca16da
resubmit(,48)
48. metadata=0x9, priority 0, cookie 0x7eb5b6c5
resubmit(,49)
49. metadata=0x9, priority 0, cookie 0x149995b7
resubmit(,50)
50. metadata=0x9, priority 0, cookie 0x9158534f
set_field:0/0x1000->reg10
resubmit(,75)
75. No match.
drop
move:NXM_NX_REG10[12]->NXM_NX_XXREG0[111]
-> NXM_NX_XXREG0[111] is now 0
resubmit(,51)
51. metadata=0x9, priority 0, cookie 0xb046f48c
resubmit(,64)
64. priority 0
resubmit(,65)
65. reg15=0x1,metadata=0x9, priority 100, cookie 0xfed4d5d9
push_vlan:0x8100
set_field:4978->vlan_vid
output:69
bridge("br-ex")
---------------
0. priority 0
NORMAL
-> forwarding to learned port
pop_vlan
set_field:0x8001->reg15
Final flow: recirc_id=0x23b8,eth,icmp6,reg0=0x300,reg11=0x3f,reg12=0x3d,reg13=0x3,reg14=0x2,reg15=0x8001,metadata=0x9,in_port=161,vlan_tci=0x0000,dl_src=02:18:59:37:20:4a,dl_dst=64:9d:99:3a:3d:58,ipv6_src=fe80::18:59ff:fe37:204a,ipv6_dst=fe80::669d:99ff:fe3a:3d58,ipv6_label=0x00000,nw_tos=0,nw_ecn=0,nw_ttl=0,nw_frag=no,icmp_type=0,icmp_code=0
Megaflow: recirc_id=0x23b8,ct_state=+new-est-rel-rpl-inv+trk,ct_mark=0/0x1,eth,icmp6,in_port=161,dl_src=02:18:59:37:20:4a,dl_dst=64:9d:99:3a:3d:58,ipv6_src=fe80::/10,ipv6_dst=fe80::669d:99ff:fe3a:3d58,nw_ttl=0,nw_frag=no,icmp_type=0x0/0x80
Datapath actions: ct(commit,zone=62,mark=0/0x1,nat(src)),push_vlan(vid=882,pcp=0),2
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2031087/+subscriptions