yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #40089
[Bug 1505781] [NEW] Unexpected SNAT behavior between instances when SNAT disabled on router
Public bug reported:
= Scenario =
• Kilo/Juno
• Single Neutron router with enable_snat=false
• two instances in two tenant networks attached to router
• each instance has a floating IP
INSTANCE A: TestNet1=192.167.7.3, 10.1.1.7
INSTANCE B: TestNet2=10.0.8.3, 10.1.1.6
When instances communicate out (ie. to the Internet), they are properly
SNAT'd using their respective floating IP. If an instance does not have
a floating IP, the traffic is routed out without SNAT.
When instances in tenant networks behind the same router communicate via
their fixed IPs, the source address is SNAT'd as the respective floating
IP while the destination is unmodified:
Pinging from INSTANCE A to INSTANCE B:
$ ping 10.0.8.3 -c1
PING 10.0.8.3 (10.0.8.3): 56 data bytes
64 bytes from 10.0.8.3: seq=0 ttl=63 time=7.483 ms
>From the Neutron router:
root@controller01:~# ip netns exec qrouter-dd15e8f3-8612-4925-81d4-88fcad49807f tcpdump -i any -ne icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
10:37:48.840404: 192.167.7.3 > 10.0.8.3: ICMP echo request, id 37121, seq 12, length 64
10:37:48.840467: 10.1.1.7 > 10.0.8.3: ICMP echo request, id 37121, seq 12, length 64 <-- SNAT as FLOAT
10:37:48.842506: 10.0.8.3 > 10.1.1.7: ICMP echo reply, id 37121, seq 12, length 64
10:37:48.842565: 10.0.8.3 > 192.167.7.3: ICMP echo reply, id 37121, seq 12, length 64
This behavior has a negative effect for a couple of reasons:
1. The expectation is that traffic between the two instances behind the same router using fixed IPs would not be source NAT'd
2. Security group rules that use 'Remote Security Group' rather than 'Remote IP Prefix' fail to work since the source address is modified
When SNAT is enabled on the router, traffic between the instances via
their fixed IP works as expected:
>From INSTANCE A to B:
$ ping 10.0.8.3 -c 1
PING 10.0.8.3 (10.0.8.3): 56 data bytes
64 bytes from 10.0.8.3: seq=0 ttl=63 time=8.024 ms
>From the Neutron router:
root@controller01:~# ip netns exec qrouter-dd15e8f3-8612-4925-81d4-88fcad49807f tcpdump -i any -ne icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
10:52:19.945863: 192.167.7.3 > 10.0.8.3: ICMP echo request, id 39425, seq 0, length 64
10:52:19.945953: 192.167.7.3 > 10.0.8.3: ICMP echo request, id 39425, seq 0, length 64
10:52:19.951498: 10.0.8.3 > 192.167.7.3: ICMP echo reply, id 39425, seq 0, length 64
10:52:19.951554: 10.0.8.3 > 192.167.7.3: ICMP echo reply, id 39425, seq 0, length 64
We believe the existence of the following iptables nat rule causes the
desired behavior, in that traffic not traversing the qg interface is not
NAT'd:
-A neutron-l3-agent-POSTROUTING ! -i qg-80aa20be-9b ! -o qg-80aa20be-9b
-m conntrack ! --ctstate DNAT -j ACCEPT
That rule only exists when SNAT is *enabled* on the router, and not when
it is disabled, as shown below:
SNAT enabled:
-A PREROUTING -j neutron-l3-agent-PREROUTING
-A OUTPUT -j neutron-l3-agent-OUTPUT
-A POSTROUTING -j neutron-l3-agent-POSTROUTING
-A POSTROUTING -j neutron-postrouting-bottom
-A neutron-l3-agent-OUTPUT -d 10.1.1.6/32 -j DNAT --to-destination 10.0.8.3
-A neutron-l3-agent-OUTPUT -d 10.1.1.7/32 -j DNAT --to-destination 192.167.7.3
-A neutron-l3-agent-POSTROUTING ! -i qg-80aa20be-9b ! -o qg-80aa20be-9b -m conntrack ! --ctstate DNAT -j ACCEPT
-A neutron-l3-agent-PREROUTING -d 10.1.1.6/32 -j DNAT --to-destination 10.0.8.3
-A neutron-l3-agent-PREROUTING -d 10.1.1.7/32 -j DNAT --to-destination 192.167.7.3
-A neutron-l3-agent-float-snat -s 10.0.8.3/32 -j SNAT --to-source 10.1.1.6
-A neutron-l3-agent-float-snat -s 192.167.7.3/32 -j SNAT --to-source 10.1.1.7
-A neutron-l3-agent-snat -j neutron-l3-agent-float-snat
-A neutron-l3-agent-snat -o qg-80aa20be-9b -j SNAT --to-source 10.1.1.5
-A neutron-l3-agent-snat -m mark ! --mark 0x2 -m conntrack --ctstate DNAT -j SNAT --to-source 10.1.1.5
-A neutron-postrouting-bottom -m comment --comment "Perform source NAT on outgoing traffic." -j neutron-l3-agent-snat
SNAT disabled:
-A PREROUTING -j neutron-l3-agent-PREROUTING
-A OUTPUT -j neutron-l3-agent-OUTPUT
-A POSTROUTING -j neutron-l3-agent-POSTROUTING
-A POSTROUTING -j neutron-postrouting-bottom
-A neutron-l3-agent-OUTPUT -d 10.1.1.6/32 -j DNAT --to-destination 10.0.8.3
-A neutron-l3-agent-OUTPUT -d 10.1.1.7/32 -j DNAT --to-destination 192.167.7.3
-A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697
-A neutron-l3-agent-PREROUTING -d 10.1.1.6/32 -j DNAT --to-destination 10.0.8.3
-A neutron-l3-agent-PREROUTING -d 10.1.1.7/32 -j DNAT --to-destination 192.167.7.3
-A neutron-l3-agent-float-snat -s 10.0.8.3/32 -j SNAT --to-source 10.1.1.6
-A neutron-l3-agent-float-snat -s 192.167.7.3/32 -j SNAT --to-source 10.1.1.7
-A neutron-l3-agent-snat -j neutron-l3-agent-float-snat
-A neutron-postrouting-bottom -m comment --comment "Perform source NAT on outgoing traffic." -j neutron-l3-agent-snat
In the event the rule is added manually, traffic between instances works
as expected in that the source address is not SNAT'd as the floating IP:
Adding the rule:
ip netns exec qrouter-dd15e8f3-8612-4925-81d4-88fcad49807f iptables -t
nat -A neutron-l3-agent-POSTROUTING ! -i qg-80aa20be-9b ! -o qg-
80aa20be-9b -m conntrack ! --ctstate DNAT -j ACCEPT
Results in:
-A PREROUTING -j neutron-l3-agent-PREROUTING
-A OUTPUT -j neutron-l3-agent-OUTPUT
-A POSTROUTING -j neutron-l3-agent-POSTROUTING
-A POSTROUTING -j neutron-postrouting-bottom
-A neutron-l3-agent-OUTPUT -d 10.1.1.6/32 -j DNAT --to-destination 10.0.8.3
-A neutron-l3-agent-OUTPUT -d 10.1.1.7/32 -j DNAT --to-destination 192.167.7.3
-A neutron-l3-agent-POSTROUTING ! -i qg-80aa20be-9b ! -o qg-80aa20be-9b -m conntrack ! --ctstate DNAT -j ACCEPT <---- RULE ADDED
-A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697
-A neutron-l3-agent-PREROUTING -d 10.1.1.6/32 -j DNAT --to-destination 10.0.8.3
-A neutron-l3-agent-PREROUTING -d 10.1.1.7/32 -j DNAT --to-destination 192.167.7.3
-A neutron-l3-agent-float-snat -s 10.0.8.3/32 -j SNAT --to-source 10.1.1.6
-A neutron-l3-agent-float-snat -s 192.167.7.3/32 -j SNAT --to-source 10.1.1.7
-A neutron-l3-agent-snat -j neutron-l3-agent-float-snat
-A neutron-postrouting-bottom -m comment --comment "Perform source NAT on outgoing traffic." -j neutron-l3-agent-snat
Ping from A to B:
$ ping 10.0.8.3 -c 1
PING 10.0.8.3 (10.0.8.3): 56 data bytes
64 bytes from 10.0.8.3: seq=0 ttl=63 time=8.458 ms
On the router we see that the traffic is unmodified:
root@controller01:~# ip netns exec qrouter-dd15e8f3-8612-4925-81d4-88fcad49807f tcpdump -i any -ne icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
12:58:08.915940: 192.167.7.3 > 10.0.8.3: ICMP echo request, id 41217, seq 0, length 64
12:58:08.916004: 192.167.7.3 > 10.0.8.3: ICMP echo request, id 41217, seq 0, length 64
12:58:08.921698: 10.0.8.3 > 192.167.7.3: ICMP echo reply, id 41217, seq 0, length 64
12:58:08.921750: 10.0.8.3 > 192.167.7.3: ICMP echo reply, id 41217, seq 0, length 64
Outbound SNAT behavior is not impacted:
Ping from A to google DNS:
$ ping 8.8.8.8 -c1
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=0 ttl=50 time=33.121 ms
Traffic is properly source NAT'd as floating IP:
root@controller01:~# ip netns exec qrouter-dd15e8f3-8612-4925-81d4-88fcad49807f tcpdump -i any -ne icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
13:26:06.474405 In fa:16:3e:43:52:58 ethertype IPv4 (0x0800), length 100: 192.167.7.3 > 8.8.8.8: ICMP echo request, id 41985, seq 0, length 64
13:26:06.474485 Out fa:16:3e:c3:7a:33 ethertype IPv4 (0x0800), length 100: 10.1.1.7 > 8.8.8.8: ICMP echo request, id 41985, seq 0, length 64 <-- SNAT as FLOAT
13:26:06.505296 In 00:e0:1c:70:06:32 ethertype IPv4 (0x0800), length 100: 8.8.8.8 > 10.1.1.7: ICMP echo reply, id 41985, seq 0, length 64
13:26:06.505326 Out fa:16:3e:e8:c9:6b ethertype IPv4 (0x0800), length 100: 8.8.8.8 > 192.167.7.3: ICMP echo reply, id 41985, seq 0, length 64
** Affects: neutron
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1505781
Title:
Unexpected SNAT behavior between instances when SNAT disabled on
router
Status in neutron:
New
Bug description:
= Scenario =
• Kilo/Juno
• Single Neutron router with enable_snat=false
• two instances in two tenant networks attached to router
• each instance has a floating IP
INSTANCE A: TestNet1=192.167.7.3, 10.1.1.7
INSTANCE B: TestNet2=10.0.8.3, 10.1.1.6
When instances communicate out (ie. to the Internet), they are
properly SNAT'd using their respective floating IP. If an instance
does not have a floating IP, the traffic is routed out without SNAT.
When instances in tenant networks behind the same router communicate
via their fixed IPs, the source address is SNAT'd as the respective
floating IP while the destination is unmodified:
Pinging from INSTANCE A to INSTANCE B:
$ ping 10.0.8.3 -c1
PING 10.0.8.3 (10.0.8.3): 56 data bytes
64 bytes from 10.0.8.3: seq=0 ttl=63 time=7.483 ms
From the Neutron router:
root@controller01:~# ip netns exec qrouter-dd15e8f3-8612-4925-81d4-88fcad49807f tcpdump -i any -ne icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
10:37:48.840404: 192.167.7.3 > 10.0.8.3: ICMP echo request, id 37121, seq 12, length 64
10:37:48.840467: 10.1.1.7 > 10.0.8.3: ICMP echo request, id 37121, seq 12, length 64 <-- SNAT as FLOAT
10:37:48.842506: 10.0.8.3 > 10.1.1.7: ICMP echo reply, id 37121, seq 12, length 64
10:37:48.842565: 10.0.8.3 > 192.167.7.3: ICMP echo reply, id 37121, seq 12, length 64
This behavior has a negative effect for a couple of reasons:
1. The expectation is that traffic between the two instances behind the same router using fixed IPs would not be source NAT'd
2. Security group rules that use 'Remote Security Group' rather than 'Remote IP Prefix' fail to work since the source address is modified
When SNAT is enabled on the router, traffic between the instances via
their fixed IP works as expected:
From INSTANCE A to B:
$ ping 10.0.8.3 -c 1
PING 10.0.8.3 (10.0.8.3): 56 data bytes
64 bytes from 10.0.8.3: seq=0 ttl=63 time=8.024 ms
From the Neutron router:
root@controller01:~# ip netns exec qrouter-dd15e8f3-8612-4925-81d4-88fcad49807f tcpdump -i any -ne icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
10:52:19.945863: 192.167.7.3 > 10.0.8.3: ICMP echo request, id 39425, seq 0, length 64
10:52:19.945953: 192.167.7.3 > 10.0.8.3: ICMP echo request, id 39425, seq 0, length 64
10:52:19.951498: 10.0.8.3 > 192.167.7.3: ICMP echo reply, id 39425, seq 0, length 64
10:52:19.951554: 10.0.8.3 > 192.167.7.3: ICMP echo reply, id 39425, seq 0, length 64
We believe the existence of the following iptables nat rule causes the
desired behavior, in that traffic not traversing the qg interface is
not NAT'd:
-A neutron-l3-agent-POSTROUTING ! -i qg-80aa20be-9b ! -o qg-80aa20be-
9b -m conntrack ! --ctstate DNAT -j ACCEPT
That rule only exists when SNAT is *enabled* on the router, and not
when it is disabled, as shown below:
SNAT enabled:
-A PREROUTING -j neutron-l3-agent-PREROUTING
-A OUTPUT -j neutron-l3-agent-OUTPUT
-A POSTROUTING -j neutron-l3-agent-POSTROUTING
-A POSTROUTING -j neutron-postrouting-bottom
-A neutron-l3-agent-OUTPUT -d 10.1.1.6/32 -j DNAT --to-destination 10.0.8.3
-A neutron-l3-agent-OUTPUT -d 10.1.1.7/32 -j DNAT --to-destination 192.167.7.3
-A neutron-l3-agent-POSTROUTING ! -i qg-80aa20be-9b ! -o qg-80aa20be-9b -m conntrack ! --ctstate DNAT -j ACCEPT
-A neutron-l3-agent-PREROUTING -d 10.1.1.6/32 -j DNAT --to-destination 10.0.8.3
-A neutron-l3-agent-PREROUTING -d 10.1.1.7/32 -j DNAT --to-destination 192.167.7.3
-A neutron-l3-agent-float-snat -s 10.0.8.3/32 -j SNAT --to-source 10.1.1.6
-A neutron-l3-agent-float-snat -s 192.167.7.3/32 -j SNAT --to-source 10.1.1.7
-A neutron-l3-agent-snat -j neutron-l3-agent-float-snat
-A neutron-l3-agent-snat -o qg-80aa20be-9b -j SNAT --to-source 10.1.1.5
-A neutron-l3-agent-snat -m mark ! --mark 0x2 -m conntrack --ctstate DNAT -j SNAT --to-source 10.1.1.5
-A neutron-postrouting-bottom -m comment --comment "Perform source NAT on outgoing traffic." -j neutron-l3-agent-snat
SNAT disabled:
-A PREROUTING -j neutron-l3-agent-PREROUTING
-A OUTPUT -j neutron-l3-agent-OUTPUT
-A POSTROUTING -j neutron-l3-agent-POSTROUTING
-A POSTROUTING -j neutron-postrouting-bottom
-A neutron-l3-agent-OUTPUT -d 10.1.1.6/32 -j DNAT --to-destination 10.0.8.3
-A neutron-l3-agent-OUTPUT -d 10.1.1.7/32 -j DNAT --to-destination 192.167.7.3
-A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697
-A neutron-l3-agent-PREROUTING -d 10.1.1.6/32 -j DNAT --to-destination 10.0.8.3
-A neutron-l3-agent-PREROUTING -d 10.1.1.7/32 -j DNAT --to-destination 192.167.7.3
-A neutron-l3-agent-float-snat -s 10.0.8.3/32 -j SNAT --to-source 10.1.1.6
-A neutron-l3-agent-float-snat -s 192.167.7.3/32 -j SNAT --to-source 10.1.1.7
-A neutron-l3-agent-snat -j neutron-l3-agent-float-snat
-A neutron-postrouting-bottom -m comment --comment "Perform source NAT on outgoing traffic." -j neutron-l3-agent-snat
In the event the rule is added manually, traffic between instances
works as expected in that the source address is not SNAT'd as the
floating IP:
Adding the rule:
ip netns exec qrouter-dd15e8f3-8612-4925-81d4-88fcad49807f iptables -t
nat -A neutron-l3-agent-POSTROUTING ! -i qg-80aa20be-9b ! -o qg-
80aa20be-9b -m conntrack ! --ctstate DNAT -j ACCEPT
Results in:
-A PREROUTING -j neutron-l3-agent-PREROUTING
-A OUTPUT -j neutron-l3-agent-OUTPUT
-A POSTROUTING -j neutron-l3-agent-POSTROUTING
-A POSTROUTING -j neutron-postrouting-bottom
-A neutron-l3-agent-OUTPUT -d 10.1.1.6/32 -j DNAT --to-destination 10.0.8.3
-A neutron-l3-agent-OUTPUT -d 10.1.1.7/32 -j DNAT --to-destination 192.167.7.3
-A neutron-l3-agent-POSTROUTING ! -i qg-80aa20be-9b ! -o qg-80aa20be-9b -m conntrack ! --ctstate DNAT -j ACCEPT <---- RULE ADDED
-A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697
-A neutron-l3-agent-PREROUTING -d 10.1.1.6/32 -j DNAT --to-destination 10.0.8.3
-A neutron-l3-agent-PREROUTING -d 10.1.1.7/32 -j DNAT --to-destination 192.167.7.3
-A neutron-l3-agent-float-snat -s 10.0.8.3/32 -j SNAT --to-source 10.1.1.6
-A neutron-l3-agent-float-snat -s 192.167.7.3/32 -j SNAT --to-source 10.1.1.7
-A neutron-l3-agent-snat -j neutron-l3-agent-float-snat
-A neutron-postrouting-bottom -m comment --comment "Perform source NAT on outgoing traffic." -j neutron-l3-agent-snat
Ping from A to B:
$ ping 10.0.8.3 -c 1
PING 10.0.8.3 (10.0.8.3): 56 data bytes
64 bytes from 10.0.8.3: seq=0 ttl=63 time=8.458 ms
On the router we see that the traffic is unmodified:
root@controller01:~# ip netns exec qrouter-dd15e8f3-8612-4925-81d4-88fcad49807f tcpdump -i any -ne icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
12:58:08.915940: 192.167.7.3 > 10.0.8.3: ICMP echo request, id 41217, seq 0, length 64
12:58:08.916004: 192.167.7.3 > 10.0.8.3: ICMP echo request, id 41217, seq 0, length 64
12:58:08.921698: 10.0.8.3 > 192.167.7.3: ICMP echo reply, id 41217, seq 0, length 64
12:58:08.921750: 10.0.8.3 > 192.167.7.3: ICMP echo reply, id 41217, seq 0, length 64
Outbound SNAT behavior is not impacted:
Ping from A to google DNS:
$ ping 8.8.8.8 -c1
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=0 ttl=50 time=33.121 ms
Traffic is properly source NAT'd as floating IP:
root@controller01:~# ip netns exec qrouter-dd15e8f3-8612-4925-81d4-88fcad49807f tcpdump -i any -ne icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
13:26:06.474405 In fa:16:3e:43:52:58 ethertype IPv4 (0x0800), length 100: 192.167.7.3 > 8.8.8.8: ICMP echo request, id 41985, seq 0, length 64
13:26:06.474485 Out fa:16:3e:c3:7a:33 ethertype IPv4 (0x0800), length 100: 10.1.1.7 > 8.8.8.8: ICMP echo request, id 41985, seq 0, length 64 <-- SNAT as FLOAT
13:26:06.505296 In 00:e0:1c:70:06:32 ethertype IPv4 (0x0800), length 100: 8.8.8.8 > 10.1.1.7: ICMP echo reply, id 41985, seq 0, length 64
13:26:06.505326 Out fa:16:3e:e8:c9:6b ethertype IPv4 (0x0800), length 100: 8.8.8.8 > 192.167.7.3: ICMP echo reply, id 41985, seq 0, length 64
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1505781/+subscriptions
Follow ups