← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1505781] Re: Unexpected SNAT behavior between instances when SNAT disabled on router

 

Reviewed:  https://review.openstack.org/235832
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=4341a4faeed937d014e95a94b77844d5a835acbe
Submitter: Jenkins
Branch:    master

commit 4341a4faeed937d014e95a94b77844d5a835acbe
Author: Jenkins <jenkins@xxxxxxxxxxxxxxxxxxxx>
Date:   Fri Oct 16 02:26:57 2015 +0000

    Don't snat traffic between fixed IPs behind same router
    
    This fixes a bug where an iptables rule to not snat traffic between
    fixed IPs is only being added if enable_snat=true. We should add
    this rule no matter what the value is for enable_snat.
    
    Without this patch, current code will break such use case:
    2 fixed IPs behind same router both have floatingip associated. And
    the router has enable_snat=false. When fixed IP A want to ping
    fixed IP B, fixed IP A will get the reply from fixed IP B's floating
    IP.
    
    More details could be found at bug description.
    
    Change-Id: I322e8d454ef1d529ceda541fb5fe577cd70b412f
    Closes-bug: #1505781


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1505781

Title:
  Unexpected SNAT behavior between instances when SNAT disabled on
  router

Status in neutron:
  Fix Released

Bug description:
  = Scenario =

  • Kilo/Juno
  • Single Neutron router with enable_snat=false
  • two instances in two tenant networks attached to router
  • each instance has a floating IP

  INSTANCE A: TestNet1=192.167.7.3, 10.1.1.7
  INSTANCE B: TestNet2=10.0.8.3, 10.1.1.6

  When instances communicate out (ie. to the Internet), they are
  properly SNAT'd using their respective floating IP. If an instance
  does not have a floating IP, the traffic is routed out without SNAT.

  When instances in tenant networks behind the same router communicate
  via their fixed IPs, the source address is SNAT'd as the respective
  floating IP while the destination is unmodified:

  Pinging from INSTANCE A to INSTANCE B:

  $ ping 10.0.8.3 -c1
  PING 10.0.8.3 (10.0.8.3): 56 data bytes
  64 bytes from 10.0.8.3: seq=0 ttl=63 time=7.483 ms

  From the Neutron router:

  root@controller01:~# ip netns exec qrouter-dd15e8f3-8612-4925-81d4-88fcad49807f tcpdump -i any -ne icmp
  tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
  10:37:48.840404: 192.167.7.3 > 10.0.8.3: ICMP echo request, id 37121, seq 12, length 64
  10:37:48.840467: 10.1.1.7 > 10.0.8.3: ICMP echo request, id 37121, seq 12, length 64 <-- SNAT as FLOAT
  10:37:48.842506: 10.0.8.3 > 10.1.1.7: ICMP echo reply, id 37121, seq 12, length 64
  10:37:48.842565: 10.0.8.3 > 192.167.7.3: ICMP echo reply, id 37121, seq 12, length 64

  This behavior has a negative effect for a couple of reasons:

  1. The expectation is that traffic between the two instances behind the same router using fixed IPs would not be source NAT'd
  2. Security group rules that use 'Remote Security Group' rather than 'Remote IP Prefix' fail to work since the source address is modified

  When SNAT is enabled on the router, traffic between the instances via
  their fixed IP works as expected:

  From INSTANCE A to B:

  $ ping 10.0.8.3 -c 1
  PING 10.0.8.3 (10.0.8.3): 56 data bytes
  64 bytes from 10.0.8.3: seq=0 ttl=63 time=8.024 ms

  From the Neutron router:

  root@controller01:~# ip netns exec qrouter-dd15e8f3-8612-4925-81d4-88fcad49807f tcpdump -i any -ne icmp
  tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
  10:52:19.945863: 192.167.7.3 > 10.0.8.3: ICMP echo request, id 39425, seq 0, length 64
  10:52:19.945953: 192.167.7.3 > 10.0.8.3: ICMP echo request, id 39425, seq 0, length 64
  10:52:19.951498: 10.0.8.3 > 192.167.7.3: ICMP echo reply, id 39425, seq 0, length 64
  10:52:19.951554: 10.0.8.3 > 192.167.7.3: ICMP echo reply, id 39425, seq 0, length 64

  We believe the existence of the following iptables nat rule causes the
  desired behavior, in that traffic not traversing the qg interface is
  not NAT'd:

  -A neutron-l3-agent-POSTROUTING ! -i qg-80aa20be-9b ! -o qg-80aa20be-
  9b -m conntrack ! --ctstate DNAT -j ACCEPT

  That rule only exists when SNAT is *enabled* on the router, and not
  when it is disabled, as shown below:

  SNAT enabled:

  -A PREROUTING -j neutron-l3-agent-PREROUTING
  -A OUTPUT -j neutron-l3-agent-OUTPUT
  -A POSTROUTING -j neutron-l3-agent-POSTROUTING
  -A POSTROUTING -j neutron-postrouting-bottom
  -A neutron-l3-agent-OUTPUT -d 10.1.1.6/32 -j DNAT --to-destination 10.0.8.3
  -A neutron-l3-agent-OUTPUT -d 10.1.1.7/32 -j DNAT --to-destination 192.167.7.3
  -A neutron-l3-agent-POSTROUTING ! -i qg-80aa20be-9b ! -o qg-80aa20be-9b -m conntrack ! --ctstate DNAT -j ACCEPT
  -A neutron-l3-agent-PREROUTING -d 10.1.1.6/32 -j DNAT --to-destination 10.0.8.3
  -A neutron-l3-agent-PREROUTING -d 10.1.1.7/32 -j DNAT --to-destination 192.167.7.3
  -A neutron-l3-agent-float-snat -s 10.0.8.3/32 -j SNAT --to-source 10.1.1.6
  -A neutron-l3-agent-float-snat -s 192.167.7.3/32 -j SNAT --to-source 10.1.1.7
  -A neutron-l3-agent-snat -j neutron-l3-agent-float-snat
  -A neutron-l3-agent-snat -o qg-80aa20be-9b -j SNAT --to-source 10.1.1.5
  -A neutron-l3-agent-snat -m mark ! --mark 0x2 -m conntrack --ctstate DNAT -j SNAT --to-source 10.1.1.5
  -A neutron-postrouting-bottom -m comment --comment "Perform source NAT on outgoing traffic." -j neutron-l3-agent-snat

  SNAT disabled:

  -A PREROUTING -j neutron-l3-agent-PREROUTING
  -A OUTPUT -j neutron-l3-agent-OUTPUT
  -A POSTROUTING -j neutron-l3-agent-POSTROUTING
  -A POSTROUTING -j neutron-postrouting-bottom
  -A neutron-l3-agent-OUTPUT -d 10.1.1.6/32 -j DNAT --to-destination 10.0.8.3
  -A neutron-l3-agent-OUTPUT -d 10.1.1.7/32 -j DNAT --to-destination 192.167.7.3
  -A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697
  -A neutron-l3-agent-PREROUTING -d 10.1.1.6/32 -j DNAT --to-destination 10.0.8.3
  -A neutron-l3-agent-PREROUTING -d 10.1.1.7/32 -j DNAT --to-destination 192.167.7.3
  -A neutron-l3-agent-float-snat -s 10.0.8.3/32 -j SNAT --to-source 10.1.1.6
  -A neutron-l3-agent-float-snat -s 192.167.7.3/32 -j SNAT --to-source 10.1.1.7
  -A neutron-l3-agent-snat -j neutron-l3-agent-float-snat
  -A neutron-postrouting-bottom -m comment --comment "Perform source NAT on outgoing traffic." -j neutron-l3-agent-snat

  In the event the rule is added manually, traffic between instances
  works as expected in that the source address is not SNAT'd as the
  floating IP:

  Adding the rule:

  ip netns exec qrouter-dd15e8f3-8612-4925-81d4-88fcad49807f iptables -t
  nat -A neutron-l3-agent-POSTROUTING ! -i qg-80aa20be-9b ! -o qg-
  80aa20be-9b -m conntrack ! --ctstate DNAT -j ACCEPT

  Results in:

  -A PREROUTING -j neutron-l3-agent-PREROUTING
  -A OUTPUT -j neutron-l3-agent-OUTPUT
  -A POSTROUTING -j neutron-l3-agent-POSTROUTING
  -A POSTROUTING -j neutron-postrouting-bottom
  -A neutron-l3-agent-OUTPUT -d 10.1.1.6/32 -j DNAT --to-destination 10.0.8.3
  -A neutron-l3-agent-OUTPUT -d 10.1.1.7/32 -j DNAT --to-destination 192.167.7.3
  -A neutron-l3-agent-POSTROUTING ! -i qg-80aa20be-9b ! -o qg-80aa20be-9b -m conntrack ! --ctstate DNAT -j ACCEPT  <---- RULE ADDED
  -A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697
  -A neutron-l3-agent-PREROUTING -d 10.1.1.6/32 -j DNAT --to-destination 10.0.8.3
  -A neutron-l3-agent-PREROUTING -d 10.1.1.7/32 -j DNAT --to-destination 192.167.7.3
  -A neutron-l3-agent-float-snat -s 10.0.8.3/32 -j SNAT --to-source 10.1.1.6
  -A neutron-l3-agent-float-snat -s 192.167.7.3/32 -j SNAT --to-source 10.1.1.7
  -A neutron-l3-agent-snat -j neutron-l3-agent-float-snat
  -A neutron-postrouting-bottom -m comment --comment "Perform source NAT on outgoing traffic." -j neutron-l3-agent-snat

  Ping from A to B:

  $ ping 10.0.8.3 -c 1
  PING 10.0.8.3 (10.0.8.3): 56 data bytes
  64 bytes from 10.0.8.3: seq=0 ttl=63 time=8.458 ms

  On the router we see that the traffic is unmodified:

  root@controller01:~# ip netns exec qrouter-dd15e8f3-8612-4925-81d4-88fcad49807f tcpdump -i any -ne icmp
  tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
  12:58:08.915940: 192.167.7.3 > 10.0.8.3: ICMP echo request, id 41217, seq 0, length 64
  12:58:08.916004: 192.167.7.3 > 10.0.8.3: ICMP echo request, id 41217, seq 0, length 64
  12:58:08.921698: 10.0.8.3 > 192.167.7.3: ICMP echo reply, id 41217, seq 0, length 64
  12:58:08.921750: 10.0.8.3 > 192.167.7.3: ICMP echo reply, id 41217, seq 0, length 64

  Outbound SNAT behavior is not impacted:

  Ping from A to google DNS:

  $ ping 8.8.8.8 -c1
  PING 8.8.8.8 (8.8.8.8): 56 data bytes
  64 bytes from 8.8.8.8: seq=0 ttl=50 time=33.121 ms

  Traffic is properly source NAT'd as floating IP:

  root@controller01:~# ip netns exec qrouter-dd15e8f3-8612-4925-81d4-88fcad49807f tcpdump -i any -ne icmp
  tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
  13:26:06.474405  In fa:16:3e:43:52:58 ethertype IPv4 (0x0800), length 100: 192.167.7.3 > 8.8.8.8: ICMP echo request, id 41985, seq 0, length 64
  13:26:06.474485 Out fa:16:3e:c3:7a:33 ethertype IPv4 (0x0800), length 100: 10.1.1.7 > 8.8.8.8: ICMP echo request, id 41985, seq 0, length 64 <-- SNAT as FLOAT
  13:26:06.505296  In 00:e0:1c:70:06:32 ethertype IPv4 (0x0800), length 100: 8.8.8.8 > 10.1.1.7: ICMP echo reply, id 41985, seq 0, length 64
  13:26:06.505326 Out fa:16:3e:e8:c9:6b ethertype IPv4 (0x0800), length 100: 8.8.8.8 > 192.167.7.3: ICMP echo reply, id 41985, seq 0, length 64

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1505781/+subscriptions


References