← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1776778] Re: Floating IPs broken after kernel upgrade to Centos/RHEL 7.5 - DNAT not working

 

** Changed in: neutron
       Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1776778

Title:
  Floating IPs broken after kernel upgrade to Centos/RHEL 7.5 - DNAT not
  working

Status in neutron:
  Invalid

Bug description:
  Since upgrading to Centos 7.5 (with kernel 3.10.0-862), floating IP
  functionality has been completely busted. Packets arrive inbound to
  qrouter from fip namespace via RFP, but are not DNAT'd or routed, as
  we see nothing going out qr- interface. For outbound packets leaving
  the VM, they are fine, but then all responses are again dropped
  inbound to qrouter after arriving on rfp. It appears the DNAT rules in
  the "-t nat" iptables within qrouter are not being hit (packet
  counters are zero).

  SNAT functionality works when we remove floating IP from the VM (VM
  can then ping outbound). So problem seems isolated to DNAT / qrouter
  receiving packets from fip?

  We are able to reproduce this 100% consistently, whenever we update
  our working centos 7.2 / centos 7.4 hosts to 7.5. Nothing changes
  except a "yum update". All routes, rules, iptables are identical on a
  working older host vs. broken centos 7.5 host.

  I added some basic rules to log packets at top of PREROUTING chain in
  raw, mangle, and nat tables. Filtering either by my source IP, or all
  packets on -i rfp ingress interface. While packet counters increment
  for raw and mangle, they remain at 0 for nat, indicating the nat
  iptable is not invoked for PREROUTING.

  Floating IP = 10.8.17.52, Fixed IP = 192.168.94.9.

  [root@centos7-neutron-template ~]# ip netns exec qrouter-f48d5536-eefa-4410-b17b-1b3d14426323 tcpdump -l -evvvnn -i rfp-f48d5536-e
  tcpdump: listening on rfp-f48d5536-e, link-type EN10MB (Ethernet), capture size 262144 bytes
  13:42:00.345440 7a:3b:f1:c7:5d:4e > aa:24:89:9e:c8:f0, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 62, id 1832, offset 0, flags [DF], proto ICMP (1), length 84)
      10.4.165.22 > 10.8.17.52: ICMP echo request, id 5771, seq 1, length 64
  13:42:01.344047 7a:3b:f1:c7:5d:4e > aa:24:89:9e:c8:f0, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 63, id 1833, offset 0, flags [DF], proto ICMP (1), length 84)
      10.4.165.22 > 10.8.17.52: ICMP echo request, id 5771, seq 2, length 64
  13:42:02.398300 7a:3b:f1:c7:5d:4e > aa:24:89:9e:c8:f0, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 63, id 1834, offset 0, flags [DF], proto ICMP (1), length 84)
      10.4.165.22 > 10.8.17.52: ICMP echo request, id 5771, seq 3, length 64
  13:42:03.344345 7a:3b:f1:c7:5d:4e > aa:24:89:9e:c8:f0, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 63, id 1835, offset 0, flags [DF], proto ICMP (1), length 84)
      10.4.165.22 > 10.8.17.52: ICMP echo request, id 5771, seq 4, length 64
  ^C
  4 packets captured
  4 packets received by filter
  0 packets dropped by kernel
  [root@centos7-neutron-template ~]# ip netns exec qrouter-f48d5536-eefa-4410-b17b-1b3d14426323 tcpdump -l -evvvnn -i qr-295f9857-21
  tcpdump: listening on qr-295f9857-21, link-type EN10MB (Ethernet), capture size 262144 bytes

  ***CRICKETS***

  [root@centos7-neutron-template ~]# ip netns exec qrouter-f48d5536-eefa-4410-b17b-1b3d14426323 ip a
  1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
      link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
      inet 127.0.0.1/8 scope host lo
         valid_lft forever preferred_lft forever
      inet6 ::1/128 scope host
         valid_lft forever preferred_lft forever
  2: rfp-f48d5536-e: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
      link/ether aa:24:89:9e:c8:f0 brd ff:ff:ff:ff:ff:ff link-netnsid 0
      inet 169.254.106.114/31 scope global rfp-f48d5536-e
         valid_lft forever preferred_lft forever
      inet6 fe80::a824:89ff:fe9e:c8f0/64 scope link
         valid_lft forever preferred_lft forever
  59: qr-295f9857-21: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
      link/ether fa:16:3e:3d:f1:12 brd ff:ff:ff:ff:ff:ff
      inet 192.168.94.1/24 brd 192.168.94.255 scope global qr-295f9857-21
         valid_lft forever preferred_lft forever
      inet6 fe80::f816:3eff:fe3d:f112/64 scope link
         valid_lft forever preferred_lft forever

  [root@centos7-neutron-template ~]# ip netns exec qrouter-f48d5536-eefa-4410-b17b-1b3d14426323 ip route
  169.254.106.114/31 dev rfp-f48d5536-e proto kernel scope link src 169.254.106.114
  192.168.94.0/24 dev qr-295f9857-21 proto kernel scope link src 192.168.94.1
  [root@centos7-neutron-template ~]# ip netns exec qrouter-f48d5536-eefa-4410-b17b-1b3d14426323 ip rule
  0:	from all lookup local
  32766:	from all lookup main
  32767:	from all lookup default
  57481:	from 192.168.94.9 lookup 16
  3232259585:	from 192.168.94.1/24 lookup 3232259585
  [root@centos7-neutron-template ~]# ip netns exec qrouter-f48d5536-eefa-4410-b17b-1b3d14426323 ip route show table 16
  default via 169.254.106.115 dev rfp-f48d5536-e
  [root@centos7-neutron-template ~]# ip netns exec qrouter-f48d5536-eefa-4410-b17b-1b3d14426323 ip neighbor
  169.254.106.115 dev rfp-f48d5536-e lladdr 7a:3b:f1:c7:5d:4e STALE
  192.168.94.4 dev qr-295f9857-21 lladdr fa:16:3e:cf:a1:08 PERMANENT
  192.168.94.13 dev qr-295f9857-21 lladdr fa:16:3e:91:37:54 PERMANENT
  192.168.94.2 dev qr-295f9857-21 lladdr fa:16:3e:b2:18:5e PERMANENT
  192.168.94.9 dev qr-295f9857-21 lladdr fa:16:3e:6c:4a:3b PERMANENT

  [root@centos7-neutron-template ~]# ip netns exec qrouter-f48d5536-eefa-4410-b17b-1b3d14426323 iptables-save
  # Generated by iptables-save v1.4.21 on Wed Jun 13 15:20:58 2018
  *raw
  :PREROUTING ACCEPT [5384:453413]
  :OUTPUT ACCEPT [65:5637]
  :neutron-l3d-OUTPUT - [0:0]
  :neutron-l3d-PREROUTING - [0:0]
  -A PREROUTING -j neutron-l3d-PREROUTING
  -A OUTPUT -j neutron-l3d-OUTPUT
  COMMIT
  # Completed on Wed Jun 13 15:20:58 2018
  # Generated by iptables-save v1.4.21 on Wed Jun 13 15:20:58 2018
  *mangle
  :PREROUTING ACCEPT [5281:443604]
  :INPUT ACCEPT [4:336]
  :FORWARD ACCEPT [20:1680]
  :OUTPUT ACCEPT [4:336]
  :POSTROUTING ACCEPT [24:2016]
  :neutron-l3d-FORWARD - [0:0]
  :neutron-l3d-INPUT - [0:0]
  :neutron-l3d-OUTPUT - [0:0]
  :neutron-l3d-POSTROUTING - [0:0]
  :neutron-l3d-PREROUTING - [0:0]
  :neutron-l3d-float-snat - [0:0]
  :neutron-l3d-floatingip - [0:0]
  :neutron-l3d-mark - [0:0]
  :neutron-l3d-scope - [0:0]
  -A PREROUTING -j neutron-l3d-PREROUTING
  -A INPUT -j neutron-l3d-INPUT
  -A FORWARD -j neutron-l3d-FORWARD
  -A OUTPUT -j neutron-l3d-OUTPUT
  -A POSTROUTING -j neutron-l3d-POSTROUTING
  -A neutron-l3d-PREROUTING -j neutron-l3d-mark
  -A neutron-l3d-PREROUTING -j neutron-l3d-scope
  -A neutron-l3d-PREROUTING -m connmark ! --mark 0x0/0xffff0000 -j CONNMARK --restore-mark --nfmask 0xffff0000 --ctmask 0xffff0000
  -A neutron-l3d-PREROUTING -j neutron-l3d-floatingip
  -A neutron-l3d-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j MARK --set-xmark 0x1/0xffff
  -A neutron-l3d-float-snat -m connmark --mark 0x0/0xffff0000 -j CONNMARK --save-mark --nfmask 0xffff0000 --ctmask 0xffff0000
  -A neutron-l3d-scope -i qr-295f9857-21 -j MARK --set-xmark 0x4000000/0xffff0000
  -A neutron-l3d-scope -i rfp-f48d5536-e -j MARK --set-xmark 0x4000000/0xffff0000
  COMMIT
  # Completed on Wed Jun 13 15:20:59 2018
  # Generated by iptables-save v1.4.21 on Wed Jun 13 15:20:59 2018
  *nat
  :PREROUTING ACCEPT [0:0]
  :INPUT ACCEPT [0:0]
  :OUTPUT ACCEPT [1:84]
  :POSTROUTING ACCEPT [3:252]
  :neutron-l3d-OUTPUT - [0:0]
  :neutron-l3d-POSTROUTING - [0:0]
  :neutron-l3d-PREROUTING - [0:0]
  :neutron-l3d-float-snat - [0:0]
  :neutron-l3d-snat - [0:0]
  :neutron-postrouting-bottom - [0:0]
  -A PREROUTING -j neutron-l3d-PREROUTING
  -A OUTPUT -j neutron-l3d-OUTPUT
  -A POSTROUTING -j neutron-l3d-POSTROUTING
  -A POSTROUTING -j neutron-postrouting-bottom
  -A neutron-l3d-POSTROUTING ! -i rfp-f48d5536-e ! -o rfp-f48d5536-e -m conntrack ! --ctstate DNAT -j ACCEPT
  -A neutron-l3d-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697
  -A neutron-l3d-PREROUTING -d 10.8.17.52/32 -i rfp-f48d5536-e -j DNAT --to-destination 192.168.94.9
  -A neutron-l3d-float-snat -s 192.168.94.9/32 -j SNAT --to-source 10.8.17.52
  -A neutron-l3d-snat -j neutron-l3d-float-snat
  -A neutron-postrouting-bottom -m comment --comment "Perform source NAT on outgoing traffic." -j neutron-l3d-snat
  COMMIT
  # Completed on Wed Jun 13 15:20:59 2018
  # Generated by iptables-save v1.4.21 on Wed Jun 13 15:20:59 2018
  *filter
  :INPUT ACCEPT [4:336]
  :FORWARD ACCEPT [20:1680]
  :OUTPUT ACCEPT [4:336]
  :neutron-filter-top - [0:0]
  :neutron-l3d-FORWARD - [0:0]
  :neutron-l3d-INPUT - [0:0]
  :neutron-l3d-OUTPUT - [0:0]
  :neutron-l3d-local - [0:0]
  :neutron-l3d-scope - [0:0]
  -A INPUT -j neutron-l3d-INPUT
  -A FORWARD -j neutron-filter-top
  -A FORWARD -j neutron-l3d-FORWARD
  -A OUTPUT -j neutron-filter-top
  -A OUTPUT -j neutron-l3d-OUTPUT
  -A neutron-filter-top -j neutron-l3d-local
  -A neutron-l3d-FORWARD -j neutron-l3d-scope
  -A neutron-l3d-INPUT -m mark --mark 0x1/0xffff -j ACCEPT
  -A neutron-l3d-INPUT -p tcp -m tcp --dport 9697 -j DROP
  -A neutron-l3d-scope -o qr-295f9857-21 -m mark ! --mark 0x4000000/0xffff0000 -j DROP
  -A neutron-l3d-scope -o rfp-f48d5536-e -m mark ! --mark 0x4000000/0xffff0000 -j DROP
  COMMIT
  # Completed on Wed Jun 13 15:20:59 2018

  Also as you can see, the qrouter itself can ping the VM's fixed IP. It
  just does not DNAT/route packets arriving from the fip namespace:

  [root@centos7-neutron-template ~]# ip netns exec qrouter-f48d5536-eefa-4410-b17b-1b3d14426323 ping 192.168.94.9
  PING 192.168.94.9 (192.168.94.9) 56(84) bytes of data.
  64 bytes from 192.168.94.9: icmp_seq=1 ttl=64 time=6.37 ms
  64 bytes from 192.168.94.9: icmp_seq=2 ttl=64 time=1.02 ms
  64 bytes from 192.168.94.9: icmp_seq=3 ttl=64 time=1.11 ms
  64 bytes from 192.168.94.9: icmp_seq=4 ttl=64 time=0.599 ms

  This is in Newton release BTW

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1776778/+subscriptions



References