← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2003532] [NEW] Floating IP stuck in snat-ns after binding host to associated fixed IP

 

Public bug reported:

We encountered a problem when the floating IP is not removed from the snat-ns when FIP is moving from the centralized to the distributed state (i.e. when the host is binding to the associated fixed IP address).
This happens when the the fixed IP was originally created with a non-empty device_owner field.

Steps to reproduce.
Create a router, a port on a private network, and a FIP with this port as a fixed IP port:

[root@devstack0 ~]# openstack router create --distributed r1 --external-gateway public
[root@devstack0 ~]# openstack router add subnet r1 private
[root@devstack0 ~]# openstack port create my-port --network private --device-owner compute:nova
+--------------+-------------------------------------------------------------------------------+
| Field        | Value                                                                         |
+--------------+-------------------------------------------------------------------------------+
| device_owner | compute:nova                                                                  |
| fixed_ips    | ip_address='192.168.10.133', subnet_id='8ec1cd23-363a-474c-a53f-bab4692c312f' |
+--------------+-------------------------------------------------------------------------------+
[root@devstack0 ~]# openstack floating ip create public --port my-port -c floating_ip_address
+---------------------+---------------+
| Field               | Value         |
+---------------------+---------------+
| floating_ip_address | 10.136.17.171 |
+---------------------+---------------+
[root@devstack0 ~]#

The FIP is added to the snat-ns:

[root@devstack0 ~]# ip netns exec snat-b961c902-8cd9-4c5c-a03c-6595368a2314 ip a
...
38: qg-6a663b96-e1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether fa:16:3e:bf:85:ab brd ff:ff:ff:ff:ff:ff
    inet 10.136.17.175/20 brd 10.136.31.255 scope global qg-6a663b96-e1
       valid_lft forever preferred_lft forever
    inet 10.136.17.171/32 brd 10.136.17.171 scope global qg-6a663b96-e1
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:febf:85ab/64 scope link
       valid_lft forever preferred_lft forever
...
[root@devstack0 ~]#

Create a VM with `my-port` and boot it on an another node:

[root@devstack0 ~]# openstack server create vm --port my-port --image
cirros-0.5.2-x86_64-disk --flavor 1 --host devstack2


Check FIP state on the node with VM (OK):

[root@devstack2 ~]# ip netns exec qrouter-b961c902-8cd9-4c5c-a03c-6595368a2314 ip rule
...
65426:  from 192.168.10.133 lookup 16
3232238081:     from 192.168.10.1/24 lookup 3232238081
[root@devstack2 ~]#

Check the FIP on the node with the snat-ns (not OK, it's still here):

[root@devstack0 ~]# ip netns exec snat-b961c902-8cd9-4c5c-a03c-6595368a2314 ip a
...
38: qg-6a663b96-e1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether fa:16:3e:bf:85:ab brd ff:ff:ff:ff:ff:ff
    inet 10.136.17.175/20 brd 10.136.31.255 scope global qg-6a663b96-e1
       valid_lft forever preferred_lft forever
    inet 10.136.17.171/32 brd 10.136.17.171 scope global qg-6a663b96-e1
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:febf:85ab/64 scope link
       valid_lft forever preferred_lft forever
...
[root@devstack0 ~]#


We found that FIP status "moving" notification is not sent to snat nodes in this scenario, see [1].
There was also some small discussion about why the notification should be sent only when changing from empty to a non-empty device_owner [2].
It looks like such behavior can be considered as a bug.


[1] https://opendev.org/openstack/neutron/src/commit/c1eff1dd440b2243a4a31cf3c3af06a01e899f1d/neutron/db/l3_dvrscheduler_db.py#L647
[2] https://review.opendev.org/c/openstack/neutron/+/609924/10/neutron/db/l3_dvrscheduler_db.py#503

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2003532

Title:
  Floating IP stuck in snat-ns after binding host to associated fixed IP

Status in neutron:
  New

Bug description:
  We encountered a problem when the floating IP is not removed from the snat-ns when FIP is moving from the centralized to the distributed state (i.e. when the host is binding to the associated fixed IP address).
  This happens when the the fixed IP was originally created with a non-empty device_owner field.

  Steps to reproduce.
  Create a router, a port on a private network, and a FIP with this port as a fixed IP port:

  [root@devstack0 ~]# openstack router create --distributed r1 --external-gateway public
  [root@devstack0 ~]# openstack router add subnet r1 private
  [root@devstack0 ~]# openstack port create my-port --network private --device-owner compute:nova
  +--------------+-------------------------------------------------------------------------------+
  | Field        | Value                                                                         |
  +--------------+-------------------------------------------------------------------------------+
  | device_owner | compute:nova                                                                  |
  | fixed_ips    | ip_address='192.168.10.133', subnet_id='8ec1cd23-363a-474c-a53f-bab4692c312f' |
  +--------------+-------------------------------------------------------------------------------+
  [root@devstack0 ~]# openstack floating ip create public --port my-port -c floating_ip_address
  +---------------------+---------------+
  | Field               | Value         |
  +---------------------+---------------+
  | floating_ip_address | 10.136.17.171 |
  +---------------------+---------------+
  [root@devstack0 ~]#

  The FIP is added to the snat-ns:

  [root@devstack0 ~]# ip netns exec snat-b961c902-8cd9-4c5c-a03c-6595368a2314 ip a
  ...
  38: qg-6a663b96-e1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
      link/ether fa:16:3e:bf:85:ab brd ff:ff:ff:ff:ff:ff
      inet 10.136.17.175/20 brd 10.136.31.255 scope global qg-6a663b96-e1
         valid_lft forever preferred_lft forever
      inet 10.136.17.171/32 brd 10.136.17.171 scope global qg-6a663b96-e1
         valid_lft forever preferred_lft forever
      inet6 fe80::f816:3eff:febf:85ab/64 scope link
         valid_lft forever preferred_lft forever
  ...
  [root@devstack0 ~]#

  Create a VM with `my-port` and boot it on an another node:

  [root@devstack0 ~]# openstack server create vm --port my-port --image
  cirros-0.5.2-x86_64-disk --flavor 1 --host devstack2

  
  Check FIP state on the node with VM (OK):

  [root@devstack2 ~]# ip netns exec qrouter-b961c902-8cd9-4c5c-a03c-6595368a2314 ip rule
  ...
  65426:  from 192.168.10.133 lookup 16
  3232238081:     from 192.168.10.1/24 lookup 3232238081
  [root@devstack2 ~]#

  Check the FIP on the node with the snat-ns (not OK, it's still here):

  [root@devstack0 ~]# ip netns exec snat-b961c902-8cd9-4c5c-a03c-6595368a2314 ip a
  ...
  38: qg-6a663b96-e1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
      link/ether fa:16:3e:bf:85:ab brd ff:ff:ff:ff:ff:ff
      inet 10.136.17.175/20 brd 10.136.31.255 scope global qg-6a663b96-e1
         valid_lft forever preferred_lft forever
      inet 10.136.17.171/32 brd 10.136.17.171 scope global qg-6a663b96-e1
         valid_lft forever preferred_lft forever
      inet6 fe80::f816:3eff:febf:85ab/64 scope link
         valid_lft forever preferred_lft forever
  ...
  [root@devstack0 ~]#

  
  We found that FIP status "moving" notification is not sent to snat nodes in this scenario, see [1].
  There was also some small discussion about why the notification should be sent only when changing from empty to a non-empty device_owner [2].
  It looks like such behavior can be considered as a bug.

  
  [1] https://opendev.org/openstack/neutron/src/commit/c1eff1dd440b2243a4a31cf3c3af06a01e899f1d/neutron/db/l3_dvrscheduler_db.py#L647
  [2] https://review.opendev.org/c/openstack/neutron/+/609924/10/neutron/db/l3_dvrscheduler_db.py#503

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2003532/+subscriptions