← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1883089] Re: [L3] floating IP failed to bind due to no agent gateway port(fip-ns)

 

UCA Ussuri is released to ussuri-updates in package
2:16.3.2-0ubuntu3~cloud0, so marking the status as Fix released for UCA
Ussuri

** Changed in: cloud-archive/ussuri
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1883089

Title:
  [L3] floating IP failed to bind due to no agent gateway port(fip-ns)

Status in Ubuntu Cloud Archive:
  New
Status in Ubuntu Cloud Archive ussuri series:
  Fix Released
Status in Ubuntu Cloud Archive victoria series:
  Fix Released
Status in neutron:
  Fix Released
Status in neutron package in Ubuntu:
  Fix Released
Status in neutron source package in Focal:
  Fix Released
Status in neutron source package in Groovy:
  Fix Released
Status in neutron source package in Hirsute:
  Fix Released
Status in neutron source package in Impish:
  Fix Released

Bug description:
  In patch [1] it introduced a binding of DB uniq constraint for L3
  agent gateway. In some extreme case the DvrFipGatewayPortAgentBinding
  is in DB while the gateway port not. The current code path only checks
  the binding existence which will pass a "None" port to the following
  code path that results an AttributeError.

  [1] https://review.opendev.org/#/c/702547/

  Exception log:

  2020-06-11 15:39:28.361 1285214 INFO neutron.db.l3_dvr_db [None req-d6a41187-2495-46bf-a424-ab7195c0ecb1 - - - - -] Floating IP Agent Gateway port for network 3fcb7702-ae0b-46b4-807f-8ae94d656dd3 does not exist on host host-compute-1. Creating one.
  2020-06-11 15:39:28.370 1285214 DEBUG neutron.db.l3_dvr_db [None req-d6a41187-2495-46bf-a424-ab7195c0ecb1 - - - - -] Floating IP Agent Gateway port for network 3fcb7702-ae0b-46b4-807f-8ae94d656dd3 already exists on host host-compute-1. Probably it was just created by other worker. create_fip_agent_gw_port_if_not_exists /usr/lib/python2.7/site-packages/neutron/db/l3_dvr_db.py:927
  2020-06-11 15:39:28.390 1285214 DEBUG neutron.db.l3_dvr_db [None req-d6a41187-2495-46bf-a424-ab7195c0ecb1 - - - - -] Floating IP Agent Gateway port None found for the destination host: host-compute-1 create_fip_agent_gw_port_if_not_exists /usr/lib/python2.7/site-packages/neutron/db/l3_dvr_db.py:933
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server [None req-d6a41187-2495-46bf-a424-ab7195c0ecb1 - - - - -] Exception during message handling: AttributeError: 'NoneType' object has no attribute 'get'
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 170, in _process_incoming
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server     res = self.dispatcher.dispatch(message)
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 220, in dispatch
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server     return self._do_dispatch(endpoint, method, ctxt, args)
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 190, in _do_dispatch
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server     result = func(ctxt, **new_args)
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/neutron/db/api.py", line 91, in wrapped
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server     setattr(e, '_RETRY_EXCEEDED', True)
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server     self.force_reraise()
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server     six.reraise(self.type_, self.value, self.tb)
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/neutron/db/api.py", line 87, in wrapped
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server     return f(*args, **kwargs)
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 147, in wrapper
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server     ectxt.value = e.inner_exc
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server     self.force_reraise()
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server     six.reraise(self.type_, self.value, self.tb)
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 135, in wrapper
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server     return f(*args, **kwargs)
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/neutron/db/api.py", line 126, in wrapped
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server     LOG.debug("Retry wrapper got retriable exception: %s", e)
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server     self.force_reraise()
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server     six.reraise(self.type_, self.value, self.tb)
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/neutron/db/api.py", line 122, in wrapped
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server     return f(*dup_args, **dup_kwargs)
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/neutron/api/rpc/handlers/l3_rpc.py", line 348, in get_agent_gateway_port
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server     admin_ctx, network_id, host)
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/neutron/db/l3_dvr_db.py", line 953, in create_fip_agent_gw_port_if_not_exists
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server     self._populate_mtu_and_subnets_for_ports(context, [agent_port])
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/neutron/db/l3_db.py", line 1978, in _populate_mtu_and_subnets_for_ports
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server     for p in self._each_port_having_fixed_ips(ports)]
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/neutron/db/l3_db.py", line 1925, in _each_port_having_fixed_ips
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server     fixed_ips = port.get('fixed_ips', [])
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server AttributeError: 'NoneType' object has no attribute 'get'
  2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.rpc.server

  -------------------------------------------------------------------------

  [SRU]

  [Impact]
  In some cases the DvrFipGatewayPortAgentBinding is in DB but the gateway port does not exist.
  This resulted in connectivity issues to FIP for the new VMs launched on that compute node.
  The fix creates the gateway port if it does not exist.

  [Test Plan]
  This is a race condition and difficult to reproduce. The test case simulated the error condition to verify the fix.

  * Deploy openstack with dvr l3ha and centralised snat on neutron nodes
  * Deploy instances and delete them. This step is to ensure FIP Agent gateway's are created on compute nodes

    Check the following command to see FIP Agent gateway information
    openstack port list --network ext_net -c id -c device_id -c binding_host_id -c device_owner -c fixed_ips | grep floatingip_agent_gateway

  * Pick one of the compute node that has no instances and delete the FIP Agent gateway port (port id can be determined from above command)
    openstack port delete <port id>

  * Launch an instance on the compute node
    openstack server create --wait --image cirros --flavor m1.cirros --nic net-id=<network id> --availability-zone nova:<hostname> cirros-test1

  * Verify neutron-server logs for error
    ERROR oslo_messaging.rpc.server AttributeError: 'NoneType' object has no attribute 'get'

  * Assign floating ip and tried to ping fip and the ping fails

  [Where problems could occur]
  The fix itself adds an extra check to determine the cases when the gateway port needs to be created.
  And hence it is not expected to cause any regression.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1883089/+subscriptions


References