← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1926531] [NEW] SNAT namespace prematurely created then deleted on hosts, resulting in removal of RFP/FPR link to FIP namespace

 

Public bug reported:

Seems like collateral from
https://bugs.launchpad.net/neutron/+bug/1850779


I think this fix causes problems. We have multiple nodes that are
DVR_SNAT mode. Snat namespace is scheduled to 1 of them.

When l3-agent is restarted on the othre nodes, now, initialize() is
invoked always for DvrEdgeRouter which creates the SNAT namespace
prematurely. This in turn causes external_gateway_added() to later
detect that this host is NOT hosting snat router, but the namespace
exists, so it removes it by triggering
external_gateway_removed(dvr_edge_router --> dvr_local_router)

Problem is that the dvr_local_router code for external_gateway_removed()
ends up DELETING the rfp/fpr pair and severs the qrouter connection to
fip namespace (and deletes all the FIP routes in fip namespace as a
result).

Prior to this bug fix, _create_snat_namespace was only invoked in
_create_dvr_gateway(), which was only invoked when the node was actually
hosting SNAT for the router.

Even without the breaking issue of deleting the rtr_2_fip link, this fix
unnecessarily creates SNAT namespace on every host, only for it to be
deleted.

FYI this is for non-HA routers


1. Where the qrouter to FIP link is deleted:
https://github.com/openstack/neutron/blob/master/neutron/agent/l3/dvr_local_router.py#L599

This results in connectivity breakage

2. Above #1 is triggered by code here in edge router which sees snat
namespace, but SNAT is scheduled to different host:
https://github.com/openstack/neutron/blob/master/neutron/agent/l3/dvr_edge_router.py#L56

3. SNAT namespace is created on wrong host because of bug fix for
1850779 which moved it to DvrEdgeRouter intilization

** Affects: neutron
     Importance: Undecided
         Status: New


** Tags: l3-dvr-backlog l3-ha

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1926531

Title:
  SNAT namespace prematurely created then deleted on hosts, resulting in
  removal of RFP/FPR link to FIP namespace

Status in neutron:
  New

Bug description:
  Seems like collateral from
  https://bugs.launchpad.net/neutron/+bug/1850779


  I think this fix causes problems. We have multiple nodes that are
  DVR_SNAT mode. Snat namespace is scheduled to 1 of them.

  When l3-agent is restarted on the othre nodes, now, initialize() is
  invoked always for DvrEdgeRouter which creates the SNAT namespace
  prematurely. This in turn causes external_gateway_added() to later
  detect that this host is NOT hosting snat router, but the namespace
  exists, so it removes it by triggering
  external_gateway_removed(dvr_edge_router --> dvr_local_router)

  Problem is that the dvr_local_router code for
  external_gateway_removed() ends up DELETING the rfp/fpr pair and
  severs the qrouter connection to fip namespace (and deletes all the
  FIP routes in fip namespace as a result).

  Prior to this bug fix, _create_snat_namespace was only invoked in
  _create_dvr_gateway(), which was only invoked when the node was
  actually hosting SNAT for the router.

  Even without the breaking issue of deleting the rtr_2_fip link, this
  fix unnecessarily creates SNAT namespace on every host, only for it to
  be deleted.

  FYI this is for non-HA routers


  1. Where the qrouter to FIP link is deleted:
  https://github.com/openstack/neutron/blob/master/neutron/agent/l3/dvr_local_router.py#L599

  This results in connectivity breakage

  2. Above #1 is triggered by code here in edge router which sees snat
  namespace, but SNAT is scheduled to different host:
  https://github.com/openstack/neutron/blob/master/neutron/agent/l3/dvr_edge_router.py#L56

  3. SNAT namespace is created on wrong host because of bug fix for
  1850779 which moved it to DvrEdgeRouter intilization

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1926531/+subscriptions