← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1749982] [NEW] After l3-agent-router-add inactive router gets all traffic

 

Public bug reported:

Openstack Ocata (10.0.2)

>From time to time, when we try to recover a router on a L3 agent the
traffic ends up in the recovered, but inactive router instance.

Expected result: router gets added, but traffic continues to flow to
already existing, active router on another L3 agent.

Actual result: router gets added, router is inactive, but traffic from
outside world ends up on the newly-created, inactive router. Therefore
no connection to active router and VMs behind the active router.

Workaround: Send a package (e.g. ping) from active router namespace.
E.g. ping from an instance behind the active router. This updates the
arp tables in the outside world and traffic continues to flow again.

Steps to reproduce:
1. One router with one active and one standby instance on different network nodes
1. Remove active router instance -> failover works nicely, 3-5 pings lost, inactive router instance changes to being active.
3. Add the router again -> router gets created, router is in standby status, but arp tables of outside world get changed so they point to the added, inactive router -> connectivity lost.

Commands used:

# neutron l3-agent-list-hosting-router $routerID
+--------------------------------------+-------+----------------+-------+----------+
| id                                   | host  | admin_state_up | alive | ha_state |
+--------------------------------------+-------+----------------+-------+----------+
| $router0 | net00 | True           | :-)   | active   |
| $router1 | net01 | True           | :-)   | standby  |
+--------------------------------------+-------+----------------+-------+----------+
# neutron l3-agent-router-remove $router0 $routerID
# neutron l3-agent-router-add $router0 $routerID


Packages used:
neutron-common/xenial-updates,now 2:10.0.4-0ubuntu1~cloud0 all [installed,automatic]
neutron-lbaas-common/xenial-updates,now 2:10.0.1-0ubuntu1~cloud0 all [installed]
neutron-metadata-agent/xenial-updates,now 2:10.0.4-0ubuntu1~cloud0 all [installed,automatic]
neutron-plugin-ml2/xenial-updates,now 2:10.0.4-0ubuntu1~cloud0 all [installed]
neutron-server/xenial-updates,now 2:10.0.4-0ubuntu1~cloud0 all [installed]
neutron-vpn-agent/xenial-updates,now 2:10.0.0-0ubuntu1~cloud0 all [installed]

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1749982

Title:
  After l3-agent-router-add inactive router gets all traffic

Status in neutron:
  New

Bug description:
  Openstack Ocata (10.0.2)

  From time to time, when we try to recover a router on a L3 agent the
  traffic ends up in the recovered, but inactive router instance.

  Expected result: router gets added, but traffic continues to flow to
  already existing, active router on another L3 agent.

  Actual result: router gets added, router is inactive, but traffic from
  outside world ends up on the newly-created, inactive router. Therefore
  no connection to active router and VMs behind the active router.

  Workaround: Send a package (e.g. ping) from active router namespace.
  E.g. ping from an instance behind the active router. This updates the
  arp tables in the outside world and traffic continues to flow again.

  Steps to reproduce:
  1. One router with one active and one standby instance on different network nodes
  1. Remove active router instance -> failover works nicely, 3-5 pings lost, inactive router instance changes to being active.
  3. Add the router again -> router gets created, router is in standby status, but arp tables of outside world get changed so they point to the added, inactive router -> connectivity lost.

  Commands used:

  # neutron l3-agent-list-hosting-router $routerID
  +--------------------------------------+-------+----------------+-------+----------+
  | id                                   | host  | admin_state_up | alive | ha_state |
  +--------------------------------------+-------+----------------+-------+----------+
  | $router0 | net00 | True           | :-)   | active   |
  | $router1 | net01 | True           | :-)   | standby  |
  +--------------------------------------+-------+----------------+-------+----------+
  # neutron l3-agent-router-remove $router0 $routerID
  # neutron l3-agent-router-add $router0 $routerID

  
  Packages used:
  neutron-common/xenial-updates,now 2:10.0.4-0ubuntu1~cloud0 all [installed,automatic]
  neutron-lbaas-common/xenial-updates,now 2:10.0.1-0ubuntu1~cloud0 all [installed]
  neutron-metadata-agent/xenial-updates,now 2:10.0.4-0ubuntu1~cloud0 all [installed,automatic]
  neutron-plugin-ml2/xenial-updates,now 2:10.0.4-0ubuntu1~cloud0 all [installed]
  neutron-server/xenial-updates,now 2:10.0.4-0ubuntu1~cloud0 all [installed]
  neutron-vpn-agent/xenial-updates,now 2:10.0.0-0ubuntu1~cloud0 all [installed]

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1749982/+subscriptions