yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #54746
[Bug 1557909] Re: SNAT namespace is not getting cleared after the manual move of SNAT with dead agent
Reviewed: https://review.openstack.org/326729
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=acd04d668bd414cd21f2715adc6a35a0eaed59a3
Submitter: Jenkins
Branch: master
commit acd04d668bd414cd21f2715adc6a35a0eaed59a3
Author: Swaminathan Vasudevan <swaminathan.vasudevan@xxxxxxx>
Date: Tue Jun 7 13:31:56 2016 -0700
DVR: Clean stale snat-ns by checking its existence when agent restarts
At present there is no clear way to distinguish when the snat_namespace
object is initialized and when the actual namespace is created.
There is no way to check if the namespace already existed. The
code was only checking at the snat_namespace object instead of its
existence.
This patch addresses the issue by adding in an exists method to the
namespace object to identify the existence of the namespace in the
given agent.
This would allow us to check for the existence of the namespace,
and also allow us to identify the stale snat namespace and
delete the namespace when the gateway is cleared as the agent restarts.
This also applies for conditions when the router is manually moved
from one agent to another agent while the agent is dead. When the
agent wakes up it would clean up the stale snat namespace.
Change-Id: Icb00297208813436c2a9e9a003275462293ad643
Closes-Bug: #1557909
** Changed in: neutron
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1557909
Title:
SNAT namespace is not getting cleared after the manual move of SNAT
with dead agent
Status in neutron:
Fix Released
Bug description:
Latest patch (2016-06-10): https://review.openstack.org/#/c/326729/
Stale snat namespace on the controller after recovery of dead l3
agent.
Note: Only on Stable/LIBERTY Branch:
Setup:
Multiple controller (DVR_SNAT) setup.
Steps:
1) Create tenant network, subnet and router.
2) Create a external network
3) Attached internal & external network to a router
4) Create VM on above tenant network.
5) Make sure VM can reach outside using CSNAT.
6) Find router hosting l3 agent and stop the l3 agent.
7) Manually move router to other controller (dvr_snat mode). SNAT namespace should be create on new controller node.
8) Start the l3 agent on the controller (the one that stopped in step6)
9) Notice that snat namespace is now available on 2 controller and it is not getting deleted from the agent which is not hosting it.
Example:
| cfa97c12-b975-4515-86c3-9710c9b88d76 | L3 agent | vm2-ctl2-936 | :-) | True | neutron-l3-agent |
| df4ca7c5-9bae-4cfb-bc83-216612b2b378 | L3 agent | vm1-ctl1-936 | :-) | True | neutron-l3-agent |
mysql> select * from csnat_l3_agent_bindings;
+--------------------------------------+--------------------------------------+---------+------------------+
| router_id | l3_agent_id | host_id | csnat_gw_port_id |
+--------------------------------------+--------------------------------------+---------+------------------+
| 0fb68420-9e69-41bb-8a88-8ab53b0faabb | cfa97c12-b975-4515-86c3-9710c9b88d76 | NULL | NULL |
+--------------------------------------+--------------------------------------+---------+------------------+
On vm1-ctl1-936
Stale SNAT namespace on Initially hosting controller.
ubuntu@vm1-ctl1-936:~/devstack$ sudo ip netns
snat-0fb68420-9e69-41bb-8a88-8ab53b0faabb
qrouter-0fb68420-9e69-41bb-8a88-8ab53b0faabb
On vm2-ctl2-936 (2nd Controller)
ubuntu@vm2-ctl2-936:~$ ip netns
snat-0fb68420-9e69-41bb-8a88-8ab53b0faabb
qrouter-0fb68420-9e69-41bb-8a88-8ab53b0faabb
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1557909/+subscriptions
References