yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #43814
[Bug 1530070] [NEW] Neutron Netns Cleanup script fails to delete namespaces after reboot
Public bug reported:
After rebooting a node which held an active VRRP router, DHCP , and metadata agent, the neutron-netns-cleanup utility failed to delete stale namespaces.
The utility fails with :
seting the network namespace "qrouter-3d4e5634-59f0-401e-
9f28-6c8daaec311c" failed: Invalid argument
The reason is a bug in iproute which fails to do any operation on a stale namespaces which appear in /var/run/netns like this:
root@stratonode66 ~# ls -l /var/run/netns/
total 0
rrr- 1 root root 0 Dec 24 13:38 qdhcp-0a348422-97e2-4ab6-bb22-55994a125823
rrr- 1 root root 0 Dec 24 11:54 qdhcp-2258aa3f-d256-4c9f-9e48-16811fc57981
rrr- 1 root root 0 Dec 24 13:38 qdhcp-3ceb1f27-e3fc-413a-a184-567041f073e2
rrr- 1 root root 0 Dec 24 11:54 qdhcp-62a51b66-d0e2-42fc-bdf2-2d622a889e75
rrr- 1 root root 0 Dec 24 11:54 qdhcp-81b550a2-c483-4280-a83a-b560ecdc416b
---------- 1 root root 0 Dec 23 13:54 qrouter-3d4e5634-59f0-401e-9f28-6c8daaec311c
---------- 1 root root 0 Dec 24 11:25 qrouter-69d20923-da78-4c6b-bb24-967dd67acb1d
---------- 1 root root 0 Dec 23 13:54 qrouter-cc649801-96ec-4d59-90de-1004fc026024
This bug s related, but doesn't solve the issue after reboot:
https://bugs.launchpad.net/neutron/+bug/1052535.
I solved it by fixing the neutron-netns-cleanup --force code, with this
patch:
diff --git a/neutron/agent/netns_cleanup_util.py b/neutron/agent/netns_cleanup_util.py
index 771a77f..3c43480 100644
--- a/neutron/agent/netns_cleanup_util.py
+++ b/neutron/agent/netns_cleanup_util.py
@@ -132,8 +132,13 @@ def destroy_namespace(conf, namespace, force=False):
# NOTE: The dhcp driver will remove the namespace if is it empty,
# so a second check is required here.
if ip.netns.exists(namespace):
- for device in ip.get_devices(exclude_loopback=True):
- unplug_device(conf, device)
+ try:
+ for device in ip.get_devices(exclude_loopback=True):
+ unplug_device(conf, device)
+ except RuntimeError:
+ LOG.info(_('Keep calm, and destroy namespace: %s'), namespace)
+ ip.netns.delete(namespace)
+ return
ip.garbage_collect_namespace()
except Exception:
When I run the following after reboot, the name spaces are cleaned-up
and when starting neutron-openvswitch-agent.service neutron-dhcp-
agent.service neutron-l3-agent.service they are recreated.
** Affects: neutron
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1530070
Title:
Neutron Netns Cleanup script fails to delete namespaces after reboot
Status in neutron:
New
Bug description:
After rebooting a node which held an active VRRP router, DHCP , and metadata agent, the neutron-netns-cleanup utility failed to delete stale namespaces.
The utility fails with :
seting the network namespace "qrouter-3d4e5634-59f0-401e-
9f28-6c8daaec311c" failed: Invalid argument
The reason is a bug in iproute which fails to do any operation on a stale namespaces which appear in /var/run/netns like this:
root@stratonode66 ~# ls -l /var/run/netns/
total 0
rrr- 1 root root 0 Dec 24 13:38 qdhcp-0a348422-97e2-4ab6-bb22-55994a125823
rrr- 1 root root 0 Dec 24 11:54 qdhcp-2258aa3f-d256-4c9f-9e48-16811fc57981
rrr- 1 root root 0 Dec 24 13:38 qdhcp-3ceb1f27-e3fc-413a-a184-567041f073e2
rrr- 1 root root 0 Dec 24 11:54 qdhcp-62a51b66-d0e2-42fc-bdf2-2d622a889e75
rrr- 1 root root 0 Dec 24 11:54 qdhcp-81b550a2-c483-4280-a83a-b560ecdc416b
---------- 1 root root 0 Dec 23 13:54 qrouter-3d4e5634-59f0-401e-9f28-6c8daaec311c
---------- 1 root root 0 Dec 24 11:25 qrouter-69d20923-da78-4c6b-bb24-967dd67acb1d
---------- 1 root root 0 Dec 23 13:54 qrouter-cc649801-96ec-4d59-90de-1004fc026024
This bug s related, but doesn't solve the issue after reboot:
https://bugs.launchpad.net/neutron/+bug/1052535.
I solved it by fixing the neutron-netns-cleanup --force code, with
this patch:
diff --git a/neutron/agent/netns_cleanup_util.py b/neutron/agent/netns_cleanup_util.py
index 771a77f..3c43480 100644
--- a/neutron/agent/netns_cleanup_util.py
+++ b/neutron/agent/netns_cleanup_util.py
@@ -132,8 +132,13 @@ def destroy_namespace(conf, namespace, force=False):
# NOTE: The dhcp driver will remove the namespace if is it empty,
# so a second check is required here.
if ip.netns.exists(namespace):
- for device in ip.get_devices(exclude_loopback=True):
- unplug_device(conf, device)
+ try:
+ for device in ip.get_devices(exclude_loopback=True):
+ unplug_device(conf, device)
+ except RuntimeError:
+ LOG.info(_('Keep calm, and destroy namespace: %s'), namespace)
+ ip.netns.delete(namespace)
+ return
ip.garbage_collect_namespace()
except Exception:
When I run the following after reboot, the name spaces are cleaned-up
and when starting neutron-openvswitch-agent.service neutron-dhcp-
agent.service neutron-l3-agent.service they are recreated.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1530070/+subscriptions
Follow ups