← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1482521] [NEW] L3 Agent in DVR mode is removing FIP namespace on startup

 

Public bug reported:

Rrestarting the L3 agent in DVR mode is causing the VM network downtime for configured floating IP.
The responsible for this situation is removing of FIP namespace at startup.

The reproduction steps:
1. Configure Openstack to have tenant network and external network with Floating IPs.
2. Launch VM and assign floating IP to it.
3. Ping the VM from external network machine.
4. Restart the L3 Agent on compute node where VM was placed.
5. You can observe that few pings are lost.

I guess the problem is at startup when network namespace are parsed, and
the FIP namespace is not included in L3 server message - so it is
treated as stale and removed.

The traceback when I raise exception in /neutron/neutron/agent/l3/dvr_fip_ns.py FipNamespace.delete() :
2015-08-06 06:35:28.469 DEBUG neutron.agent.linux.utils [-] Unable to access /opt/openstack/data/neutron/external/pids/8223e12e-837b-49d4-9793-63603fccbc9f.pid from (pid=70216) get_value_from_file /opt/openstack/neutron/neutron/agent/linux/utils.py:222
2015-08-06 06:35:28.469 DEBUG neutron.agent.linux.utils [-] Unable to access /opt/openstack/data/neutron/external/pids/8223e12e-837b-49d4-9793-63603fccbc9f.pid from (pid=70216) get_value_from_file /opt/openstack/neutron/neutron/agent/linux/utils.py:222
2015-08-06 06:35:28.469 DEBUG neutron.agent.linux.external_process [-] No process started for 8223e12e-837b-49d4-9793-63603fccbc9f from (pid=70216) disable /opt/openstack/neutron/neutron/agent/linux/external_process.py:113
Traceback (most recent call last):
 File "/usr/local/lib/python2.7/dist-packages/eventlet/queue.py", line 117, in switch
    self.greenlet.switch(value)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main
    result = function(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/oslo_service/service.py", line 612, in run_service
    service.start()
  File "/opt/openstack/neutron/neutron/service.py", line 233, in start
    self.manager.after_start()
  File "/opt/openstack/neutron/neutron/agent/l3/agent.py", line 641, in after_start
    self.periodic_sync_routers_task(self.context)
  File "/opt/openstack/neutron/neutron/agent/l3/agent.py", line 519, in periodic_sync_routers_task
    self.fetch_and_sync_all_routers(context, ns_manager)
  File "/opt/openstack/neutron/neutron/agent/l3/namespace_manager.py", line 91, in __exit__
    self._cleanup(_ns_prefix, ns_id)
  File "/opt/openstack/neutron/neutron/agent/l3/namespace_manager.py", line 140, in _cleanup
    ns.delete()
  File "/opt/openstack/neutron/neutron/agent/l3/dvr_fip_ns.py", line 147, in delete
    raise TypeError("ss")
TypeError: ss

full log here:
http://pastebin.com/xRa77kk6

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1482521

Title:
  L3 Agent in DVR mode is removing FIP namespace on startup

Status in neutron:
  New

Bug description:
  Rrestarting the L3 agent in DVR mode is causing the VM network downtime for configured floating IP.
  The responsible for this situation is removing of FIP namespace at startup.

  The reproduction steps:
  1. Configure Openstack to have tenant network and external network with Floating IPs.
  2. Launch VM and assign floating IP to it.
  3. Ping the VM from external network machine.
  4. Restart the L3 Agent on compute node where VM was placed.
  5. You can observe that few pings are lost.

  I guess the problem is at startup when network namespace are parsed,
  and the FIP namespace is not included in L3 server message - so it is
  treated as stale and removed.

  The traceback when I raise exception in /neutron/neutron/agent/l3/dvr_fip_ns.py FipNamespace.delete() :
  2015-08-06 06:35:28.469 DEBUG neutron.agent.linux.utils [-] Unable to access /opt/openstack/data/neutron/external/pids/8223e12e-837b-49d4-9793-63603fccbc9f.pid from (pid=70216) get_value_from_file /opt/openstack/neutron/neutron/agent/linux/utils.py:222
  2015-08-06 06:35:28.469 DEBUG neutron.agent.linux.utils [-] Unable to access /opt/openstack/data/neutron/external/pids/8223e12e-837b-49d4-9793-63603fccbc9f.pid from (pid=70216) get_value_from_file /opt/openstack/neutron/neutron/agent/linux/utils.py:222
  2015-08-06 06:35:28.469 DEBUG neutron.agent.linux.external_process [-] No process started for 8223e12e-837b-49d4-9793-63603fccbc9f from (pid=70216) disable /opt/openstack/neutron/neutron/agent/linux/external_process.py:113
  Traceback (most recent call last):
   File "/usr/local/lib/python2.7/dist-packages/eventlet/queue.py", line 117, in switch
      self.greenlet.switch(value)
    File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main
      result = function(*args, **kwargs)
    File "/usr/local/lib/python2.7/dist-packages/oslo_service/service.py", line 612, in run_service
      service.start()
    File "/opt/openstack/neutron/neutron/service.py", line 233, in start
      self.manager.after_start()
    File "/opt/openstack/neutron/neutron/agent/l3/agent.py", line 641, in after_start
      self.periodic_sync_routers_task(self.context)
    File "/opt/openstack/neutron/neutron/agent/l3/agent.py", line 519, in periodic_sync_routers_task
      self.fetch_and_sync_all_routers(context, ns_manager)
    File "/opt/openstack/neutron/neutron/agent/l3/namespace_manager.py", line 91, in __exit__
      self._cleanup(_ns_prefix, ns_id)
    File "/opt/openstack/neutron/neutron/agent/l3/namespace_manager.py", line 140, in _cleanup
      ns.delete()
    File "/opt/openstack/neutron/neutron/agent/l3/dvr_fip_ns.py", line 147, in delete
      raise TypeError("ss")
  TypeError: ss

  full log here:
  http://pastebin.com/xRa77kk6

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1482521/+subscriptions


Follow ups