← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1662804] Re: [SRU] Agent is failing to process HA router if initialize() fails

 

This bug was fixed in the package neutron - 2:9.4.0-0ubuntu1~cloud0
---------------

 neutron (2:9.4.0-0ubuntu1~cloud0) xenial-newton; urgency=medium
 .
   * New upstream release for the Ubuntu Cloud Archive.
 .
 neutron (2:9.4.0-0ubuntu1) yakkety; urgency=medium
 .
   * New upstream point release for OpenStack Newton (LP: #1696133, #1662804).


** Changed in: cloud-archive/newton
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1662804

Title:
  [SRU] Agent is failing to process HA router if initialize() fails

Status in Ubuntu Cloud Archive:
  Fix Released
Status in Ubuntu Cloud Archive mitaka series:
  Fix Released
Status in Ubuntu Cloud Archive newton series:
  Fix Released
Status in neutron:
  Fix Released
Status in neutron package in Ubuntu:
  Fix Released
Status in neutron source package in Xenial:
  Fix Released
Status in neutron source package in Yakkety:
  Fix Released

Bug description:
  [Impact]

  This patch resolves, amongst other things, issues with a create and
  delete router request race condition when using l3 HA. At the time of
  backport this patch is already available from Ocata onwards and has
  been verified as sufficiently minimal and safe for backport to Newton
  and Mitaka. Essentially the error case is a result of an incorrectly
  intialised router update action being executed without proper checks
  and this patch fixes this.

  [Test Case]

   * Deploy Openstack Mitaka - http://pastebin.ubuntu.com/24637244/ -
  with neutron-l3-agent configured to provide HA (vrrp) routers.

   * Repeatedly create and delete routers in rapid succession and check
  that the l3 agent does not go into an infinite error loop i.e. run
  http://pastebin.ubuntu.com/24634950/ and run do tail -F
  /var/log/neutron/neutron-l3-agent.log on all units of l3 agent. Also
  check that qrouter- namepspaces are not stacking up. For Mitaka I
  typically hit the error after ~20 create/deletes.

  [Regression Potential]

   * I do not envisage any regression potential from this patch.

  ====

  When HA router initialize() function fails for some reason(rabbitmq
  restart or no ha_port), keepalived_manager or KeepalivedInstance won't
  be configured. In this case, _process_router_if_compatible fails with
  exception, then _resync_router(update) will again try to process this
  router in loop. As we try initialize() only once(which was failed),
  retry of _process_router_if_compatible will always fail(no keepalived
  manager or instance) and router is never configured(see below trace).

  2017-02-06 18:34:18.539 26120 DEBUG neutron.agent.linux.utils [-] Running command (rootwrap daemon): ['ip', 'netns', 'exec', 'qrouter-114a72fe-02ae-4b87-a2e7-70f962df0951', 'ip', '-o', 'link', 'show', 'qr-e6
  3406e1-e7'] execute_rootwrap_daemon /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:101
  2017-02-06 18:34:18.544 26120 DEBUG neutron.agent.linux.utils [-]
  Command: ['ip', 'netns', 'exec', u'qrouter-114a72fe-02ae-4b87-a2e7-70f962df0951', 'ip', '-o', 'link', 'show', u'qr-e63406e1-e7']
  Exit code: 0
   execute /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:156
  2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info [-] 'NoneType' object has no attribute 'get_process'
  2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info Traceback (most recent call last):
  2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info   File "/usr/lib/python2.7/site-packages/neutron/common/utils.py", line 359, in call
  2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info     return func(*args, **kwargs)
  2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 744, in process
  2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info     self._process_internal_ports(agent.pd)
  2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 394, in _process_internal_ports
  2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info     self.internal_network_added(p)
  2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 275, in internal_network_added
  2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info     self._disable_ipv6_addressing_on_interface(interface_name)
  2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 235, in _disable_ipv6_addressing_on_interface
  2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info     if self._should_delete_ipv6_lladdr(ipv6_lladdr):
  2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 217, in _should_delete_ipv6_lladdr
  2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info     if manager.get_process().active:
  2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info AttributeError: 'NoneType' object has no attribute 'get_process'
  2017-02-06 18:34:18.544 26120 ERROR neutron.agent.l3.router_info
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent [-] Failed to process compatible router '114a72fe-02ae-4b87-a2e7-70f962df0951'
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent Traceback (most recent call last):
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 506, in _process_router_update
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent     self._process_router_if_compatible(router)
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 445, in _process_router_if_compatible
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent     self._process_updated_router(router)
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 459, in _process_updated_router
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent     ri.process(self)
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 377, in process
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent     super(HaRouter, self).process(agent)
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/common/utils.py", line 362, in call
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent     self.logger(e)
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 204, in __exit__
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent     six.reraise(self.type_, self.value, self.tb)
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/common/utils.py", line 359, in call
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent     return func(*args, **kwargs)
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 744, in process
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent     self._process_internal_ports(agent.pd)
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 394, in _process_internal_ports
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent     self.internal_network_added(p)
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 275, in internal_network_added
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent     self._disable_ipv6_addressing_on_interface(interface_name)
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 235, in _disable_ipv6_addressing_on_interface
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent     if self._should_delete_ipv6_lladdr(ipv6_lladdr):
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 217, in _should_delete_ipv6_lladdr
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent     if manager.get_process().active:
  2017-02-06 18:34:18.549 26120 ERROR neutron.agent.l3.agent AttributeError: 'NoneType' object has no attribute 'get_process'

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1662804/+subscriptions


References