yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #72029
[Bug 1755243] Re: AttributeError when updating DvrEdgeRouter objects running on network nodes
Reviewed: https://review.openstack.org/552097
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=8c2dae659a806fdc20331de4b8a917ec3ae0e6f6
Submitter: Zuul
Branch: master
commit 8c2dae659a806fdc20331de4b8a917ec3ae0e6f6
Author: Daniel Gonzalez <daniel@xxxxxxxxxxxxxxxxxxxxx>
Date: Mon Mar 12 17:48:54 2018 +0100
Fix l3-agent crash on routers without ha_state
l3-agent checks the HA state of routers when a router is updated.
To ensure that the HA state is only checked on HA routers the following
check is performed: `if router.get('ha') and not is_dvr_only_agent`.
This check should ensure that the check is only performed on
DvrEdgeHaRouter and HaRouter objects.
Unfortunately, there are cases where we have DvrEdgeRouter objects
running on 'dvr_snat' agents. E.g. when deploying a loadbalancer with
neutron-lbaas in a landscape with 6 network nodes and
max_l3_agents_per_router set to 3, it may happen that the loadbalancer
is placed on a network node that does not have a DvrEdgeHaRouter running
on it. In such a case, neutron will deploy a DvrEdgeRouter object on the
network node to serve the loadbalancer, just like it would deploy a
DvrEdgeRouter on a compute node when deploying a VM.
Under such circumstances each update to the router will lead to an
AttributeError, because the DvrEdgeRouter object does not have the
ha_state attribute.
This patch circumvents the issue by doing an additional check on the
router object to ensure that it actually has the ha_state attribute.
Change-Id: I755990324db445efd0ee0b8a9db1f4d7bfb58e26
Closes-Bug: #1755243
** Changed in: neutron
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1755243
Title:
AttributeError when updating DvrEdgeRouter objects running on network
nodes
Status in neutron:
Fix Released
Bug description:
In a configuration with L3 HA, DVR and neutron-lbaasv2, it can happen
that the update of a router with a connected load balancer crashes
with the following stack trace (line numbers may be a bit outdated):
Failed to process compatible router: 192c77b2-1487-4bc4-af40-26563e959989
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 543, in _process_router_update
self._process_router_if_compatible(router)
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 464, in _process_router_if_compatible
self._process_updated_router(router)
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 480, in _process_updated_router
router['id'], router.get(l3_constants.HA_ROUTER_STATE_KEY))
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha.py", line 132, in check_ha_state_for_router
if ri and current_state != TRANSLATION_MAP[ri.ha_state]:
AttributeError: 'DvrEdgeRouter' object has no attribute 'ha_state'
The issue is, that in a landscape with more network nodes than
'max_l3_agents_per_router', e.g. 6 network nodes and
max_l3_agents_per_router = 3, it may happen that a load balancer is
scheduled on a network node that does not have the correct router
deployed on it. In such a case, neutron deploys a DvrEdgeRouter on the
network node to serve the LB. Every time neutron updates that router,
e.g. to assign a floating IP to the LB, it crashes with the above
stack trace because it expected to find a DvrEdgeHaRouter on the
network node on which it has to check the ha state.
To verify if it has to check the ha state of a router object, neutron
runs the following check:
if router.get('ha') and not is_dvr_only_agent
In our case that check is true, because the agent runs in mode
'dvr_snat', and the router is HA. But the actual router object running
on the network node is of type DvrEdgeRouter and therefore has no
ha_state attribute, causing the update to fail.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1755243/+subscriptions
References