yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #71643
[Bug 1755243] [NEW] AttributeError when updating DvrEdgeRouter objects running on network nodes
Public bug reported:
In a configuration with L3 HA, DVR and neutron-lbaasv2, it can happen
that the update of a router with a connected load balancer crashes with
the following stack trace (line numbers may be a bit outdated):
Failed to process compatible router: 192c77b2-1487-4bc4-af40-26563e959989
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 543, in _process_router_update
self._process_router_if_compatible(router)
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 464, in _process_router_if_compatible
self._process_updated_router(router)
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 480, in _process_updated_router
router['id'], router.get(l3_constants.HA_ROUTER_STATE_KEY))
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha.py", line 132, in check_ha_state_for_router
if ri and current_state != TRANSLATION_MAP[ri.ha_state]:
AttributeError: 'DvrEdgeRouter' object has no attribute 'ha_state'
The issue is, that in a landscape with more network nodes than
'max_l3_agents_per_router', e.g. 6 network nodes and
max_l3_agents_per_router = 3, it may happen that a load balancer is
scheduled on a network node that does not have the correct router
deployed on it. In such a case, neutron deploys a DvrEdgeRouter on the
network node to serve the LB. Every time neutron updates that router,
e.g. to assign a floating IP to the LB, it crashes with the above stack
trace because it expected to find a DvrEdgeHaRouter on the network node
on which it has to check the ha state.
To verify if it has to check the ha state of a router object, neutron
runs the following check:
if router.get('ha') and not is_dvr_only_agent
In our case that check is true, because the agent runs in mode
'dvr_snat', and the router is HA. But the actual router object running
on the network node is of type DvrEdgeRouter and therefore has no
ha_state attribute, causing the update to fail.
** Affects: neutron
Importance: Undecided
Assignee: Daniel Gonzalez Nothnagel (dgonzalez)
Status: In Progress
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1755243
Title:
AttributeError when updating DvrEdgeRouter objects running on network
nodes
Status in neutron:
In Progress
Bug description:
In a configuration with L3 HA, DVR and neutron-lbaasv2, it can happen
that the update of a router with a connected load balancer crashes
with the following stack trace (line numbers may be a bit outdated):
Failed to process compatible router: 192c77b2-1487-4bc4-af40-26563e959989
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 543, in _process_router_update
self._process_router_if_compatible(router)
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 464, in _process_router_if_compatible
self._process_updated_router(router)
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 480, in _process_updated_router
router['id'], router.get(l3_constants.HA_ROUTER_STATE_KEY))
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha.py", line 132, in check_ha_state_for_router
if ri and current_state != TRANSLATION_MAP[ri.ha_state]:
AttributeError: 'DvrEdgeRouter' object has no attribute 'ha_state'
The issue is, that in a landscape with more network nodes than
'max_l3_agents_per_router', e.g. 6 network nodes and
max_l3_agents_per_router = 3, it may happen that a load balancer is
scheduled on a network node that does not have the correct router
deployed on it. In such a case, neutron deploys a DvrEdgeRouter on the
network node to serve the LB. Every time neutron updates that router,
e.g. to assign a floating IP to the LB, it crashes with the above
stack trace because it expected to find a DvrEdgeHaRouter on the
network node on which it has to check the ha state.
To verify if it has to check the ha state of a router object, neutron
runs the following check:
if router.get('ha') and not is_dvr_only_agent
In our case that check is true, because the agent runs in mode
'dvr_snat', and the router is HA. But the actual router object running
on the network node is of type DvrEdgeRouter and therefore has no
ha_state attribute, causing the update to fail.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1755243/+subscriptions
Follow ups