← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1755243] [NEW] AttributeError when updating DvrEdgeRouter objects running on network nodes

 

Public bug reported:

In a configuration with L3 HA, DVR and neutron-lbaasv2, it can happen
that the update of a router with a connected load balancer crashes with
the following stack trace (line numbers may be a bit outdated):

Failed to process compatible router: 192c77b2-1487-4bc4-af40-26563e959989
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 543, in _process_router_update
    self._process_router_if_compatible(router)
  File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 464, in _process_router_if_compatible
    self._process_updated_router(router)
  File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 480, in _process_updated_router
    router['id'], router.get(l3_constants.HA_ROUTER_STATE_KEY))
  File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha.py", line 132, in check_ha_state_for_router
    if ri and current_state != TRANSLATION_MAP[ri.ha_state]:
AttributeError: 'DvrEdgeRouter' object has no attribute 'ha_state'

The issue is, that in a landscape with more network nodes than
'max_l3_agents_per_router', e.g. 6 network nodes and
max_l3_agents_per_router = 3, it may happen that a load balancer is
scheduled on a network node that does not have the correct router
deployed on it. In such a case, neutron deploys a DvrEdgeRouter on the
network node to serve the LB. Every time neutron updates that router,
e.g. to assign a floating IP to the LB, it crashes with the above stack
trace because it expected to find a DvrEdgeHaRouter on the network node
on which it has to check the ha state.

To verify if it has to check the ha state of a router object, neutron
runs the following check:

if router.get('ha') and not is_dvr_only_agent

In our case that check is true, because the agent runs in mode
'dvr_snat', and the router is HA. But the actual router object running
on the network node is of type DvrEdgeRouter and therefore has no
ha_state attribute, causing the update to fail.

** Affects: neutron
     Importance: Undecided
     Assignee: Daniel Gonzalez Nothnagel (dgonzalez)
         Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1755243

Title:
  AttributeError when updating DvrEdgeRouter objects running on network
  nodes

Status in neutron:
  In Progress

Bug description:
  In a configuration with L3 HA, DVR and neutron-lbaasv2, it can happen
  that the update of a router with a connected load balancer crashes
  with the following stack trace (line numbers may be a bit outdated):

  Failed to process compatible router: 192c77b2-1487-4bc4-af40-26563e959989
  Traceback (most recent call last):
    File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 543, in _process_router_update
      self._process_router_if_compatible(router)
    File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 464, in _process_router_if_compatible
      self._process_updated_router(router)
    File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 480, in _process_updated_router
      router['id'], router.get(l3_constants.HA_ROUTER_STATE_KEY))
    File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha.py", line 132, in check_ha_state_for_router
      if ri and current_state != TRANSLATION_MAP[ri.ha_state]:
  AttributeError: 'DvrEdgeRouter' object has no attribute 'ha_state'

  The issue is, that in a landscape with more network nodes than
  'max_l3_agents_per_router', e.g. 6 network nodes and
  max_l3_agents_per_router = 3, it may happen that a load balancer is
  scheduled on a network node that does not have the correct router
  deployed on it. In such a case, neutron deploys a DvrEdgeRouter on the
  network node to serve the LB. Every time neutron updates that router,
  e.g. to assign a floating IP to the LB, it crashes with the above
  stack trace because it expected to find a DvrEdgeHaRouter on the
  network node on which it has to check the ha state.

  To verify if it has to check the ha state of a router object, neutron
  runs the following check:

  if router.get('ha') and not is_dvr_only_agent

  In our case that check is true, because the agent runs in mode
  'dvr_snat', and the router is HA. But the actual router object running
  on the network node is of type DvrEdgeRouter and therefore has no
  ha_state attribute, causing the update to fail.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1755243/+subscriptions


Follow ups