← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2083237] Re: Initial router state is not set correctly

 

The fixed has been proposed here:
https://review.opendev.org/c/openstack/neutron/+/937758


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2083237

Title:
  Initial router state is not set correctly

Status in neutron:
  Fix Released

Bug description:
  Context
  =======

  OpenStack Antelope (but master seems affected).

  When a router is created in HA mode, multiple L3 agents (3 by default)
  are spawning a keepalived process to monitor the state of the router.

  The initial state of the router is supposed to be saved in the
  'initial_state' variable when a call to the initial_state_change()
  function is done.

  This initial_state is kept so that it prevent false bounces when
  keepalived is transiting.

  Problem
  =======

  The initial_state is set only when the state of the router is primary.
  So in a scenario with 3 L3 agents, we could have:

  t0:
  agent-1 initial state: primary
  agent-2 initial state: (unset)
  agent-3 initial state: (unset)

  t1:
  agent-1 failure
  agent-2 transition to primary
  agent-3 transition to primary

  both agent-2 and 3 are transitionning to primary and neutron will send a port binding update to server.
  The last one sending the request will win the binding.
  Let's imagine the binding is now on agent-3

  t2:
  agent-1 failure
  agent-2 primary
  agent-3 transition to backup

  agent-2 wins and stay primary, agent-3 transition to backup.

  
  So now, we have the port binding recorded to be on agent-3 but agent-2 is actually primary.

  
  Solution
  ========

  Neutron code is supposed to handle false bounces by setting the initial state correctly.
  Then the code will sleep (eventlet.sleep(self.conf.ha_vrrp_advert_int)) until the keepalived state is stabilized.
  So only one agent will grab the binding.

  To make sure this code works, the initial state needs to be set
  correctly from the beginning.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2083237/+subscriptions



References