yahoo-eng-team team mailing list archive

Thread
Date

[Bug 1920065] [NEW] Automatic rescheduling of BGP speakers on DrAgents

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Renat Nurgaliyev <1920065@xxxxxxxxxxxxxxxxxx>
Date: Thu, 18 Mar 2021 20:02:06 -0000
Reply-to: Bug 1920065 <1920065@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx

Public bug reported:

In case when dynamic routing agent becomes unreachable, neutron takes
these actions:

1. Remove all BGP speakers from unreachable agents
2. Schedule all unassigned BGP speakers on available DrAgents

This behavior can be undesirable, in the following cases:

1. Speakers are removed from DrAgent, even if there is no other
alive agent running. Sometimes, I'd prefer them to stay configured
exactly where they are, and come back after DrAgent is back online,
after the server is restarted or so. This sometimes leads to situations,
especially when there is only one active DrAgent, that speakers are
not configured on any DrAgent at all.

2. Sometimes it is desirable to let operator control which components
are running where. For example, not every node running DrAgent has
reachability to all iBGP peers, and network designer places route
reflectors, DrAgents, BGP speakers, in their appropriate places, keeping
in mind high availability and other concerns. In these setups, it could
be better to let the speaker fail on DrAgent which is down. Moving speaker
to another DrAgent also means that the source IP address for the BGP
session will also change, which sometimes can be not so good to reconfigure
on the other side of BGP peering, and not predictable at all.

These situations may happen after following change was introduced:
https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/478455

My proposal is to add a configuration flag to control this behavior:
https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/780675

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1920065

Title:
  Automatic rescheduling of BGP speakers on DrAgents

Status in neutron:
  New

Bug description:
  In case when dynamic routing agent becomes unreachable, neutron takes
  these actions:

  1. Remove all BGP speakers from unreachable agents
  2. Schedule all unassigned BGP speakers on available DrAgents

  This behavior can be undesirable, in the following cases:

  1. Speakers are removed from DrAgent, even if there is no other
  alive agent running. Sometimes, I'd prefer them to stay configured
  exactly where they are, and come back after DrAgent is back online,
  after the server is restarted or so. This sometimes leads to situations,
  especially when there is only one active DrAgent, that speakers are
  not configured on any DrAgent at all.

  2. Sometimes it is desirable to let operator control which components
  are running where. For example, not every node running DrAgent has
  reachability to all iBGP peers, and network designer places route
  reflectors, DrAgents, BGP speakers, in their appropriate places, keeping
  in mind high availability and other concerns. In these setups, it could
  be better to let the speaker fail on DrAgent which is down. Moving speaker
  to another DrAgent also means that the source IP address for the BGP
  session will also change, which sometimes can be not so good to reconfigure
  on the other side of BGP peering, and not predictable at all.

  These situations may happen after following change was introduced:
  https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/478455

  My proposal is to add a configuration flag to control this behavior:
  https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/780675

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1920065/+subscriptions

Follow ups

[Bug 1920065] Re: Automatic rescheduling of BGP speakers on DrAgents
From: OpenStack Infra, 2021-11-11