← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1862932] [NEW] [neutron-bgp-dragent] passive peers send wrong number of routes

 

Public bug reported:

So, we have the following setup:

3 neutron-bgp-dragents connected to the same peers, all agents assigned to the same speaker (HA).
Remote hardware is: Cisco Nexus 93180YC-EX, NXOS: version 7.0(3)I7(5a).

The problem:
whenever something changes with the routes, only one of the controllers actually sends the withdrawal requests to the peers and sends the correct number of routes. The other 2 will still send the old number. That will not be a problem, because by default it will default to the routes sent by the "active" controller, unless something happens with the "active" controller. When that happens, another controller will become "active", but unless we restart the agent, it will still send the old number of routes.

Here's some info that might help:

# speaker
openstack bgp speaker list
+--------------------------------------+---------+----------+------------+
| ID                                   | Name    | Local AS | IP Version |
+--------------------------------------+---------+----------+------------+
| 486cd041-fbe0-4f0a-b12f-d630923fac58 | speaker |    65007 |          4 |
+--------------------------------------+---------+----------+------------+
# number of routes
openstack bgp speaker list advertised routes speaker -f value | wc -l
93
# agents attached to the speaker
+--------------------------------------+--------------+-------+-------+
| ID                                   | Host         | State | Alive |
+--------------------------------------+--------------+-------+-------+
| 4258f572-adfe-42dc-bcd5-7bb1a380503e | controller-2 | True  | :-)   |
| 6b08e6f3-728c-4cd6-a2f8-0247b55ba49b | controller-3 | True  | :-)   |
| 7b7b2db1-5c0d-4aa3-8260-e398109ba727 | controller-1 | True  | :-)   |
+--------------------------------------+--------------+-------+-------+
# speaker peers
openstack bgp peer list
+--------------------------------------+-------+-----------------+-----------+
| ID                                   | Name  | Peer IP         | Remote AS |
+--------------------------------------+-------+-----------------+-----------+
| 99e9d413-98cf-4985-aaf4-4d920bc72678 | sw1   | XXX.XXX.XXX.XXX |     65005 |
| b10c1ab0-ce8e-47cd-a036-a14425b9a917 | sw2   | XXX.XXX.XXX.XXX |     65005 |
+--------------------------------------+-------+-----------------+-----------+
# show speaker
openstack bgp speaker show speaker
+-----------------------------------+------------------------------------------------------------------------------------+
| Field                             | Value                                                                              |
+-----------------------------------+------------------------------------------------------------------------------------+
| advertise_floating_ip_host_routes | True                                                                               |
| advertise_tenant_networks         | True                                                                               |
| id                                | 486cd041-fbe0-4f0a-b12f-d630923fac58                                               |
| ip_version                        | 4                                                                                  |
| local_as                          | 65007                                                                              |
| name                              | speaker                                                                            |
| networks                          | [u'97c53e69-89f5-4cd8-ac97-1ea536797f5c']                                          |
| peers                             | [u'99e9d413-98cf-4985-aaf4-4d920bc72678', u'b10c1ab0-ce8e-47cd-a036-a14425b9a917'] |
| project_id                        | 4cdb825ea20f43cb9cde3b3686188b5a                                                   |
| tenant_id                         | 4cdb825ea20f43cb9cde3b3686188b5a                                                   |
+-----------------------------------+------------------------------------------------------------------------------------+

# route count on the other side:
XXX.XXX.XXX.XXX 4 65007 2288383 2285772   181451    0    0    5d08h 92
XXX.XXX.XXX.XXX 4 65007 1494121 1490739   181451    0    0    5d08h 92
XXX.XXX.XXX.XXX 4 65007 1531378 1528974   181451    0    0    5d08h 93

# for the momment, controller-2 is the "active" one, sending messages like:
    2020-02-11 06:44:08.639 50425 DEBUG bgpspeaker.info_base.base [-] Sending withdrawal to Peer(ip: <IP_OF_PEER1>, asn: 65005) for OutgoingRoute(path: Path(source: None, nlri: IPAddrPrefix(addr='XX.XX.XX.XX',length=32), source ver#: 1, path attrs.: OrderedDict(), n
exthop: XX.XX.XX.XX, is_withdraw: True), for_route_refresh: False) _best_path_lost /usr/lib/python2.7/dist-packages/ryu/services/protocols/bgp/info_base/base.py:243
    2020-02-11 06:44:08.640 50425 DEBUG bgpspeaker.info_base.base [-] Sending withdrawal to Peer(ip: <IP_OF_PEER2>, asn: 65005) for OutgoingRoute(path: Path(source: None, nlri: IPAddrPrefix(addr='XX.XX.XX.XX',length=32), source ver#: 1, path attrs.: OrderedDict(), n
exthop: XX.XX.XX.XX, is_withdraw: True), for_route_refresh: False) _best_path_lost /usr/lib/python2.7/dist-packages/ryu/services/protocols/bgp/info_base/base.py:243

# versions:
Openstack version: queens (charmed)
OS version: Ubuntu 18.04.1 LTS
neutron-bgp-dragent --version
neutron-bgp-dragent 12.0.5

# step-by-step reproduction:
- add a speaker
- add multiple agents to a speaker (HA)
- add/remove multiple VMs with FIPs and networks from address pools
- check remote peers for current number of routes received from each agent
- turn off the "active" agent

# expected result:
- the new agent takes over advertising the correct number of routes

# actual result:
- the new agent takes over, but does not refresh the actual number of routes and sends the routes it had when it first started up

** Affects: neutron
     Importance: Undecided
         Status: New


** Tags: neutron-bgp-dragent

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1862932

Title:
  [neutron-bgp-dragent] passive peers send wrong number of routes

Status in neutron:
  New

Bug description:
  So, we have the following setup:

  3 neutron-bgp-dragents connected to the same peers, all agents assigned to the same speaker (HA).
  Remote hardware is: Cisco Nexus 93180YC-EX, NXOS: version 7.0(3)I7(5a).

  The problem:
  whenever something changes with the routes, only one of the controllers actually sends the withdrawal requests to the peers and sends the correct number of routes. The other 2 will still send the old number. That will not be a problem, because by default it will default to the routes sent by the "active" controller, unless something happens with the "active" controller. When that happens, another controller will become "active", but unless we restart the agent, it will still send the old number of routes.

  Here's some info that might help:

  # speaker
  openstack bgp speaker list
  +--------------------------------------+---------+----------+------------+
  | ID                                   | Name    | Local AS | IP Version |
  +--------------------------------------+---------+----------+------------+
  | 486cd041-fbe0-4f0a-b12f-d630923fac58 | speaker |    65007 |          4 |
  +--------------------------------------+---------+----------+------------+
  # number of routes
  openstack bgp speaker list advertised routes speaker -f value | wc -l
  93
  # agents attached to the speaker
  +--------------------------------------+--------------+-------+-------+
  | ID                                   | Host         | State | Alive |
  +--------------------------------------+--------------+-------+-------+
  | 4258f572-adfe-42dc-bcd5-7bb1a380503e | controller-2 | True  | :-)   |
  | 6b08e6f3-728c-4cd6-a2f8-0247b55ba49b | controller-3 | True  | :-)   |
  | 7b7b2db1-5c0d-4aa3-8260-e398109ba727 | controller-1 | True  | :-)   |
  +--------------------------------------+--------------+-------+-------+
  # speaker peers
  openstack bgp peer list
  +--------------------------------------+-------+-----------------+-----------+
  | ID                                   | Name  | Peer IP         | Remote AS |
  +--------------------------------------+-------+-----------------+-----------+
  | 99e9d413-98cf-4985-aaf4-4d920bc72678 | sw1   | XXX.XXX.XXX.XXX |     65005 |
  | b10c1ab0-ce8e-47cd-a036-a14425b9a917 | sw2   | XXX.XXX.XXX.XXX |     65005 |
  +--------------------------------------+-------+-----------------+-----------+
  # show speaker
  openstack bgp speaker show speaker
  +-----------------------------------+------------------------------------------------------------------------------------+
  | Field                             | Value                                                                              |
  +-----------------------------------+------------------------------------------------------------------------------------+
  | advertise_floating_ip_host_routes | True                                                                               |
  | advertise_tenant_networks         | True                                                                               |
  | id                                | 486cd041-fbe0-4f0a-b12f-d630923fac58                                               |
  | ip_version                        | 4                                                                                  |
  | local_as                          | 65007                                                                              |
  | name                              | speaker                                                                            |
  | networks                          | [u'97c53e69-89f5-4cd8-ac97-1ea536797f5c']                                          |
  | peers                             | [u'99e9d413-98cf-4985-aaf4-4d920bc72678', u'b10c1ab0-ce8e-47cd-a036-a14425b9a917'] |
  | project_id                        | 4cdb825ea20f43cb9cde3b3686188b5a                                                   |
  | tenant_id                         | 4cdb825ea20f43cb9cde3b3686188b5a                                                   |
  +-----------------------------------+------------------------------------------------------------------------------------+

  # route count on the other side:
  XXX.XXX.XXX.XXX 4 65007 2288383 2285772   181451    0    0    5d08h 92
  XXX.XXX.XXX.XXX 4 65007 1494121 1490739   181451    0    0    5d08h 92
  XXX.XXX.XXX.XXX 4 65007 1531378 1528974   181451    0    0    5d08h 93

  # for the momment, controller-2 is the "active" one, sending messages like:
      2020-02-11 06:44:08.639 50425 DEBUG bgpspeaker.info_base.base [-] Sending withdrawal to Peer(ip: <IP_OF_PEER1>, asn: 65005) for OutgoingRoute(path: Path(source: None, nlri: IPAddrPrefix(addr='XX.XX.XX.XX',length=32), source ver#: 1, path attrs.: OrderedDict(), n
  exthop: XX.XX.XX.XX, is_withdraw: True), for_route_refresh: False) _best_path_lost /usr/lib/python2.7/dist-packages/ryu/services/protocols/bgp/info_base/base.py:243
      2020-02-11 06:44:08.640 50425 DEBUG bgpspeaker.info_base.base [-] Sending withdrawal to Peer(ip: <IP_OF_PEER2>, asn: 65005) for OutgoingRoute(path: Path(source: None, nlri: IPAddrPrefix(addr='XX.XX.XX.XX',length=32), source ver#: 1, path attrs.: OrderedDict(), n
  exthop: XX.XX.XX.XX, is_withdraw: True), for_route_refresh: False) _best_path_lost /usr/lib/python2.7/dist-packages/ryu/services/protocols/bgp/info_base/base.py:243

  # versions:
  Openstack version: queens (charmed)
  OS version: Ubuntu 18.04.1 LTS
  neutron-bgp-dragent --version
  neutron-bgp-dragent 12.0.5

  # step-by-step reproduction:
  - add a speaker
  - add multiple agents to a speaker (HA)
  - add/remove multiple VMs with FIPs and networks from address pools
  - check remote peers for current number of routes received from each agent
  - turn off the "active" agent

  # expected result:
  - the new agent takes over advertising the correct number of routes

  # actual result:
  - the new agent takes over, but does not refresh the actual number of routes and sends the routes it had when it first started up

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1862932/+subscriptions


Follow ups