← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2089250] [NEW] neutron-vpnaas: Automatic VPN Agent failover not working with OVN driver

 

Public bug reported:

This issue has been observed in RDO 2024.1/OVN 23.03/AlmaLinux 9.

This has also been reproduced in Devstack.

1. Ensure that allow_automatic_vpnagent_failover is set to True
2. Have two VPN agents
3. Set up a VPN connection successfully
4. Disable the agent running the VPN connection

Expected results: The VPN connection is successfully rescheduled to a a
different agent

Actual results: Nothing happens


I have been troubleshooting this for a fair bit, and I have found the following:

In neutron-vpnaas/neutron_vpnaas/db/vpn/vpn_agentschedulers_db.py, the
function get_down_router_bindings tries to get_vpn_agents that are down.
If I add a line in the beginning of the function that tries to get ALL
agents like this:

```python
LOG.info("troubleshooting: " + str(self.core_plugin.get_agents(context)))
```

... and then restart neutron-server and wait for it to run the function,
it prints 'troubleshooting: []', indicating that it finds no agents at
all.

If you do this on a system that has most agents in OVN, but some in the
neutron database (case in point: BGP Dragents), you will see the BGP
agents listed.  This makes me believe that the core problem is that
vpnaas never checks for agents in the OVN singleton.

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2089250

Title:
  neutron-vpnaas: Automatic VPN Agent failover not working with OVN
  driver

Status in neutron:
  New

Bug description:
  This issue has been observed in RDO 2024.1/OVN 23.03/AlmaLinux 9.

  This has also been reproduced in Devstack.

  1. Ensure that allow_automatic_vpnagent_failover is set to True
  2. Have two VPN agents
  3. Set up a VPN connection successfully
  4. Disable the agent running the VPN connection

  Expected results: The VPN connection is successfully rescheduled to a
  a different agent

  Actual results: Nothing happens

  
  I have been troubleshooting this for a fair bit, and I have found the following:

  In neutron-vpnaas/neutron_vpnaas/db/vpn/vpn_agentschedulers_db.py, the
  function get_down_router_bindings tries to get_vpn_agents that are
  down.  If I add a line in the beginning of the function that tries to
  get ALL agents like this:

  ```python
  LOG.info("troubleshooting: " + str(self.core_plugin.get_agents(context)))
  ```

  ... and then restart neutron-server and wait for it to run the
  function, it prints 'troubleshooting: []', indicating that it finds no
  agents at all.

  If you do this on a system that has most agents in OVN, but some in
  the neutron database (case in point: BGP Dragents), you will see the
  BGP agents listed.  This makes me believe that the core problem is
  that vpnaas never checks for agents in the OVN singleton.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2089250/+subscriptions