← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2095566] [NEW] [OVN] Router creation timeouts randomly

 

Public bug reported:

The router creation operation randomly timeouts.

This is happening with Neutron API using eventlet (grenade jobs) and
WSGI.

Logs:
* https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_2cd/932601/16/check/neutron-ovn-tempest-ipv6-only-ovs-release-wsgi-9/2cdda5f/controller/logs/screen-neutron-api.txt
* https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_154/939347/4/check/neutron-ovn-tempest-ipv6-only-ovs-release-wsgi-4/15432e0/controller/logs/screen-neutron-api.txt
* https://f4b7745b1a52a3f1e3d3-57fb9c0e693945c47fbc306535c3cca3.ssl.cf5.rackcdn.com/939347/4/check/neutron-ovn-tempest-ipv6-only-ovs-release-wsgi-3/3eb33de/controller/logs/screen-neutron-api.txt
* https://e7998ba09581f0a85d02-c128921194a2b01210c7e01142fc6470.ssl.cf1.rackcdn.com/939451/4/check/neutron-ovn-grenade-multinode-skip-level/46d4ff9/controller/logs/screen-q-svc.txt

Snippet: https://paste.opendev.org/show/bqEeKFs7CfBJHJvn5UQh/

There are 7 OVN commands issued in the same transaction (as seen in the provided snippet or other logs):
* LrAddCommand
* AddLRouterPortCommand
* ScheduleNewGatewayCommand
* SetLRouterPortInLSwitchPortCommand
* AddStaticRouteCommand
* QoSDelCommand (x2)

I suspect that the ``ScheduleNewGatewayCommand``, due to the operations
done inside the ``run`` method [1], is causing this random timeout. My
recommendation is to move this operation out of the router creation
command.

[1]https://github.com/openstack/neutron/blob/ee913133a6f6ed034578c00a4e140f9053179929/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/commands.py#L408-L419

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2095566

Title:
  [OVN] Router creation timeouts randomly

Status in neutron:
  New

Bug description:
  The router creation operation randomly timeouts.

  This is happening with Neutron API using eventlet (grenade jobs) and
  WSGI.

  Logs:
  * https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_2cd/932601/16/check/neutron-ovn-tempest-ipv6-only-ovs-release-wsgi-9/2cdda5f/controller/logs/screen-neutron-api.txt
  * https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_154/939347/4/check/neutron-ovn-tempest-ipv6-only-ovs-release-wsgi-4/15432e0/controller/logs/screen-neutron-api.txt
  * https://f4b7745b1a52a3f1e3d3-57fb9c0e693945c47fbc306535c3cca3.ssl.cf5.rackcdn.com/939347/4/check/neutron-ovn-tempest-ipv6-only-ovs-release-wsgi-3/3eb33de/controller/logs/screen-neutron-api.txt
  * https://e7998ba09581f0a85d02-c128921194a2b01210c7e01142fc6470.ssl.cf1.rackcdn.com/939451/4/check/neutron-ovn-grenade-multinode-skip-level/46d4ff9/controller/logs/screen-q-svc.txt

  Snippet: https://paste.opendev.org/show/bqEeKFs7CfBJHJvn5UQh/

  There are 7 OVN commands issued in the same transaction (as seen in the provided snippet or other logs):
  * LrAddCommand
  * AddLRouterPortCommand
  * ScheduleNewGatewayCommand
  * SetLRouterPortInLSwitchPortCommand
  * AddStaticRouteCommand
  * QoSDelCommand (x2)

  I suspect that the ``ScheduleNewGatewayCommand``, due to the
  operations done inside the ``run`` method [1], is causing this random
  timeout. My recommendation is to move this operation out of the router
  creation command.

  [1]https://github.com/openstack/neutron/blob/ee913133a6f6ed034578c00a4e140f9053179929/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/commands.py#L408-L419

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2095566/+subscriptions