← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1897095] [NEW] [OVN] ARP/MAC handling for routers connected to external network is scaling poorly

 

Public bug reported:

With current router configuration set by neutron, a number of logical
flows in lr_in_arp_resolve seems to have O(n^2) scaling where n is a
number of routers connected to the external network, for example this is
our test where we created 800 routers (I believe it was 800, and not 400
as stated in the linked discussion):

--8<--8<--8<--
# cat lflows.txt |grep -v Datapath |cut -d'(' -f 2 | cut -d ')' -f1 |sort | uniq -c |sort -n | tail -10
   3264 lr_in_learn_neighbor
   3386 ls_out_port_sec_l2
   4112 lr_in_admission
   4202 ls_in_port_sec_l2
   4898 lr_in_lookup_neighbor
   4900 lr_in_ip_routing
   9144 ls_in_l2_lkup
   9160 ls_in_arp_rsp
  22136 lr_in_ip_input
 671656 lr_in_arp_resolve
#
--8<--8<--8<--

I've opened a review where we set `always_learn_from_arp_request=false`
and `dynamic_neigh_routers=true` on all routers, which has a significant
impact on a number of logical flows:

--8<--8<--8<--
# cat lflows-new.txt |grep -v Datapath |cut -d'(' -f 2 | cut -d ')' -f1 |sort | uniq -c |sort -n | tail -10
   2170 ls_out_port_sec_l2
   2172 lr_in_learn_neighbor
   2666 lr_in_admission
   2690 ls_in_port_sec_l2
   3190 lr_in_ip_routing
   4276 lr_in_lookup_neighbor
   4873 lr_in_arp_resolve
   5864 ls_in_arp_rsp
   5873 ls_in_l2_lkup
  14343 lr_in_ip_input
# ovn-sbctl --timeout=120 lflow-list > lflows-new.txt
--8<--8<--8<--

There is however some performance penalty, which from my understanding affects east-west traffic between routers - I'm not quite sure how much of an effect it is, and it may be a good idea to make that change optional as mentioned in the mailing list discussion.  
 

See https://mail.openvswitch.org/pipermail/ovs-
discuss/2020-May/049994.html and http://lists.openstack.org/pipermail
/openstack-discuss/2020-September/017370.html for related discussions.

** Affects: neutron
     Importance: Undecided
     Assignee: Krzysztof Klimonda (kklimonda)
         Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1897095

Title:
  [OVN] ARP/MAC handling for routers connected to external network is
  scaling poorly

Status in neutron:
  In Progress

Bug description:
  With current router configuration set by neutron, a number of logical
  flows in lr_in_arp_resolve seems to have O(n^2) scaling where n is a
  number of routers connected to the external network, for example this
  is our test where we created 800 routers (I believe it was 800, and
  not 400 as stated in the linked discussion):

  --8<--8<--8<--
  # cat lflows.txt |grep -v Datapath |cut -d'(' -f 2 | cut -d ')' -f1 |sort | uniq -c |sort -n | tail -10
     3264 lr_in_learn_neighbor
     3386 ls_out_port_sec_l2
     4112 lr_in_admission
     4202 ls_in_port_sec_l2
     4898 lr_in_lookup_neighbor
     4900 lr_in_ip_routing
     9144 ls_in_l2_lkup
     9160 ls_in_arp_rsp
    22136 lr_in_ip_input
   671656 lr_in_arp_resolve
  #
  --8<--8<--8<--

  I've opened a review where we set
  `always_learn_from_arp_request=false` and `dynamic_neigh_routers=true`
  on all routers, which has a significant impact on a number of logical
  flows:

  --8<--8<--8<--
  # cat lflows-new.txt |grep -v Datapath |cut -d'(' -f 2 | cut -d ')' -f1 |sort | uniq -c |sort -n | tail -10
     2170 ls_out_port_sec_l2
     2172 lr_in_learn_neighbor
     2666 lr_in_admission
     2690 ls_in_port_sec_l2
     3190 lr_in_ip_routing
     4276 lr_in_lookup_neighbor
     4873 lr_in_arp_resolve
     5864 ls_in_arp_rsp
     5873 ls_in_l2_lkup
    14343 lr_in_ip_input
  # ovn-sbctl --timeout=120 lflow-list > lflows-new.txt
  --8<--8<--8<--

  There is however some performance penalty, which from my understanding affects east-west traffic between routers - I'm not quite sure how much of an effect it is, and it may be a good idea to make that change optional as mentioned in the mailing list discussion.  
   

  See https://mail.openvswitch.org/pipermail/ovs-
  discuss/2020-May/049994.html and http://lists.openstack.org/pipermail
  /openstack-discuss/2020-September/017370.html for related discussions.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1897095/+subscriptions