yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #84009
[Bug 1897095] [NEW] [OVN] ARP/MAC handling for routers connected to external network is scaling poorly
Public bug reported:
With current router configuration set by neutron, a number of logical
flows in lr_in_arp_resolve seems to have O(n^2) scaling where n is a
number of routers connected to the external network, for example this is
our test where we created 800 routers (I believe it was 800, and not 400
as stated in the linked discussion):
--8<--8<--8<--
# cat lflows.txt |grep -v Datapath |cut -d'(' -f 2 | cut -d ')' -f1 |sort | uniq -c |sort -n | tail -10
3264 lr_in_learn_neighbor
3386 ls_out_port_sec_l2
4112 lr_in_admission
4202 ls_in_port_sec_l2
4898 lr_in_lookup_neighbor
4900 lr_in_ip_routing
9144 ls_in_l2_lkup
9160 ls_in_arp_rsp
22136 lr_in_ip_input
671656 lr_in_arp_resolve
#
--8<--8<--8<--
I've opened a review where we set `always_learn_from_arp_request=false`
and `dynamic_neigh_routers=true` on all routers, which has a significant
impact on a number of logical flows:
--8<--8<--8<--
# cat lflows-new.txt |grep -v Datapath |cut -d'(' -f 2 | cut -d ')' -f1 |sort | uniq -c |sort -n | tail -10
2170 ls_out_port_sec_l2
2172 lr_in_learn_neighbor
2666 lr_in_admission
2690 ls_in_port_sec_l2
3190 lr_in_ip_routing
4276 lr_in_lookup_neighbor
4873 lr_in_arp_resolve
5864 ls_in_arp_rsp
5873 ls_in_l2_lkup
14343 lr_in_ip_input
# ovn-sbctl --timeout=120 lflow-list > lflows-new.txt
--8<--8<--8<--
There is however some performance penalty, which from my understanding affects east-west traffic between routers - I'm not quite sure how much of an effect it is, and it may be a good idea to make that change optional as mentioned in the mailing list discussion.
See https://mail.openvswitch.org/pipermail/ovs-
discuss/2020-May/049994.html and http://lists.openstack.org/pipermail
/openstack-discuss/2020-September/017370.html for related discussions.
** Affects: neutron
Importance: Undecided
Assignee: Krzysztof Klimonda (kklimonda)
Status: In Progress
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1897095
Title:
[OVN] ARP/MAC handling for routers connected to external network is
scaling poorly
Status in neutron:
In Progress
Bug description:
With current router configuration set by neutron, a number of logical
flows in lr_in_arp_resolve seems to have O(n^2) scaling where n is a
number of routers connected to the external network, for example this
is our test where we created 800 routers (I believe it was 800, and
not 400 as stated in the linked discussion):
--8<--8<--8<--
# cat lflows.txt |grep -v Datapath |cut -d'(' -f 2 | cut -d ')' -f1 |sort | uniq -c |sort -n | tail -10
3264 lr_in_learn_neighbor
3386 ls_out_port_sec_l2
4112 lr_in_admission
4202 ls_in_port_sec_l2
4898 lr_in_lookup_neighbor
4900 lr_in_ip_routing
9144 ls_in_l2_lkup
9160 ls_in_arp_rsp
22136 lr_in_ip_input
671656 lr_in_arp_resolve
#
--8<--8<--8<--
I've opened a review where we set
`always_learn_from_arp_request=false` and `dynamic_neigh_routers=true`
on all routers, which has a significant impact on a number of logical
flows:
--8<--8<--8<--
# cat lflows-new.txt |grep -v Datapath |cut -d'(' -f 2 | cut -d ')' -f1 |sort | uniq -c |sort -n | tail -10
2170 ls_out_port_sec_l2
2172 lr_in_learn_neighbor
2666 lr_in_admission
2690 ls_in_port_sec_l2
3190 lr_in_ip_routing
4276 lr_in_lookup_neighbor
4873 lr_in_arp_resolve
5864 ls_in_arp_rsp
5873 ls_in_l2_lkup
14343 lr_in_ip_input
# ovn-sbctl --timeout=120 lflow-list > lflows-new.txt
--8<--8<--8<--
There is however some performance penalty, which from my understanding affects east-west traffic between routers - I'm not quite sure how much of an effect it is, and it may be a good idea to make that change optional as mentioned in the mailing list discussion.
See https://mail.openvswitch.org/pipermail/ovs-
discuss/2020-May/049994.html and http://lists.openstack.org/pipermail
/openstack-discuss/2020-September/017370.html for related discussions.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1897095/+subscriptions