yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #73221
[Bug 1775797] [NEW] The mac table size of neutron bridges (br-tun, br-int, br-*) is too small by default and eventually makes openvswitch explode
Public bug reported:
Description of problem:
the CPU utilization of ovs-vswitchd is high without DPDK enabled
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1512 root 10 -10 4352840 793864 12008 R 1101 0.3 15810:26 ovs-vswitchd
at the same time we were observing failures to send packets (ICMP) over
VXLAN tunnel, we think this might be related to high CPU usage.
--- Reproducer and analysis on ovs side done by Jiri Benc:
Reproducer:
Create an ovs bridge:
------
ovs-vsctl add-br ovs0
ip l s ovs0 up
------
Save this to a file named "reproducer.py":
------
#!/usr/bin/python
from scapy.all import *
data = [(str(RandMAC()), str(RandIP())) for i in
range(int(sys.argv[1]))]
s = conf.L2socket(iface="ovs0")
while True:
for mac, ip in data:
p = Ether(src=mac, dst=mac)/IP(src=ip, dst=ip)
s.send(p)
------
Run the reproducer:
./reproducer.py 5000
----
The problem is how flow revalidation works in ovs. There are several 'revalidator' threads launched. They should normally sleep (modulo waking every 0.5 second just to do nothing) and they wake if anything of interest happens (udpif_revalidator => poll_block). On every wake up, each revalidator thread checks whether flow revalidation is needed and if it is, it does the revalidation.
The revalidation is very costly with high number of flows. I also
suspect there's a lot of contention between the revalidator threads.
The flow revalidation is triggered by many things. What is of interest
for us is that any eviction of a MAC learning table entry triggers
revalidation.
The reproducer script repeatedly sends the same 5000 packets, all of
them with a different MAC address. This causes constant overflows of the
MAC learning table and constant revalidation. The revalidator threads
are being immediately woken up and are busy looping the revalidation.
Which is exactly the pattern from the customers' data: there are 16000+
flows and the packet capture shows that the packets are repeating every
second.
A quick fix is to increase the MAC learning table size:
ovs-vsctl set bridge <bridge> other-config:mac-table-size=50000
This should lower the CPU usage down substantially; allow a few seconds
for things to settle down.
** Affects: neutron
Importance: Medium
Assignee: Slawek Kaplonski (slaweq)
Status: Confirmed
** Tags: ovs
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1775797
Title:
The mac table size of neutron bridges (br-tun, br-int, br-*) is too
small by default and eventually makes openvswitch explode
Status in neutron:
Confirmed
Bug description:
Description of problem:
the CPU utilization of ovs-vswitchd is high without DPDK enabled
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1512 root 10 -10 4352840 793864 12008 R 1101 0.3 15810:26 ovs-vswitchd
at the same time we were observing failures to send packets (ICMP)
over VXLAN tunnel, we think this might be related to high CPU usage.
--- Reproducer and analysis on ovs side done by Jiri Benc:
Reproducer:
Create an ovs bridge:
------
ovs-vsctl add-br ovs0
ip l s ovs0 up
------
Save this to a file named "reproducer.py":
------
#!/usr/bin/python
from scapy.all import *
data = [(str(RandMAC()), str(RandIP())) for i in
range(int(sys.argv[1]))]
s = conf.L2socket(iface="ovs0")
while True:
for mac, ip in data:
p = Ether(src=mac, dst=mac)/IP(src=ip, dst=ip)
s.send(p)
------
Run the reproducer:
./reproducer.py 5000
----
The problem is how flow revalidation works in ovs. There are several 'revalidator' threads launched. They should normally sleep (modulo waking every 0.5 second just to do nothing) and they wake if anything of interest happens (udpif_revalidator => poll_block). On every wake up, each revalidator thread checks whether flow revalidation is needed and if it is, it does the revalidation.
The revalidation is very costly with high number of flows. I also
suspect there's a lot of contention between the revalidator threads.
The flow revalidation is triggered by many things. What is of interest
for us is that any eviction of a MAC learning table entry triggers
revalidation.
The reproducer script repeatedly sends the same 5000 packets, all of
them with a different MAC address. This causes constant overflows of
the MAC learning table and constant revalidation. The revalidator
threads are being immediately woken up and are busy looping the
revalidation.
Which is exactly the pattern from the customers' data: there are
16000+ flows and the packet capture shows that the packets are
repeating every second.
A quick fix is to increase the MAC learning table size:
ovs-vsctl set bridge <bridge> other-config:mac-table-size=50000
This should lower the CPU usage down substantially; allow a few
seconds for things to settle down.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1775797/+subscriptions
Follow ups