← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1419069] [NEW] Network Performance Problem with GRE using Openvswitch

 

Public bug reported:

We are having GRE performance issues with Juno installation. From VM to
network node, we can only get 3Gbit on 10Gbit interface. Finally, I
tracked and solved the issue but that requires patches to nova and
neutron-plugin-openvswitch. I am reporting this bug to find a clean
solution instead of a hack.

The isssue is caused by MTU setting and lack of multiqueue net support
in kvm. As official openstack documentation suggests, MTU settings are
1500 by default. This creates a bottleneck in VMs and it's only possible
to process 3Gbit network traffic with 1500 MTU  and without MQ support
enabled in KVM.

What I did to solve the issue:
1- Set physical interface (em1) mtu to 9000
2- Set network_device_mtu = 8950 in nova and neutron.conf (both on compute/network nodes)
3- Set br-int mtu to 8950 manually
4- Set br-tun mtu to 8976 manually
5- Set VM MTU to be 8950 in dnsmasq-neutron.conf
6- Patch nova config code to add <device driver='vhost' queue='4'> element in libvirt.xml
7- Run "ethtool -L eth0 combined 4" in VMs

With network_device_mtu setting, tap/qvo/qvb in compute nodes and
internal legs in the router/dhcp namespace in network node can be set
automatically. However, it only solves half of the problem. I still need
to set mtu to br-int and br-tun interfaces.

To enable MQ support in KVM, I needed to patch nova. Currently, there is
no possible way to set queues in libvirt.xml. Without MQ support, even
if jumbo frames are enabled, VMs are limited to 5Gbit. This is because
of the fact that [vhost-xxxx] process is bound to one CPU and network
load cannot be distributed to other CPUs. When MQ is enabled, [vhost-
xxxx] can be distributed to other cores, which gives 9.3Gbit
performance.

I am adding my ugly hacks just to give some idea on code change. I know
that it is not a right way. Let's discuss how to properly address this
issue.

Should I open another related bug to nova as this issue needs a change
in nova code as well?

Note: this is a different bug than
https://bugs.launchpad.net/bugs/1252900 affecting Juno release.

** Affects: neutron
     Importance: Undecided
         Status: New

** Affects: nova
     Importance: Undecided
         Status: New


** Tags: neutron-core

** Patch added: "neutron-add-tunnel-integration-bridge-mtu-setting.patch"
   https://bugs.launchpad.net/bugs/1419069/+attachment/4313936/+files/neutron-add-tunnel-integration-bridge-mtu-setting.patch

** Also affects: nova
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1419069

Title:
  Network Performance Problem with GRE using Openvswitch

Status in OpenStack Neutron (virtual network service):
  New
Status in OpenStack Compute (Nova):
  New

Bug description:
  We are having GRE performance issues with Juno installation. From VM
  to network node, we can only get 3Gbit on 10Gbit interface. Finally, I
  tracked and solved the issue but that requires patches to nova and
  neutron-plugin-openvswitch. I am reporting this bug to find a clean
  solution instead of a hack.

  The isssue is caused by MTU setting and lack of multiqueue net support
  in kvm. As official openstack documentation suggests, MTU settings are
  1500 by default. This creates a bottleneck in VMs and it's only
  possible to process 3Gbit network traffic with 1500 MTU  and without
  MQ support enabled in KVM.

  What I did to solve the issue:
  1- Set physical interface (em1) mtu to 9000
  2- Set network_device_mtu = 8950 in nova and neutron.conf (both on compute/network nodes)
  3- Set br-int mtu to 8950 manually
  4- Set br-tun mtu to 8976 manually
  5- Set VM MTU to be 8950 in dnsmasq-neutron.conf
  6- Patch nova config code to add <device driver='vhost' queue='4'> element in libvirt.xml
  7- Run "ethtool -L eth0 combined 4" in VMs

  With network_device_mtu setting, tap/qvo/qvb in compute nodes and
  internal legs in the router/dhcp namespace in network node can be set
  automatically. However, it only solves half of the problem. I still
  need to set mtu to br-int and br-tun interfaces.

  To enable MQ support in KVM, I needed to patch nova. Currently, there
  is no possible way to set queues in libvirt.xml. Without MQ support,
  even if jumbo frames are enabled, VMs are limited to 5Gbit. This is
  because of the fact that [vhost-xxxx] process is bound to one CPU and
  network load cannot be distributed to other CPUs. When MQ is enabled,
  [vhost-xxxx] can be distributed to other cores, which gives 9.3Gbit
  performance.

  I am adding my ugly hacks just to give some idea on code change. I
  know that it is not a right way. Let's discuss how to properly address
  this issue.

  Should I open another related bug to nova as this issue needs a change
  in nova code as well?

  Note: this is a different bug than
  https://bugs.launchpad.net/bugs/1252900 affecting Juno release.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1419069/+subscriptions


Follow ups

References