← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1252900] Re: Directional network performance issues with Neutron + OpenvSwitch

 

Reviewed:  https://review.openstack.org/58606
Committed: http://github.com/openstack/openstack-manuals/commit/3cc8efdf5466750334e912ae8efa4cc8c0354edb
Submitter: Jenkins
Branch:    master

commit 3cc8efdf5466750334e912ae8efa4cc8c0354edb
Author: Darragh O'Reilly <dara2002-openstack@xxxxxxxxx>
Date:   Tue Nov 26 20:00:17 2013 +0000

    Add warning about GRO and Neutron routers
    
    Generic Receive Offload appears to be enabled by default on recent Ubuntu
    kernels. It can have a significant impact on download performance when
    enabled on a Neutron router interface. This patch warns users about that.
    
    Change-Id: I3d3a560b1db55aabd901f27ad5c7bd5777b300da
    Closes-bug: 1252900


** Changed in: openstack-manuals
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1252900

Title:
  Directional network performance issues with Neutron + OpenvSwitch

Status in OpenStack Neutron (virtual network service):
  Confirmed
Status in OpenStack Manuals:
  Fix Released
Status in Open vSwitch:
  New
Status in Ubuntu:
  Confirmed

Bug description:
  Hello!

  Currently, Havana L3 Router have a serious issue. Which makes it
  almost useless (sorry, I do not want to be rude but instead, trying to
  bring more attention to this problem).

  When the tenant network traffic pass trough the L3 Router (Namespace
  at the Network Node), it becomes very, very slow and intermittent. The
  issue also affects the traffic that hit a "Floating IP", going into
  the Tenant subnet.

  The affected topology is: "Per-Tenant Router with Private Networks".

  As a reference, I'm using the following Grizzly guide for my Havana
  deployment:

  https://github.com/mseknibilel/OpenStack-Grizzly-Install-
  Guide/blob/OVS_MultiNode/OpenStack_Grizzly_Install_Guide.rst

  Extra info:

  http://docs.openstack.org/havana/install-guide/install/apt/content
  /section_networking-routers-with-private-networks.html

  The symptoms are:

  1- "Slow connection to Canonical or when browsing the web from within
  a tenant subnet"

  aptitude update ; aptitude safe-upgrade

  From within a Tenant instance, it will take about 1 hour to finish, on
  a link capable of finishing it in 2~3 minutes.

  2- SSH connection using Floating IPs froze 10 times per minute.

  Connecting from the outside world, into a Instance using its Floating
  IP address, is a pain.

  We're talking about this issue at the OpenStack mail list, here is the
  related thread:
  http://lists.openstack.org/pipermail/openstack/2013-November/002705.html

  Also, I made a video about it, watch it here:
  http://www.youtube.com/watch?v=jVjiphMuuzM

  Tested versions:

  * OpenStack Havana on top of Ubuntu 12.04.3 using Ubuntu Cloud Archive

  * Tested with Open vSwitch versions (none of it works):

  1.10.2 from UCA
  1.11.0 compiled for Ubuntu 12.04.3 using "dpkg-buildpackage"
  1.9.0 from Ubuntu package "openvswitch-datapath-lts-raring-dkms"

  * Not tested (maybe it will work):

  Havana with Ubuntu 12.04.1 + OVS 1.4.0 (does not support VXLAN).

  * Tenant subnet tested types:

  VXLAN
  GRE
  VLAN

  It does not matter the subnet type you choose, it will be always slow.

  Apparently, if you upgrade your Grizzly from Ubuntu 12.04.1 + OVS
  1.4.0, to Ubuntu 12.04.3 with OVS 1.9.0, it will trigger this problem
  when with  Grizzly too. So, I think that this problem might be related
  to Open vSwitch itself. But I need more time to check this.

  My private cloud computing based on Havana is open for you guys to
  debug it, just ask for an access!   =)

  My current plan it to test Havana with OVS 1.4.0 but, I don't have too
  much time this week to do this job.

  I'm not sure if the problem is with OVS or not, I'll try to test it
  this week.

  Also, at my video, you guys can see how I "fixed" it, by starting a
  Squid proxy-cache server within the Tenant Namespece Router, proving
  that the problem appear ONLY when you try to establish a connection
  from a tenant subnet, directly to the External network.

  I mean, the connection between a tenant and its router is okay, from
  its router to the Internet, is also okay but, from a tenant to the
  Internet, is not. So, Squid was a perfect choice to verify this theory
  at the Namespace router... And Voialá! "There I fixed it"!   =P

  Please, let me know what configuration files do you guys will need to
  be able to reproduce this problem.

  Best!
  Thiago

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1252900/+subscriptions