yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #06160
[Bug 1252900] Re: Directional network performance issues with Neutron + OpenvSwitch
Reviewed: https://review.openstack.org/58606
Committed: http://github.com/openstack/openstack-manuals/commit/3cc8efdf5466750334e912ae8efa4cc8c0354edb
Submitter: Jenkins
Branch: master
commit 3cc8efdf5466750334e912ae8efa4cc8c0354edb
Author: Darragh O'Reilly <dara2002-openstack@xxxxxxxxx>
Date: Tue Nov 26 20:00:17 2013 +0000
Add warning about GRO and Neutron routers
Generic Receive Offload appears to be enabled by default on recent Ubuntu
kernels. It can have a significant impact on download performance when
enabled on a Neutron router interface. This patch warns users about that.
Change-Id: I3d3a560b1db55aabd901f27ad5c7bd5777b300da
Closes-bug: 1252900
** Changed in: openstack-manuals
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1252900
Title:
Directional network performance issues with Neutron + OpenvSwitch
Status in OpenStack Neutron (virtual network service):
Confirmed
Status in OpenStack Manuals:
Fix Released
Status in Open vSwitch:
New
Status in Ubuntu:
Confirmed
Bug description:
Hello!
Currently, Havana L3 Router have a serious issue. Which makes it
almost useless (sorry, I do not want to be rude but instead, trying to
bring more attention to this problem).
When the tenant network traffic pass trough the L3 Router (Namespace
at the Network Node), it becomes very, very slow and intermittent. The
issue also affects the traffic that hit a "Floating IP", going into
the Tenant subnet.
The affected topology is: "Per-Tenant Router with Private Networks".
As a reference, I'm using the following Grizzly guide for my Havana
deployment:
https://github.com/mseknibilel/OpenStack-Grizzly-Install-
Guide/blob/OVS_MultiNode/OpenStack_Grizzly_Install_Guide.rst
Extra info:
http://docs.openstack.org/havana/install-guide/install/apt/content
/section_networking-routers-with-private-networks.html
The symptoms are:
1- "Slow connection to Canonical or when browsing the web from within
a tenant subnet"
aptitude update ; aptitude safe-upgrade
From within a Tenant instance, it will take about 1 hour to finish, on
a link capable of finishing it in 2~3 minutes.
2- SSH connection using Floating IPs froze 10 times per minute.
Connecting from the outside world, into a Instance using its Floating
IP address, is a pain.
We're talking about this issue at the OpenStack mail list, here is the
related thread:
http://lists.openstack.org/pipermail/openstack/2013-November/002705.html
Also, I made a video about it, watch it here:
http://www.youtube.com/watch?v=jVjiphMuuzM
Tested versions:
* OpenStack Havana on top of Ubuntu 12.04.3 using Ubuntu Cloud Archive
* Tested with Open vSwitch versions (none of it works):
1.10.2 from UCA
1.11.0 compiled for Ubuntu 12.04.3 using "dpkg-buildpackage"
1.9.0 from Ubuntu package "openvswitch-datapath-lts-raring-dkms"
* Not tested (maybe it will work):
Havana with Ubuntu 12.04.1 + OVS 1.4.0 (does not support VXLAN).
* Tenant subnet tested types:
VXLAN
GRE
VLAN
It does not matter the subnet type you choose, it will be always slow.
Apparently, if you upgrade your Grizzly from Ubuntu 12.04.1 + OVS
1.4.0, to Ubuntu 12.04.3 with OVS 1.9.0, it will trigger this problem
when with Grizzly too. So, I think that this problem might be related
to Open vSwitch itself. But I need more time to check this.
My private cloud computing based on Havana is open for you guys to
debug it, just ask for an access! =)
My current plan it to test Havana with OVS 1.4.0 but, I don't have too
much time this week to do this job.
I'm not sure if the problem is with OVS or not, I'll try to test it
this week.
Also, at my video, you guys can see how I "fixed" it, by starting a
Squid proxy-cache server within the Tenant Namespece Router, proving
that the problem appear ONLY when you try to establish a connection
from a tenant subnet, directly to the External network.
I mean, the connection between a tenant and its router is okay, from
its router to the Internet, is also okay but, from a tenant to the
Internet, is not. So, Squid was a perfect choice to verify this theory
at the Namespace router... And Voialá! "There I fixed it"! =P
Please, let me know what configuration files do you guys will need to
be able to reproduce this problem.
Best!
Thiago
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1252900/+subscriptions