yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #48711
[Bug 1551288] Re: Fullstack native tests sometimes fail with an OVS agent failing to start with 'Address already in use' error
Reviewed: https://review.openstack.org/298056
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=03999961ac620249950d8bca628719e9c14c4382
Submitter: Jenkins
Branch: master
commit 03999961ac620249950d8bca628719e9c14c4382
Author: Assaf Muller <amuller@xxxxxxxxxx>
Date: Thu Mar 24 22:14:07 2016 -0400
Add fullstack cross-process port/ip address fixtures
We've had a series of bugs with resources that need
to be unique on the system across test runner
processes. Ports are used by neutron-server and the
OVS agent when run in native openflow mode. The function
that generates ports looks up random unused ports and
starts the service. However, it is raceful: By the time the
port is found to be unused and the service is started,
another test runner can pick the same random port.
With close to 65536 ports to choose from, the chance
for collision is low, but given enough test runs, it's
happened a non-trivial amount of times, and given that
a voting job needs a very low false-negative rate, we
need a more robust solution. The same applies to IP
addresses that are used by the OVS agent in tunneling
mode, and for the LB agent in all modes. With IP addresses,
we don't check if the IP address is used, we simply
pick a random address from a large pool, and again
we've seen a non-trivial amount of test failures.
The bugs referenced below had simple, short term solutions
applied but the bugs remain remain. This patch is a correct,
long term solution that doesn't rely on chance.
This patch adds a resource allocator that uses the disk
to persist allocations. Access to the disk is guarded
via a file lock. IP address, networks and ports fixtures
use an allocator internally.
Closes-Bug: #1551288
Closes-Bug: #1561248
Closes-Bug: #1560277
Change-Id: I46c0ca138b806759128462f8d44a5fab96a106d3
** Changed in: neutron
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1551288
Title:
Fullstack native tests sometimes fail with an OVS agent failing to
start with 'Address already in use' error
Status in neutron:
Fix Released
Bug description:
Example failure:
test_connectivity(VLANs,Native) fails with this error:
http://paste.openstack.org/show/488585/
wait_until_env_is_up is timing out, which typically means that the
expected number of agents failed to start. Indeed in this particular
example I saw this line being output repeatedly in neutron-server.log:
[29/Feb/2016 04:16:31] "GET /v2.0/agents.json HTTP/1.1" 200 1870
0.005458
Fullstack calls GET on agents to determine if the expected amount of
agents were started and are successfully reporting back to neutron-
server.
We then see that one of the three OVS agents crashed with this TRACE:
http://paste.openstack.org/show/488586/
This happens only with the native tests using the Ryu library.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1551288/+subscriptions
References