← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1551288] Re: Fullstack native tests sometimes fail with an OVS agent failing to start with 'Address already in use' error

 

Reviewed:  https://review.openstack.org/298056
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=03999961ac620249950d8bca628719e9c14c4382
Submitter: Jenkins
Branch:    master

commit 03999961ac620249950d8bca628719e9c14c4382
Author: Assaf Muller <amuller@xxxxxxxxxx>
Date:   Thu Mar 24 22:14:07 2016 -0400

    Add fullstack cross-process port/ip address fixtures
    
    We've had a series of bugs with resources that need
    to be unique on the system across test runner
    processes. Ports are used by neutron-server and the
    OVS agent when run in native openflow mode. The function
    that generates ports looks up random unused ports and
    starts the service. However, it is raceful: By the time the
    port is found to be unused and the service is started,
    another test runner can pick the same random port.
    With close to 65536 ports to choose from, the chance
    for collision is low, but given enough test runs, it's
    happened a non-trivial amount of times, and given that
    a voting job needs a very low false-negative rate, we
    need a more robust solution. The same applies to IP
    addresses that are used by the OVS agent in tunneling
    mode, and for the LB agent in all modes. With IP addresses,
    we don't check if the IP address is used, we simply
    pick a random address from a large pool, and again
    we've seen a non-trivial amount of test failures.
    
    The bugs referenced below had simple, short term solutions
    applied but the bugs remain remain. This patch is a correct,
    long term solution that doesn't rely on chance.
    
    This patch adds a resource allocator that uses the disk
    to persist allocations. Access to the disk is guarded
    via a file lock. IP address, networks and ports fixtures
    use an allocator internally.
    
    Closes-Bug: #1551288
    Closes-Bug: #1561248
    Closes-Bug: #1560277
    Change-Id: I46c0ca138b806759128462f8d44a5fab96a106d3


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1551288

Title:
  Fullstack native tests sometimes fail with an OVS agent failing to
  start with 'Address already in use' error

Status in neutron:
  Fix Released

Bug description:
  Example failure:
  test_connectivity(VLANs,Native) fails with this error:

  http://paste.openstack.org/show/488585/

  wait_until_env_is_up is timing out, which typically means that the
  expected number of agents failed to start. Indeed in this particular
  example I saw this line being output repeatedly in neutron-server.log:

  [29/Feb/2016 04:16:31] "GET /v2.0/agents.json HTTP/1.1" 200 1870
  0.005458

  Fullstack calls GET on agents to determine if the expected amount of
  agents were started and are successfully reporting back to neutron-
  server.

  We then see that one of the three OVS agents crashed with this TRACE:
  http://paste.openstack.org/show/488586/

  This happens only with the native tests using the Ryu library.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1551288/+subscriptions


References