← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1558819] Re: Fullstack linux bridge agent sometimes refuses to die during test clean up, failing the test

 

Reviewed:  https://review.openstack.org/294798
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=fd93e19f2a415b3803700fc491749daba01a4390
Submitter: Jenkins
Branch:    master

commit fd93e19f2a415b3803700fc491749daba01a4390
Author: Assaf Muller <amuller@xxxxxxxxxx>
Date:   Fri Mar 18 16:29:26 2016 -0400

    Change get_root_helper_child_pid to stop when it finds cmd
    
    get_root_helper_child_pid recursively finds the child of pid,
    until it can no longer find a child. However, the intention is
    not to find the deepest child, but to strip away root helpers.
    For example 'sudo neutron-rootwrap x' is supposed to find the
    pid of x. However, in cases 'x' spawned quick lived children of
    its own (For example: ip / brctl / ovs invocations),
    get_root_helper_child_pid returned those pids if called in
    the wrong time.
    
    Change-Id: I582aa5c931c8bfe57f49df6899445698270bb33e
    Closes-Bug: #1558819


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1558819

Title:
  Fullstack linux bridge agent sometimes refuses to die during test
  clean up, failing the test

Status in neutron:
  Fix Released

Bug description:
  Paste of failure:
  http://paste.openstack.org/show/491014/

  When looking at the LB agent logs, you start seeing RPC errors as
  neutron-server is unable to access the DB. What's happening is that
  fullstack times out trying to kill the LB agent and moves on to other
  clean ups. It deletes the DB for the test, but the agents and neutron-
  server live on, resulting in errors trying to access the DB. The DB
  errors are essentially unrelated - The root cause is that the agent
  refuses to die for an unknown reason.

  The code that tries to stop the agent is AsyncProcess.stop(block=True, signal=9).
  Another detail that might be relevant is that the agent lives in a namespace.

  To reproduce locally, go to the VM running the fullstack tests and load all CPUs to 100%, then run:
  tox -e dsvm-fullstack TestLinuxBridgeConnectivitySameNetwork

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1558819/+subscriptions


References