launchpad-dev team mailing list archive

Thread
Date

Re: make check fails with "A test appears to be hung. There has been no output for 600 seconds."

To: launchpad-dev@xxxxxxxxxxxxxxxxxxx
From: Maris Fogels <maris.fogels@xxxxxxxxxxxxx>
Date: Fri, 03 Sep 2010 15:20:56 -0400
In-reply-to: <201009031700.01557.julian.edwards@canonical.com>
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.23) Gecko/20090817 Lightning/0.9 Thunderbird/2.0.0.23 Mnenhy/0.7.6.666

On 09/03/2010 12:00 PM, Julian Edwards wrote:

On Friday 03 September 2010 16:29:14 Aaron Bentley wrote:

 > On 09/03/2010 09:59 AM, Aaron Bentley wrote:

 > > I am unable to do a full test suite run locally. I tried twice.

 > >

 > > I am testing lp:~abentley/launchpad/permit-commands and I'll test

 > > another branch to be sure, but it doesn't look like something I changed

 > > caused this.

 >

 > Confirmed, another branch fails the same way.

I've seen this for a week or two as well.

I get around it by running "bin/test -vv" instead. The last test shown
before it hangs works fine, and since bin/test works OK I presume the
subsequent one does too. I've no idea why it's hanging.

How can we debug this?

I'm looking into it. The test log that Aaron posted should be enough to get mestarted. I've filed a bug for this here:


  https://bugs.edge.launchpad.net/launchpad-foundations/+bug/629746


As for debugging these problems, here is what I plan to do:

 * Reproduce anywhere

* See if the smallest suite hangs(lp.archiveuploader.tests.test_ppauploadprocessor)

* Figure out if it hangs running both 'make check' and bin/test, or just one orthe other


To do this I will try re-running the suite under a clean environment:

  $ bzr switch ../devel
  $ make clean
  $ make
  $ make schema
  # ... everything is now roughly as test_on_merge.py does

  $ bin/test -vv |& tee test.log
  # ... wait for a hang

If it hangs, great! Mashing Ctrl-c a few times should dump the stack trace, oreven better, you will have an exception in the console. You should also have arough idea which suite is hanging (like lp.foo.bar), so you can run that suiteon its own to try and narrow in on it.

If you know roughly where it hangs, and bin/test passes without issue, you mightalso want to try running the sub-suite on its own under 'make check', like so:


 $ make check TESTOPTS='-vv lp.foo.bar'

FWIW, running 'make check' creates a stack of processes and buffering in thoseprocesses can eat useful program output, such as the original error that causedthe hang. The first thing you want to do is remove the layers of indirectionbetween you and the test suite by running bin/test directly in a terminal.Always try to reproduce the problem with bin/test first.



--
Māris Fogels -- https://launchpad.net/~mars
Launchpad.net -- cross-project collaboration and hosting

Follow ups

Re: make check fails with "A test appears to be hung. There has been no output for 600 seconds."
From: Martin Pool, 2010-11-17
Re: make check fails with "A test appears to be hung. There has been no output for 600 seconds."
From: Robert Collins, 2010-09-03

References

make check fails with "A test appears to be hung. There has been no output for 600 seconds."
From: Aaron Bentley, 2010-09-03
Re: make check fails with "A test appears to be hung. There has been no output for 600 seconds."
From: Aaron Bentley, 2010-09-03
Re: make check fails with "A test appears to be hung. There has been no output for 600 seconds."
From: Julian Edwards, 2010-09-03