← Back to team overview

yellow team mailing list archive

Re: Python 2.7 and parallel testing

 

On 06/26/2012 03:03 PM, Graham Binns wrote:
> Hi chaps - this email is going to ~yellow and all the good men of the
> Blue Squad. Please remember to reply to all :).

Hi everybody.

> We here sprinting in Eindhoven have started working today on getting
> Launchpad into a python 2.7 / Precise-compatible state. 

Great!

> To work out
> what we actually needed to do, we ran the test suite on Precise.
> Martin ran it linearly in Canonistack and I updated our buildbot-slave
> charm to tell lpsetup to use precise.

(Gary muttering to himself:) right, you told it to use Precise in the
containers.  The slave host was already precise.

> The Parallel testing results were interesting to say the least.
> Everything sets up correctly, and it was perfectly happy for me to
> start a build. However, the build ran for ~40 minutes and then died

When you say "the build ran" what do you mean? When looking at the
buildbot master, was there anything on the stdout log of the test step?
 If so, what was it?  Did you also look at the build log (the log of the
previous step) to make sure it looked ok?

> with a Twisted timeout error. It's as though the workers were
> semi-communicating with the master but just not reporting any tests.

It would help to agree on terminology.  "workers" are what the buildbot
web interface currently calls ephemeral LXC containers.  They report to
a central testr process running in the LXC host.  This testr process in
turn is controlled by, and reports to, the buildbot slave process, also
running in the LXC host.  The buildbot slave process reports to the
buildbot master, running on the other juju machine.

With that terminology, what was it you saw?

> Does anyone have any ideas as to why this might be?

Not yet. :-) If you give me a branch or directions or something I'll be
happy to poke at it.  Alternatively, you could poke yourself, but I'll
hold off on giving poking ideas until I hear more about the symptoms.

> We might not need
> worry too much about it just yet but once production is running on
> Precise our LXC containers will need to as well, and it will become an
> issue then.

I'll talk with Francis today about this.  I would personally hope that
Precise would not be usable in Production until it could work in
parallel tests.  One way or another, it is a big deal for us collectively.

Gary


Follow ups

References