← Back to team overview

launchpad-dev team mailing list archive

Re: ec2 failures

 

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 8/30/2011 6:02 PM, Gary Poster wrote:
> 
> On Aug 30, 2011, at 11:14 AM, Jonathan Lange wrote:
> 
>> On Fri, Aug 26, 2011 at 9:27 PM, Gary Poster
>> <gary.poster@xxxxxxxxxxxxx> wrote:
>>> I believe this is affecting other ec2 users.  I dug into it
>>> with abentley's help.  It's a bug in bzrlib (symptom details
>>> are in http://pastebin.ubuntu.com/675476/ , problem details are
>>> in http://pastebin.ubuntu.com/675505/).
>>> 
>>> https://bugs.launchpad.net/bzr/+bug/835035
>>> 
>>> I'm going to write a test and fix for bzr, make an MP, and make
>>> a Launchpad egg.
>>> 
>> 
>> This looks a lot like 
>> https://bugs.launchpad.net/launchpad/+bug/721166, which Martin
>> Pool started fixing in June.
> 
> Yes, it does!  Martin said on 835035 that he thought it was not a
> dupe.  Martin, maybe you could clarify what looks different?
> 
> I am pretty sure I have a fix for 835035, landed in bzr 2.3 thanks
> to jam.  Aaron advised me that if I wanted it in Launchpad with 2.4
> in the next couple of weeks, I should roll my own branch with a fix
> and add that to db-devel (where we have 2.4 ready for deployment
> Friday AIUI), so I plan to do that.
> 
> Gary

Technically, they are 2 different things.

You had a test that was triggering a failure (what happens when the
fallback repository is incompatible), which explicitly left a
repository locked. The fact that LockWarner was telling us "you left a
repository locked" meant that we actually were discovering there was a
bug. Without LockWarner, we would have silently garbage collected the
otherwise locked repository.

The proposed fix for bug #721166 is to get rid of LockWarner because
__del__ is triggering at a time that is causing confusion. (It may not
trigger in the middle of the actual test, etc.)

I'm not entirely sure whether the tests are buggy (doing
repo.lock_read() without .unlock()) whether they are testing something
which is slightly buggy in bzr (such as what you found for bug
#835035), or whether it is something else entirely.

You can argue that a test that just doesn't unlock its repository
isn't worth failing.

Now, it sounds like disabled_test_sphinxdocs.py wasn't actually where
the bug was. It was just a test that happened to check stderr where
LockWarner content was being written. So it is possible that fixing
bug #835035 (properly unlock the repository) will have already fixed
the sphinx tests.


It at least sounds like a case where Martin's fix for bug #721166 was
wrong. The __del__ methods were hard to debug (they don't trigger
deterministically), but they were actually a symptom of a real bug in
bzrlib.

Anyway, without upgrading to bzr-2.4 can you enable the disabled
sphinx tests?

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk5dEBQACgkQJdeBCYSNAAPptwCdEczD02O+GS7k98lW/tyPivYK
DlgAoJ0VfyHyFoI1vZ1c62NgCQpsmoe/
=7VXB
-----END PGP SIGNATURE-----


References