launchpad-dev team mailing list archive

Thread
Date

RFC: Stopping apport-driven +filebug requests from timing out

To: Launchpad <launchpad-dev@xxxxxxxxxxxxxxxxxxx>
From: Graham Binns <graham@xxxxxxxxxxxxx>
Date: Thu, 22 Oct 2009 10:55:00 +0100
Sender: graham.binns@xxxxxxxxx

Hi folks,

We've been seeing repeated timeouts in the +filebug page where users are
coming from apport (so +filebugs/$token, to be precise). These timeouts
often look like they're happening in the LoginStatus code, but some
investigation proves this to be a red herring.

My current theory is that:

 1. A user comes to +filebug/$token
 2. FileBugViewBase.publishTraverse() handles the token, fetches the
    LibraryFileAlias to which it points and then passes that to a
    FileBugDataParser.
 3. FileBugViewBase.publishTraverse() calls FileBugDataParser.parse()
 4. Time passes.
 5. More time passes.
 6. FileBugDataParser.parse() completes and the request continues, but
    parse() has taken so long to run (~30 seconds) that by the time the
    LoginStatus code is being run the timeout limit kicks in and the
    request is given the nuclear boot of doom.

FileBugDataParser.parse() is in fact pretty much one big while loop
(lib/lp/bugs/browser/bugtarget.py:187), looping over the contents of the
file that apport has attached and dealing with them appropriately. I'm
pretty certain that the problem we're having is just one of too much
data; the files that were being uploaded by apport in the cases I looked
at were circa 90MB in size, and they're going to take a while to parse,
whichever way you look at it.

Now, as far as I can tell - without studying the loop in detail and
trying to find ways to slim it down - the only real way to fix this is
to move the processing of the apport data elsewhere, so that it doesn't
impact on the user's session. As I see it, the options are:

 1. Create a script that processes apport data and make it possible for
    the +filebug process to tell it "Hey, this LibraryFileAlias is mine,
    please process it and update this bug appropriately" after the bug
    has been filed.
 2. Make it so that the apport data get processed before the user is
    pointed at +filebug, so that the requisite data are available to
    +filebug as via a series of queries instead of locked away in a
    BLOB.
 3. A variation on option 1, whereby +filebug will only use the
    asynchronous method for files over a certain size, e.g. 25MB or so).

The problem with options 1 and 3 is that we need the apport data before
filing the bug, as far as I can tell. The docstring of
FileBugDataParser.parse states that the following items are gleaned from
apport:

  * The initial bug summary.
  * The initial bug tags.
  * The visibility of the bug.
  * Additional initial subscribers

In addition:

  * The first inline part will be added to the description.
  * All other inline parts will be added as separate comments.
  * All attachment parts will be added as attachment.

So at this point, as far as I can tell, only option 2 is actually
viable, though it may require changes to apport, too (probably not, but
I'm just tossing it in there for the sake of being paranoid). Unless
there's some other way of fixing this that I've not thought about at
this point (as I said, I haven't had time yet to properly profile the
offending while loop to find out if there are savings to be made).

I'd appreciate any comments or suggestions you might have.

Regards,

Graham

-- 
Graham Binns | PGP Key: EC66FA7D

Follow ups

Re: RFC: Stopping apport-driven +filebug requests from timing out
From: Francis J. Lacoste, 2009-10-22
Re: RFC: Stopping apport-driven +filebug requests from timing out
From: Gavin Panella, 2009-10-22