launchpad-dev team mailing list archive

Thread
Date

Re: RFC: Stopping apport-driven +filebug requests from timing out

To: Graham Binns <graham@xxxxxxxxxxxxx>
From: Gavin Panella <gavin.panella@xxxxxxxxxxxxx>
Date: Thu, 22 Oct 2009 11:24:08 +0100
Cc: Launchpad <launchpad-dev@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <d9d0a00d0910220255w7b4531aan1bd9e663d25af46f@mail.gmail.com>
Organization: Canonical Ltd

On Thu, 22 Oct 2009 10:55:00 +0100
Graham Binns <graham@xxxxxxxxxxxxx> wrote:

> Hi folks,
>
> We've been seeing repeated timeouts in the +filebug page where users are
> coming from apport (so +filebugs/$token, to be precise). These timeouts
> often look like they're happening in the LoginStatus code, but some
> investigation proves this to be a red herring.
>
> My current theory is that:
>
>  1. A user comes to +filebug/$token
>  2. FileBugViewBase.publishTraverse() handles the token, fetches the
>     LibraryFileAlias to which it points and then passes that to a
>     FileBugDataParser.
>  3. FileBugViewBase.publishTraverse() calls FileBugDataParser.parse()
>  4. Time passes.
>  5. More time passes.
>  6. FileBugDataParser.parse() completes and the request continues, but
>     parse() has taken so long to run (~30 seconds) that by the time the
>     LoginStatus code is being run the timeout limit kicks in and the
>     request is given the nuclear boot of doom.

IIRC, this version of the parser has already been Bjornified and is
about a tillenion times faster than it was previously. I guess we're
hitting a new limit now.

>
> FileBugDataParser.parse() is in fact pretty much one big while loop
> (lib/lp/bugs/browser/bugtarget.py:187), looping over the contents of the
> file that apport has attached and dealing with them appropriately. I'm
> pretty certain that the problem we're having is just one of too much
> data; the files that were being uploaded by apport in the cases I looked
> at were circa 90MB in size, and they're going to take a while to parse,
> whichever way you look at it.
>
> Now, as far as I can tell - without studying the loop in detail and
> trying to find ways to slim it down - the only real way to fix this is
> to move the processing of the apport data elsewhere, so that it doesn't
> impact on the user's session. As I see it, the options are:
>
>  1. Create a script that processes apport data and make it possible for
>     the +filebug process to tell it "Hey, this LibraryFileAlias is mine,
>     please process it and update this bug appropriately" after the bug
>     has been filed.
>  2. Make it so that the apport data get processed before the user is
>     pointed at +filebug, so that the requisite data are available to
>     +filebug as via a series of queries instead of locked away in a
>     BLOB.
>  3. A variation on option 1, whereby +filebug will only use the
>     asynchronous method for files over a certain size, e.g. 25MB or so).
>
> The problem with options 1 and 3 is that we need the apport data before
> filing the bug, as far as I can tell. The docstring of
> FileBugDataParser.parse states that the following items are gleaned from
> apport:
>
>   * The initial bug summary.
>   * The initial bug tags.
>   * The visibility of the bug.
>   * Additional initial subscribers
>
> In addition:
>
>   * The first inline part will be added to the description.
>   * All other inline parts will be added as separate comments.
>   * All attachment parts will be added as attachment.
>
> So at this point, as far as I can tell, only option 2 is actually
> viable, though it may require changes to apport, too (probably not, but
> I'm just tossing it in there for the sake of being paranoid). Unless
> there's some other way of fixing this that I've not thought about at
> this point (as I said, I haven't had time yet to properly profile the
> offending while loop to find out if there are savings to be made).

It should be possible to stop parsing one we have:

  * The initial bug summary.
  * The initial bug tags.
  * The visibility of the bug.
  * Additional initial subscribers
  * The first inline part.

These are all early on in the apport blob.

Then, later, parse the remainder of the blob:

  * All other inline parts.
  * All attachment parts

(We could also add a notification to the response saying that this
kind of stuff is happening.)

A problem with parsing ahead of time is that we then have to figure
out how and where to store the results, which may involve some
additional serialisation and parsing.

References

RFC: Stopping apport-driven +filebug requests from timing out
From: Graham Binns, 2009-10-22