← Back to team overview

launchpad-dev team mailing list archive

Re: Request for brainjuice: bug 194558

 

On Wed, Oct 5, 2011 at 10:34 AM, Francis J. Lacoste
<francis.lacoste@xxxxxxxxxxxxx> wrote:
> Hi Graham,
>
> tl;dr  Thanks for looking in this issue.
> I think the let's upload to twisted is a big hand wave at the moment and
> we should probably investigate why apache is failing.
>
> I don't understand how allowing direct upload to the librarian would
> solve this bug. It might, but I think there are quite a few loose ends
> that need to be connected first.

We have a cluster of issues around the current architecture: Uploading
via zope buffers on the appserver, *then* spools to the librarian.
This bug -may- be unrelated, but its more likely related than not
(IMNSHO). My theory is that this is happening:
client -> apache -> zope (streaming)
(wait for the entire upload to buffer in zope)
apache starts its timeout counter
zope -> worker thread
worker thread begins uploads to the librarian
apache times out
worker thread completes upload to the librarian
worker thread sends 302 (the upload completed)
... but the user already saw the apache timeout

> For reminder, the problem is that the file upload form fails at the
> apache level in some nebulous but - not so uncommon - cases. That form
> handles upload both the content (which ends up in the librarian) and the
> metadata (creating the release and linking it to the file content).

> So loose ends:
>
>  * As a general deployment aren't we always running apache in front of
>    twisted? So uploading to the librarian would fail similarly than
>    now.

No, because there wouldn't be a long internal retransmission to the librarian.

>  * It's interesting that users report that creating the release using
>    the API script doesn't suffer from this bug. In which case, this
>    might point to a apache <-> browser bad interaction.

That is interesting but the bug seems inconsistent anyway - it may
well be tied to librarian load, for instance. Or the lp api may be
streaming directly to the librarian which would avoid the issue I
theorise is happening. (And if so, thats yet another good reason to
move uploads to the librarian, freeing up interactive threads in the
appservers).

>  * How are we going to handle the cross domain issue (loads from
>    launchpad.net, posts to launchpadlibrarian.net - without going
>    through apache?)

going through apache hasn't been shown to be caustive vs correlated.

>  * Are we going to write a "specific" view in twisted to take care of
>     all the data handled by this form. Or more specifically, devise a
>     generic protocol where once the content is stored, the librarian
>     redirect to the real view with the file alias reference + the
>     other metadata originally passed in)?

I proposed that the librarian create the LFC and do a backend call
passing on the form variables from the browser to LP, which would
create the LFA and return the next_url which the librarian then passes
to the client.

> And I'm pretty sure that "solutions" to any of these problems will open
> up a bunch of other issues. Making that path really too much for what we
> are trying to achieve (work-around a weird apache bug).

That assumes its a weird apache bug. I believe this is a classic
architectural failure, and its one I noted about a year back.

> Simpler fix:
>
>  * See if playing if timeouts on apache change the behavior.

Meep - lets make sure we can reproduce the fault, in a test
environment, reliably, before we go fiddling at knobs.

>  * Change (or make an alternative yui-based upload form) that uses the
>    yui flash-based upload widget (maybe the flash http doesn't
>    trigger the apache bug, like the launchpadlib script does).

As a data gathering point perhaps, but HTTP is HTTP, we've no concrete
reason to think that this is apache specifically, especially as some
users with browsers don't encounter the problem.

>  * Just add warnings to users that they use the launchpadlib script if
>    they encounter repeated upload errors.

This might work, but I suggest we apply science first.

-Rob


Follow ups

References