launchpad-dev team mailing list archive

Thread
Date

Re: [rfc] more branch content in the main ui / loggerhead service

To: John Meinel <john@xxxxxxxxxxxxxxxxx>
From: Robert Collins <robertc@xxxxxxxxxxxxxxxxx>
Date: Tue, 28 Jun 2011 23:49:22 +1200
Cc: Martin Pool <mbp@xxxxxxxxxxxxx>, Launchpad Community Development Team <launchpad-dev@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <BANLkTinMC6KWis6GQpsy=xprSikPaG=zfA@mail.gmail.com>

On Tue, Jun 28, 2011 at 10:43 PM, John Meinel <john@xxxxxxxxxxxxxxxxx> wrote:
> I personally have a pretty good feeling about what requests are slow (revno
> and annotations). We could somehow clarify "safe" requests that could be
> made to loggerhead.

I hesistate to agree with this because I think the number and depth of
issues we'll run into means the whole thing is effectively unsafe -
the issues you list below are only the known ones.

> We'll still run into issues, like loggerhead runs in multithreaded mode
> inside python, so theoretically one request can block an independent
> request. And we have 16 appserver instances, but only 2 loggerhead ones.
> Though certainly we expect a smaller fraction hitting loggerhead, though if
> we start pulling more bits into the main lp...

*cough*, we have 64 appserver instances. - 16 per machine, 4 machines.
Of course we aren't showing code pages on every one simultaneous, but
certainly, we should expect additional load coming onto loggerhead,
and a need to scale that service.

> Certainly you could write tests for specific rpcs that checks the "scale"
> appropriately. The same as you do today that you don't issue extra db
> queries when you add items to a page.
>
> Diff is fast, getting the commit message is fast, getting file content is
> fast. It does seem like there is a lot we could be getting from bzr that
> wouldn't negatively impact render times.

Are they really? Pick a random, cold cache launchpad (itself) branch.
Get me the diff from tip to -1. Is that subsecond for the whole
operation? yes? It might be fast; now try on a linux branch.

We may have *hot* cache operations fast subjectively, but that doesn't
mean we're at all ready to claim its fast when thrown random data
sets. Our data store massively exceeds main memory on either the
storage server or web servers for loggerhead: we have a significant
optimise-and-iterate job ahead of us to make it fast : and its one we
should do.

bzr and loggerhead have made huge improvements in performance, but web
services are looking at a different scale again, and I'm not at all
convinced we're there yet (OTOH I don't think git or any of the other
DVCSs are there *either* - cold cache performance is hard).

For instance, getting the commit message for a revision from
postgresql is (at the moment) 8ms *cold*, 0.8ms *hot*.

-Rob

Follow ups

Re: [rfc] more branch content in the main ui / loggerhead service
From: Jeroen Vermeulen, 2011-07-01
Re: [rfc] more branch content in the main ui / loggerhead service
From: John Arbash Meinel, 2011-06-28

References

Re: [rfc] more branch content in the main ui / loggerhead service
From: John Meinel, 2011-06-28