launchpad-dev team mailing list archive
-
launchpad-dev team
-
Mailing list archive
-
Message #04519
Re: result sets and iteration... unsafe idioms - seeking thoughts on how we can avoid them
-
To:
launchpad-dev@xxxxxxxxxxxxxxxxxxx
-
From:
Jeroen Vermeulen <jtv@xxxxxxxxxxxxx>
-
Date:
Mon, 30 Aug 2010 12:12:15 +0700
-
In-reply-to:
<1283023963.9910.29.camel@babaroga>
-
User-agent:
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.11) Gecko/20100713 Thunderbird/3.0.6
On 2010-08-29 02:32, Danilo Šegan wrote:
У суб, 28. 08 2010. у 06:24 +1200, Robert Collins пише:
Well, is_empty does a *separate query*. Its not avoiding the work, its
doing some of it again.
That's true. But you said "iterate". is_empty doesn't iterate, it
executes a much faster (on average) "limit 1" query (it's roughly as
slow only when the result is "false", but those are much faster anyhow).
I ain't saying it couldn't be done even better, but is_empty is a worthy
improvement in itself.
Agree with Danilo here: is_empty isn't particularly costly (even if the
underlying Storm implementation could be faster--see bug 525825). In
Robert's example scenario Storm should easily be able to optimize the
is_empty away, but the common case is a very different one.
I think we should aim to fix those 300ms once we've gotten
everything else sorted out first.
Indeed. Those 300ms are also below my ping time to launchpad.net. As
long as we have bigger fish to fry, I'd be happy to eliminate a query
like this if possible--but I wouldn't pick it as an SQL optimization
target if it runs just once per request on one page.
We don't current *see* that in our OOPS traces.
Its also tricky to get data for because its such a hot code path,
Gustavo has serious concerns that instrumentation in that area will
savagely hurt performance.
We can always cowboy stuff on staging and test directly. A poor-man's
instrumentation. ;)
We ran one of our app servers under gdb for some time. It did cause
some pain. In retrospect though I think the real problem was that we
rarely had clear indications of whether a particular performance
incident happened on the instrumented server or not. If the oopses etc.
had stood out clearly it might have been fine.
Perhaps we should dedicate one of the production appservers to
performance experiments. We could use that for profiling, but also
other controlled experiments such as trying different tradeoffs between
processes and threads. For the fine-grained systemic optimizations we
may not even care much about timeouts, but more about enabling the
experimental server to handle more than its fair share of requests.
If we had something like this to give us authoritative answers to
performance questions, it wouldn't take us long to fill a few pages with
worthwhile experiments. Just off the top of my head:
* Does threading help throughput, or harm it because of the GIL?
* How does threading affect consistency of our performance numbers?
* Is BranchRevision really faster than talking directly to bzr?
* What numbers does a particular timeout oops correlate with?
* Will a particular prejoin improve things overall or make them worse?
* Should we defer deserialization of multi-line strings in Storm?
* Are we optimizing a rare pathology at the cost of the common case?
* Can we win big from some simple caching in canonical_url?
We could get much better answers to these by comparing stub's
performance graphs for two alternatives over the same time period than
we could by making the change and trying to pick its effect out of the
noise next month!
Why is threading at the top of my list? Because some research suggests
that it's possible to lose a few seconds (!!!!!) to bungled GIL
contention sometimes. Our oops reports would probably show that as
variable delays scattered among SQL and non-SQL time. Arbitrary
sub-millisecond queries could sometimes be reported as taking half a
second, without offering any clear optimization target. We do actually
see those symptoms, but we have no clue where they really come from.
Jeroen
Follow ups
References