launchpad-dev team mailing list archive

Thread
Date

Re: Performance question

To: Ian Booth <ian.booth@xxxxxxxxxxxxx>
From: Robert Collins <robert.collins@xxxxxxxxxxxxx>
Date: Tue, 21 Sep 2010 12:24:20 +1200
Cc: launchpad-dev@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4C97F461.2030506@canonical.com>
Sender: robertc@xxxxxxxxxxxxxxxxx

On Tue, Sep 21, 2010 at 11:55 AM, Ian Booth <ian.booth@xxxxxxxxxxxxx> wrote:
> Hi Robert,
>
> Thanks for the information.
>
>>
>> I don't think we've ever had Launchpad tuned to the point that this
>> would be a significant win: we're looking at multi second queries
>> needing improving, not 10's of ms ones.
>>
>
> It's not so much the 10's of ms for each query, but the cumulative
> effect of the resources (mainly CPU load) required on the database
> server to parse the SQL over and over again. There's two main multiplier
> effects at work: SQL volume and concurrent access load ie number of
> concurrent users. I know there's tuning occurring right now, but some
> views seem to result in 1000's of SQL queries and even once these are
> eliminated, there will be other similar inefficiencies introduced later
> for sure. Any complex system with potential bottlenecks like CPU
> contention or even seemingly innocuous db throughput issues can very
> quickly become unstable since effectively a positive feedback loop is
> created. This can be exacerbated also by the effects of external factors
> such as other applications on the same hardware "stealing" core
> resources like I/O bandwidth etc.

Indeed. We don't have the last issue you list, but its certainly all
connected. We have a resource mismatch at the moment: the backend has
16 read/write threads and 16 more readonly threads across three
machines (1x16 core, 2x8 core, only the 16 can write). We have nearly
300 things submitting queries (64 lpnet appserver threads, 20 edge
appserver threads, 4 from xmlrpc, and 270 odd cron jobs). If they all
went beserk we'd be sunk.

We're reasonably far back from the knee though, which is a comfort.
That said, prepared queries in postgresql only last for the session,
so they don't provide a global optimisation, and they are only a small
win: an easy win, but a small one. I say a small one because the
incremental benefit on load is the fraction of time in the query spent
planning : unless that is large (say 10%), its an overall maximum
benefit of whatever percentage it is.

I lack data to say for sure, but my impression is that the time spent
planning is -very small- (because analyze is ~ instant for any query,
and that does the plan). So today, for me, choosing to spend the day
on prepared statements, or on (say) bugtask:+index tuning, is a choice
between a fraction of a % win across the system, or a 30%-40% win on
one of the most popular pages in the system.

Like I say though:

>> I agree that they are a no brainer win, but I suspect we'd need some
>> infrastructure work in storm to bring them in.
>>
>> If you, or someone else wants to work on them in the near term, brilliant.
>>
>
> It's on my todo list to look at the storm codebase and persistence layer
> in lp in more detail :-) I'm happy to get stuck in once I've got a
> better understanding of this aspect of the system architecture.

Cool!

>> In the medium to long term, there are a few considerations to bear in mind.
>>
>> Firstly, once we have some breathing space on performance, we're going
>> to be reevaluating our DB stack : we need things like automatic
>> failover and write scaling that are absent today. One possible outcome
>> there is a move away from postgresql (or even SQL per se) in which
>> case investment in storm & stored procedures may be a dead end. I
>> don't have any feeling for how likely that is yet: it would be a huge
>> effort to move and theres lots of things to consider in evaluations in
>> that area.
>>
>
> One of the mandatory things IMHO is getting persistence infrastructure
> out of the core domain model :-D

I completely agree. I'd like to have a strong idea about what
persistence model we're targeting too though - I think it will inform
the design choices in moving stuff around.

-Rob

References

Performance question
From: Ian Booth, 2010-09-20
Re: Performance question
From: Robert Collins, 2010-09-20
Re: Performance question
From: Ian Booth, 2010-09-20