← Back to team overview

launchpad-dev team mailing list archive

Re: Performance Tuesday: enhancement; analysis; questions

 

On Wed, Oct 6, 2010 at 1:52 AM, Ian Booth <ian.booth@xxxxxxxxxxxxx> wrote:
> Hi
>
> I took the opportunity to look at one of the code team top 10's (both
> for duration and statement count). I thought I'd share what I found - it

\o/
> *Question:* does the "Repeated SQL Statements" section on the oops page
> collate the statements with or without taking into account the
> parameters being substituted? If without, should it?

OOPs discard the bind parameters, so they detect queries with the same
template and different parameters as repeated (which is desirable -
exact repeats are rare, the usual repeat we see is an associated table
being looked up just-in-time rather than en masse).

> *Analysis*
>
> 1. Duplicate sql in populating view items (which btw are count(*) which
> may be expensive in postgres I've gathered from others' posts).

Yes. The best way to think of count(*) cost is 'the same as doing the query'.

> Here's one particular issue which consumes, for this particular oops,
> 20% of the sql time. The root cause leads to a design issue which I'll
> pose as a question at the end. Very briefly, the
> PersonSubscribedBranchesView causes the instantiation of 2 copies of a
> PersonBranchesMenu object - one for the menu links under the page title
> and one for the links on the right hand side of the page. Each of these
> uses a property cache, but a new, different cache instance is used for
> each PersonBranchesMenu object, even though the underlying domain object
> whose properties are being cached represents the same real world entity.
> Hence as each PersonBranchesMenu attribute is populated, there is one
> cache miss per attribute per cache instance - it should be only one miss
> per attribute IMHO.

Can you put the cache on the domain object?

> The other issue with the way this page is populated is that the
> construction of both PersonBranchesMenu instances results in every
> single one of the defined "extra_attributes" being retrieved, even
> though one of those instances is only used to render the "Add Branch"
> link on the right of the page and none of the extra_attributes from that
> instance are actually used.

Thats certainly an issue too; asking for a narrower report from the
context may help?

> So we're executing potentially expensive sql to get attributes which
> have already been fetched in a different instance and which aren't in
> fact required for the instance in question. For this page, one of the
> PersonBranchesMenu instances comes from a reference directly in the page
> template, the other comes from an included macro.
>
> *Questions*
>
> a. Should the property caching infrastructure be made smarter so that
> for a given view rendering cycle, each domain object representation only
> uses a single cache instance?

It already does; you probably have two domain object instances, not
one instance with two caches. There are a few ways this could occur;
what particular object and cached attribute are you looking at? (so
that I can look at the code).

> b. Can anything be done to configure the tales/zope infrastructure so
> that entities involved in rendering the view be accessed as singletons
> where appropriate?

So the basic problem is that we have multiple 'templates' involved in
rendering, and each template has a 'view' and the view has a
'context'.

If you provide a single context for all the views you can get a single
instance; 'model' objects will have this behaviour intrinsically due
to the storm cache, other objects won't.

Alternatively, juidicious caching on the model objects can let things
traverse more sensibly - I did that two weeks back on bugtask:+index,
if you want an example.

> 2. Other duplicate sqls
>
> According to the oops report, there's a large number of duplicate sqls
> which don't seem to be used to directly populate any view items (but I
> haven't done a complete analysis yet).
>
> 17 x select ... from Person

LP_DEBUG_SQL_EXTRA=1 is your friend: run interactively with this, and
you can see whats triggering the call.

My money is on permission checking.

> 17 x select ... from ValidPersonCache

This is checking 'is_valid_person' - see Person._validity_clauses for
eager-loading support for this. Examples clients are in blueprints and
person.py.

> 14 x select .... FROM LibraryFileAlias

Probably team branding metadata lookups, a prime candidate for eager
loading - just get the tupleset and use DRS on it; these three sets of
queries probably should all be deleted and be either:
a single result set for the main object driving their retrieval
or
three queries done in the pre_iter_hook of a DRS for that main object

which makes most sense will depend on the number of NULLs and the
width of the resulting query.

> These don't take up too much sql time but to me the high execution
> counts (> 1/2 the total) suggest there's something somewhere that should
> be cached. Besides, these will add additional I/O load to the database
> server which all adds up if the system is under load. Perhaps there will
> be a similar sert of sqls executed for other similar views and a
> common/shared optimisation approach can be used? I don't yet know enough
> about the origin of these to make a call. Anyone else have any ideas?
>
> If you've managed to read this far without hitting delete, well done and
> go get a life :-)

Can't, sorry... I'm lifeless :)

-Rob



Follow ups

References