launchpad-dev team mailing list archive
-
launchpad-dev team
-
Mailing list archive
-
Message #06444
Re: performance tuesday - bugmessage progress and bug searches
-
To:
Launchpad Community Development Team <launchpad-dev@xxxxxxxxxxxxxxxxxxx>
-
From:
Julian Edwards <julian.edwards@xxxxxxxxxxxxx>
-
Date:
Wed, 9 Feb 2011 09:20:33 +0000
-
In-reply-to:
<AANLkTikvfiNJn-b5j9Boj-7Tp=qV0a-p4yDrZyB7o8L=@mail.gmail.com>
-
Organization:
Canonical Ltd
-
User-agent:
KMail/1.13.5 (Linux/2.6.35-25-generic; KDE/4.5.1; x86_64; ; )
On Wednesday 09 February 2011 00:12:51 Robert Collins wrote:
> Secondly, there is some confusion about what makes queries fast in
> SQL. Basically the goal has to be to reduce the amount of data being
> examined as rapidly as possible - to use the most selective index
> possible; our schemas need to be compatible with that goal, or we
> can't query efficiently.
I've made some queries over 50% faster by just removing unnecessary joins to
very large tables.
> Because of this, when dealing with two separate but highly selective
> indices on joined tables, it can be *faster* to do two queries and
> union them, than to do a single query with an OR clause: in principle
> the query planner can figure this all out, but in practice - well in
> practice we need to measure /everything/ we do :). So when we see a
> UNION query, don't assume its a bad query builder artifact - it may
> be, *or* it may be a solution to querying quickly with our schema. bug
> 714383 has details on a case where we should add a union in bug search
> because of this.
One thing I've noticed during my own optimising hack sessions is that it's
very hard to work out *why* a query was written in a particular way.
So, I'd love it if everyone could add some comments in the code to explain why
a query is constructed in a particular way.
For example,
* why is it pre-joining, what needs that pre-join?
* why are you using a union when an OR would suffice (apropos Rob above)
or anything else that's noteworthy. We should also be a bit more diligent
about asking about queries and their performance in reviews.
> For the same root causes - constraining the intermediary tables that
> are created, and the amount of data being processed, I'm growing more
> and more concerned about the impact of wide queries - 'prejoins' on
> our system; I'm starting to think we should have a default of set
> based eager loading rather than wide-query eager loading : what do you
> think?
Can you give a concrete example?
J
Follow ups
References