← Back to team overview

launchpad-dev team mailing list archive

performance tuesday - bug search timeouts, timeouts, and with queries

 

We had a bad day today with bug search. Something is causing our
bug_fti index to get fat faster than expected - we don't know why yet,
but this is the second outbreak we've had since I became TA. The
symptoms (hindsight being wonderful) are pretty clear: we have had
slowly increasing volumes of timeouts on fti queries such as +filebug
+filebug-find-similar which search only bug_fti, with inexplicable
slow plans.

What happened today though was that my patch to drop the (slow,
useless) bugtask_fti index from regular bug searches appeared to make
many bug searches slower. In actual fact, the index was massively
bloated (and this happened fairly rapidly):
17:43 < stub> We can repack on the slaves. Tricky on the master.
17:43 < stub> It wasn't particularly bloated last I looked...
17:45 < stub> Nope, bug_fti is pretty majorly bloated.
17:45 < wgrant> stub: How badly?
17:46 < stub> Free space at 93%. 20% would be normalish.

Stuart has repacked the bug_fti index on prod and the db slaves, and
we're now searching massively faster than we were yesterday (because
the patch dropping bugtask_fti is live, *and* we have shiny tight
indices.

Todays OOPS summary, even with 8 hours of poor search performance, had
a total of 307 timeouts, and 1816 pages that took over 10 seconds to
render. This is awesome - its half the timeout volume we had when we
last dropped the timeout limit : so I asked spm and its been lowered
again (after Stuart resolved the bug search performance issue). As
before, if things go suddenly bad, raise the timeout  : but I don't
expect things to go suddenly bad :).

Lastly I'd like to talk about an awesome feature we have in
postgresql8.4 - the WITH statement.  I have a patch up for storm which
adds support for using WITH - its subject to change without notice,
but we have it in lp (and I'll update it as I get further feedback on
the storm patch). There was a patch on storm trunk that was going to
cause a bit of disruption, so I've also got a branch with that backed
out - thats the one we're running today.

A simple WITH clause:
store.with_(SQL("some_products AS (SELECT id from product where name
like '%foo')")).find(Person, ...)

The point of this noddy example would be to get what is essentially a
variable - we can then use some_products in multiple parts of the main
query - without it being reevaluated. (Postgresql could in principle
share identical subclauses, but it doesn't yet).

WITH clauses can do much more than that though - they can be used to
implement recursion - graph traversal for instance - inside the DB.
Edwin landed a patch using this to eliminate timeouts on team
membership changes; Tim has a patch making blueprint dependencies (in
both directions) be done with a single query. Gary has a subscription
query which performs most snappily using a WITH clause. Lastly I've
landed two patches now using WITH clauses  - one to get the
sourcepackagenames in a component, and one to get the pots that are
relevant for suggestions : all of these patches have made dramatic
performance improvements.

Read the docs, look at the current queries in the code base - WITH can
be a significant win.

lp:~lifeless/storm/with
lp:~lifeless/storm/with-without-datetime
http://www.postgresql.org/docs/8.4/static/sql-select.html#SQL-WITH
http://www.postgresql.org/docs/8.4/static/queries-with.html



Follow ups