← Back to team overview

launchpad-dev team mailing list archive

decoupling bugmessage and message in the schema

 

So, I've been analyzing a bunch of bug performance, and the
Bug->BugMessage->Message split is a significant burden.

We end up doing a 4 table *minimum* query to answer things like 'what
open bugs has user X commented on'.

This is a problem because:
 - the openness (or not) is contained in bugtask
 - the message commenting status is contained in message
 - the bugmessage link is separate - but we need that.

Take pitti for example: 75000 messages
33K bugmessage links

Just enumerating those two fields takes 4.5 seconds:

 select count(bugmessage.id) from message,bugmessage where owner=100
and bugmessage.message=message.id;
 count
-------
 33149
(1 row)

Time: 4531.640 ms



QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=256803.87..256803.88 rows=1 width=4) (actual
time=5472.919..5472.920 rows=1 loops=1)
   ->  Hash Join  (cost=142973.14..256615.52 rows=75337 width=4)
(actual time=116.964..5465.813 rows=33149 loops=1)
         Hash Cond: (bugmessage.message = message.id)
         ->  Seq Scan on bugmessage  (cost=0.00..61445.48 rows=3741348
width=8) (actual time=0.010..1088.603 rows=3737977 loops=1)
         ->  Hash  (cost=142031.96..142031.96 rows=75294 width=4)
(actual time=116.874..116.874 rows=75175 loops=1)
               ->  Bitmap Heap Scan on message
(cost=1205.51..142031.96 rows=75294 width=4) (actual
time=20.798..87.278 rows=75175 loops=1)
                     Recheck Cond: (owner = 100)
                     ->  Bitmap Index Scan on message_owner_idx
(cost=0.00..1186.69 rows=75294 width=0) (actual time=14.215..14.215
rows=75175 loops=1)
                           Index Cond: (owner = 100)
 Total runtime: 5474.103 ms


As you can see here:
                                 Hash Cond: (bugmessage.message = message.id)
                                 -> Seq Scan on bugmessage
(cost=0.00..70798.85 rows=3027910 width=8) (actual
time=0.021..1354.923 rows=3026226 loops=1)
                                       Filter: (index > 0)
                                 -> Hash (cost=142031.96..142031.96
rows=75294 width=4) (actual time=122.059..122.059 rows=75175 loops=1)
                                       -> Bitmap Heap Scan on message
(cost=1205.51..142031.96 rows=75294 width=4) (actual
time=22.913..92.674 rows=75175 loops=1)
                                             Recheck Cond: (owner = 100)

this is precisely the subplan that turns up in the page for 'bugs this
user has commented on' for high volume users.

Now, I may be wrong, but I think its worth duplicating message rather
than sharing them between bugs and questions and merge proposals: we
have /very/ few shared messages today (640 bug messages have shared
ids). This doesn't need to mean duplicate model code, but we may need
to do something a little fancy to reuse model code with different db
tables.

Having a single table with the abillty to answer all three constraints
(bug, message owner, message index) at once would let postgresql do a
bitmap join between the bugtask.bug and bugmessage.bug fields (given
an appropriate index on bugmessage - and then bring in the bug itself
for privacy and product visibility.

We can move incrementally to this - for instance, we can start with
denormalised trigger-maintained fields on bugmessage, and if we can
get great queries out of that, do the larger shuffle.

Are there any reasons we /shouldn't/ do this?

Relatedly, BuildFarmJob and its related tables suffers nearly
identical poor or pathological query behaviour, but one thing at a
time.

-Rob



Follow ups