launchpad-dev team mailing list archive
-
launchpad-dev team
-
Mailing list archive
-
Message #02725
SQL/DDL help wanted to fix a distrosseries timeouts
There are frequent timeouts when we want to know about the bugs and
translations of source packages in a distro series. There is no source package
of course, the real information is scattered throughout the db. We always do
a query to workout which source packages are in a series then join to other
tables or subselects to learn about the packages.
I think the best answer is to modify sourcepackagename to store the
aggregations for fast queries. bug heat and total PO messages are what
our queries use.
Barring that, I am trying to make this subselect faster by many orders of
magnitude. There is an index on bugtask.sourcepackagename, but the query
does not use it. 3/5 of all bugtasks has a sourcepackgename, so an index
may be irrelevant.
explain
SELECT BugTask.sourcepackagename,
sum(Bug.heat) AS total_heat,
count(Bug.id) AS total_bugs
FROM
BugTask
JOIN Bug ON bugtask.bug = Bug.id
WHERE
BugTask.sourcepackagename is not NULL
AND BugTask.distribution = 1
AND BugTask.status in (10, 15, 20, 21, 22, 25)
GROUP BY BugTask.sourcepackagename;
HashAggregate (cost=131735.63..131743.54 rows=527 width=12)
-> Hash Join (cost=97898.98..131115.61 rows=82670 width=12)
Hash Cond: (bugtask.bug = bug.id)
-> Bitmap Heap Scan on bugtask (cost=3032.33..32043.23 rows=82670 width=8)
Recheck Cond: (status = ANY ('{10,15,20,21,22,25}'::integer[]))
Filter: ((sourcepackagename IS NOT NULL) AND (distribution = 1))
-> Bitmap Index Scan on bugtask__status__idx (cost=0.00..3011.67 rows=206145 width=0)
Index Cond: (status = ANY ('{10,15,20,21,22,25}'::integer[]))
-> Hash (cost=85994.40..85994.40 rows=540740 width=8)
-> Seq Scan on bug (cost=0.00..85994.40 rows=540740 width=8)
--
__Curtis C. Hovey_________
http://launchpad.net/
Attachment:
signature.asc
Description: This is a digitally signed message part
Follow ups