← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~cjwatson/launchpad/optimise-milestone-xref-query into lp:launchpad

 

Colin Watson has proposed merging lp:~cjwatson/launchpad/optimise-milestone-xref-query into lp:launchpad.

Commit message:
Make XRefSet.findFromMany more efficient for large numbers of IDs.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  Bug #1520281 in Launchpad itself: "Milestone:+index times out with enormous XRef query"
  https://bugs.launchpad.net/launchpad/+bug/1520281

For more details, see:
https://code.launchpad.net/~cjwatson/launchpad/optimise-milestone-xref-query/+merge/278732

Make XRefSet.findFromMany more efficient for large numbers of IDs.

Here's a before-and-after comparison of query plans with ten IDs (rather fewer than in the OOPS so there's negligible timing difference here, but the plans give an idea; also the plans report different row counts but those are lies):

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            QUERY PLAN                                                                                                                                                                                                                                                                                                                                                                                                                                                  
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on xref  (cost=44.30..48.34 rows=1 width=68) (actual time=0.154..0.154 rows=1 loops=1)
   Recheck Cond: (((from_type = 'bug'::text) AND (from_id = '1433600'::text) AND (to_type = 'specification'::text)) OR ((from_type = 'bug'::text) AND (from_id = '1490949'::text) AND (to_type = 'specification'::text)) OR ((from_type = 'bug'::text) AND (from_id = '1286151'::text) AND (to_type = 'specification'::text)) OR ((from_type = 'bug'::text) AND (from_id = '1515529'::text) AND (to_type = 'specification'::text)) OR ((from_type = 'bug'::text) AND (from_id = '1515530'::text) AND (to_type = 'specification'::text)) OR ((from_type = 'bug'::text) AND (from_id = '1515547'::text) AND (to_type = 'specification'::text)) OR ((from_type = 'bug'::text) AND (from_id = '1507361'::text) AND (to_type = 'specification'::text)) OR ((from_type = 'bug'::text) AND (from_id = '1425442'::text) AND (to_type = 'specification'::text)) OR ((from_type = 'bug'::text) AND (from_id = '1515556'::text) AND (to_type = 'specification'::text)) OR ((from_type = 'bug'::text) AND (from_id = '1474597'::text) AND (to_type = 'specification'::text)))
   Buffers: shared hit=31
   ->  BitmapOr  (cost=44.30..44.30 rows=1 width=0) (actual time=0.138..0.138 rows=0 loops=1)
         Buffers: shared hit=30
         ->  Bitmap Index Scan on xref_pkey  (cost=0.00..4.43 rows=1 width=0) (actual time=0.053..0.053 rows=0 loops=1)
               Index Cond: ((from_type = 'bug'::text) AND (from_id = '1433600'::text) AND (to_type = 'specification'::text))
               Buffers: shared hit=3
         ->  Bitmap Index Scan on xref_pkey  (cost=0.00..4.43 rows=1 width=0) (actual time=0.016..0.016 rows=0 loops=1)
               Index Cond: ((from_type = 'bug'::text) AND (from_id = '1490949'::text) AND (to_type = 'specification'::text))
               Buffers: shared hit=3
         ->  Bitmap Index Scan on xref_pkey  (cost=0.00..4.43 rows=1 width=0) (actual time=0.018..0.018 rows=1 loops=1)
               Index Cond: ((from_type = 'bug'::text) AND (from_id = '1286151'::text) AND (to_type = 'specification'::text))
               Buffers: shared hit=3
         ->  Bitmap Index Scan on xref_pkey  (cost=0.00..4.43 rows=1 width=0) (actual time=0.011..0.011 rows=0 loops=1)
               Index Cond: ((from_type = 'bug'::text) AND (from_id = '1515529'::text) AND (to_type = 'specification'::text))
               Buffers: shared hit=3
         ->  Bitmap Index Scan on xref_pkey  (cost=0.00..4.43 rows=1 width=0) (actual time=0.006..0.006 rows=0 loops=1)
               Index Cond: ((from_type = 'bug'::text) AND (from_id = '1515530'::text) AND (to_type = 'specification'::text))
               Buffers: shared hit=3
         ->  Bitmap Index Scan on xref_pkey  (cost=0.00..4.43 rows=1 width=0) (actual time=0.006..0.006 rows=0 loops=1)
               Index Cond: ((from_type = 'bug'::text) AND (from_id = '1515547'::text) AND (to_type = 'specification'::text))
               Buffers: shared hit=3
         ->  Bitmap Index Scan on xref_pkey  (cost=0.00..4.43 rows=1 width=0) (actual time=0.006..0.006 rows=0 loops=1)
               Index Cond: ((from_type = 'bug'::text) AND (from_id = '1507361'::text) AND (to_type = 'specification'::text))
               Buffers: shared hit=3
         ->  Bitmap Index Scan on xref_pkey  (cost=0.00..4.43 rows=1 width=0) (actual time=0.006..0.006 rows=0 loops=1)
               Index Cond: ((from_type = 'bug'::text) AND (from_id = '1425442'::text) AND (to_type = 'specification'::text))
               Buffers: shared hit=3
         ->  Bitmap Index Scan on xref_pkey  (cost=0.00..4.43 rows=1 width=0) (actual time=0.005..0.005 rows=0 loops=1)
               Index Cond: ((from_type = 'bug'::text) AND (from_id = '1515556'::text) AND (to_type = 'specification'::text))
               Buffers: shared hit=3
         ->  Bitmap Index Scan on xref_pkey  (cost=0.00..4.43 rows=1 width=0) (actual time=0.006..0.006 rows=0 loops=1)
               Index Cond: ((from_type = 'bug'::text) AND (from_id = '1474597'::text) AND (to_type = 'specification'::text))
               Buffers: shared hit=3
 Total runtime: 0.352 ms
(36 rows)

                                                                                            QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using xref_pkey on xref  (cost=0.43..48.32 rows=1 width=68) (actual time=0.058..0.147 rows=1 loops=1)
   Index Cond: ((from_type = 'bug'::text) AND (from_id = ANY ('{1433600,1490949,1286151,1515529,1515530,1515547,1507361,1425442,1515556,1474597}'::text[])) AND (to_type = 'specification'::text))
   Buffers: shared hit=31
 Total runtime: 0.247 ms
(4 rows)

On dogfood, this takes the relevant query on https://dogfood.paddev.net/fuel/+milestone/6.0 from ~800ms to ~25ms.
-- 
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~cjwatson/launchpad/optimise-milestone-xref-query into lp:launchpad.
=== modified file 'lib/lp/services/xref/model.py'
--- lib/lp/services/xref/model.py	2015-10-01 01:42:13 +0000
+++ lib/lp/services/xref/model.py	2015-11-26 17:15:01 +0000
@@ -9,6 +9,8 @@
     "XRefSet",
     ]
 
+from itertools import groupby
+
 import pytz
 from storm.expr import (
     And,
@@ -115,12 +117,15 @@
             return {}
 
         store = IStore(XRef)
+        extract_type = lambda id: id[0]
         rows = list(store.using(XRef).find(
             (XRef.from_type, XRef.from_id, XRef.to_type, XRef.to_id,
              XRef.creator_id, XRef.date_created, XRef.metadata),
             Or(*[
-                And(XRef.from_type == id[0], XRef.from_id == id[1])
-                for id in object_ids]),
+                And(XRef.from_type == from_type,
+                    XRef.from_id.is_in([id[1] for id in group]))
+                for from_type, group in groupby(
+                    sorted(object_ids, key=extract_type), extract_type)]),
             XRef.to_type.is_in(types) if types is not None else True))
         bulk.load(Person, [row[4] for row in rows])
         result = {}

=== modified file 'lib/lp/services/xref/tests/test_model.py'
--- lib/lp/services/xref/tests/test_model.py	2015-09-28 23:37:26 +0000
+++ lib/lp/services/xref/tests/test_model.py	2015-11-26 17:15:01 +0000
@@ -76,6 +76,8 @@
                 ('b', 'foo'): {'creator': creator, 'metadata': {'test': 1}}},
             ('b', 'foo'): {
                 ('a', 'baz'): {'creator': creator, 'metadata': {'test': 2}}},
+            ('b', 'baz'): {
+                ('a', 'quux'): {'creator': creator, 'metadata': {'test': 3}}},
             })
 
         with StormStatementRecorder() as recorder:
@@ -123,6 +125,21 @@
                     'metadata': {'test': 2}}}},
              bar_baz_refs)
 
+        with StormStatementRecorder() as recorder:
+            bar_foo_refs = getUtility(IXRefSet).findFromMany(
+                [('a', 'bar'), ('a', 'nonexistent'), ('b', 'baz')])
+        self.assertThat(recorder, HasQueryCount(Equals(2)))
+        self.assertEqual(
+            {('a', 'bar'): {
+                ('b', 'foo'): {
+                    'creator': creator, 'date_created': now,
+                    'metadata': {'test': 1}}},
+             ('b', 'baz'): {
+                ('a', 'quux'): {
+                    'creator': creator, 'date_created': now,
+                    'metadata': {'test': 3}}}},
+             bar_foo_refs)
+
     def test_findFrom_creator(self):
         # findFrom issues a single query to get all of the people.
         people = [self.factory.makePerson() for i in range(3)]


Follow ups