← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~adeuring/launchpad/bug-594247 into lp:launchpad/devel

 

Abel Deuring has proposed merging lp:~adeuring/launchpad/bug-594247 into lp:launchpad/devel.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  #594247 searchTasks with structural_subscriber times out regularly
  https://bugs.launchpad.net/bugs/594247


This branch fixes bug 594247: searchTasks with structural_subscriber
times out regularly.

The fix uses stub's suggestion from a comment in the bug report
to (LEFT) JOIN the tables Product and StructuralSubscription and
to define the filter clause "StructuralSubscriber=somebody"
directly in the main query.

This required some larger modification of the method
BugTaskSet.buildQuery(): Up to now, it returned four variables,
query, clauseTables, orderby_arg, decorator. query contains a string
with the WHERE clause, clauseTables contains a list of tables
used in the WHERE clause.

Simply adding Storm Join instances for

  LEFT OUTER JOIN Product ON BugTask.product=Product.id

and

  JOIN StructuralSubscription ON (some_longer_expression)

to clauseTables has at least one problem: The method BugTaskSet.search()
optionally joins some tables, including Product, in order to pre-load
some objects that will be needed later to process a request.

So it could happen that the table Product is JOINed twice: The first time
in clauseTables because it is needed in the WHERE expression for
structural subscription, the second time to pre-load Storm Product
instances.

Hence I did not add the Storm Join instances to clauseTables; instead,
BugTaskSet.buildQuery() now returns another parameter join_tables,
which contains a sequence of tuples (storm_table, join_expression).

A new method BugTaskSet.buildOrigin() now processes join_expression and
clauseTables together with a sequence of Join instances needed to
pre-load Product, Bug, SourcePackageName instances,  so that each
table appears exactly once in the WHERE clause of the SQL query.

The new way to find bugtasks related to a structural subscription
can return bug tasks twice, for example, if somebody has
a structural subscription to a distribution and to a distro source
package. So BugTaskSet.buildQuery() now returns one more parameter,
has_duplicate_results.

BugTaskSet.buildQuery() was called by BugTaskSet.search() and
by BugTaskSet.searchBugIds(). Processing the data returned by
buildQuery() became a bit more complicated, and BugTaskSet.search()
could already create two variants of the SQL query: one to retrieve
only BugTask instances and another to retrieve related Product,
Bug, SourcePackageName instances. We can consider searchBugIds()
to be variant of search() which just defines another type of data to
retrieve, so the SQL query is now built in a new method _search()
which is called by search() and by searchBugIds().

This required another tweak: BugTaskSet.buildQuery() returns a
decorator for the result set whichs avoids later SQL queries
ensuring that the current user is allowed to view a bug. (see
lp.bugs.model.bugtask.get_bug_privacy_filter_with_decorator())

This decorator assumes that the Storm query returns BugTask
instances, which is not the case for BugTaskSet.searchBugIds().
So BugTaskSet._search() has an optional parameter "undecorated".
If it is True, _search() returns a plain Storm result set,
otherwise, it returns a DecoratedResultSet.

Treatment of queries which may have duplicate result rows
("if has_duplicate_result_rows:" in _search()): We cannot simply
call resultset.config(distinct=True) because we may get two rows
for one bugtask if somebody is subscribed to a distribution and
to a distro source package and if we preload SourcePackageName.

Using a DISTINCT ON BugTask.id query would require sorting by
BugTask.id, so I simply used a sub-query.

The part of BugTaskSet.search() (now in _search()) which handles
queries with more that one BugTaskSearchParams instance now
preloads/prejoins Bug, Product etc only if _noprejoins is False.

Finally, I added some more tests to test_bugtask_search.py

no lint.

test: ./bin/test -vvt test_bugtask_search

It might make sense to change BugTaskSet.getAssignedMilestonesFromSearch()
in a follow-up brnach so that it calls _search() directly,
instead of calling search(), extracting milestone IDs
from the result set and running another SQL query to retrieve
these milestones.

-- 
https://code.launchpad.net/~adeuring/launchpad/bug-594247/+merge/40713
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~adeuring/launchpad/bug-594247 into lp:launchpad/devel.
=== modified file 'lib/lp/bugs/model/bugtask.py'
--- lib/lp/bugs/model/bugtask.py	2010-11-03 18:43:07 +0000
+++ lib/lp/bugs/model/bugtask.py	2010-11-12 12:17:06 +0000
@@ -35,12 +35,12 @@
 from storm.expr import (
     Alias,
     And,
-    AutoTables,
     Desc,
     In,
     Join,
     LeftJoin,
     Or,
+    Select,
     SQL,
     )
 from storm.store import (
@@ -161,6 +161,7 @@
     )
 from lp.registry.model.pillar import pillar_sort_key
 from lp.registry.model.sourcepackagename import SourcePackageName
+from lp.registry.model.structuralsubscription import StructuralSubscription
 from lp.services.propertycache import get_property_cache
 from lp.soyuz.enums import PackagePublishingStatus
 from lp.soyuz.model.publishing import SourcePackagePublishingHistory
@@ -1322,6 +1323,7 @@
     :seealso: get_bug_privacy_filter_with_decorator
     """
     userid = user.id
+
     def cache_user_can_view_bug(bugtask):
         get_property_cache(bugtask.bug)._known_viewers = set([userid])
         return bugtask
@@ -1610,7 +1612,9 @@
         from lp.bugs.model.bug import Bug
         extra_clauses = ['Bug.id = BugTask.bug']
         clauseTables = [BugTask, Bug]
+        join_tables = []
         decorators = []
+        has_duplicate_results = False
 
         # These arguments can be processed in a loop without any other
         # special handling.
@@ -1717,46 +1721,41 @@
                     sqlvalues(personid=params.subscriber.id))
 
         if params.structural_subscriber is not None:
-            structural_subscriber_clause = ("""BugTask.id IN (
-                SELECT BugTask.id FROM BugTask, StructuralSubscription
-                WHERE BugTask.product = StructuralSubscription.product
-                  AND StructuralSubscription.subscriber = %(personid)s
-                UNION ALL
-                SELECT BugTask.id FROM BugTask, StructuralSubscription
-                WHERE
-                  BugTask.distribution = StructuralSubscription.distribution
-                  AND BugTask.sourcepackagename =
-                      StructuralSubscription.sourcepackagename
-                  AND StructuralSubscription.subscriber = %(personid)s
-                UNION ALL
-                SELECT BugTask.id FROM BugTask, StructuralSubscription
-                WHERE
-                  BugTask.distroseries = StructuralSubscription.distroseries
-                  AND StructuralSubscription.subscriber = %(personid)s
-                UNION ALL
-                SELECT BugTask.id FROM BugTask, StructuralSubscription
-                WHERE
-                  BugTask.milestone = StructuralSubscription.milestone
-                  AND StructuralSubscription.subscriber = %(personid)s
-                UNION ALL
-                SELECT BugTask.id FROM BugTask, StructuralSubscription
-                WHERE
-                  BugTask.productseries = StructuralSubscription.productseries
-                  AND StructuralSubscription.subscriber = %(personid)s
-                UNION ALL
-                SELECT BugTask.id FROM BugTask, StructuralSubscription, Product
-                WHERE
-                  BugTask.product = Product.id
-                  AND Product.project = StructuralSubscription.project
-                  AND StructuralSubscription.subscriber = %(personid)s
-                UNION ALL
-                SELECT BugTask.id FROM BugTask, StructuralSubscription
-                WHERE
-                  BugTask.distribution = StructuralSubscription.distribution
-                  AND StructuralSubscription.sourcepackagename is NULL
-                  AND StructuralSubscription.subscriber = %(personid)s)""" %
-                sqlvalues(personid=params.structural_subscriber))
-            extra_clauses.append(structural_subscriber_clause)
+            join_clause = (
+                BugTask.productID == StructuralSubscription.productID)
+            join_clause = Or(
+                join_clause,
+                (BugTask.productseriesID ==
+                 StructuralSubscription.productseriesID))
+            # Circular.
+            from lp.registry.model.product import Product
+            join_clause = Or(
+                join_clause,
+                And(Product.projectID == StructuralSubscription.projectID,
+                    BugTask.product == Product.id))
+            join_clause = Or(
+                join_clause,
+                And((BugTask.distributionID ==
+                     StructuralSubscription.distributionID),
+                    Or(BugTask.sourcepackagenameID ==
+                       StructuralSubscription.sourcepackagenameID,
+                    StructuralSubscription.sourcepackagename == None)))
+            join_clause = Or(
+                join_clause,
+                (BugTask.distroseriesID ==
+                 StructuralSubscription.distroseriesID))
+            join_clause = Or(
+                join_clause,
+                BugTask.milestoneID == StructuralSubscription.milestoneID)
+            join_tables.append(
+                (Product, LeftJoin(Product, BugTask.productID == Product.id)))
+            join_tables.append(
+                (StructuralSubscription,
+                 Join(StructuralSubscription, join_clause)))
+            extra_clauses.append(
+                'StructuralSubscription.subscriber = %s'
+                % sqlvalues(params.structural_subscriber))
+            has_duplicate_results = True
 
         if params.component:
             clauseTables += [SourcePackagePublishingHistory,
@@ -1930,7 +1929,9 @@
                 for decor in decorators:
                     obj = decor(obj)
                 return obj
-        return query, clauseTables, orderby_arg, decorator
+        return (
+            query, clauseTables, orderby_arg, decorator, join_tables,
+            has_duplicate_results)
 
     def _buildUpstreamClause(self, params):
         """Return an clause for returning upstream data if the data exists.
@@ -2158,100 +2159,142 @@
             ', '.join(tables), ' AND '.join(clauses))
         return clause
 
-    def search(self, params, *args, **kwargs):
-        """See `IBugTaskSet`.
-
-        :param _noprejoins: Private internal parameter to BugTaskSet which
-            disables all use of prejoins : consolidated from code paths that
-            claim they were inefficient and unwanted.
-        """
-        # Circular.
-        from lp.registry.model.product import Product
-        from lp.bugs.model.bug import Bug
-        _noprejoins = kwargs.get('_noprejoins', False)
+    def buildOrigin(self, join_tables, prejoin_tables, clauseTables):
+        """Build the parameter list for Store.using().
+
+        :param join_tables: A sequence of tables that should be joined
+            as returned by buildQuery(). Each element has the form
+            (table, join), where table is the table to join and join
+            is a Storm Join or LeftJoin instance.
+        :param prejoin_tables: A sequence of tables that should additionally
+            be joined. Each element has the form (table, join),
+            where table is the table to join and join is a Storm Join
+            or LeftJoin instance.
+        :param clauseTables: A sequence of tables that should appear in
+            the FROM clause of a query. The join condition is defined in
+            the WHERE clause.
+
+        Tables may appear simultaneously in join_tables, prejoin_tables
+        and in clauseTables. This method ensures that each table
+        appears exactly once in the returned sequence.
+        """
+        origin = [BugTask]
+        already_joined = set(origin)
+        for table, join in join_tables:
+            origin.append(join)
+            already_joined.add(table)
+        for table, join in prejoin_tables:
+            if table not in already_joined:
+                origin.append(join)
+                already_joined.add(table)
+        for table in clauseTables:
+            if table not in already_joined:
+                origin.append(table)
+        return origin
+
+    def _search(self, resultrow, prejoins, params, *args, **kw):
+        """Return a Storm result set for the given search parameters.
+
+        :param resultrow: The type of data returned by the query.
+        :param prejoins: A sequence of Storm SQL row instances which are
+            pre-joined.
+        :param params: A BugTaskSearchParams instance.
+        :param args: optional additional BugTaskSearchParams instances,
+        """
+        undecorated = kw.get('undecorated', False)
         store = IStore(BugTask)
-        query, clauseTables, orderby, bugtask_decorator = self.buildQuery(
-            params)
+        (query, clauseTables, orderby, bugtask_decorator, join_tables,
+        has_duplicate_results) = self.buildQuery(params)
         if len(args) == 0:
-            if _noprejoins:
-                resultset = store.find(BugTask,
-                    AutoTables(SQL("1=1"), clauseTables),
-                    query)
+            if has_duplicate_results:
+                origin = self.buildOrigin(join_tables, [], clauseTables)
+                outer_origin = self.buildOrigin([], prejoins, [])
+                subquery = Select(BugTask.id, where=SQL(query), tables=origin)
+                resultset = store.using(*outer_origin).find(
+                    resultrow, In(BugTask.id, subquery))
+            else:
+                origin = self.buildOrigin(join_tables, prejoins, clauseTables)
+                resultset = store.using(*origin).find(resultrow, query)
+            if prejoins:
+                decorator=lambda row: bugtask_decorator(row[0])
+            else:
                 decorator = bugtask_decorator
-            else:
-                tables = clauseTables + [Product, SourcePackageName]
-                origin = [
-                    BugTask,
-                    LeftJoin(Bug, BugTask.bug == Bug.id),
-                    LeftJoin(Product, BugTask.product == Product.id),
-                    LeftJoin(
-                        SourcePackageName,
-                        BugTask.sourcepackagename == SourcePackageName.id),
-                    ]
-                # NB: these may work with AutoTables, but its hard to tell,
-                # this way is known to work.
-                if BugNomination in tables:
-                    # The relation is already in query.
-                    origin.append(BugNomination)
-                if BugSubscription in tables:
-                    # The relation is already in query.
-                    origin.append(BugSubscription)
-                if SourcePackageRelease in tables:
-                    origin.append(SourcePackageRelease)
-                if SourcePackagePublishingHistory in tables:
-                    origin.append(SourcePackagePublishingHistory)
-                resultset = store.using(*origin).find(
-                    (BugTask, Product, SourcePackageName, Bug),
-                    AutoTables(SQL("1=1"), tables),
-                    query)
-                decorator=lambda row: bugtask_decorator(row[0])
+
             resultset.order_by(orderby)
+            if undecorated:
+                return resultset
             return DecoratedResultSet(resultset, result_decorator=decorator)
 
         bugtask_fti = SQL('BugTask.fti')
-        result = store.find((BugTask, bugtask_fti), query,
-                            AutoTables(SQL("1=1"), clauseTables))
+        inner_resultrow = (BugTask, bugtask_fti)
+        origin = self.buildOrigin(join_tables, [], clauseTables)
+        resultset = store.using(*origin).find(inner_resultrow, query)
+
         decorators = [bugtask_decorator]
         for arg in args:
-            query, clauseTables, dummy, decorator = self.buildQuery(arg)
-            result = result.union(
-                store.find((BugTask, bugtask_fti), query,
-                           AutoTables(SQL("1=1"), clauseTables)))
+            (query, clauseTables, ignore, decorator, join_tables,
+             has_duplicate_results) = self.buildQuery(arg)
+            origin = self.buildOrigin(join_tables, [], clauseTables)
+            next_result = store.using(*origin).find(inner_resultrow, query)
+            resultset = resultset.union(next_result)
             # NB: assumes the decorators are all compatible.
             # This may need revisiting if e.g. searches on behalf of different
             # users are combined.
             decorators.append(decorator)
-        def decorator(row):
+
+        def prejoin_decorator(row):
             bugtask = row[0]
             for decorator in decorators:
                 bugtask = decorator(bugtask)
             return bugtask
 
-        # Build up the joins.
-        # TODO: implement _noprejoins for this code path: as of 20100818 it
-        # has been silently disabled because clients of the API were setting
-        # prejoins=[] which had no effect; this TODO simply notes the reality
-        # already existing when it was added.
-        joins = Alias(result._get_select(), "BugTask")
-        joins = Join(joins, Bug, BugTask.bug == Bug.id)
-        joins = LeftJoin(joins, Product, BugTask.product == Product.id)
-        joins = LeftJoin(joins, SourcePackageName,
-                         BugTask.sourcepackagename == SourcePackageName.id)
-
-        result = store.using(joins).find(
-            (BugTask, Bug, Product, SourcePackageName))
+        def simple_decorator(bugtask):
+            for decorator in decorators:
+                bugtask = decorator(bugtask)
+            return bugtask
+
+        origin = [Alias(resultset._get_select(), "BugTask")]
+        if prejoins:
+            origin += [join for table, join in prejoins]
+            decorator = prejoin_decorator
+        else:
+            decorator = simple_decorator
+
+        result = store.using(*origin).find(resultrow)
         result.order_by(orderby)
+        if undecorated:
+            return result
         return DecoratedResultSet(result, result_decorator=decorator)
 
+    def search(self, params, *args, **kwargs):
+        """See `IBugTaskSet`.
+
+        :param _noprejoins: Private internal parameter to BugTaskSet which
+            disables all use of prejoins : consolidated from code paths that
+            claim they were inefficient and unwanted.
+        """
+        # Circular.
+        from lp.registry.model.product import Product
+        from lp.bugs.model.bug import Bug
+        _noprejoins = kwargs.get('_noprejoins', False)
+        if _noprejoins:
+            prejoins = []
+            resultrow = BugTask
+        else:
+            prejoins = [
+                (Bug, LeftJoin(Bug, BugTask.bug == Bug.id)),
+                (Product, LeftJoin(Product, BugTask.product == Product.id)),
+                (SourcePackageName,
+                 LeftJoin(
+                     SourcePackageName,
+                     BugTask.sourcepackagename == SourcePackageName.id)),
+                ]
+            resultrow = (BugTask, Bug, Product, SourcePackageName, )
+        return self._search(resultrow, prejoins, params, *args)
+
     def searchBugIds(self, params):
         """See `IBugTaskSet`."""
-        query, clauseTables, orderby, decorator = self.buildQuery(
-            params)
-        store = IStore(BugTask)
-        resultset = store.find(BugTask.bugID,
-            AutoTables(SQL("1=1"), clauseTables), query)
-        resultset.order_by(orderby)
-        return resultset
+        return self._search(BugTask.bugID, [], params, undecorated=True)
 
     def getAssignedMilestonesFromSearch(self, search_results):
         """See `IBugTaskSet`."""
@@ -2854,8 +2897,9 @@
                 for subscription in subscriptions))
 
         if recipients is not None:
-            # We need to process subscriptions, so pull all the subscribes into
-            # the cache, then update recipients with the subscriptions.
+            # We need to process subscriptions, so pull all the
+            # subscribes into the cache, then update recipients
+            # with the subscriptions.
             subscribers = list(subscribers)
             for subscription in subscriptions:
                 recipients.addStructuralSubscriber(

=== modified file 'lib/lp/bugs/tests/test_bugtask_search.py'
--- lib/lp/bugs/tests/test_bugtask_search.py	2010-11-04 15:53:15 +0000
+++ lib/lp/bugs/tests/test_bugtask_search.py	2010-11-12 12:17:06 +0000
@@ -507,6 +507,23 @@
         self.assertSearchFinds(params, self.bugtasks[:1])
 
 
+class ProjectGroupAndDistributionTests:
+    """Tests which are useful for project groups and distributions."""
+
+    def setUpStructuralSubscriptions(self):
+        # Subscribe a user to the search target of this test and to
+        # another target.
+        raise NotImplementedError
+
+    def test_unique_results_for_multiple_structural_subscriptions(self):
+        # Searching for subscriber who is more than once subscribed to a
+        # bug task returns this bug task only once.
+        subscriber = self.setUpStructuralSubscriptions()
+        params = self.getBugTaskSearchParams(
+            user=None, structural_subscriber=subscriber)
+        self.assertSearchFinds(params, self.bugtasks)
+
+
 class BugTargetTestBase:
     """A base class for the bug target mixin classes."""
 
@@ -628,7 +645,8 @@
             bugtask, self.searchtarget.product)
 
 
-class ProjectGroupTarget(BugTargetTestBase, BugTargetWithBugSuperVisor):
+class ProjectGroupTarget(BugTargetTestBase, BugTargetWithBugSuperVisor,
+                         ProjectGroupAndDistributionTests):
     """Use a project group as the bug target."""
 
     def setUp(self):
@@ -698,6 +716,15 @@
             'No bug task found for a product that is not the target of '
             'the main test bugtask.')
 
+    def setUpStructuralSubscriptions(self):
+        # See `ProjectGroupAndDistributionTests`.
+        subscriber = self.factory.makePerson()
+        self.subscribeToTarget(subscriber)
+        with person_logged_in(subscriber):
+            self.bugtasks[0].target.addSubscription(
+                subscriber, subscribed_by=subscriber)
+        return subscriber
+
 
 class MilestoneTarget(BugTargetTestBase):
     """Use a milestone as the bug target."""
@@ -731,7 +758,8 @@
 
 
 class DistributionTarget(BugTargetTestBase, ProductAndDistributionTests,
-                         BugTargetWithBugSuperVisor):
+                         BugTargetWithBugSuperVisor,
+                         ProjectGroupAndDistributionTests):
     """Use a distribution as the bug target."""
 
     def setUp(self):
@@ -753,6 +781,18 @@
         """See `ProductAndDistributionTests`."""
         return self.factory.makeDistroSeries(distribution=self.searchtarget)
 
+    def setUpStructuralSubscriptions(self):
+        # See `ProjectGroupAndDistributionTests`.
+        subscriber = self.factory.makePerson()
+        sourcepackage = self.factory.makeDistributionSourcePackage(
+            distribution=self.searchtarget)
+        self.bugtasks.append(self.factory.makeBugTask(target=sourcepackage))
+        self.subscribeToTarget(subscriber)
+        with person_logged_in(subscriber):
+            sourcepackage.addSubscription(
+                subscriber, subscribed_by=subscriber)
+        return subscriber
+
 
 class DistroseriesTarget(BugTargetTestBase):
     """Use a distro series as the bug target."""
@@ -838,7 +878,30 @@
     )
 
 
-class PreloadBugtaskTargets:
+class MultipleParams:
+    """A mixin class for tests with more than one search parameter object.
+
+    BugTaskSet.search() can be called with more than one
+    BugTaskSearchParams instancs, while BugTaskSet.searchBugIds()
+    accepts exactly one instance.
+    """
+
+    def test_two_param_objects(self):
+        # We can pass more than one BugTaskSearchParams instance to
+        # BugTaskSet.search().
+        params1 = self.getBugTaskSearchParams(
+            user=None, status=BugTaskStatus.FIXCOMMITTED)
+        subscriber = self.factory.makePerson()
+        self.subscribeToTarget(subscriber)
+        params2 = self.getBugTaskSearchParams(
+            user=None, status=BugTaskStatus.NEW,
+            structural_subscriber=subscriber)
+        search_result = self.runSearch(params1, params2)
+        expected = self.resultValuesForBugtasks(self.bugtasks[1:])
+        self.assertEqual(expected, search_result)
+
+
+class PreloadBugtaskTargets(MultipleParams):
     """Preload bug targets during a BugTaskSet.search() query."""
 
     def setUp(self):
@@ -852,7 +915,7 @@
         return expected_bugtasks
 
 
-class NoPreloadBugtaskTargets:
+class NoPreloadBugtaskTargets(MultipleParams):
     """Do not preload bug targets during a BugTaskSet.search() query."""
 
     def setUp(self):