← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~gary/launchpad/bug723999 into lp:launchpad

 

Gary Poster has proposed merging lp:~gary/launchpad/bug723999 into lp:launchpad.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  #723999 BugNomination:+edit-form structural subscriptions taking 4.8 seconds during nomination editing POST
  https://bugs.launchpad.net/bugs/723999

For more details, see:
https://code.launchpad.net/~gary/launchpad/bug723999/+merge/52133

This branch makes significant changes to how we calculate the structural subscribers for a given notification.  The previous set-based approach was elegant and may have had similar or even slightly better performance characteristics in some simple cases; however it created gigantic SQL queries and performed poorly when there were a few bugtasks for a bug, and filters for every subscription.

This branch takes a more classic approach to building a SQL query.  In tests, this brought a query that was taking 1.2-1.4 seconds to taking 300 milliseconds or less.

It also significantly reduced the SQL needed.  SQL is 2/3 size in the simplest case with a single bugtask and no tags; 1/2 size in a more complex case with a single bugtask with tags.  Once you have two tasks and on a bug with tags, the new approach's SQL is 1/3 the size, and as you add bugtasks, the ratio gets smaller and smaller.

It's worth noting that Storm support for the WITH statement would really help things.  Then I would do one WITH table for the first query, and in the case of the UNION for the tags, I might do another WITH query for the shared bits.  As it is, I'm trying to compromise between the cost of duplicating work in the database and the cost of making a separate Storm query.  The WITH statement support would also shorten the size of the SQL.  I'll be tempted to try and add that to Storm later, now that Postgres supports it.

I ended up having to use some somewhat esoteric SQL.  In addition to having a pre-implementation call with Danilo about this, I also ran it past him after I was finished.  He seemed pleased with explanations I gave verbally.  I have copiously commented the core function, so I hope that you will be able to follow along there as well.

This is a big branch--noticeably bigger than our 800 line goal. "make lint" forced some of the size on me--it's happy, but at a cost.  However, even before linting, I was still over the limit.  I tried to keep it as small as possible, and have left out some clean-ups that need to happen.  In particular, I tried to make minimal changes to the tests, so some methods that no longer serve any purpose in the codebase are left because they are exercised by valuable tests.  I will follow up with one or more later branches that clean these things up and move tests around appropriately.

Danilo and I agreed that we might want to remove the features of "include_any_tags" (the filter only accepts any bug that has one or more tag) and "exclude_any_tags" (the filter only accepts bugs that have no tags); and we felt that it might be worth reconsidering how the excluded tags work.  However, I kept those out of this branch, as out of scope.

That said, I did make one semantic change to an edge case.  Before, NOT find_all_tags meant (in part) that excluded tags (==NOT include) must *all* match.  I thought that was opposite what I would expect (and in fact initially implemented the other approach without realizing I was contradicting the tests).  After my changes, NOT find_all_tags means that *any* excluded tag must match.  For example, if a bug has the tags "foo" and "bar", a filter with find_all_tags = False and "-foo" (NOT include "foo") will now match because "bar" is not "foo".  (If find_all_tags = True, the filter will not match.  If find_all_tags = False and the filters tags are "-foo" and "-bar", it will not match).  Again, I might like to reconsider the broader approach to this feature in the future, but this is not the branch, or the time for that.

I won't say more here, in the hopes that the comments will guide you.

Thank you for looking at this over-sized branch.
-- 
https://code.launchpad.net/~gary/launchpad/bug723999/+merge/52133
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~gary/launchpad/bug723999 into lp:launchpad.
=== modified file 'lib/lp/bugs/model/bug.py'
--- lib/lp/bugs/model/bug.py	2011-03-01 05:05:26 +0000
+++ lib/lp/bugs/model/bug.py	2011-03-03 21:04:20 +0000
@@ -143,7 +143,6 @@
     )
 from lp.bugs.interfaces.bugnotification import IBugNotificationSet
 from lp.bugs.interfaces.bugtask import (
-    BugTaskSearchParams,
     BugTaskStatus,
     IBugTaskSet,
     UNRESOLVED_BUGTASK_STATUSES,
@@ -151,9 +150,6 @@
 from lp.bugs.interfaces.bugtracker import BugTrackerType
 from lp.bugs.interfaces.bugwatch import IBugWatchSet
 from lp.bugs.interfaces.cve import ICveSet
-from lp.bugs.interfaces.structuralsubscription import (
-    IStructuralSubscriptionTarget,
-    )
 from lp.bugs.mail.bugnotificationrecipients import BugNotificationRecipients
 from lp.bugs.model.bugattachment import BugAttachment
 from lp.bugs.model.bugbranch import BugBranch
@@ -170,6 +166,9 @@
     NullBugTask,
     )
 from lp.bugs.model.bugwatch import BugWatch
+from lp.bugs.model.structuralsubscription import (
+    get_all_structural_subscriptions,
+    )
 from lp.hardwaredb.interfaces.hwdb import IHWSubmissionBugSet
 from lp.registry.interfaces.distribution import IDistribution
 from lp.registry.interfaces.distributionsourcepackage import (
@@ -441,8 +440,8 @@
     @property
     def indexed_messages(self):
         """See `IMessageTarget`."""
-        # Note that this is a decorated result set, so will cache its value (in
-        # the absence of slices)
+        # Note that this is a decorated result set, so will cache its
+        # value (in the absence of slices)
         return self._indexed_messages(include_content=True)
 
     def _indexed_messages(self, include_content=False, include_parents=True):
@@ -554,7 +553,6 @@
     def bugtasks(self):
         """See `IBug`."""
         # \o/ circular imports.
-        from lp.bugs.model.bugwatch import BugWatch
         from lp.registry.model.distribution import Distribution
         from lp.registry.model.distroseries import DistroSeries
         from lp.registry.model.product import Product
@@ -562,17 +560,18 @@
         from lp.registry.model.sourcepackagename import SourcePackageName
         store = Store.of(self)
         tasks = list(store.find(BugTask, BugTask.bugID == self.id))
-        # The bugtasks attribute is iterated in the API and web services, so it
-        # needs to preload all related data otherwise late evaluation is
-        # triggered in both places. Separately, bugtask_sort_key requires
-        # the related products, series, distros, distroseries and source
-        # package names to be loaded.
+        # The bugtasks attribute is iterated in the API and web
+        # services, so it needs to preload all related data otherwise
+        # late evaluation is triggered in both places. Separately,
+        # bugtask_sort_key requires the related products, series,
+        # distros, distroseries and source package names to be loaded.
         ids = set(map(operator.attrgetter('assigneeID'), tasks))
         ids.update(map(operator.attrgetter('ownerID'), tasks))
         ids.discard(None)
         if ids:
             list(getUtility(IPersonSet).getPrecachedPersonsFromIDs(
                 ids, need_validity=True))
+
         def load_something(attrname, klass):
             ids = set(map(operator.attrgetter(attrname), tasks))
             ids.discard(None)
@@ -1401,8 +1400,8 @@
 
     def getMessagesForView(self, slice_info):
         """See `IBug`."""
-        # Note that this function and indexed_messages have significant overlap
-        # and could stand to be refactored.
+        # Note that this function and indexed_messages have significant
+        # overlap and could stand to be refactored.
         slices = []
         if slice_info is not None:
             # NB: This isn't a full implementation of the slice protocol,
@@ -1438,6 +1437,7 @@
             BugMessage.bug==self.id,
             *ranges)
         result.order_by(BugMessage.index, MessageChunk.sequence)
+
         def eager_load_owners(rows):
             owners = set()
             for row in rows:
@@ -2183,29 +2183,7 @@
     @freeze(StructuralSubscriptionSet)
     def structural_subscriptions(self):
         """Structural subscriptions to the bug's targets."""
-        query_arguments = []
-        for bugtask in self.bug.bugtasks:
-            if IStructuralSubscriptionTarget.providedBy(bugtask.target):
-                query_arguments.append((bugtask.target, bugtask))
-                if bugtask.target.parent_subscription_target is not None:
-                    query_arguments.append(
-                        (bugtask.target.parent_subscription_target, bugtask))
-            if ISourcePackage.providedBy(bugtask.target):
-                # Distribution series bug tasks with a package have the source
-                # package set as their target, so we add the distroseries
-                # explicitly to the set of subscription targets.
-                query_arguments.append((bugtask.distroseries, bugtask))
-            if bugtask.milestone is not None:
-                query_arguments.append((bugtask.milestone, bugtask))
-        # Build the query.
-        empty = EmptyResultSet()
-        union = lambda left, right: (
-            removeSecurityProxy(left).union(
-                removeSecurityProxy(right)))
-        queries = (
-            target.getSubscriptionsForBugTask(bugtask, self.level)
-            for target, bugtask in query_arguments)
-        return reduce(union, queries, empty)
+        return get_all_structural_subscriptions(self.bug.bugtasks)
 
     @cachedproperty
     @freeze(BugSubscriberSet)

=== modified file 'lib/lp/bugs/model/bugtask.py'
--- lib/lp/bugs/model/bugtask.py	2011-02-27 19:45:44 +0000
+++ lib/lp/bugs/model/bugtask.py	2011-03-03 21:04:20 +0000
@@ -100,7 +100,6 @@
     )
 from lp.app.enums import ServiceUsage
 from lp.app.errors import NotFoundError
-from lp.bugs.enum import BugNotificationLevel
 from lp.bugs.interfaces.bug import IBugSet
 from lp.bugs.interfaces.bugattachment import BugAttachmentType
 from lp.bugs.interfaces.bugnomination import BugNominationStatus
@@ -130,12 +129,14 @@
     UserCannotEditBugTaskMilestone,
     UserCannotEditBugTaskStatus,
     )
-from lp.bugs.interfaces.structuralsubscription import (
-    IStructuralSubscriptionTarget,
-    )
 from lp.bugs.model.bugnomination import BugNomination
 from lp.bugs.model.bugsubscription import BugSubscription
-from lp.bugs.model.structuralsubscription import StructuralSubscription
+from lp.bugs.model.structuralsubscription import (
+    get_all_structural_subscriptions,
+    get_structural_subscribers_for_bugtasks,
+    get_structural_subscription_targets,
+    StructuralSubscription,
+    )
 from lp.registry.interfaces.distribution import (
     IDistribution,
     IDistributionSet,
@@ -2096,7 +2097,8 @@
                     """EXISTS (
                         SELECT id FROM BugBranch WHERE BugBranch.bug=Bug.id)
                     """)
-            elif params.linked_branches == BugBranchSearch.BUGS_WITHOUT_BRANCHES:
+            elif (params.linked_branches ==
+                  BugBranchSearch.BUGS_WITHOUT_BRANCHES):
                 extra_clauses.append(
                     """NOT EXISTS (
                         SELECT id FROM BugBranch WHERE BugBranch.bug=Bug.id)
@@ -2130,6 +2132,7 @@
         if not decorators:
             decorator = lambda x: x
         else:
+
             def decorator(obj):
                 for decor in decorators:
                     obj = decor(obj)
@@ -3108,71 +3111,13 @@
         Each bugtask may be responsible theoretically for 0 or more targets.
         In practice, each generates one, two or three.
         """
-        for bugtask in bugtasks:
-            if IStructuralSubscriptionTarget.providedBy(bugtask.target):
-                yield (bugtask, bugtask.target)
-                if bugtask.target.parent_subscription_target is not None:
-                    yield (bugtask, bugtask.target.parent_subscription_target)
-            if ISourcePackage.providedBy(bugtask.target):
-                # Distribution series bug tasks with a package have the source
-                # package set as their target, so we add the distroseries
-                # explicitly to the set of subscription targets.
-                yield (bugtask, bugtask.distroseries)
-            if bugtask.milestone is not None:
-                yield (bugtask, bugtask.milestone)
+        return get_structural_subscription_targets(bugtasks)
 
-    def getAllStructuralSubscriptions(self, bugtasks, recipient):
+    def getAllStructuralSubscriptions(self, bugtasks, recipient=None):
         """See `IBugTaskSet`."""
-        targets = [target for bugtask, target
-                   in self._getStructuralSubscriptionTargets(bugtasks)]
-        if len(targets) == 0:
-            return EmptyResultSet()
-        union = lambda left, right: (
-            removeSecurityProxy(left).union(
-                removeSecurityProxy(right)))
-        queries = (
-            target.getSubscriptions(recipient) for target in targets)
-        return reduce(union, queries)
+        return get_all_structural_subscriptions(bugtasks, recipient)
 
     def getStructuralSubscribers(self, bugtasks, recipients=None, level=None):
         """See `IBugTaskSet`."""
-        query_arguments = list(
-            self._getStructuralSubscriptionTargets(bugtasks))
-
-        if len(query_arguments) == 0:
-            return EmptyResultSet()
-
-        if level is None:
-            # If level is not specified, default to NOTHING so that all
-            # subscriptions are found.
-            level = BugNotificationLevel.NOTHING
-
-        # Build the query.
-        union = lambda left, right: (
-            removeSecurityProxy(left).union(
-                removeSecurityProxy(right)))
-        queries = (
-            target.getSubscriptionsForBugTask(bugtask, level)
-            for bugtask, target in query_arguments)
-        subscriptions = reduce(union, queries)
-
-        # Pull all the subscriptions in.
-        subscriptions = list(subscriptions)
-
-        # Prepare a query for the subscribers.
-        from lp.registry.model.person import Person
-        subscribers = IStore(Person).find(
-            Person, Person.id.is_in(
-                removeSecurityProxy(subscription).subscriberID
-                for subscription in subscriptions))
-
-        if recipients is not None:
-            # We need to process subscriptions, so pull all the
-            # subscribers into the cache, then update recipients
-            # with the subscriptions.
-            subscribers = list(subscribers)
-            for subscription in subscriptions:
-                recipients.addStructuralSubscriber(
-                    subscription.subscriber, subscription.target)
-
-        return subscribers
+        return get_structural_subscribers_for_bugtasks(
+            bugtasks, recipients, level)

=== modified file 'lib/lp/bugs/model/structuralsubscription.py'
--- lib/lp/bugs/model/structuralsubscription.py	2011-02-21 15:55:48 +0000
+++ lib/lp/bugs/model/structuralsubscription.py	2011-03-03 21:04:20 +0000
@@ -3,6 +3,9 @@
 
 __metaclass__ = type
 __all__ = [
+    'get_all_structural_subscriptions',
+    'get_structural_subscribers_for_bugtasks',
+    'get_structural_subscription_targets',
     'StructuralSubscription',
     'StructuralSubscriptionTargetMixin',
     ]
@@ -17,12 +20,11 @@
 
 from storm.base import Storm
 from storm.expr import (
-    Alias,
     And,
     CompoundOper,
-    Except,
+    Count,
     In,
-    Intersect,
+    Join,
     LeftJoin,
     NamedFunc,
     Not,
@@ -31,7 +33,10 @@
     SQL,
     Union,
     )
-from storm.store import Store
+from storm.store import (
+    Store,
+    EmptyResultSet,
+    )
 from zope.component import (
     adapts,
     getUtility,
@@ -42,6 +47,12 @@
 from canonical.database.sqlbase import quote
 from canonical.launchpad.interfaces.launchpad import ILaunchpadCelebrities
 from canonical.launchpad.interfaces.lpstorm import IStore
+from lp.bugs.interfaces.structuralsubscription import (
+    IStructuralSubscription,
+    IStructuralSubscriptionTarget,
+    IStructuralSubscriptionTargetHelper,
+    )
+from lp.bugs.model.bugsubscription import BugSubscription
 from lp.bugs.model.bugsubscriptionfilter import BugSubscriptionFilter
 from lp.bugs.model.bugsubscriptionfilterimportance import (
     BugSubscriptionFilterImportance,
@@ -67,11 +78,7 @@
 from lp.registry.interfaces.product import IProduct
 from lp.registry.interfaces.productseries import IProductSeries
 from lp.registry.interfaces.projectgroup import IProjectGroup
-from lp.bugs.interfaces.structuralsubscription import (
-    IStructuralSubscription,
-    IStructuralSubscriptionTarget,
-    IStructuralSubscriptionTargetHelper,
-    )
+from lp.registry.interfaces.sourcepackage import ISourcePackage
 from lp.services.propertycache import cachedproperty
 
 
@@ -471,243 +478,412 @@
 
     def getSubscriptionsForBugTask(self, bugtask, level):
         """See `IStructuralSubscriptionTarget`."""
-        set_builder = BugFilterSetBuilder(
-            bugtask, level, self.__helper.join)
-        return Store.of(self.__helper.pillar).find(
-            StructuralSubscription, In(
-                StructuralSubscription.id,
-                set_builder.subscriptions))
+        # Note that this method does not take into account
+        # structural subscriptions without filters.  Since it is only
+        # used for tests at this point, that's not a problem; moreover,
+        # we intend all structural subscriptions to have filters.
+        candidates, filter_id_query = (
+            _get_structural_subscription_filter_id_query(
+                bugtask.bug, [bugtask], level))
+        if not candidates:
+            return EmptyResultSet()
+        return IStore(StructuralSubscription).find(
+            StructuralSubscription,
+            BugSubscriptionFilter.structural_subscription_id ==
+            StructuralSubscription.id,
+            In(BugSubscriptionFilter.id,
+               filter_id_query)).config(distinct=True)
+
+
+def get_structural_subscription_targets(bugtasks):
+    """Return (bugtask, target) pairs for each target of the bugtasks.
+
+    Each bugtask may be responsible theoretically for 0 or more targets.
+    In practice, each generates one, two or three.
+    """
+    for bugtask in bugtasks:
+        if IStructuralSubscriptionTarget.providedBy(bugtask.target):
+            yield (bugtask, bugtask.target)
+            if bugtask.target.parent_subscription_target is not None:
+                yield (bugtask, bugtask.target.parent_subscription_target)
+        if ISourcePackage.providedBy(bugtask.target):
+            # Distribution series bug tasks with a package have the source
+            # package set as their target, so we add the distroseries
+            # explicitly to the set of subscription targets.
+            yield (bugtask, bugtask.distroseries)
+        if bugtask.milestone is not None:
+            yield (bugtask, bugtask.milestone)
+
+
+def _get_all_structural_subscriptions(find, targets, *conditions):
+    """Find the structural subscriptions for the given targets.
+
+    "find" should be what to find (typically StructuralSubscription or
+    StructuralSubscription.id).  "targets" is an iterable of (bugtask,
+    target) pairs, as returned by get_structural_subscription_targets.
+    You can add in zero or more additional conditions to filter the
+    results.
+    """
+    return list(
+        IStore(StructuralSubscription).find(
+            find,
+            Or(*[IStructuralSubscriptionTargetHelper(target).join
+                 for target
+                 in set(target for bugtask, target
+                        in targets)]),
+            *conditions))
+
+
+def get_all_structural_subscriptions(bugtasks, person=None):
+    if not bugtasks:
+        return EmptyResultSet()
+    conditions = []
+    if person is not None:
+        conditions.append(
+            StructuralSubscription.subscriber == person)
+    return _get_all_structural_subscriptions(
+        StructuralSubscription,
+        get_structural_subscription_targets(bugtasks),
+        *conditions)
+
+
+def _get_structural_subscribers(candidates, filter_id_query, recipients):
+    if not candidates:
+        return EmptyResultSet()
+    # This is here because of a circular import.
+    from lp.registry.model.person import Person
+    source = IStore(StructuralSubscription).using(
+        StructuralSubscription,
+        # We need to do this LeftJoin because we still have structural
+        # subscriptions without filters in qastaging and production.
+        # Once we do not, we can just use a Join.  Also see constraints
+        # below.
+        LeftJoin(BugSubscriptionFilter,
+             BugSubscriptionFilter.structural_subscription_id ==
+             StructuralSubscription.id),
+        Join(Person,
+             Person.id == StructuralSubscription.subscriberID),
+        )
+    constraints = [
+        # We need to do this Or because we still have structural
+        # subscriptions without filters in qastaging and production.
+        # Once we do not, we can simplify this to just
+        # "In(BugSubscriptionFilter.id, filter_id_query)".  Also see
+        # LeftJoin above.
+        Or(In(BugSubscriptionFilter.id, filter_id_query),
+           And(In(StructuralSubscription.id, candidates),
+               BugSubscriptionFilter.id == None))]
+    if recipients is None:
+        return source.find(
+            Person, *constraints).config(distinct=True).order_by()
+    else:
+        subscribers = []
+        query_results = source.find(
+            (Person, StructuralSubscription),
+            *constraints).config(distinct=True)
+        for person, subscription in query_results:
+            # Set up results.
+            if person not in recipients:
+                subscribers.append(person)
+                recipients.addStructuralSubscriber(
+                    person, subscription.target)
+        return subscribers
+
+
+def get_structural_subscribers_for_bugtasks(bugtasks,
+                                            recipients=None,
+                                            level=None):
+    """Return subscribers for structural filters for the bugtasks at "level".
+
+    All bugtasks must be for the same bug.
+
+    Returns None if we already know that the result is empty.
+
+    Excludes structural subscriptions for people who are directly subscribed
+    to the bug.
+
+    Populates "recipients" (a
+    lp.bugs.mail.bugnotificationrecipients.BugNotificationRecipients) if
+    given."""
+    if not bugtasks:
+        return EmptyResultSet()
+    bugs = set(bugtask.bug for bugtask in bugtasks)
+    if len(bugs) > 1:
+        raise NotImplementedError('Each bugtask must be from the same bug.')
+    bug = bugs.pop()
+    candidates, query = _get_structural_subscription_filter_id_query(
+        bug, bugtasks, level)
+    return _get_structural_subscribers(candidates, query, recipients)
 
 
 class ArrayAgg(NamedFunc):
+    "Aggregate values (within a GROUP BY) into an array."
     __slots__ = ()
     name = "ARRAY_AGG"
 
 
 class ArrayContains(CompoundOper):
+    "True iff the left side is a superset of the right side."
     __slots__ = ()
     oper = "@>"
 
 
-class BugFilterSetBuilder:
-    """A convenience class to build queries for getSubscriptionsForBugTask."""
-
-    def __init__(self, bugtask, level, join_condition):
-        """Initialize a new set builder for bug filters.
-
-        :param bugtask: The `IBugTask` to match against.
-        :param level: A member of `BugNotificationLevel`.
-        :param join_condition: A condition for selecting structural
-            subscriptions. Generally this should limit the subscriptions to a
-            particular target (i.e. project or distribution).
-        """
-        self.status = bugtask.status
-        self.importance = bugtask.importance
-        # The list() gets around some weirdness with security proxies; Storm
-        # does not know how to compile an expression with a proxied list.
-        self.tags = list(bugtask.bug.tags)
-        # Set up common conditions.
-        self.base_conditions = join_condition
-        # Set up common filter conditions.
-        self.filter_conditions = And(
-            BugSubscriptionFilter.bug_notification_level >= level,
-            self.base_conditions)
-        if len(self.tags) == 0:
-            self.filter_conditions = And(
-                # When the bug has no tags, filters with include_any_tags set
-                # can never match.
-                Not(BugSubscriptionFilter.include_any_tags),
-                self.filter_conditions)
-        else:
-            self.filter_conditions = And(
-                # When the bug has tags, filters with exclude_any_tags set can
-                # never match.
-                Not(BugSubscriptionFilter.exclude_any_tags),
-                self.filter_conditions)
-
-    @property
-    def subscriptions_without_filters(self):
-        """Subscriptions without filters."""
-        return Select(
-            StructuralSubscription.id,
-            tables=(
-                StructuralSubscription,
-                LeftJoin(
-                    BugSubscriptionFilter,
-                    BugSubscriptionFilter.structural_subscription_id == (
-                        StructuralSubscription.id))),
-            where=And(
-                BugSubscriptionFilter.id == None,
-                self.base_conditions))
-
-    def _filters_matching_x(self, join, where_condition, **extra):
-        """Return an expression yielding `(subscription_id, filter_id)` rows.
-
-        The expressions returned by this function are used in set (union,
-        intersect, except) operations at the *filter* level. However, the
-        interesting result of these set operations is the structural
-        subscription, hence both columns are included in the expressions
-        generated. Since a structural subscription can have zero or more
-        filters, and a filter can never be associated with more than one
-        subscription, the set operations are unaffected.
-        """
-        return Select(
-            columns=(
-                # Alias this column so it can be selected in
-                # subscriptions_matching.
-                Alias(
-                    BugSubscriptionFilter.structural_subscription_id,
-                    "structural_subscription_id"),
-                BugSubscriptionFilter.id),
-            tables=(
-                StructuralSubscription, BugSubscriptionFilter, join),
-            where=And(
-                BugSubscriptionFilter.structural_subscription_id == (
-                    StructuralSubscription.id),
-                self.filter_conditions,
-                where_condition),
-            **extra)
-
-    @property
-    def filters_matching_status(self):
-        """Filters with the given bugtask's status."""
-        join = LeftJoin(
-            BugSubscriptionFilterStatus,
-            BugSubscriptionFilterStatus.filter_id == (
-                BugSubscriptionFilter.id))
-        condition = Or(
-            BugSubscriptionFilterStatus.id == None,
-            BugSubscriptionFilterStatus.status == self.status)
-        return self._filters_matching_x(join, condition)
-
-    @property
-    def filters_matching_importance(self):
-        """Filters with the given bugtask's importance."""
-        join = LeftJoin(
-            BugSubscriptionFilterImportance,
-            BugSubscriptionFilterImportance.filter_id == (
-                BugSubscriptionFilter.id))
-        condition = Or(
-            BugSubscriptionFilterImportance.id == None,
-            BugSubscriptionFilterImportance.importance == self.importance)
-        return self._filters_matching_x(join, condition)
-
-    @property
-    def filters_without_include_tags(self):
-        """Filters with no tags required."""
-        join = LeftJoin(
-            BugSubscriptionFilterTag,
-            And(BugSubscriptionFilterTag.filter_id == (
-                    BugSubscriptionFilter.id),
-                BugSubscriptionFilterTag.include))
-        return self._filters_matching_x(
-            join, BugSubscriptionFilterTag.id == None)
-
-    @property
-    def filters_matching_any_include_tags(self):
-        """Filters including any of the bug's tags."""
-        condition = And(
-            BugSubscriptionFilterTag.filter_id == (
-                BugSubscriptionFilter.id),
-            BugSubscriptionFilterTag.include,
-            Not(BugSubscriptionFilter.find_all_tags),
-            In(BugSubscriptionFilterTag.tag, self.tags))
-        return self._filters_matching_x(
-            BugSubscriptionFilterTag, condition)
-
-    @property
-    def filters_matching_any_exclude_tags(self):
-        """Filters excluding any of the bug's tags."""
-        condition = And(
-            BugSubscriptionFilterTag.filter_id == (
-                BugSubscriptionFilter.id),
-            Not(BugSubscriptionFilterTag.include),
-            Not(BugSubscriptionFilter.find_all_tags),
-            In(BugSubscriptionFilterTag.tag, self.tags))
-        return self._filters_matching_x(
-            BugSubscriptionFilterTag, condition)
-
-    def _filters_matching_all_x_tags(self, where_condition):
-        """Return an expression yielding `(subscription_id, filter_id)` rows.
-
-        This joins to `BugSubscriptionFilterTag` and calls up to
-        `_filters_matching_x`, and groups by filter. Conditions are added to
-        ensure that all rows in each group are a subset of the bug's tags.
-        """
-        tags_array = "ARRAY[%s]::TEXT[]" % ",".join(
-            quote(tag) for tag in self.tags)
-        return self._filters_matching_x(
-            BugSubscriptionFilterTag,
-            And(
-                BugSubscriptionFilterTag.filter_id == (
-                    BugSubscriptionFilter.id),
-                BugSubscriptionFilter.find_all_tags,
-                self.filter_conditions,
-                where_condition),
-            group_by=(
-                BugSubscriptionFilter.structural_subscription_id,
-                BugSubscriptionFilter.id),
-            having=ArrayContains(
-                SQL(tags_array), ArrayAgg(
-                    BugSubscriptionFilterTag.tag)))
-
-    @property
-    def filters_matching_all_include_tags(self):
-        """Filters including the bug's tags."""
-        return self._filters_matching_all_x_tags(
-            BugSubscriptionFilterTag.include)
-
-    @property
-    def filters_matching_all_exclude_tags(self):
-        """Filters excluding the bug's tags."""
-        return self._filters_matching_all_x_tags(
-            Not(BugSubscriptionFilterTag.include))
-
-    @property
-    def filters_matching_include_tags(self):
-        """Filters with tag filters including the bug."""
-        return Union(
-            self.filters_matching_any_include_tags,
-            self.filters_matching_all_include_tags)
-
-    @property
-    def filters_matching_exclude_tags(self):
-        """Filters with tag filters excluding the bug."""
-        return Union(
-            self.filters_matching_any_exclude_tags,
-            self.filters_matching_all_exclude_tags)
-
-    @property
-    def filters_matching_tags(self):
-        """Filters with tag filters matching the bug."""
-        if len(self.tags) == 0:
-            # The filter's required tags must be an empty set. The filter's
-            # excluded tags can be anything so no condition is needed.
-            return self.filters_without_include_tags
-        else:
-            return Except(
-                Union(self.filters_without_include_tags,
-                      self.filters_matching_include_tags),
-                self.filters_matching_exclude_tags)
-
-    @property
-    def filters_matching(self):
-        """Filters matching the bug."""
-        return Intersect(
-            self.filters_matching_status,
-            self.filters_matching_importance,
-            self.filters_matching_tags)
-
-    @property
-    def subscriptions_with_matching_filters(self):
-        """Subscriptions with one or more filters matching the bug."""
-        return Select(
-            # I don't know of a more Storm-like way of doing this.
-            SQL("filters_matching.structural_subscription_id"),
-            tables=Alias(self.filters_matching, "filters_matching"))
-
-    @property
-    def subscriptions(self):
-        return Union(
-            self.subscriptions_without_filters,
-            self.subscriptions_with_matching_filters)
+class ArrayIntersects(CompoundOper):
+    "True iff the left side shares at least one element with the right side."
+    __slots__ = ()
+    oper = "&&"
+
+
+def _get_structural_subscription_filter_id_query(bug, bugtasks, level):
+    """Helper function.
+
+    This provides the core implementation for
+    get_structural_subscribers_for_bug and
+    get_structural_subscribers_for_bugtask.  "bug" should
+    ba a bug.  "bugtasks" is an iterable of one or more bugtasks of the bug.
+    "level" is a notification level.
+    """
+    # We get the ids because we need to use group by in order to
+    # look at the filters' tags in aggregate.  Once we have the ids,
+    # we can get the full set of what we need in subsuming or
+    # subsequent SQL calls.
+    # (Aside 1: We could in theory get all the fields we wanted with
+    # a hack--we could use an aggregrate function like max to get
+    # fields that we know will be unique--but Storm would not like
+    # it.)
+    # (Aside 2: IMO Postgres should allow getting other fields if
+    # the group-by key is a primary key and the other fields desired
+    # are other values from the same table as the group-by key, or
+    # values of a table linked by a foreign key from the same table
+    # as the group-by key...but that's dreaming.)
+    # See the docstring of get_structural_subscription_targets.
+    query_arguments = list(
+        get_structural_subscription_targets(bugtasks))
+    assert len(query_arguments) > 0, (
+        'Programmer error: expected query arguments')
+    # With large numbers of filters in the system, it's fastest in our
+    # tests if we get a set of structural subscriptions pertinent to the
+    # given targets, and then work with that.  It also comes in handy
+    # when we have to do a union, because we can share the work across
+    # the two queries.
+    # We will exclude people who have a direct subscription to the bug.
+    filters = [
+        Not(In(StructuralSubscription.subscriberID,
+               Select(BugSubscription.person_id,
+                      BugSubscription.bug == bug)))]
+    candidates = _get_all_structural_subscriptions(
+        StructuralSubscription.id, query_arguments, *filters)
+    if not candidates:
+        # If there are no structural subscriptions for these targets,
+        # then we don't need to look at the importance, status, and
+        # tags.  We're done.
+        return None, None
+    # The "select_args" dictionary holds the arguments that we will
+    # pass to one or more SELECT clauses.  We start with what will
+    # become the FROM clause.  We always want the following Joins,
+    # so we can add them here at the beginning.
+    select_args = {
+        'tables': [
+            StructuralSubscription,
+            Join(BugSubscriptionFilter,
+                 BugSubscriptionFilter.structural_subscription_id ==
+                 StructuralSubscription.id),
+            LeftJoin(BugSubscriptionFilterStatus,
+                     BugSubscriptionFilterStatus.filter_id ==
+                     BugSubscriptionFilter.id),
+            LeftJoin(BugSubscriptionFilterImportance,
+                     BugSubscriptionFilterImportance.filter_id ==
+                     BugSubscriptionFilter.id),
+            LeftJoin(BugSubscriptionFilterTag,
+                     BugSubscriptionFilterTag.filter_id ==
+                     BugSubscriptionFilter.id)]}
+    # The "conditions" list will eventually be passed to a Storm
+    # "And" function, and then become the WHERE clause of our SELECT.
+    conditions = [In(StructuralSubscription.id, candidates)]
+    # Handling notification level is trivial, so we include that first.
+    if level is not None:
+        conditions.append(
+            BugSubscriptionFilter.bug_notification_level >= level)
+    # Now we handle importance and status, which are per bugtask.
+    # What we do is loop through the collection of bugtask, target
+    # in query_arguments.  Each bugtask will have one or more
+    # targets that we have to check.  We figure out how to describe each
+    # target using the useful IStructuralSubscriptionTargetHelper
+    # adapter, which has a "join" attribute on it that tells us how
+    # to distinguish that target.  Once we have all of the target
+    # descriptins, we OR those together, and say that the filters
+    # for those targets must either have no importance or match the
+    # associated bugtask's importance; and have no status or match
+    # the bugtask's status.  Once we have looked at all of the
+    # bugtasks, we OR all of those per-bugtask target comparisons
+    # together, and we are done with the status and importance.
+    # The "outer_or_conditions" list holds the full clauses for each
+    # bugtask.
+    outer_or_conditions = []
+    # The "or_target_conditions" list holds the clauses for each target,
+    # and is reset for each new bugtask.
+    or_target_conditions = []
+
+    def handle_bugtask_conditions(bugtask):
+        """Helper function for building status and importance clauses.
+
+        Call with the previous bugtask when the bugtask changes in
+        the iteration of query_arguments, and call with the last
+        bugtask when the iteration is complete."""
+        if or_target_conditions:
+            outer_or_conditions.append(
+                And(Or(*or_target_conditions),
+                    Or(BugSubscriptionFilterImportance.importance ==
+                       bugtask.importance,
+                       BugSubscriptionFilterImportance.importance == None),
+                    Or(BugSubscriptionFilterStatus.status == bugtask.status,
+                       BugSubscriptionFilterStatus.status == None)))
+            del or_target_conditions[:]
+    last_bugtask = None
+    for bugtask, target in query_arguments:
+        if last_bugtask is not bugtask:
+            handle_bugtask_conditions(last_bugtask)
+        last_bugtask = bugtask
+        or_target_conditions.append(
+            IStructuralSubscriptionTargetHelper(target).join)
+    # We know there will be at least one bugtask, because we already
+    # escaped early "if len(query_arguments) == 0".
+    handle_bugtask_conditions(bugtask)
+    conditions.append(Or(*outer_or_conditions))
+    # Now we handle tags.  If the bug has no tags, this is
+    # relatively easy. Otherwise, not so much.
+    tags = list(bug.tags) # This subtly removes the security proxy on
+    # the list.  Strings are never security-proxied, so we don't have
+    # to worry about them.
+    if len(tags) == 0:
+        # The bug has no tags.  We should leave out filters that
+        # require any generic non-empty set of tags
+        # (BugSubscriptionFilter.include_any_tags), which we do with
+        # the conditions.  Then we can finish up the WHERE clause.
+        # Then we have to make sure that the filter does not require
+        # any *specific* tags. We do that with a GROUP BY on the
+        # filters, and then a HAVING clause that aggregates the
+        # BugSubscriptionFilterTags that are set to "include" the
+        # tag.  (If it is not an include, that is an exclude, and a
+        # bug without tags will not have a particular tag, so we can
+        # ignore those in this case.)  This requires a CASE
+        # statement within the COUNT.  After this, we are done, and
+        # we return the fully formed SELECT query object.
+        conditions.append(Not(BugSubscriptionFilter.include_any_tags))
+        select_args['where'] = And(*conditions)
+        select_args['group_by'] = (BugSubscriptionFilter.id,)
+        select_args['having'] = Count(
+            SQL('CASE WHEN BugSubscriptionFilterTag.include '
+                'THEN BugSubscriptionFilterTag.tag END'))==0
+        return candidates, Select(BugSubscriptionFilter.id, **select_args)
+    else:
+        # The bug has some tags.  This will require a bit of fancy
+        # footwork. First, though, we will simply want to leave out
+        # filters that should only match bugs without tags.
+        conditions.append(Not(BugSubscriptionFilter.exclude_any_tags))
+        # We're going to have to do a union with another query.  One
+        # query will handle filters that are marked to include *any*
+        # of the filter's selected tags, and the other query will
+        # handle filters that include *all* of the filter's selected
+        # tags (as determined by BugSubscriptionFilter.find_all_tags).
+        # Every aspect of the unioned queries' WHERE clauses *other
+        # than tags* will need to be the same. We could try making a
+        # temporary table for the shared results, but that would
+        # involve another separate Postgres call, and I think that
+        # we've already gotten the big win by separating out the
+        # structural subscriptions into "candidates," above.
+        #
+        # So, up to now we've been assembling the things that are shared
+        # between the two queries, but now we start working on the
+        # differences between the two unioned queries. "first_select"
+        # will hold one set of arguments, and select_args will hold the
+        # other.
+        first_select = select_args.copy()
+        # As mentioned, in this first SELECT we handle filters that
+        # match any of the filter's tags.  This can be a relatively
+        # straightforward query--we just need a bit more added to
+        # our WHERE clause, and we don't need a GROUP BY/HAVING.
+        first_select['where'] = And(
+            Or(# We want filters that proclaim they simply want any tags.
+               BugSubscriptionFilter.include_any_tags,
+               # Also include filters that match any tag...
+               And(Not(BugSubscriptionFilter.find_all_tags),
+                   Or(# ...with a positive match...
+                      And(BugSubscriptionFilterTag.include,
+                          In(BugSubscriptionFilterTag.tag, tags)),
+                      # ...or with a negative match...
+                      And(Not(BugSubscriptionFilterTag.include),
+                          Not(In(BugSubscriptionFilterTag.tag, tags))),
+                      # ...or if the filter does not specify any tags.
+                      BugSubscriptionFilterTag.tag == None))),
+            *conditions)
+        first_select = Select(BugSubscriptionFilter.id, **first_select)
+        # We have our first clause.  Now we start on the second one:
+        # handling filters that match *all* tags. Our WHERE clause
+        # is straightforward and, it should be clear that we are
+        # simply focusing on BugSubscriptionFilter.find_all_tags,
+        # when the first SELECT did not consider it.
+        select_args['where'] = And(
+            BugSubscriptionFilter.find_all_tags, *conditions)
+        # The GROUP BY collects the filters together.
+        select_args['group_by'] = (BugSubscriptionFilter.id,)
+        # Now it is time for the HAVING clause, which is where some
+        # tricky bits happen. We first make a SQL snippet that
+        # represents the tags on this bug.  It is straightforward
+        # except for one subtle hack: the addition of the empty
+        # space in the array.  This is because we are going to be
+        # aggregating the tags on the filters using ARRAY_AGG, which
+        # includes NULLs (unlike most other aggregators).  That
+        # is an issue here because we use CASE statements to divide
+        # up the set of tags that are supposed to be included and
+        # supposed to be excluded.  This means that if we aggregate
+        # "CASE WHEN BugSubscriptionFilterTag.include THEN
+        # BugSubscriptionFilterTag.tag END" then that array will
+        # include NULL.  SQL treats NULLs as unknowns that can never
+        # be matched, so the array of ['foo', 'bar', NULL] does not
+        # contain the array of ['foo', NULL] ("SELECT
+        # ARRAY['foo','bar',NULL]::TEXT[] @>
+        # ARRAY['foo',NULL]::TEXT[];" is false).  Therefore, so we
+        # can make the HAVING statement we want to make without
+        # defining a custom Postgres aggregator, we use a single
+        # space as, effectively, NULL.  This is safe because a
+        # single space is not an acceptable tag.  Again, the
+        # clearest alternative is defining a custom Postgres aggregator.
+        tags_array = "ARRAY[%s,' ']::TEXT[]" % ",".join(
+            quote(tag) for tag in tags)
+        # Now comes the HAVING clause itself.
+        select_args['having'] = And(
+            # The list of tags should be a superset of the filter tags to
+            # be included.
+            ArrayContains(
+                SQL(tags_array),
+                # This next line gives us an array of the tags that the
+                # filter wants to include.  Notice that it includes the
+                # empty string when the condition does not match, per the
+                # discussion above.
+                ArrayAgg(
+                   SQL("CASE WHEN BugSubscriptionFilterTag.include "
+                       "THEN BugSubscriptionFilterTag.tag "
+                       "ELSE ' '::TEXT END"))),
+            # The list of tags should also not intersect with the
+            # tags that the filter wants to exclude.
+            Not(
+                ArrayIntersects(
+                    SQL(tags_array),
+                    # This next line gives us an array of the tags
+                    # that the filter wants to exclude.  We do not bother
+                    # with the empty string, and therefore allow NULLs
+                    # into the array, because in this case we are
+                    # determining whether the sets intersect, not if the
+                    # first set subsumes the second.
+                    ArrayAgg(
+                       SQL('CASE WHEN '
+                           'NOT BugSubscriptionFilterTag.include '
+                           'THEN BugSubscriptionFilterTag.tag END')))))
+        # Everything is ready.  Make our second SELECT statement, UNION
+        # it, and return it.
+        return candidates, Union(
+            first_select,
+            Select(
+                BugSubscriptionFilter.id,
+                **select_args))

=== modified file 'lib/lp/bugs/model/tests/test_bugtask.py'
--- lib/lp/bugs/model/tests/test_bugtask.py	2011-02-22 17:11:10 +0000
+++ lib/lp/bugs/model/tests/test_bugtask.py	2011-03-03 21:04:20 +0000
@@ -8,7 +8,10 @@
 import unittest
 
 from lazr.lifecycle.snapshot import Snapshot
-from storm.store import ResultSet
+from storm.store import (
+    EmptyResultSet,
+    ResultSet,
+    )
 from testtools.matchers import (
     Equals,
     StartsWith,
@@ -1429,7 +1432,6 @@
     def test_no_subscriptions(self):
         subscriptions = self.getAllStructuralSubscriptions(
             self.bug.bugtasks, self.subscriber)
-        self.assertIsInstance(subscriptions, ResultSet)
         self.assertEqual([], list(subscriptions))
 
     def test_one_subscription(self):
@@ -1505,7 +1507,7 @@
         # subscribers will be returned by getStructuralSubscribers().
         product, bug = self.make_product_with_bug()
         subscribers = self.getStructuralSubscribers(bug.bugtasks)
-        self.assertIsInstance(subscribers, ResultSet)
+        self.assertIsInstance(subscribers, (ResultSet, EmptyResultSet))
         self.assertEqual([], list(subscribers))
 
     def test_getStructuralSubscribers_single_target(self):

=== modified file 'lib/lp/bugs/tests/test_structuralsubscriptiontarget.py'
--- lib/lp/bugs/tests/test_structuralsubscriptiontarget.py	2011-02-18 11:59:44 +0000
+++ lib/lp/bugs/tests/test_structuralsubscriptiontarget.py	2011-03-03 21:04:20 +0000
@@ -331,8 +331,21 @@
         # Without either tag the subscription is found.
         self.assertSubscriptions([self.subscription])
 
-        # With either tag the subscription is no longer found.
+        # With both tags, the subscription is omitted.
+        self.bug.tags = ["foo", "bar"]
+        self.assertSubscriptions([])
+
+        # With only one tag, the subscription is found again.
         self.bug.tags = ["foo"]
+        self.assertSubscriptions([self.subscription])
+
+        # However, if find_all_tags is True, even a single excluded tag
+        # causes the subscription to be skipped.
+        self.initial_filter.find_all_tags = True
+        self.assertSubscriptions([])
+
+        # This is also true, of course, if the bug has both tags.
+        self.bug.tags = ["foo", "bar"]
         self.assertSubscriptions([])
 
     def test_getSubscriptionsForBugTask_with_filter_for_not_all_tags(self):
@@ -347,11 +360,13 @@
         # Without either tag the subscription is found.
         self.assertSubscriptions([self.subscription])
 
-        # With only one of the excluded tags the subscription is found.
+        # With only one of the excluded tags the subscription is not
+        # found--we are saying that we want to find both an absence of foo
+        # and an absence of bar, and yet foo exists.
         self.bug.tags = ["foo"]
-        self.assertSubscriptions([self.subscription])
+        self.assertSubscriptions([])
 
-        # With both tags the subscription is no longer found.
+        # With both tags the subscription is also not found.
         self.bug.tags = ["foo", "bar"]
         self.assertSubscriptions([])
 
@@ -361,6 +376,7 @@
 
         # Add the "foo" tag to the bug.
         self.bug.tags = ["foo"]
+        self.assertSubscriptions([self.subscription])
 
         # Filter the subscription to bugs in the CRITICAL state.
         self.initial_filter.statuses = [BugTaskStatus.CONFIRMED]
@@ -553,7 +569,8 @@
         return self.factory.makeMilestone()
 
     def makeBugTask(self):
-        return self.factory.makeBugTask(target=self.target.series_target)
+        bug = self.factory.makeBug(milestone=self.target)
+        return bug.bugtasks[0]
 
 
 class TestStructuralSubscriptionForDistroSeries(


Follow ups