← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~gary/launchpad/bug741684 into lp:launchpad

 

Gary Poster has proposed merging lp:~gary/launchpad/bug741684 into lp:launchpad.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  Bug #741684 in Launchpad itself: "bug notification script needs to be optimized"
  https://bugs.launchpad.net/launchpad/+bug/741684
  Bug #742230 in Launchpad itself: "Too much repeated SQL in send-bug-notifications job"
  https://bugs.launchpad.net/launchpad/+bug/742230

For more details, see:
https://code.launchpad.net/~gary/launchpad/bug741684/+merge/55656

This branch makes the SQL optimizations I could find for the bug notification script, with one or two exceptions.  The first exception is that I felt we could do better with how we are finding the transitive emails for teams.  I make that change in another branch, lp:~gary/launchpad/bug741684-2.  The other possible exception is that Robert and Stuart seem to disagree on what kind of Storm cache we have in place.  In order for these optimizations to be most effective, we may need to prevent Storm cache exhaustion in an additional subsequent change of some sort.

Now I'll describe what actually *is* in this branch.

This branch focuses on two changes to the asynchronous sending of bug notifications, both of which were deployed at the last downtime deployment.

[Feature 1: quickly undone actions should not send email notifications]
  The first was to fix bug 164196: quickly undone actions should not send email notifications.  In order to determine if an action is undone, the new code connected our notification records with our log records for a given change.  For each collection of notifications, each notification's activity was consulted to determine what was changed, and if the end state was the same as the beginning state.  This change meant that the script now looks at BugActivity records, and it did not before.

[Feature 2: emails can include notes on what structural subscription filter(s) caused the email to be sent to a recipient, to aid in organizing and highlighting email]
  I initially thought that this was the primary change in how we sent our emails.  However, perhaps the bigger one from the perspective of database usage was that we looked for structural subscription filters for every notification to see if we need to tell the recipient about what filter caused them to receive the mail.  This caused significant database churn in a much tighter loop.

I divide the changes below into sections that first list the file or files that changed, and then discuss the changes.

= Get filters for multiple bug notifications at once =

 * lib/lp/bugs/interfaces/bugnotification.py
 * lib/lp/bugs/model/bugnotification.py
 * lib/lp/bugs/scripts/tests/test_bugnotification.py
 * lib/lp/bugs/tests/test_bugnotification.py

Rather than getting the filters for a particular bug notification, the code now gets the filters for a set of bug notifications when possible.  This reduces the number of queries within the tightest loop.

= Cache Bug.initial_message =

 * The code was getting every bug's initial_message twice, and each time was firing a SQL query.  I tried cacheing it.  If that causes problems, I'll get the value once in the bugnotification script code instead.

= Pre-load database values for BugNotificationSet.getNotificationsToSend =

* lib/lp/bugs/model/bugnotification.py

This change loads as much as possible of what we will need later into the Storm cache.  This is therefore the change that may need tweaks of our Storm cache, as discussed above.  It is also a revised version of the code that was attempted earlier as a cowboy, to no great improvement.  This version may be better.

We load the direct (and smaller) dependencies immediately, simultaneously as we load the notifications; then, after we have gotten the notifications and determined what will be used, we preload the bugs and the people (using the IPersonSet method that also loads the email information that we will need later).

= Only get the filters for the "top level" recipients, not for the transitively exploded full recipient list =

 * lib/lp/bugs/scripts/bugnotification.py

I am hopeful that this will be the most important optimization.  It completely eliminates the tightest loop whenever a team that does not have a preferred email is set.  It then also takes advantage of the change described above in the section titled "Get filters for multiple bug notifications at once."  This change should be transparent, while generating a lot less work in certain cases.

I investigated many other potential optimizations, but these were the ones I found that kept the desired behavior and actually reduced the SQL when I attempted them.

Thank you for the review.
-- 
https://code.launchpad.net/~gary/launchpad/bug741684/+merge/55656
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~gary/launchpad/bug741684 into lp:launchpad.
=== modified file 'lib/lp/bugs/interfaces/bugnotification.py'
--- lib/lp/bugs/interfaces/bugnotification.py	2011-02-25 16:19:58 +0000
+++ lib/lp/bugs/interfaces/bugnotification.py	2011-03-31 00:47:42 +0000
@@ -67,9 +67,6 @@
     bug_filters = Attribute(
         "List of bug filters that caused this notification.")
 
-    def getFiltersByRecipient(person):
-        """Return filters for a particular recipient."""
-
 
 class IBugNotificationSet(Interface):
     """The set of bug notifications."""
@@ -84,6 +81,9 @@
         `BugNotificationRecipient` objects.
         """
 
+    def getFiltersByRecipient(notifications, person):
+        """Return filters for a particular recipient."""
+
 
 class IBugNotificationRecipient(Interface):
     """A recipient of a bug notification."""

=== modified file 'lib/lp/bugs/model/bug.py'
--- lib/lp/bugs/model/bug.py	2011-03-29 00:11:57 +0000
+++ lib/lp/bugs/model/bug.py	2011-03-31 00:47:42 +0000
@@ -703,7 +703,7 @@
             days_old, getUtility(ILaunchpadCelebrities).janitor, bug=self)
         return bugtasks.count() > 0
 
-    @property
+    @cachedproperty
     def initial_message(self):
         """See `IBug`."""
         store = Store.of(self)

=== modified file 'lib/lp/bugs/model/bugnotification.py'
--- lib/lp/bugs/model/bugnotification.py	2011-02-25 16:19:58 +0000
+++ lib/lp/bugs/model/bugnotification.py	2011-03-31 00:47:42 +0000
@@ -24,16 +24,23 @@
     ForeignKey,
     StringCol,
     )
+from storm.expr import (
+    In,
+    Join,
+    LeftJoin,
+    )
 from storm.store import Store
 from storm.locals import (
     Int,
     Reference,
     )
+from zope.component import getUtility
 from zope.interface import implements
 
 from canonical.config import config
 from canonical.database.datetimecol import UtcDateTimeCol
 from canonical.database.enumcol import EnumCol
+from canonical.launchpad.database.message import Message
 from canonical.database.sqlbase import (
     SQLBase,
     sqlvalues,
@@ -46,8 +53,11 @@
     IBugNotificationRecipient,
     IBugNotificationSet,
     )
+from lp.bugs.model.bugactivity import BugActivity
 from lp.bugs.model.bugsubscriptionfilter import BugSubscriptionFilter
 from lp.bugs.model.structuralsubscription import StructuralSubscription
+from lp.registry.interfaces.person import IPersonSet
+from lp.registry.model.person import Person
 from lp.registry.model.teammembership import TeamParticipation
 from lp.services.database.stormbase import StormBase
 
@@ -82,18 +92,6 @@
              BugNotificationFilter.bug_subscription_filter_id),
             BugNotificationFilter.bug_notification == self)
 
-    def getFiltersByRecipient(self, recipient):
-        """See `IBugNotification`."""
-        return IStore(BugSubscriptionFilter).find(
-            BugSubscriptionFilter,
-            (BugSubscriptionFilter.id ==
-             BugNotificationFilter.bug_subscription_filter_id),
-            BugNotificationFilter.bug_notification == self,
-            (BugSubscriptionFilter.structural_subscription_id ==
-             StructuralSubscription.id),
-            TeamParticipation.personID == recipient.id,
-            TeamParticipation.teamID == StructuralSubscription.subscriberID)
-
 
 class BugNotificationSet:
     """A set of bug notifications."""
@@ -101,36 +99,57 @@
 
     def getNotificationsToSend(self):
         """See IBugNotificationSet."""
-        notifications = BugNotification.select(
-            """date_emailed IS NULL""", orderBy=['bug', '-id'])
-        pending_notifications = list(notifications)
-        omitted_notifications = []
+        # We preload the bug activity and the message in order to
+        # try to reduce subsequent database calls: try to get direct
+        # dependencies at once.  We then also pre-load the pertinent bugs,
+        # people (with their own dependencies), and message chunks before
+        # returning the notifications that should be processed.
+        # Sidestep circular reference.
+        from lp.bugs.model.bug import Bug
+        store = IStore(BugNotification)
+        source = store.using(BugNotification,
+                             Join(Message,
+                                  BugNotification.message==Message.id),
+                             LeftJoin(
+                                BugActivity,
+                                BugNotification.activity==BugActivity.id))
+        results = list(source.find(
+            (BugNotification, BugActivity, Message),
+            BugNotification.date_emailed == None).order_by(
+            'BugNotification.bug', '-BugNotification.id'))
         interval = timedelta(
             minutes=int(config.malone.bugnotification_interval))
         time_limit = (
             datetime.now(pytz.timezone('UTC')) - interval)
-
         last_omitted_notification = None
-        for notification in pending_notifications:
+        pending_notifications = []
+        people_ids = set()
+        bug_ids = set()
+        for notification, ignore, ignore in results:
             if notification.message.datecreated > time_limit:
-                omitted_notifications.append(notification)
-                last_omitted_notification = notification
-            elif last_omitted_notification is not None:
-                if (notification.message.owner ==
-                       last_omitted_notification.message.owner and
-                    notification.bug == last_omitted_notification.bug and
-                    last_omitted_notification.message.datecreated -
-                    notification.message.datecreated < interval):
-                    omitted_notifications.append(notification)
-                    last_omitted_notification = notification
+                last_omitted_notification = notification
+            elif (last_omitted_notification is not None and
+                notification.message.ownerID ==
+                   last_omitted_notification.message.ownerID and
+                notification.bugID == last_omitted_notification.bugID and
+                last_omitted_notification.message.datecreated -
+                notification.message.datecreated < interval):
+                last_omitted_notification = notification
             if last_omitted_notification != notification:
                 last_omitted_notification = None
-
-        pending_notifications = [
-            notification
-            for notification in pending_notifications
-            if notification not in omitted_notifications
-            ]
+                pending_notifications.append(notification)
+                people_ids.add(notification.message.ownerID)
+                bug_ids.add(notification.bugID)
+        # Now we do some calls that are purely for cacheing.
+        # Converting these into lists forces the queries to execute.
+        if pending_notifications:
+            cached_people = list(
+                getUtility(IPersonSet).getPrecachedPersonsFromIDs(
+                    list(people_ids),
+                    need_validity=True,
+                    need_preferred_email=True))
+            cached_bugs = list(
+                IStore(Bug).find(Bug, In(Bug.id, list(bug_ids))))
         pending_notifications.reverse()
         return pending_notifications
 
@@ -158,6 +177,26 @@
             VALUES %s;""" % ', '.join(sql_values))
         return bug_notification
 
+    def getFiltersByRecipient(self, notifications, recipient):
+        """See `IBugNotificationSet`."""
+        store = IStore(BugSubscriptionFilter)
+        source = store.using(
+            BugSubscriptionFilter,
+            Join(BugNotificationFilter,
+                 BugSubscriptionFilter.id ==
+                    BugNotificationFilter.bug_subscription_filter_id),
+            Join(StructuralSubscription,
+                 BugSubscriptionFilter.structural_subscription_id ==
+                    StructuralSubscription.id),
+            Join(TeamParticipation,
+                 TeamParticipation.teamID ==
+                    StructuralSubscription.subscriberID))
+        return source.find(
+            BugSubscriptionFilter,
+            In(BugNotificationFilter.bug_notification_id,
+               [notification.id for notification in notifications]),
+            TeamParticipation.personID == recipient.id)
+
 
 class BugNotificationRecipient(SQLBase):
     """A recipient of a bug notification."""

=== modified file 'lib/lp/bugs/scripts/bugnotification.py'
--- lib/lp/bugs/scripts/bugnotification.py	2011-03-23 15:55:44 +0000
+++ lib/lp/bugs/scripts/bugnotification.py	2011-03-31 00:47:42 +0000
@@ -16,6 +16,7 @@
 from operator import itemgetter
 
 import transaction
+from zope.component import getUtility
 
 from canonical.launchpad.helpers import (
     emailPeople,
@@ -28,6 +29,7 @@
     get_bugmail_from_address,
     )
 from lp.bugs.mail.newbug import generate_bug_add_email
+from lp.bugs.interfaces.bugnotification import IBugNotificationSet
 from lp.services.mail.mailwrapper import MailWrapper
 
 
@@ -107,14 +109,20 @@
             filtered_notifications.append(notification)
             for recipient in notification.recipients:
                 email_people = emailPeople(recipient.person)
-                if (not actor.selfgenerated_bugnotifications and
-                    actor in email_people):
-                    email_people.remove(actor)
                 for email_person in email_people:
-                    recipients[email_person] = recipient
+                    recipients_for_person = recipients.get(email_person)
+                    if recipients_for_person is None:
+                        recipients_for_person = []
+                        recipients[email_person] = recipients_for_person
+                    recipients_for_person.append(recipient)
         else:
             omitted_notifications.append(notification)
 
+    # If the actor does not want self-generated bug notifications, remove the
+    # actor now.
+    if not actor.selfgenerated_bugnotifications:
+        recipients.pop(actor, None)
+
     if bug.duplicateof is not None:
         text_notifications.append(
             '*** This bug is a duplicate of bug %d ***\n    %s' %
@@ -158,18 +166,38 @@
     bug_notification_builder = BugNotificationBuilder(bug, actor)
     sorted_recipients = sorted(
         recipients.items(), key=lambda t: t[0].preferredemail.email)
-    for email_person, recipient in sorted_recipients:
+    cached_filters = {}
+
+    bug_notification_set = getUtility(IBugNotificationSet)
+    def get_filter_descriptions(recipients):
+        "Given a list of recipients, return the filter descriptions for them."
+        descriptions = set()
+        people_seen = set()
+        for recipient in recipients:
+            person = recipient.person
+            if person in people_seen:
+                continue
+            people_seen.add(person)
+            recipient_filters = cached_filters.get(person)
+            if recipient_filters is None:
+                recipient_filters = set(
+                    filter.description for filter
+                    in bug_notification_set.getFiltersByRecipient(
+                        filtered_notifications, person)
+                    if filter.description)
+                cached_filters[recipient] = recipient_filters
+            descriptions.update(recipient_filters)
+        return descriptions
+
+    for email_person, recipients in sorted_recipients:
         address = str(email_person.preferredemail.email)
-        reason = recipient.reason_body
-        rationale = recipient.reason_header
+        # Choosing the first recipient is a bit arbitrary, but it
+        # is simple for the user to understand.  We may want to reconsider
+        # this in the future.
+        reason = recipients[0].reason_body
+        rationale = recipients[0].reason_header
 
-        filters = set()
-        for notification in filtered_notifications:
-            notification_filters = notification.getFiltersByRecipient(
-                email_person)
-            for notification_filter in notification_filters:
-                if notification_filter.description is not None:
-                    filters.add(notification_filter.description)
+        filters = get_filter_descriptions(recipients)
         if filters:
             # There are some filters as well, add it to the email body.
             filters_text = u"\nMatching subscriptions: %s" % ", ".join(

=== modified file 'lib/lp/bugs/scripts/tests/test_bugnotification.py'
--- lib/lp/bugs/scripts/tests/test_bugnotification.py	2011-03-23 15:55:44 +0000
+++ lib/lp/bugs/scripts/tests/test_bugnotification.py	2011-03-31 00:47:42 +0000
@@ -8,9 +8,10 @@
 import unittest
 
 import pytz
+from storm.store import Store
 from testtools.matchers import Not
 from transaction import commit
-from zope.component import getUtility
+from zope.component import getUtility, getSiteManager
 from zope.interface import implements
 
 from canonical.config import config
@@ -43,8 +44,14 @@
 from lp.bugs.interfaces.bugnotification import IBugNotificationSet
 from lp.bugs.interfaces.bugtask import BugTaskStatus
 from lp.bugs.mail.bugnotificationrecipients import BugNotificationRecipients
+from lp.bugs.model.bugnotification import (
+    BugNotification,
+    BugNotificationFilter,
+#    BugNotificationRecipient,
+    )
 from lp.bugs.model.bugtask import BugTask
 from lp.bugs.scripts.bugnotification import (
+    construct_email_notifications,
     get_email_notifications,
     get_activity_key,
     notification_batches,
@@ -136,9 +143,6 @@
         self.recipients = [MockBugNotificationRecipient()]
         self.activity = None
 
-    def getFiltersByRecipient(self, recipient):
-        return []
-
 
 class FakeNotification:
     """An even simpler fake notification.
@@ -157,6 +161,15 @@
         self.activity = None
 
 
+class FakeBugNotificationSetUtility:
+    """A notification utility used for testing."""
+
+    implements(IBugNotificationSet)
+
+    def getFiltersByRecipient(self, notifications, recipient):
+        return []
+
+
 class MockBugActivity:
     """A mock BugActivity used for testing."""
 
@@ -244,6 +257,18 @@
         # will abort the current transaction.
         commit()
 
+        sm = getSiteManager()
+        self._original_utility = sm.getUtility(IBugNotificationSet)
+        sm.unregisterUtility(self._original_utility)
+        self._fake_utility = FakeBugNotificationSetUtility()
+        sm.registerUtility(self._fake_utility)
+
+    def tearDown(self):
+        sm = getSiteManager()
+        sm.unregisterUtility(self._fake_utility)
+        sm.registerUtility(self._original_utility)
+        
+
     def _getAndCheckSentNotifications(self, notifications_to_send):
         """Return the notifications that were successfully sent.
 
@@ -876,14 +901,6 @@
                     self.ten_minutes_ago, self.person, 'attachment',
                     item, None))
 
-from lp.bugs.model.bugnotification import (
-    BugNotification,
-    BugNotificationFilter,
-#    BugNotificationRecipient,
-    )
-from lp.bugs.scripts.bugnotification import construct_email_notifications
-from storm.store import Store
-
 
 class TestEmailNotificationsWithFilters(TestCaseWithFactory):
     """Ensure outgoing mails have corresponding mail headers.

=== modified file 'lib/lp/bugs/tests/test_bugnotification.py'
--- lib/lp/bugs/tests/test_bugnotification.py	2011-02-24 18:24:10 +0000
+++ lib/lp/bugs/tests/test_bugnotification.py	2011-03-31 00:47:42 +0000
@@ -191,7 +191,8 @@
         # with no entries.
         subscriber = self.factory.makePerson()
         self.assertTrue(
-            self.notification.getFiltersByRecipient(subscriber).is_empty())
+            BugNotificationSet().getFiltersByRecipient(
+                [self.notification], subscriber).is_empty())
 
     def test_getFiltersByRecipient_other_persons(self):
         # When there is no bug filter for the recipient,
@@ -205,7 +206,8 @@
             bug_notification=self.notification,
             bug_subscription_filter=bug_filter)
         self.assertTrue(
-            self.notification.getFiltersByRecipient(recipient).is_empty())
+            BugNotificationSet().getFiltersByRecipient(
+                [self.notification], recipient).is_empty())
 
     def test_getFiltersByRecipient_match(self):
         # When there are bug filters for the recipient,
@@ -219,7 +221,37 @@
             bug_subscription_filter=bug_filter)
         self.assertContentEqual(
             [bug_filter],
-            self.notification.getFiltersByRecipient(subscriber))
+            BugNotificationSet().getFiltersByRecipient(
+                [self.notification], subscriber))
+
+    def test_getFiltersByRecipients_multiple_notifications_match(self):
+        # When there are bug filters for the recipient for multiple
+        # notifications, return filters for all the notifications.
+        # Set up first notification and filter.
+        subscriber = self.factory.makePerson()
+        subscription = self.bug.default_bugtask.target.addSubscription(
+            subscriber, subscriber)
+        bug_filter = subscription.newBugFilter()
+        BugNotificationFilter(
+            bug_notification=self.notification,
+            bug_subscription_filter=bug_filter)
+        # set up second notification and filter.
+        bug2 = self.factory.makeBug()
+        message2 = self.factory.makeMessage()
+        notification2 = BugNotification(
+            message=message2, activity=None, bug=bug2, is_comment=False,
+            date_emailed=None)
+        subscription2 = bug2.default_bugtask.target.addSubscription(
+            subscriber, subscriber)
+        bug_filter2 = subscription2.newBugFilter()
+        BugNotificationFilter(
+            bug_notification=notification2,
+            bug_subscription_filter=bug_filter2)
+        # Perform the test.
+        self.assertContentEqual(
+            set([bug_filter, bug_filter2]),
+            set(BugNotificationSet().getFiltersByRecipient(
+                [self.notification, notification2], subscriber)))
 
 
 class TestNotificationProcessingWithoutRecipients(TestCaseWithFactory):