← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~adeuring/launchpad/bug-769293 into lp:launchpad

 

Abel Deuring has proposed merging lp:~adeuring/launchpad/bug-769293 into lp:launchpad.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  Bug #769293 in Launchpad itself: "Timeout trying to set bunch of dupes"
  https://bugs.launchpad.net/launchpad/+bug/769293

For more details, see:
https://code.launchpad.net/~adeuring/launchpad/bug-769293/+merge/59496

This branch fixes bug 769293:  Timeout trying to set bunch of dupes.

A glance at the OOPS reports OOPS-1939DX14 and OOPS-1939BA20 shows that
a relatively slow query (~100 msec) is executed quite often (29 times
in OOPS-1939BA20, 26 times in 1939DX14):

    SELECT MAX(Bug.heat), SUM(Bug.heat), COUNT(Bug.id)
    FROM Bug,
         BugTask
    WHERE BugTask.bug = Bug.id
      AND BugTask.distribution = $INT
      AND BugTask.sourcepackagename = $INT
      AND Bug.duplicateof IS NULL
      AND BugTask.status IN ($INT ... $INT)

This query is issued in DistributionSourcePackage.recalculateBugHeatCache(),
which is called by Bug.setHeat() and by Bug.updateHeat().
These methods are in turn called by Bug.markAsDuplicate():
setHeat() for the duplicate and updateHeat() for the master bug.

If we assume that the duplicate bug and the master bug have the
same target (which should be the most common case), this results
in two calls of recalculateBugHeatCache() for one target.

If a bug being marked as a duplicates has itself duplicates,
markAsDuplicate(new_master_bug) is called for them too, adding
two more calls of recalculateBugHeatCache().

These repeated calls are pointless, so I added an optional
parameter affected_targets to Bug.setHeat() and Bug.updateHeat(),
which, if specified, should be a set. If it is given, the methods
do not call recalculateBugHeatCache() but add the target of their
bugtasks to affected_targets.

I renamed Bug.markAsDuplicate() to _markAsDuplicate(); the method
now calls setHeat() and updateHeat() with the paremeter
affected_targets, and it returns the set of targets.

Targets found in recursive calls are collected too.

Finally, The new method Bug.markAsDuplicate() calls
_markAsDuplicate() and then calls recalculateBugHeatCache()
for all targets returned by _markAsDuplicate().

As a "drive-by optimisation", I also moved the call of
duplicate_of.updateHeat() from _markAsDuplicate() to
markAsDuplicate(): Calling this method repeatedly
in the recursive calls for duplicates of the bug that is
being marked as a dupe is also unnecessary. This removes
another 100 msec of "SQL time" as seen in OOPS-1939BA20.

tests:

for the change itself:

./bin/test bugs -vvt test_duplicate_handling

that the "bug heat stuff" still works:

./bin/test bugs -vvt doc.bug-heat.txt
./bin/test bugs -vvt test_bugheat

no lint.

-- 
https://code.launchpad.net/~adeuring/launchpad/bug-769293/+merge/59496
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~adeuring/launchpad/bug-769293 into lp:launchpad.
=== modified file 'lib/lp/bugs/interfaces/bug.py'
--- lib/lp/bugs/interfaces/bug.py	2011-04-13 20:55:31 +0000
+++ lib/lp/bugs/interfaces/bug.py	2011-04-29 11:59:25 +0000
@@ -922,10 +922,10 @@
         if the user is the owner or an admin.
         """
 
-    def setHeat(heat, timestamp=None):
+    def setHeat(heat, timestamp=None, affected_targets=None):
         """Set the heat for the bug."""
 
-    def updateHeat():
+    def updateHeat(affected_targets=None):
         """Update the heat for the bug."""
 
     @operation_parameters(

=== modified file 'lib/lp/bugs/model/bug.py'
--- lib/lp/bugs/model/bug.py	2011-04-26 11:58:26 +0000
+++ lib/lp/bugs/model/bug.py	2011-04-29 11:59:25 +0000
@@ -1761,8 +1761,18 @@
 
         self.updateHeat()
 
-    def markAsDuplicate(self, duplicate_of):
-        """See `IBug`."""
+    def _markAsDuplicate(self, duplicate_of):
+        """Mark this bug as a duplicate of another.
+
+        Marking a bug as a duplicate requires a recalculation
+        of the heat of this bug and of the master bug, and it
+        requires a recalulation of the heat cache of the
+        affected bug targets. None of this is done here in order
+        to avoid unnecessary repetitions in recursive calls
+        for duplicates of this bug, which also become duplicates
+        of the new master bug.
+        """
+        affected_targets = set()
         field = DuplicateBug()
         field.context = self
         current_duplicateof = self.duplicateof
@@ -1776,7 +1786,8 @@
                     # event.
                     dupe_before = Snapshot(
                         duplicate, providing=providedBy(duplicate))
-                    duplicate.markAsDuplicate(duplicate_of)
+                    affected_targets.update(
+                        duplicate._markAsDuplicate(duplicate_of))
                     notify(ObjectModifiedEvent(
                             duplicate, dupe_before, 'duplicateof'))
             self.duplicateof = duplicate_of
@@ -1787,14 +1798,22 @@
             # Update the heat of the master bug and set this bug's heat
             # to 0 (since it's a duplicate, it shouldn't have any heat
             # at all).
-            self.setHeat(0)
-            duplicate_of.updateHeat()
+            self.setHeat(0, affected_targets=affected_targets)
         else:
             # Otherwise, recalculate this bug's heat, since it will be 0
             # from having been a duplicate. We also update the bug that
             # was previously duplicated.
-            self.updateHeat()
-            current_duplicateof.updateHeat()
+            self.updateHeat(affected_targets)
+            current_duplicateof.updateHeat(affected_targets)
+        return affected_targets
+
+    def markAsDuplicate(self, duplicate_of):
+        """See `IBug`."""
+        affected_targets = self._markAsDuplicate(duplicate_of)
+        if duplicate_of is not None:
+            duplicate_of.updateHeat(affected_targets)
+        for target in affected_targets:
+            target.recalculateBugHeatCache()
 
     def setCommentVisibility(self, user, comment_number, visible):
         """See `IBug`."""
@@ -1920,7 +1939,8 @@
 
         return not subscriptions_from_dupes.is_empty()
 
-    def setHeat(self, heat, timestamp=None):
+    def setHeat(self, heat, timestamp=None, affected_targets=None):
+        """See `IBug`."""
         """See `IBug`."""
         if timestamp is None:
             timestamp = UTC_NOW
@@ -1930,10 +1950,13 @@
 
         self.heat = heat
         self.heat_last_updated = timestamp
-        for task in self.bugtasks:
-            task.target.recalculateBugHeatCache()
+        if affected_targets is None:
+            for task in self.bugtasks:
+                task.target.recalculateBugHeatCache()
+        else:
+            affected_targets.update(task.target for task in self.bugtasks)
 
-    def updateHeat(self):
+    def updateHeat(self, affected_targets=None):
         """See `IBug`."""
         if self.duplicateof is not None:
             # If this bug is a duplicate we don't try to calculate its
@@ -1947,8 +1970,11 @@
 
         self.heat = SQL("calculate_bug_heat(%s)" % sqlvalues(self))
         self.heat_last_updated = UTC_NOW
-        for task in self.bugtasks:
-            task.target.recalculateBugHeatCache()
+        if affected_targets is None:
+            for task in self.bugtasks:
+                task.target.recalculateBugHeatCache()
+        else:
+            affected_targets.update(task.target for task in self.bugtasks)
         store.flush()
 
     def _attachments_query(self):

=== modified file 'lib/lp/bugs/tests/test_duplicate_handling.py'
--- lib/lp/bugs/tests/test_duplicate_handling.py	2011-01-21 21:42:40 +0000
+++ lib/lp/bugs/tests/test_duplicate_handling.py	2011-04-29 11:59:25 +0000
@@ -10,7 +10,10 @@
 
 from canonical.testing.layers import DatabaseFunctionalLayer
 from lp.bugs.errors import InvalidDuplicateValue
-from lp.testing import TestCaseWithFactory
+from lp.testing import (
+    StormStatementRecorder,
+    TestCaseWithFactory,
+    )
 
 
 class TestDuplicateAttributes(TestCaseWithFactory):
@@ -96,6 +99,70 @@
         self.assertEqual(dupe_one.duplicateof, new_bug)
         self.assertEqual(dupe_two.duplicateof, new_bug)
 
+    def makeBugForDistributionSourcePackage(self, sourcepackage,
+                                            with_random_target):
+        bug = self.factory.makeBug(
+            distribution=sourcepackage.distribution,
+            sourcepackagename=sourcepackage.sourcepackagename)
+        if with_random_target:
+            bug.addTask(
+                self.factory.makePerson(),
+                self.factory.makeDistributionSourcePackage())
+        return bug
+
+    def moveDuplicates(self, number_of_dupes, with_random_target):
+        # Create a bug with the given number of duplicates and
+        # then mark the bug as a duplicate of another bug.
+        # Return the number of SQL statements executed to
+        # update the target's bug heat cache
+        # (IBugTarget.recalculateBugHeatCache())
+        #
+        # We use a distributionsourcepackage as the bug target
+        # because we filter the recorded SQL statements by
+        # string.startswith(...), and the implementation of
+        # DistributionSourcePackage.recalculateBugHeatCache()
+        # is the only one that issues a "SELECT MAX(Bug.heat)..."
+        # query, making it more reliable to detect in the
+        # numerous recorded statements compared with the
+        # statements issued by BugTarget.recalculateBugHeatCache().
+        dsp = self.factory.makeDistributionSourcePackage()
+        master_bug = self.makeBugForDistributionSourcePackage(
+            dsp, with_random_target)
+        for count in xrange(number_of_dupes):
+            dupe = self.makeBugForDistributionSourcePackage(
+                dsp, with_random_target)
+            dupe.markAsDuplicate(master_bug)
+        new_master_bug = self.makeBugForDistributionSourcePackage(
+            dsp, with_random_target)
+        with StormStatementRecorder() as recorder:
+            master_bug.markAsDuplicate(new_master_bug)
+        target_heat_cache_statements = [
+            statement for statement in recorder.statements
+            if statement.startswith(
+                "'SELECT MAX(Bug.heat), SUM(Bug.heat), COUNT(Bug.id)")]
+        return len(target_heat_cache_statements)
+
+    def test_move_duplicates_efficient_target_heat_cache_calculation(self):
+        # When bug A is marked as a duplicate of bug B, bug A's
+        # duplicates become duplicates of bug B too. This requires
+        # to set the heat of the duplicates to 0, and to recalculate
+        # the heat cache of each target. Ensure that the heat cache
+        # is computed only once per target.
+        #
+        # The query to retrieve the hottest bug for a target is quite
+        # slow (ca 200 msec) and should be executed exactly once per
+        # target.
+        self.assertEqual(1, self.moveDuplicates(2, with_random_target=False))
+        self.assertEqual(1, self.moveDuplicates(4, with_random_target=False))
+
+        # If each bug has two targets, one of them common, the other
+        # distinct for each bug, we still get one call for each target.
+        # For N duplicates, we have N distinct targets, we have
+        # the targets for old master bug and for the new master bug,
+        # and one common target, i.e., 2*N+3 targets for N duplicates.
+        self.assertEqual(5, self.moveDuplicates(2, with_random_target=True))
+        self.assertEqual(7, self.moveDuplicates(4, with_random_target=True))
+
 
 def test_suite():
     return unittest.TestLoader().loadTestsFromName(__name__)