← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~wgrant/launchpad/rargh-smash-dupe-heat into lp:launchpad

 

William Grant has proposed merging lp:~wgrant/launchpad/rargh-smash-dupe-heat into lp:launchpad.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  Bug #936884 in Launchpad itself: "BugHeatUpdater is slow"
  https://bugs.launchpad.net/launchpad/+bug/936884

For more details, see:
https://code.launchpad.net/~wgrant/launchpad/rargh-smash-dupe-heat/+merge/93801

BugHeatUpdater has a goal iteration length of 2 seconds, but the initial query takes around 2.5s due to the large number of duplicate bugs with outdated heat. This causes it to degrade to single-bug ~3s iterations, which makes it take forever and places excessive load on the DB.

This branch adjusts our definition of heat to make things simpler and faster. Since duplicate bugs are hidden from default listings, the current rule that they have their heat overridden to 0 provides little benefit to search results. I've removed the override from updateHeat and changed BugHeatUpdater to no longer skip duplicate bugs -- this lets it use the heat_last_updated index to satisfy the query in milliseconds rather than seconds.

I've also thrown in a change to the heat outdatedness check. We've traditionally updated every bug's heat weekly in order to implement the recently removed aging functionality, but we now only need to do it when we change the algorithm. The new BugHeatUpdater parses a feature flag to determine the cutoff -- if it's not set then it doesn't update anything.

20:11:17 < wgrant> So, I'll remove the dupe => 0, and remove the non-dupe filter?
20:11:23 < lifeless> mrevell: I've been meaning to catchup with you, but ETERRIBLETIMING
20:11:42 < lifeless> wgrant: that works for me; see if you can convince yourrandomreviewer of the sanity of it ;)
20:12:39 < lifeless> wgrant: you should also land either a knob, or a hard coded value, to replace the '1 week old' thing, unless you've done that already.
-- 
https://code.launchpad.net/~wgrant/launchpad/rargh-smash-dupe-heat/+merge/93801
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~wgrant/launchpad/rargh-smash-dupe-heat into lp:launchpad.
=== modified file 'lib/lp/bugs/doc/bug-heat.txt'
--- lib/lp/bugs/doc/bug-heat.txt	2012-02-07 06:48:22 +0000
+++ lib/lp/bugs/doc/bug-heat.txt	2012-02-20 11:18:19 +0000
@@ -143,10 +143,9 @@
     >>> bug.heat
     8
 
-Marking the bug as a duplicate will set its heat to zero, whilst also
-adding 10 points of heat to the bug it duplicates, 6 points for the
-duplication and 4 points for the subscribers that the duplicated bug
-inherits.
+Marking the bug as a duplicate won't change its heat, but it will add 10
+points of heat to the bug it duplicates: 6 points for the duplication
+and 4 points for the subscribers that the duplicated bug inherits.
 
     >>> duplicated_bug = factory.makeBug()
     >>> duplicated_bug.heat
@@ -154,7 +153,7 @@
 
     >>> bug.markAsDuplicate(duplicated_bug)
     >>> bug.heat
-    0
+    8
 
     >>> duplicated_bug.heat
     16
@@ -216,16 +215,16 @@
 
 If we call getBugsWithOutdatedHeat() now, the set that is returned will
 be empty because all the bugs have been recently updated.
-getBugsWithOutdatedHeat() takes a single parameter, max_heat_age, which
-is the maximum age, in days, that a bug's heat can be before it gets
-included in the returned set.
+getBugsWithOutdatedHeat() takes a single parameter, cutoff, which is the
+oldest a bug's heat can be before it gets included in the returned set.
 
-    >>> getUtility(IBugSet).getBugsWithOutdatedHeat(1).count()
+    >>> yesterday = datetime.now(timezone('UTC')) - timedelta(days=1)
+    >>> getUtility(IBugSet).getBugsWithOutdatedHeat(yesterday).count()
     0
 
 IBug.setHeat() takes a timestamp parameter so that we can set the
 heat_last_updated date manually for the purposes of testing. If we make
-a bug's heat older than the max_heat_age that we pass to
+a bug's heat older than the cutoff that we pass to
 getBugsWithOutdatedHeat() it will appear in the set returned by
 getBugsWithOutdatedHeat().
 
@@ -233,7 +232,8 @@
     >>> old_heat_bug.setHeat(
     ...     0, datetime.now(timezone('UTC')) - timedelta(days=2))
 
-    >>> outdated_bugs = getUtility(IBugSet).getBugsWithOutdatedHeat(1)
+    >>> outdated_bugs = getUtility(IBugSet).getBugsWithOutdatedHeat(
+    ...     yesterday)
     >>> outdated_bugs.count()
     1
 
@@ -249,7 +249,8 @@
 
     >>> new_bug.heat_last_updated = None
 
-    >>> outdated_bugs = getUtility(IBugSet).getBugsWithOutdatedHeat(1)
+    >>> outdated_bugs = getUtility(IBugSet).getBugsWithOutdatedHeat(
+    ...     yesterday)
     >>> outdated_bugs.count()
     2
 
@@ -270,33 +271,49 @@
 right bugs.
 
     >>> transaction.commit()
-    >>> update_bug_heat = BugHeatUpdater(FakeLogger(), max_heat_age=1)
+    >>> update_bug_heat = BugHeatUpdater(FakeLogger())
 
 BugHeatUpdater implements ITunableLoop and as such is callable. Calling
 it as a method will recalculate the heat for all the out-of-date bugs.
 
 There are two bugs with heat more than a day old:
 
-    >>> getUtility(IBugSet).getBugsWithOutdatedHeat(1).count()
+    >>> getUtility(IBugSet).getBugsWithOutdatedHeat(yesterday).count()
     2
 
-Calling our BugHeatUpdater will update the heat of those bugs.
+The updater gets the cutoff from a feature flag. By default no bugs are
+considered outdated.
 
     >>> update_bug_heat(chunk_size=1)
+    DEBUG Updating heat for 0 bugs
+
+    >>> getUtility(IBugSet).getBugsWithOutdatedHeat(yesterday).count()
+    2
+
+If we set the cutoff to a day ago, calling our BugHeatUpdater will update the
+heat of those bugs.
+
+    >>> from lp.services.features.testing import FeatureFixture
+    >>> flag = FeatureFixture(
+    ...     {'bugs.heat_updates.cutoff': yesterday.isoformat()})
+
+    >>> with flag:
+    ...     update_bug_heat(chunk_size=1)
     DEBUG Updating heat for 1 bugs
 
 IBugSet.getBugsWithOutdatedHeat() will now return 1 item.
 
-    >>> getUtility(IBugSet).getBugsWithOutdatedHeat(1).count()
+    >>> getUtility(IBugSet).getBugsWithOutdatedHeat(yesterday).count()
     1
 
 Update the rest in one big chunk.
 
-    >>> update_bug_heat(chunk_size=1000)
+    >>> with flag:
+    ...     update_bug_heat(chunk_size=1000)
     DEBUG Updating heat for 1 bugs
 
 IBugSet.getBugsWithOutdatedHeat() will now return an empty set since all
 the bugs have been updated.
 
-    >>> getUtility(IBugSet).getBugsWithOutdatedHeat(1).count()
+    >>> getUtility(IBugSet).getBugsWithOutdatedHeat(yesterday).count()
     0

=== modified file 'lib/lp/bugs/interfaces/bug.py'
--- lib/lp/bugs/interfaces/bug.py	2012-02-15 08:13:51 +0000
+++ lib/lp/bugs/interfaces/bug.py	2012-02-20 11:18:19 +0000
@@ -1263,12 +1263,11 @@
         USE THIS ANYWAY.
         """
 
-    def getBugsWithOutdatedHeat(max_heat_age):
+    def getBugsWithOutdatedHeat(cutoff):
         """Return the set of bugs whose heat is out of date.
 
-        :param max_heat_age: The maximum age, in days, that a bug's heat
-                             can be before it is included in the
-                             returned set.
+        :param cutoff: the oldest that a bug's heat can be before it is
+            considered outdated.
         """
 
 

=== modified file 'lib/lp/bugs/model/bug.py'
--- lib/lp/bugs/model/bug.py	2012-02-15 08:13:51 +0000
+++ lib/lp/bugs/model/bug.py	2012-02-20 11:18:19 +0000
@@ -22,10 +22,6 @@
 
 
 from cStringIO import StringIO
-from datetime import (
-    datetime,
-    timedelta,
-    )
 from email.Utils import make_msgid
 from functools import wraps
 from itertools import chain
@@ -39,7 +35,6 @@
     )
 from lazr.lifecycle.snapshot import Snapshot
 import pytz
-from pytz import timezone
 from sqlobject import (
     BoolCol,
     ForeignKey,
@@ -2070,20 +2065,14 @@
         except LaunchpadValidationError, validation_error:
             raise InvalidDuplicateValue(validation_error)
         if duplicate_of is not None:
-            # Update the heat of the master bug and set this bug's heat
-            # to 0 (since it's a duplicate, it shouldn't have any heat
-            # at all).
-            self.setHeat(0)
             # Maybe confirm bug tasks, now that more people might be affected
             # by this bug from the duplicates.
             duplicate_of.maybeConfirmBugtasks()
-        else:
-            # Otherwise, recalculate this bug's heat, since it will be 0
-            # from having been a duplicate. We also update the bug that
-            # was previously duplicated.
-            self.updateHeat()
-            if current_duplicateof is not None:
-                current_duplicateof.updateHeat()
+
+        # Update the former duplicateof's heat, as it will have been
+        # reduced by the unduping.
+        if current_duplicateof is not None:
+            current_duplicateof.updateHeat()
 
     def markAsDuplicate(self, duplicate_of):
         """See `IBug`."""
@@ -2256,11 +2245,6 @@
 
     def updateHeat(self):
         """See `IBug`."""
-        if self.duplicateof is not None:
-            # If this bug is a duplicate we don't try to calculate its
-            # heat.
-            return
-
         # We need to flush the store first to ensure that changes are
         # reflected in the new bug heat total.
         store = Store.of(self)
@@ -2991,18 +2975,15 @@
         result_set = store.find(Bug)
         return result_set.order_by('id')
 
-    def getBugsWithOutdatedHeat(self, max_heat_age):
+    def getBugsWithOutdatedHeat(self, cutoff):
         """See `IBugSet`."""
         store = IStore(Bug)
-        last_updated_cutoff = (
-            datetime.now(timezone('UTC')) -
-            timedelta(days=max_heat_age))
         last_updated_clause = Or(
-            Bug.heat_last_updated < last_updated_cutoff,
+            Bug.heat_last_updated < cutoff,
             Bug.heat_last_updated == None)
 
-        return store.find(
-            Bug, Bug.duplicateof == None, last_updated_clause).order_by('id')
+        return store.find(Bug, last_updated_clause).order_by(
+            Bug.heat_last_updated)
 
 
 class BugAffectsPerson(SQLBase):

=== modified file 'lib/lp/scripts/garbo.py'
--- lib/lp/scripts/garbo.py	2012-01-18 05:41:43 +0000
+++ lib/lp/scripts/garbo.py	2012-02-20 11:18:19 +0000
@@ -13,6 +13,7 @@
     datetime,
     timedelta,
     )
+import iso8601
 import logging
 import multiprocessing
 import os
@@ -25,12 +26,14 @@
     )
 from psycopg2 import IntegrityError
 import pytz
+from pytz import timezone
 from storm.expr import In
 from storm.locals import (
     Max,
     Min,
     SQL,
     )
+from storm.store import EmptyResultSet
 import transaction
 from zope.component import getUtility
 from zope.security.proxy import removeSecurityProxy
@@ -63,6 +66,7 @@
     session_store,
     sqlvalues,
     )
+from lp.services.features import getFeatureFlag
 from lp.services.identity.interfaces.account import AccountStatus
 from lp.services.identity.interfaces.emailaddress import EmailAddressStatus
 from lp.services.identity.model.account import Account
@@ -776,22 +780,24 @@
 
     maximum_chunk_size = 5000
 
-    def __init__(self, log, abort_time=None, max_heat_age=None):
+    def __init__(self, log, abort_time=None):
         super(BugHeatUpdater, self).__init__(log, abort_time)
         self.transaction = transaction
         self.total_processed = 0
         self.is_done = False
         self.offset = 0
-        if max_heat_age is None:
-            max_heat_age = config.calculate_bug_heat.max_heat_age
-        self.max_heat_age = max_heat_age
 
         self.store = IMasterStore(Bug)
 
     @property
     def _outdated_bugs(self):
+        try:
+            last_updated_cutoff = iso8601.parse_date(
+                getFeatureFlag('bugs.heat_updates.cutoff'))
+        except iso8601.ParseError:
+            return EmptyResultSet()
         outdated_bugs = getUtility(IBugSet).getBugsWithOutdatedHeat(
-            self.max_heat_age)
+            last_updated_cutoff)
         # We remove the security proxy so that we can access the set()
         # method of the result set.
         return removeSecurityProxy(outdated_bugs)

=== modified file 'lib/lp/services/features/flags.py'
--- lib/lp/services/features/flags.py	2012-02-17 01:23:31 +0000
+++ lib/lp/services/features/flags.py	2012-02-20 11:18:19 +0000
@@ -31,7 +31,9 @@
     ('int',
      "An integer."),
     ('space delimited',
-     'Space-delimited strings.')
+     'Space-delimited strings.'),
+    ('datetime',
+     'ISO 8601 datetime'),
     ])
 
 # Data for generating web-visible feature flag documentation.
@@ -79,6 +81,13 @@
      '',
      'Listing pre-fetching',
      'https://bugs.launchpad.net/launchpad/+bug/888756'),
+    ('bugs.heat_updates.cutoff',
+     'timestamp',
+     ('Set the oldest that a bug\'s heat can be before it is '
+      'considered outdated.'),
+     '',
+     '',
+     ''),
     ('code.ajax_revision_diffs.enabled',
      'boolean',
      ("Offer expandable inline diffs for branch revisions."),