← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~danilo/launchpad/bug-826692 into lp:launchpad

 

Данило Шеган has proposed merging lp:~danilo/launchpad/bug-826692 into lp:launchpad.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  Bug #826692 in Launchpad itself: "+activereviews times out with a lot of merge proposals"
  https://bugs.launchpad.net/launchpad/+bug/826692

For more details, see:
https://code.launchpad.net/~danilo/launchpad/bug-826692/+merge/71836

= Bug 826692 =

Branch:+activereviews page lists all "active" merge proposals, but fetches them one-by-one along with a single query for each page.  A page with 300 of them can emit a total of over 2000 queries.

That is slow, so we pre-load a bunch of the relevant things.  For preloading branch information, we re-use the preloading logic in BranchCollection.getBranches() and move it out to _preloadDataForBranches(): nothing is changed there except s/rows/branches/.

At the same time, I move the landing_candidates to using getMergeProposals().

Since I want to avoid having to rewrite any of the API callsites, the changes are all done as "side-effects", and no new tests are provided.

Providing new tests that limit the number of queries issued would be hard because this is dependent on the storm_cache_size which is set to a low 100 on development boxes.  For local testing, I've used a value of 1000 (http://paste.ubuntu.com/668090/) instead, and default value (and production value) is 10000.

== Tests ==

bin/test -cvvm lp.code

== Demo and Q/A ==

Make sure +activereviews page keeps working, and especially that

https://code.qastaging.launchpad.net/~openerp/openobject-addons/trunk/+activereviews

doesn't time out anymore.

= Launchpad lint =

Checking for conflicts and issues in changed files.

Linting changed files:
  lib/lp/code/model/branchcollection.py
  lib/lp/code/model/branch.py
-- 
https://code.launchpad.net/~danilo/launchpad/bug-826692/+merge/71836
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~danilo/launchpad/bug-826692 into lp:launchpad.
=== modified file 'lib/lp/code/model/branch.py'
--- lib/lp/code/model/branch.py	2011-08-09 14:53:07 +0000
+++ lib/lp/code/model/branch.py	2011-08-17 09:52:43 +0000
@@ -358,10 +358,10 @@
     @property
     def landing_candidates(self):
         """See `IBranch`."""
-        return BranchMergeProposal.select("""
-            BranchMergeProposal.target_branch = %s AND
-            BranchMergeProposal.queue_status NOT IN %s
-            """ % sqlvalues(self, BRANCH_MERGE_PROPOSAL_FINAL_STATES))
+        non_final = tuple(
+            set(BranchMergeProposalStatus.items) -
+            set(BRANCH_MERGE_PROPOSAL_FINAL_STATES))
+        return self.getMergeProposals(status=non_final)
 
     @property
     def dependent_branches(self):

=== modified file 'lib/lp/code/model/branchcollection.py'
--- lib/lp/code/model/branchcollection.py	2011-07-08 10:47:21 +0000
+++ lib/lp/code/model/branchcollection.py	2011-08-17 09:52:43 +0000
@@ -63,12 +63,17 @@
 from lp.code.model.codeimport import CodeImport
 from lp.code.model.codereviewcomment import CodeReviewComment
 from lp.code.model.codereviewvote import CodeReviewVoteReference
+from lp.code.model.diff import (
+    Diff,
+    PreviewDiff,
+    )
 from lp.code.model.seriessourcepackagebranch import SeriesSourcePackageBranch
 from lp.registry.model.distribution import Distribution
 from lp.registry.model.distroseries import DistroSeries
 from lp.registry.model.person import (
     Owner,
     Person,
+    ValidPersonCache,
     )
 from lp.registry.model.product import Product
 from lp.registry.model.sourcepackagename import SourcePackageName
@@ -152,9 +157,9 @@
                   exclude_from_search=None, symmetric=True):
         """Return a subset of this collection, filtered by 'expressions'.
 
-        :param symmetric: If True this filter will apply to both sides of merge
-            proposal lookups and any other lookups that join Branch back onto
-            Branch.
+        :param symmetric: If True this filter will apply to both sides
+            of merge proposal lookups and any other lookups that join
+            Branch back onto Branch.
         """
         # NOTE: JonathanLange 2009-02-17: We might be able to avoid the need
         # for explicit 'tables' by harnessing Storm's table inference system.
@@ -175,7 +180,8 @@
             if table is not None:
                 asymmetric_tables[table] = join
             symmetric_expr = list(self._branch_filter_expressions)
-            asymmetric_expr = self._asymmetric_filter_expressions + expressions
+            asymmetric_expr = (
+                self._asymmetric_filter_expressions + expressions)
         if exclude_from_search is None:
             exclude_from_search = []
         return self.__class__(
@@ -205,12 +211,48 @@
     def _getCandidateBranchesWith(self):
         """Return WITH clauses defining candidate branches.
 
-        These are defined in terms of scope_branches which should be separately
-        calculated.
+        These are defined in terms of scope_branches which should be
+        separately calculated.
         """
         return [
             With("candidate_branches", SQL("SELECT id from scope_branches"))]
 
+    def _preloadDataForBranches(self, branches):
+        """Preload branches cached associated product series and
+        suite source packages."""
+        caches = dict((branch.id, get_property_cache(branch))
+            for branch in branches)
+        branch_ids = caches.keys()
+        for cache in caches.values():
+            if not safe_hasattr(cache, '_associatedProductSeries'):
+                cache._associatedProductSeries = []
+            if not safe_hasattr(cache, '_associatedSuiteSourcePackages'):
+                cache._associatedSuiteSourcePackages = []
+            if not safe_hasattr(cache, 'code_import'):
+                cache.code_import = None
+        # associatedProductSeries
+        # Imported here to avoid circular import.
+        from lp.registry.model.productseries import ProductSeries
+        for productseries in self.store.find(
+            ProductSeries,
+            ProductSeries.branchID.is_in(branch_ids)):
+            cache = caches[productseries.branchID]
+            cache._associatedProductSeries.append(productseries)
+        # associatedSuiteSourcePackages
+        series_set = getUtility(IFindOfficialBranchLinks)
+        # Order by the pocket to get the release one first. If changing
+        # this be sure to also change BranchCollection.getBranches.
+        links = series_set.findForBranches(branches).order_by(
+            SeriesSourcePackageBranch.pocket)
+        for link in links:
+            cache = caches[link.branchID]
+            cache._associatedSuiteSourcePackages.append(
+                link.suite_sourcepackage)
+        for code_import in IStore(CodeImport).find(
+            CodeImport, CodeImport.branchID.is_in(branch_ids)):
+            cache = caches[code_import.branchID]
+            cache.code_import = code_import
+
     def getBranches(self, eager_load=False):
         """See `IBranchCollection`."""
         all_tables = set(
@@ -225,43 +267,12 @@
             branch_ids = set(branch.id for branch in rows)
             if not branch_ids:
                 return
-            branches = dict((branch.id, branch) for branch in rows)
-            caches = dict((branch.id, get_property_cache(branch))
-                for branch in rows)
-            for cache in caches.values():
-                if not safe_hasattr(cache, '_associatedProductSeries'):
-                    cache._associatedProductSeries = []
-                if not safe_hasattr(cache, '_associatedSuiteSourcePackages'):
-                    cache._associatedSuiteSourcePackages = []
-                if not safe_hasattr(cache, 'code_import'):
-                    cache.code_import = None
-            # associatedProductSeries
-            # Imported here to avoid circular import.
-            from lp.registry.model.productseries import ProductSeries
-            for productseries in self.store.find(
-                ProductSeries,
-                ProductSeries.branchID.is_in(branch_ids)):
-                cache = caches[productseries.branchID]
-                cache._associatedProductSeries.append(productseries)
-            # associatedSuiteSourcePackages
-            series_set = getUtility(IFindOfficialBranchLinks)
-            # Order by the pocket to get the release one first. If changing
-            # this be sure to also change BranchCollection.getBranches.
-            links = series_set.findForBranches(rows).order_by(
-                SeriesSourcePackageBranch.pocket)
-            for link in links:
-                cache = caches[link.branchID]
-                cache._associatedSuiteSourcePackages.append(
-                    link.suite_sourcepackage)
+            self._preloadDataForBranches(rows)
             load_related(Product, rows, ['productID'])
             # So far have only needed the persons for their canonical_url - no
             # need for validity etc in the /branches API call.
             load_related(Person, rows,
                 ['ownerID', 'registrantID', 'reviewerID'])
-            for code_import in IStore(CodeImport).find(
-                CodeImport, CodeImport.branchID.is_in(branch_ids)):
-                cache = caches[code_import.branchID]
-                cache.code_import = code_import
             load_referencing(BugBranch, rows, ['branchID'])
         return DecoratedResultSet(resultset, pre_iter_hook=do_eager_load)
 
@@ -280,15 +291,46 @@
 
     def _naiveGetMergeProposals(self, statuses=None, for_branches=None,
         target_branch=None, merged_revnos=None):
+
+        def do_eager_load(rows):
+            branch_ids = set()
+            person_ids = set()
+            diff_ids = set()
+            for mp in rows:
+                branch_ids.add(mp.target_branchID)
+                branch_ids.add(mp.source_branchID)
+                person_ids.add(mp.registrantID)
+                person_ids.add(mp.merge_reporterID)
+                diff_ids.add(mp.preview_diff_id)
+            if not branch_ids:
+                return
+
+            # Pre-load Person and ValidPersonCache.
+            list(self.store.find(
+                (Person, ValidPersonCache),
+                ValidPersonCache.id == Person.id,
+                Person.id.is_in(person_ids),
+                ))
+
+            # Pre-load PreviewDiffs and Diffs.
+            list(self.store.find(
+                (PreviewDiff, Diff),
+                PreviewDiff.id.is_in(diff_ids),
+                Diff.id == PreviewDiff.diff_id))
+
+            branches = set(
+                self.store.find(Branch, Branch.id.is_in(branch_ids)))
+            self._preloadDataForBranches(branches)
+
         Target = ClassAlias(Branch, "target")
         extra_tables = list(set(
             self._tables.values() + self._asymmetric_tables.values()))
         tables = [Branch] + extra_tables + [
             Join(BranchMergeProposal, And(
-                Branch.id==BranchMergeProposal.source_branchID,
+                Branch.id == BranchMergeProposal.source_branchID,
                 *(self._branch_filter_expressions +
                   self._asymmetric_filter_expressions))),
-            Join(Target, Target.id==BranchMergeProposal.target_branchID)
+            Join(Target, Target.id == BranchMergeProposal.target_branchID),
             ]
         expressions = self._getBranchVisibilityExpression()
         expressions.extend(self._getBranchVisibilityExpression(Target))
@@ -305,7 +347,9 @@
         if statuses is not None:
             expressions.append(
                 BranchMergeProposal.queue_status.is_in(statuses))
-        return self.store.using(*tables).find(BranchMergeProposal, *expressions)
+        resultset = self.store.using(*tables).find(
+            BranchMergeProposal, *expressions)
+        return DecoratedResultSet(resultset, pre_iter_hook=do_eager_load)
 
     def _scopedGetMergeProposals(self, statuses):
         scope_tables = [Branch] + self._tables.values()
@@ -513,7 +557,7 @@
         """See `IBranchCollection`."""
         subquery = Select(
             TeamParticipation.teamID,
-            where=TeamParticipation.personID==person.id)
+            where=TeamParticipation.personID == person.id)
         filter = [In(Branch.ownerID, subquery)]
 
         return self._filterBy(filter, symmetric=False)
@@ -666,8 +710,8 @@
     def _getCandidateBranchesWith(self):
         """Return WITH clauses defining candidate branches.
 
-        These are defined in terms of scope_branches which should be separately
-        calculated.
+        These are defined in terms of scope_branches which should be
+        separately calculated.
         """
         # Anonymous users get public branches only.
         return [
@@ -694,9 +738,9 @@
                   exclude_from_search=None, symmetric=True):
         """Return a subset of this collection, filtered by 'expressions'.
 
-        :param symmetric: If True this filter will apply to both sides of merge
-            proposal lookups and any other lookups that join Branch back onto
-            Branch.
+        :param symmetric: If True this filter will apply to both sides
+            of merge proposal lookups and any other lookups that join
+            Branch back onto Branch.
         """
         # NOTE: JonathanLange 2009-02-17: We might be able to avoid the need
         # for explicit 'tables' by harnessing Storm's table inference system.
@@ -717,7 +761,8 @@
             if table is not None:
                 asymmetric_tables[table] = join
             symmetric_expr = list(self._branch_filter_expressions)
-            asymmetric_expr = self._asymmetric_filter_expressions + expressions
+            asymmetric_expr = (
+                self._asymmetric_filter_expressions + expressions)
         if exclude_from_search is None:
             exclude_from_search = []
         return self.__class__(
@@ -781,8 +826,8 @@
     def _getCandidateBranchesWith(self):
         """Return WITH clauses defining candidate branches.
 
-        These are defined in terms of scope_branches which should be separately
-        calculated.
+        These are defined in terms of scope_branches which should be
+        separately calculated.
         """
         person = self._user
         if person is None:
@@ -796,10 +841,11 @@
                 TeamParticipation.personID == person.id)._get_select()),
             With("private_branches", SQL("""
                 SELECT scope_branches.id FROM scope_branches WHERE
-                scope_branches.private AND ((scope_branches.owner in (select team from teams) OR
-                    EXISTS(SELECT true from BranchSubscription, teams WHERE
-                        branchsubscription.branch = scope_branches.id AND
-                        branchsubscription.person = teams.team)))""")),
+                scope_branches.private AND (
+                    (scope_branches.owner in (select team from teams) OR
+                     EXISTS(SELECT true from BranchSubscription, teams WHERE
+                         branchsubscription.branch = scope_branches.id AND
+                         branchsubscription.person = teams.team)))""")),
             With("candidate_branches", SQL("""
                 (SELECT id FROM private_branches) UNION
                 (select id FROM scope_branches WHERE not private)"""))


Follow ups