← Back to team overview

launchpad-reviewers team mailing list archive

lp:~wallyworld/launchpad/person-related-software-timeout-735972 into lp:launchpad

 

Ian Booth has proposed merging lp:~wallyworld/launchpad/person-related-software-timeout-735972 into lp:launchpad.

Commit message:
Rework key person related package queries to use Storm and be more efficient.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  Bug #735972 in Launchpad itself: "Person:+related-software timeouts"
  https://bugs.launchpad.net/launchpad/+bug/735972

For more details, see:
https://code.launchpad.net/~wallyworld/launchpad/person-related-software-timeout-735972/+merge/130924

== Implementation ==

There are several person related software views. Essentially, each displays some different data and has menu links to each of the other views. There are severasl issues:

1. the queries to populate the tables use old style sql object queries with many left joins to pull in all the related data, and use multi-column distinct and order by clauses to get only the latest packages

2. the same queries as above are run just to get a true/false as to whether to display the menu links, and the expensive distinct and order by stuff is not required

3. superfluous information is shown (bug counts, question counts etc - see bug 206917) and this add to the render time

This branch covers items 1 and 2.

New person APIs are introduced:
- hasMaintainedPackages()
- hasUploadedButNotMaintainedPackages()
- hasUploadedPPAPackages()
- hasSynchronisedPublishings()

The above are much simplified versions of the queries used to populate the view, and are used to toggle menu links. A quick check on prod indicates a factor of 4 improvement for each query.

The query count is slightly increased since the batching/preloading pattern used with Storm runs separate subqueries for each related data set, but the overall queries are smaller and easier for Postgres to optimise.

== Tests ==

The view changes are internal, so existing tests will work there.
I updated the registry person.txt doc tests to exercise the newly added APIs.

== Lint ==

Checking for conflicts and issues in changed files.

Linting changed files:
  lib/lp/registry/browser/person.py
  lib/lp/registry/doc/person.txt
  lib/lp/registry/interfaces/person.py
  lib/lp/registry/model/person.py
-- 
https://code.launchpad.net/~wallyworld/launchpad/person-related-software-timeout-735972/+merge/130924
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~wallyworld/launchpad/person-related-software-timeout-735972 into lp:launchpad.
=== modified file 'lib/lp/registry/browser/person.py'
--- lib/lp/registry/browser/person.py	2012-10-22 02:30:44 +0000
+++ lib/lp/registry/browser/person.py	2012-10-23 02:37:20 +0000
@@ -609,27 +609,25 @@
     def maintained(self):
         target = '+maintained-packages'
         text = 'Maintained packages'
-        enabled = bool(self.person.getLatestMaintainedPackages())
+        enabled = self.person.hasMaintainedPackages()
         return Link(target, text, enabled=enabled, icon='info')
 
     def uploaded(self):
         target = '+uploaded-packages'
         text = 'Uploaded packages'
-        enabled = bool(
-            self.person.getLatestUploadedButNotMaintainedPackages())
+        enabled = self.person.hasUploadedButNotMaintainedPackages()
         return Link(target, text, enabled=enabled, icon='info')
 
     def ppa(self):
         target = '+ppa-packages'
         text = 'Related PPA packages'
-        enabled = bool(self.person.getLatestUploadedPPAPackages())
+        enabled = self.person.hasUploadedPPAPackages()
         return Link(target, text, enabled=enabled, icon='info')
 
     def synchronised(self):
         target = '+synchronised-packages'
         text = 'Synchronised packages'
-        enabled = bool(
-            self.person.getLatestSynchronisedPublishings())
+        enabled = self.person.hasSynchronisedPublishings()
         return Link(target, text, enabled=enabled, icon='info')
 
     def projects(self):

=== modified file 'lib/lp/registry/doc/person.txt'
--- lib/lp/registry/doc/person.txt	2012-09-28 14:35:35 +0000
+++ lib/lp/registry/doc/person.txt	2012-10-23 02:37:20 +0000
@@ -954,7 +954,15 @@
 The 3rd method returns SourcePackageReleases uploaded by the person in
 question to any PPA.
 
+There are also analogous methods to see if a person has any of the above
+related packages:
+ 1. hasMaintainedPackages(),
+ 2. hasUploadedButNotMaintainedPackages(),
+ 3. hasUploadedPPAPackages
+
     >>> mark = personset.getByName('mark')
+    >>> mark.hasMaintainedPackages()
+    True
     >>> for sprelease in mark.getLatestMaintainedPackages():
     ...     print (sprelease.name,
     ...            sprelease.upload_distroseries.fullseriesname,
@@ -967,6 +975,8 @@
     (u'mozilla-firefox', u'Ubuntu Warty', u'0.9')
     (u'evolution', u'Ubuntu Hoary', u'1.0')
 
+    >>> mark.hasUploadedButNotMaintainedPackages()
+    True
     >>> for sprelease in mark.getLatestUploadedButNotMaintainedPackages():
     ...     print (sprelease.name,
     ...            sprelease.upload_distroseries.fullseriesname,
@@ -978,6 +988,8 @@
     (u'linux-source-2.6.15', u'Ubuntu Hoary', u'2.6.15.3')
     (u'alsa-utils', u'Ubuntu Hoary', u'1.0.9a-4ubuntu1')
 
+    >>> mark.hasUploadedPPAPackages()
+    True
     >>> mark_spreleases = mark.getLatestUploadedPPAPackages()
     >>> for sprelease in mark_spreleases:
     ...     print (sprelease.name,
@@ -1007,6 +1019,16 @@
     ...            sprelease.upload_distroseries.fullseriesname)
     (u'iceweasel', u'1.0', u'mark', u'mark', u'mark', u'Ubuntu Warty')
 
+Unlike Mark, this next person is very lazy and has no related packages:
+
+    >>> lazy = personset.getByName('name12')
+    >>> lazy.hasMaintainedPackages()
+    False
+    >>> lazy.hasUploadedButNotMaintainedPackages()
+    False
+    >>> lazy.hasUploadedPPAPackages()
+    False
+
 
 Packages a Person is subscribed to
 ----------------------------------

=== modified file 'lib/lp/registry/interfaces/person.py'
--- lib/lp/registry/interfaces/person.py	2012-10-16 00:48:55 +0000
+++ lib/lp/registry/interfaces/person.py	2012-10-23 02:37:20 +0000
@@ -1220,13 +1220,6 @@
         used between TeamMembership and Person objects.
         """
 
-    def getLatestMaintainedPackages():
-        """Return `SourcePackageRelease`s maintained by this person.
-
-        This method will only include the latest source package release
-        for each source package name, distribution series combination.
-        """
-
     def getLatestSynchronisedPublishings():
         """Return `SourcePackagePublishingHistory`s synchronised by this
         person.
@@ -1235,6 +1228,13 @@
         package name, distribution series combination.
         """
 
+    def getLatestMaintainedPackages():
+        """Return `SourcePackageRelease`s maintained by this person.
+
+        This method will only include the latest source package release
+        for each source package name, distribution series combination.
+        """
+
     def getLatestUploadedButNotMaintainedPackages():
         """Return `SourcePackageRelease`s created by this person but
         not maintained by him.
@@ -1250,6 +1250,24 @@
         for each source package name, distribution series combination.
         """
 
+    def hasSynchronisedPublishings():
+        """Are there `SourcePackagePublishingHistory`s synchronised by this
+        person.
+        """
+
+    def hasMaintainedPackages():
+        """Are there `SourcePackageRelease`s maintained by this person."""
+
+    def hasUploadedButNotMaintainedPackages():
+        """Are there `SourcePackageRelease`s created by this person but
+        not maintained by him.
+        """
+
+    def hasUploadedPPAPackages():
+        """Are there `SourcePackageRelease`s uploaded by this person to any
+        PPA.
+        """
+
     def isUploader(distribution):
         """Return whether this person is an uploader for distribution.
 

=== modified file 'lib/lp/registry/model/person.py'
--- lib/lp/registry/model/person.py	2012-10-22 02:30:44 +0000
+++ lib/lp/registry/model/person.py	2012-10-23 02:37:20 +0000
@@ -2681,55 +2681,33 @@
         gpgkeyset = getUtility(IGPGKeySet)
         return gpgkeyset.getGPGKeys(ownerid=self.id)
 
+    def hasMaintainedPackages(self):
+        """See `IPerson`."""
+        return self._hasReleasesQuery()
+
+    def hasUploadedButNotMaintainedPackages(self):
+        """See `IPerson`."""
+        return self._hasReleasesQuery(uploader_only=True)
+
+    def hasUploadedPPAPackages(self):
+        """See `IPerson`."""
+        return self._hasReleasesQuery(uploader_only=True, ppa_only=True)
+
     def getLatestMaintainedPackages(self):
         """See `IPerson`."""
-        return self._latestSeriesQuery()
-
-    def getLatestSynchronisedPublishings(self):
-        """See `IPerson`."""
-        query = """
-            SourcePackagePublishingHistory.id IN (
-                SELECT DISTINCT ON (spph.distroseries,
-                                    spr.sourcepackagename)
-                    spph.id
-                FROM
-                    SourcePackagePublishingHistory as spph, archive,
-                    SourcePackagePublishingHistory as ancestor_spph,
-                    SourcePackageRelease as spr
-                WHERE
-                    spph.sourcepackagerelease = spr.id AND
-                    spph.creator = %(creator)s AND
-                    spph.ancestor = ancestor_spph.id AND
-                    spph.archive = archive.id AND
-                    ancestor_spph.archive != spph.archive AND
-                    archive.purpose = %(archive_purpose)s
-                ORDER BY spph.distroseries,
-                    spr.sourcepackagename,
-                    spph.datecreated DESC,
-                    spph.id DESC
-            )
-            """ % dict(
-                   creator=quote(self.id),
-                   archive_purpose=quote(ArchivePurpose.PRIMARY),
-                   )
-
-        return SourcePackagePublishingHistory.select(
-            query,
-            orderBy=['-SourcePackagePublishingHistory.datecreated',
-                     '-SourcePackagePublishingHistory.id'],
-            prejoins=['sourcepackagerelease', 'archive'])
+        return self._latestReleasesQuery()
 
     def getLatestUploadedButNotMaintainedPackages(self):
         """See `IPerson`."""
-        return self._latestSeriesQuery(uploader_only=True)
+        return self._latestReleasesQuery(uploader_only=True)
 
     def getLatestUploadedPPAPackages(self):
         """See `IPerson`."""
-        return self._latestSeriesQuery(
-            uploader_only=True, ppa_only=True)
+        return self._latestReleasesQuery(uploader_only=True, ppa_only=True)
 
-    def _latestSeriesQuery(self, uploader_only=False, ppa_only=False):
-        """Return the sourcepackagereleases (SPRs) related to this person.
+    def _releasesQueryFilter(self, uploader_only=False, ppa_only=False):
+        """Return the filter used to find sourcepackagereleases (SPRs)
+        related to this person.
 
         :param uploader_only: controls if we are interested in SPRs where
             the person in question is only the uploader (creator) and not the
@@ -2745,55 +2723,136 @@
         'uploader_only' because there shouldn't be any sense of maintainership
         for packages uploaded to PPAs by someone else than the user himself.
         """
-        clauses = ['sourcepackagerelease.upload_archive = archive.id']
+        clauses = [SourcePackageRelease.upload_archive == Archive.id]
 
         if uploader_only:
-            clauses.append(
-                'sourcepackagerelease.creator = %s' % quote(self.id))
+            clauses.append(SourcePackageRelease.creator == self)
 
         if ppa_only:
             # Source maintainer is irrelevant for PPA uploads.
             pass
         elif uploader_only:
-            clauses.append(
-                'sourcepackagerelease.maintainer != %s' % quote(self.id))
+            clauses.append(SourcePackageRelease.maintainer != self)
         else:
-            clauses.append(
-                'sourcepackagerelease.maintainer = %s' % quote(self.id))
+            clauses.append(SourcePackageRelease.maintainer == self)
 
         if ppa_only:
-            clauses.append(
-                'archive.purpose = %s' % quote(ArchivePurpose.PPA))
+            clauses.append(Archive.purpose == ArchivePurpose.PPA)
         else:
-            clauses.append(
-                'archive.purpose != %s' % quote(ArchivePurpose.PPA))
-
-        query_clauses = " AND ".join(clauses)
-        query = """
-            SourcePackageRelease.id IN (
-                SELECT DISTINCT ON (upload_distroseries,
-                                    sourcepackagerelease.sourcepackagename,
-                                    upload_archive)
-                    sourcepackagerelease.id
-                FROM sourcepackagerelease, archive,
-                    sourcepackagepublishinghistory as spph
-                WHERE
-                    spph.sourcepackagerelease = sourcepackagerelease.id AND
-                    spph.archive = archive.id AND
-                    %(more_query_clauses)s
-                ORDER BY upload_distroseries,
-                    sourcepackagerelease.sourcepackagename,
-                    upload_archive, dateuploaded DESC
-              )
-              """ % dict(more_query_clauses=query_clauses)
-
-        rset = SourcePackageRelease.select(
-            query,
-            orderBy=['-SourcePackageRelease.dateuploaded',
-                     'SourcePackageRelease.id'],
-            prejoins=['sourcepackagename', 'maintainer', 'upload_archive'])
-
-        return rset
+            clauses.append(Archive.purpose != ArchivePurpose.PPA)
+
+        return clauses
+
+    def _hasReleasesQuery(self, uploader_only=False, ppa_only=False):
+        """Are there sourcepackagereleases (SPRs) related to this person.
+        See `_releasesQueryFilter` for details on the criteria used.
+        """
+        clauses = self._releasesQueryFilter(uploader_only, ppa_only)
+        spph = ClassAlias(SourcePackagePublishingHistory, "spph")
+        tables = (
+            SourcePackageRelease,
+            Join(
+                spph, spph.sourcepackagereleaseID == SourcePackageRelease.id),
+            Join(Archive, Archive.id == spph.archiveID))
+        rs = Store.of(self).using(*tables).find(
+            SourcePackageRelease.id, clauses)
+        return not rs.is_empty()
+
+    def _latestReleasesQuery(self, uploader_only=False, ppa_only=False):
+        """Return the sourcepackagereleases (SPRs) related to this person.
+        See `_releasesQueryFilter` for details on the criteria used."""
+        clauses = self._releasesQueryFilter(uploader_only, ppa_only)
+        spph = ClassAlias(SourcePackagePublishingHistory, "spph")
+        rs = Store.of(self).find(
+            SourcePackageRelease,
+            SourcePackageRelease.id.is_in(
+                Select(
+                    SourcePackageRelease.id,
+                    tables=[
+                        SourcePackageRelease,
+                        Join(
+                            spph,
+                            spph.sourcepackagereleaseID ==
+                            SourcePackageRelease.id),
+                        Join(Archive, Archive.id == spph.archiveID)],
+                    where=And(*clauses),
+                    order_by=[SourcePackageRelease.upload_distroseriesID,
+                              SourcePackageRelease.sourcepackagenameID,
+                              SourcePackageRelease.upload_archiveID,
+                              Desc(SourcePackageRelease.dateuploaded)],
+                    distinct=(
+                        SourcePackageRelease.upload_distroseriesID,
+                        SourcePackageRelease.sourcepackagenameID,
+                        SourcePackageRelease.upload_archiveID)))
+        ).order_by(
+            Desc(SourcePackageRelease.dateuploaded), SourcePackageRelease.id)
+
+        def load_related_objects(rows):
+            list(getUtility(IPersonSet).getPrecachedPersonsFromIDs(
+                set(map(attrgetter("maintainerID"), rows))))
+            bulk.load_related(SourcePackageName, rows, ['sourcepackagenameID'])
+            bulk.load_related(Archive, rows, ['upload_archiveID'])
+
+        return DecoratedResultSet(rs, pre_iter_hook=load_related_objects)
+
+    def hasSynchronisedPublishings(self):
+        """See `IPerson`."""
+        spph = ClassAlias(SourcePackagePublishingHistory, "spph")
+        ancestor_spph = ClassAlias(
+            SourcePackagePublishingHistory, "ancestor_spph")
+        tables = (
+            SourcePackageRelease,
+            Join(
+                spph,
+                spph.sourcepackagereleaseID ==
+                SourcePackageRelease.id),
+            Join(Archive, Archive.id == spph.archiveID),
+            Join(ancestor_spph, ancestor_spph.id == spph.ancestorID))
+        rs = Store.of(self).using(*tables).find(
+            spph.id,
+            spph.creatorID == self.id,
+            ancestor_spph.archiveID != spph.archiveID,
+            Archive.purpose == ArchivePurpose.PRIMARY)
+        return not rs.is_empty()
+
+    def getLatestSynchronisedPublishings(self):
+        """See `IPerson`."""
+        spph = ClassAlias(SourcePackagePublishingHistory, "spph")
+        ancestor_spph = ClassAlias(
+            SourcePackagePublishingHistory, "ancestor_spph")
+        rs = Store.of(self).find(
+            SourcePackagePublishingHistory,
+            SourcePackagePublishingHistory.id.is_in(
+                Select(
+                    spph.id,
+                    tables=[
+                        SourcePackageRelease,
+                        Join(
+                            spph, spph.sourcepackagereleaseID ==
+                            SourcePackageRelease.id),
+                        Join(Archive, Archive.id == spph.archiveID),
+                        Join(
+                            ancestor_spph,
+                            ancestor_spph.id == spph.ancestorID)],
+                    where=And(
+                        spph.creatorID == self.id,
+                        ancestor_spph.archiveID != spph.archiveID,
+                        Archive.purpose == ArchivePurpose.PRIMARY),
+                    order_by=[spph.distroseriesID,
+                              SourcePackageRelease.sourcepackagenameID,
+                              spph.datecreated, spph.id],
+                    distinct=(
+                        spph.distroseriesID,
+                        SourcePackageRelease.sourcepackagenameID)
+                    ))).order_by(
+            Desc(SourcePackageRelease.dateuploaded), SourcePackageRelease.id)
+
+        def load_related_objects(rows):
+            bulk.load_related(
+                SourcePackageRelease, rows, ['sourcepackagereleaseID'])
+            bulk.load_related(Archive, rows, ['upload_archiveID'])
+
+        return DecoratedResultSet(rs, pre_iter_hook=load_related_objects)
 
     def createRecipe(self, name, description, recipe_text, distroseries,
                      registrant, daily_build_archive=None, build_daily=False):


Follow ups