← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~jtv/launchpad/bug-891493 into lp:launchpad

 

Jeroen T. Vermeulen has proposed merging lp:~jtv/launchpad/bug-891493 into lp:launchpad.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  Bug #891493 in Launchpad itself: "Dogfood timeout while approving Ubuntu uploads"
  https://bugs.launchpad.net/launchpad/+bug/891493

For more details, see:
https://code.launchpad.net/~jtv/launchpad/bug-891493/+merge/82514

= Summary =

My Q/A on dogfood was being obstructed by timeouts.  The culprit was SPR.getBuildByArch, which contained a query that took 0.8 seconds on dogfood — repeated in a loop over all of a distroseries' architectures.


== Proposed fix ==

The slow query looks for builds for a single architecture.  As it turns out, the version that gets them for all relevant architectures was massively faster: 0.2—0.3ms, or 3 to 4 orders of magnitude.

And so I extracted the query into a method, and made it retrieve builds for all architectures in one go.  It's really designed to be hoisted out of the architectures loop and reused across every iteration, but at this speed, why bother?


== Pre-implementation notes ==

William notes that the code has been in need of love for some time.


== Implementation details ==

Two architectures can be associated with one binary-package publication: the one that its binary package release was built for, and the one it's published for.

Normally these are the same, but architecture-independent packages are built in one architecture (the "nominated" one, in practice i386) and get published in all relevant architectures.

To deal with that situation in the same way as the old code, I required that the BPR and the BPPH be for the same architecture — but removed the restriction on what architecture that is, since the new query retrieves information for all architectures.  Instead of restricting the DistroArchSeries for a BinaryPackagePublication to the given one, I restricted only its distroseries to a constant.  Note that (distroseries, architecturetag) are a key of DistroArchSeries.


== Tests ==

Many tests cover or rely on getBuildByArch:
{{{
./bin/test -vvc lp.soyuz.scripts.tests.test_add_missing_builds
./bin/test -vvc lp.soyuz.scripts.tests.test_initialize_distroseries
./bin/test -vvc lp.soyuz.scripts.tests.test_sourcepackagerelease -t GetBuildByArch
}}}

I also added one unit test for my extracted method.  It's a simple case, but covers so much of the functionality that I didn't see much point in adding more:
{{{
./bin/test -vvc lp.soyuz.scripts.tests.test_sourcepackagerelease -t FindBuildsByArchitecture
}}}


== Demo and Q/A ==

On dogfood, approve half a pageful of Ubuntu package uploads at once.  It should not time out, or at least not repeatedly as it did endlessly for me.


= Launchpad lint =

One bit of lint that's really due to the linter getting confused by decorator cleverness:

Checking for conflicts and issues in changed files.

Linting changed files:
  lib/lp/soyuz/tests/test_sourcepackagerelease.py
  lib/lp/soyuz/model/sourcepackagerelease.py

./lib/lp/soyuz/model/sourcepackagerelease.py
     205: redefinition of function 'copyright' from line 196
-- 
https://code.launchpad.net/~jtv/launchpad/bug-891493/+merge/82514
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~jtv/launchpad/bug-891493 into lp:launchpad.
=== modified file 'lib/lp/soyuz/model/sourcepackagerelease.py'
--- lib/lp/soyuz/model/sourcepackagerelease.py	2011-11-04 08:03:58 +0000
+++ lib/lp/soyuz/model/sourcepackagerelease.py	2011-11-17 10:56:16 +0000
@@ -29,6 +29,7 @@
     StringCol,
     )
 from storm.expr import Join
+from storm.info import ClassAlias
 from storm.locals import (
     Int,
     Reference,
@@ -417,55 +418,76 @@
             pocket=pocket,
             archive=archive)
 
+    def findBuildsByArchitecture(self, distroseries, archive):
+        """Find associated builds, by architecture.
+
+        Looks for `BinaryPackageBuild` records for this source package
+        release, with publication records in the distroseries associated with
+        `distroarchseries`.  There should be at most one of these per
+        architecture.
+
+        :param distroarchseries: `DistroArchSeries` to look for.
+        :return: A dict mapping architecture tags (in string form,
+            e.g. 'i386') to `BinaryPackageBuild`s for that build.
+        """
+        # Avoid circular imports.
+        from lp.soyuz.model.binarypackagerelease import BinaryPackageRelease
+        from lp.soyuz.model.distroarchseries import DistroArchSeries
+
+        BuildDAS = ClassAlias(DistroArchSeries, 'BuildDAS')
+        PublishDAS = ClassAlias(DistroArchSeries, 'PublishDAS')
+
+        query = Store.of(self).find(
+            (BuildDAS.architecturetag, BinaryPackageBuild),
+            BinaryPackageBuild.source_package_release == self,
+            BinaryPackageRelease.buildID == BinaryPackageBuild.id,
+            BuildDAS.id == BinaryPackageBuild.distro_arch_series_id,
+            BinaryPackagePublishingHistory.binarypackagereleaseID ==
+                BinaryPackageRelease.id,
+            PublishDAS.id ==
+                BinaryPackagePublishingHistory.distroarchseriesID,
+            PublishDAS.distroseriesID == distroseries.id,
+            # Architecture-independent binary package releases are built
+            # in the nominated arch-indep architecture but published in
+            # all architectures.  This condition makes sure we consider
+            # only builds that have been published in their own
+            # architecture.
+            PublishDAS.architecturetag == BuildDAS.architecturetag,
+            BinaryPackagePublishingHistory.archiveID == archive.id)
+        results = list(query.config(distinct=True))
+        mapped_results = dict(results)
+        assert len(mapped_results) == len(results), (
+            "Found multiple build candidates per architecture: %s.  "
+            "This may mean that we have a serious problem in our DB model.  "
+            "Further investigation is required."
+            % [(tag, build.id) for tag, build in results])
+        return mapped_results
+
     def getBuildByArch(self, distroarchseries, archive):
         """See ISourcePackageRelease."""
         # First we try to follow any binaries built from the given source
         # in a distroarchseries with the given architecturetag and published
         # in the given (distroarchseries, archive) location.
-        clauseTables = [
-            'BinaryPackagePublishingHistory', 'BinaryPackageRelease',
-            'DistroArchSeries']
-
-        query = """
-            BinaryPackageBuild.source_package_release = %s AND
-            BinaryPackageRelease.build = BinaryPackageBuild.id AND
-            DistroArchSeries.id = BinaryPackageBuild.distro_arch_series AND
-            DistroArchSeries.architecturetag = %s AND
-            BinaryPackagePublishingHistory.binarypackagerelease =
-                BinaryPackageRelease.id AND
-            BinaryPackagePublishingHistory.distroarchseries = %s AND
-            BinaryPackagePublishingHistory.archive = %s
-        """ % sqlvalues(self, distroarchseries.architecturetag,
-                        distroarchseries, archive)
-
-        select_results = BinaryPackageBuild.select(
-            query, clauseTables=clauseTables, distinct=True,
-            orderBy='-BinaryPackageBuild.id')
-
-        # XXX cprov 20080216: this if/elif/else block could be avoided or,
-        # at least, simplified if SelectOne accepts 'distinct' argument.
-        # The query above results in multiple identical builds for ..
-        results = list(select_results)
-        if len(results) == 1:
+        # (Querying all architectures and then picking the right one out
+        # of the result turns out to be much faster than querying for
+        # just the architecture we want).
+        builds_by_arch = self.findBuildsByArchitecture(
+            distroarchseries.distroseries, archive)
+        build = builds_by_arch.get(distroarchseries.architecturetag)
+        if build is not None:
             # If there was any published binary we can use its original build.
-            # This case covers the situations when both, source and binaries
+            # This case covers the situations when both source and binaries
             # got copied from another location.
-            return results[0]
-        elif len(results) > 1:
-            # If more than one distinct build was found we have a problem.
-            # A build was created when it shouldn't, possible due to bug
-            # #181736. The broken build should be manually removed.
-            raise AssertionError(
-                    "Found more than one build candidate: %s. It possibly "
-                    "means we have a serious problem in out DB model, "
-                    "further investigation is required." %
-                    [build.id for build in results])
-        else:
-            # If there was no published binary we have to try to find a
-            # suitable build in all possible location across the distroseries
-            # inheritance tree. See bellow.
-            pass
+            return build
 
+        # If there was no published binary we have to try to find a
+        # suitable build in all possible location across the distroseries
+        # inheritance tree. See below.
+        clause_tables = [
+            'BuildFarmJob',
+            'PackageBuild',
+            'DistroArchSeries',
+            ]
         queries = [
             "BinaryPackageBuild.package_build = PackageBuild.id AND "
             "PackageBuild.build_farm_job = BuildFarmJob.id AND "
@@ -480,8 +502,7 @@
         query = " AND ".join(queries)
 
         return BinaryPackageBuild.selectFirst(
-            query, clauseTables=[
-                'BuildFarmJob', 'PackageBuild', 'DistroArchSeries'],
+            query, clauseTables=clause_tables,
             orderBy=['-BuildFarmJob.date_created'])
 
     def override(self, component=None, section=None, urgency=None):

=== modified file 'lib/lp/soyuz/tests/test_sourcepackagerelease.py'
--- lib/lp/soyuz/tests/test_sourcepackagerelease.py	2011-11-04 08:32:49 +0000
+++ lib/lp/soyuz/tests/test_sourcepackagerelease.py	2011-11-17 10:56:16 +0000
@@ -242,6 +242,42 @@
         self.assertEqual(orig_build, found_build)
 
 
+class TestFindBuildsByArchitecture(TestCaseWithFactory):
+    """Tests for SourcePackageRelease.findBuildsByArchitecture."""
+
+    layer = ZopelessDatabaseLayer
+
+    def test_finds_build_with_matching_pub(self):
+        # findBuildsByArchitecture finds builds for a source package
+        # release.  In particular, an arch-independent BPR is published in
+        # multiple architectures.  But findBuildsByArchitecture only counts
+        # the publication for the same architecture it was built in.
+        distroseries = self.factory.makeDistroSeries()
+        archive = distroseries.main_archive
+        # The series has a nominated arch-indep architecture.
+        distroseries.nominatedarchindep = self.factory.makeDistroArchSeries(
+            distroseries=distroseries)
+
+        bpb = self.factory.makeBinaryPackageBuild(
+            distroarchseries=distroseries.nominatedarchindep)
+        bpr = self.factory.makeBinaryPackageRelease(
+            build=bpb, architecturespecific=False)
+        spr = bpr.build.source_package_release
+
+        # The series also has other architectures.
+        self.factory.makeDistroArchSeries(distroseries=distroseries)
+
+        for das in distroseries.architectures:
+            self.factory.makeBinaryPackagePublishingHistory(
+                binarypackagerelease=bpr, distroarchseries=das,
+                archive=archive)
+
+        naked_spr = removeSecurityProxy(spr)
+        self.assertEqual(
+            {distroseries.nominatedarchindep.architecturetag: bpr.build},
+            naked_spr.findBuildsByArchitecture(distroseries, archive))
+
+
 class TestSourcePackageReleaseTranslationFiles(TestCaseWithFactory):
     """Tests for attachTranslationFiles on a different layer."""