← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~jtv/launchpad/bug-884649-branch-2 into lp:launchpad

 

Jeroen T. Vermeulen has proposed merging lp:~jtv/launchpad/bug-884649-branch-2 into lp:launchpad.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  Bug #884649 in Launchpad itself: "Ubuntu publisher is taking more than an hour to complete"
  https://bugs.launchpad.net/launchpad/+bug/884649

For more details, see:
https://code.launchpad.net/~jtv/launchpad/bug-884649-branch-2/+merge/81025

= Summary =

This is part of the Dominator optimization work needed for bug 884649.

Binary domination needs 2 passes.  This is because publications for the "all" architecture, which are copied into each architecture, can't be superseded until all architecture-specific publications coming from the same source package release have been superseded.  Only then can we decide whether arch-all publications can be superseded.  This is because a binary package from the same source package release may still depend on the arch-all binary package, and may not have finished building yet.

The code for dealing with this subtlety is a bit hard to place, because it's deep down in the inner loop as a special case.  The special case is applied to both binary-domination passes, although there's really no point: the first pass may conceivably supersede a few arch-all publications but it won't deal with all of them.  Meanwhile we pay a hefty price for the extra check.


== Proposed fix ==

Hoist the decision of which publications must stay live out of dominatePackage.  It's a policy decision for the caller; the API lets the caller specify which versions stay live.

With that, we can specialize the passes more, and cut out complexity that is slowing things down.

You'll see three new functions explaining exactly what liveness criteria each domination pass uses.  The source pass is easy.  The first binary pass deals with architecture-specific publications; arch-all ones must be considered, but they are simply kept alive.  And once all architectures have gone through first-pass binary domination, superseding as many arch-specific publications as possible, the second binary-domination pass considers just what needs to happen to those arch-all publications.


== Pre-implementation notes ==

Discussed in detail with Julian; see bug 884649 for the plan we're following in addressing this.  We should also be able to re-use liveness decisions across architectures (and probably binary packages building from the same source package).

Raphaël suggested another clever improvement: Steve K. has just initialized a new, denormalized column on BinaryPackagePublishingHistory that contains the package name.  That should enable a few shortcuts as well.

By the way, note that we can't just dominate arch-all and arch-specific publications separately.  Packages may vacillate between the two, producing successive releases that need to be considered together even if domination of some publications may need to be postponed.


== Implementation details ==

I had to ditch _dominatePublications, because the discovery of live versions went into the middle of it.  I reasoned that if removing it introduced too much code duplication, it could return with a callback for live-version discovery.  But the loss turned out to be pretty small: it's really just a 3-line loop, including log output.


== Tests ==

{{{
./bin/test -vvc lp.archivepublisher.tests.test_dominator
}}}


== Demo and Q/A ==

Dominate Ubuntu and you'll see.


= Launchpad lint =

Checking for conflicts and issues in changed files.

Linting changed files:
  lib/lp/archivepublisher/domination.py
  lib/lp/soyuz/model/publishing.py
  lib/lp/archivepublisher/tests/test_dominator.py
-- 
https://code.launchpad.net/~jtv/launchpad/bug-884649-branch-2/+merge/81025
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~jtv/launchpad/bug-884649-branch-2 into lp:launchpad.
=== modified file 'lib/lp/archivepublisher/domination.py'
--- lib/lp/archivepublisher/domination.py	2011-10-18 11:56:09 +0000
+++ lib/lp/archivepublisher/domination.py	2011-11-02 14:39:30 +0000
@@ -52,7 +52,12 @@
 
 __all__ = ['Dominator']
 
+from collections import defaultdict
 from datetime import timedelta
+from operator import (
+    attrgetter,
+    itemgetter,
+    )
 
 import apt_pkg
 from storm.expr import (
@@ -67,7 +72,11 @@
     flush_database_updates,
     sqlvalues,
     )
+from canonical.launchpad.components.decoratedresultset import (
+    DecoratedResultSet,
+    )
 from canonical.launchpad.interfaces.lpstorm import IStore
+from canonical.launchpad.utilities.orderingcheck import OrderingCheck
 from lp.registry.model.sourcepackagename import SourcePackageName
 from lp.services.database.bulk import load_related
 from lp.soyuz.enums import (
@@ -188,6 +197,104 @@
         else:
             return version_comparison
 
+    def sortPublications(self, publications):
+        """Sort publications from most to least current versions."""
+        # Listify; we want to iterate this twice, which won't do for a
+        # non-persistent sequence.
+        sorted_publications = list(publications)
+        # Batch-load associated package releases; we'll be needing them
+        # to compare versions.
+        self.load_releases(sorted_publications)
+        # Now sort.  This is that second iteration.  An in-place sort
+        # won't hurt the original, because we're working on a copy of
+        # the original iterable.
+        sorted_publications.sort(cmp=self.compare, reverse=True)
+        return sorted_publications
+
+
+def find_live_source_versions(publications):
+    """Find versions out of Published `publications` that should stay live.
+
+    This particular notion of liveness applies to source domination: the
+    latest version stays live, and that's it.
+
+    :param publications: An iterable of `SourcePackagePublishingHistory`
+        sorted by descending package version.
+    :return: A list of live versions.
+    """
+    # Given the required sort order, the latest version is at the head
+    # of the list.
+    return [publications[0].sourcepackagerelease.version]
+
+
+def get_binary_versions(binary_publications):
+    """List versions for sequence of `BinaryPackagePublishingHistory`."""
+    return [pub.binarypackagerelease.version for pub in binary_publications]
+
+
+def find_live_binary_versions_first_pass(publications):
+    """Find versions out of Published `publications` that should stay live.
+
+    This particular notion of liveness applies to first-pass binary
+    domination: the latest version stays live, and so do publications of
+    binary packages for the "all" architecture.
+
+    :param publications: An iterable of `BinaryPackagePublishingHistory`,
+        sorted by descending package version.
+    :return: A list of live versions.
+    """
+    publications = list(publications)
+    latest = publications.pop(0)
+    return get_binary_versions(
+        [latest] + [
+            pub for pub in publications if not pub.architecture_specific])
+
+
+def find_live_binary_versions_second_pass(publications):
+    """Find versions out of Published `publications` that should stay live.
+
+    This particular notion of liveness applies to second-pass binary
+    domination: the latest version stays live, and architecture-specific
+    publications stay live (i.e, ones that are not for the "all"
+    architecture).
+
+    More importantly, any publication for binary packages of the "all"
+    architecture stay live if any of the non-"all" binary packages from
+    the same source package release are still active -- even if they are
+    for other architectures.
+
+    This is the raison d'etre for the two-pass binary domination algorithm:
+    to let us see which architecture-independent binary publications can be
+    superseded without rendering any architecture-specific binaries from the
+    same source package release uninstallable.
+
+    (Note that here, "active" includes Published publications but also
+    Pending ones.  This is standard nomenclature in Soyuz.  Some of the
+    domination code confuses matters by using the term "active" to mean only
+    Published publications).
+
+    :param publications: An iterable of `BinaryPackagePublishingHistory`,
+        sorted by descending package version.
+    :return: A list of live versions.
+    """
+    publications = list(publications)
+    latest = publications.pop(0)
+    is_arch_specific = attrgetter('architecture_specific')
+    arch_specific_pubs = filter(is_arch_specific, publications)
+    arch_indep_pubs = filter(
+        lambda pub: not is_arch_specific(pub),
+        publications)
+
+    # XXX JeroenVermeulen 2011-11-01 bug=884649: This is likely to be
+    # costly, and the result could be reused for all builds of the same
+    # source package release to all architectures.
+    reprieved_pubs = [
+        pub
+        for pub in arch_indep_pubs
+            if pub.getOtherPublicationsForSameSource().any()]
+
+    return get_binary_versions([latest] + arch_specific_pubs + reprieved_pubs)
+
 
 class Dominator:
     """Manage the process of marking packages as superseded.
@@ -205,27 +312,6 @@
         self.logger = logger
         self.archive = archive
 
-    def _checkArchIndep(self, publication):
-        """Return True if the binary publication can be superseded.
-
-        If the publication is an arch-indep binary, we can only supersede
-        it if all the binaries from the same source are also superseded,
-        else those binaries may become uninstallable.
-        See bug 34086.
-        """
-        binary = publication.binarypackagerelease
-        if not binary.architecturespecific:
-            # getOtherPublicationsForSameSource returns PENDING in
-            # addition to PUBLISHED binaries, and we rely on this since
-            # they must also block domination.
-            others = publication.getOtherPublicationsForSameSource()
-            if others.any():
-                # Don't dominate this arch:all binary as there are
-                # other arch-specific binaries from the same build
-                # that are still active.
-                return False
-        return True
-
     def dominatePackage(self, publications, live_versions, generalization):
         """Dominate publications for a single package.
 
@@ -243,48 +329,47 @@
 
         :param publications: Iterable of publications for the same package,
             in the same archive, series, and pocket, all with status
-            `PackagePublishingStatus.PUBLISHED`.
-        :param live_versions: Iterable of version strings that are still
-            considered live for this package.  The given publications will
-            remain active insofar as they represent any of these versions;
-            older publications will be marked as superseded and newer ones
-            as deleted.
+            `PackagePublishingStatus.PUBLISHED`.  They must be sorted from
+            most current to least current, as would be the result of
+            `generalization.sortPublications`.
+        :param live_versions: Iterable of versions that are still considered
+            "live" for this package.  For any of these, the latest publication
+            among `publications` will remain Published.  Publications for
+            older releases, as well as older publications of live versions,
+            will be marked as Superseded.  Publications of newer versions than
+            are listed in `live_versions` are marked as Deleted.
         :param generalization: A `GeneralizedPublication` helper representing
-            the kind of publications these are--source or binary.
+            the kind of publications these are: source or binary.
         """
-        publications = list(publications)
-        generalization.load_releases(publications)
-
-        # Go through publications from latest version to oldest.  This
-        # makes it easy to figure out which release superseded which:
-        # the dominant is always the oldest live release that is newer
-        # than the one being superseded.
-        publications = sorted(
-            publications, cmp=generalization.compare, reverse=True)
+        live_versions = frozenset(live_versions)
 
         self.logger.debug(
             "Package has %d live publication(s).  Live versions: %s",
             len(publications), live_versions)
 
+        # Verify that the publications are really sorted properly.
+        check_order = OrderingCheck(cmp=generalization.compare, reverse=True)
+
         current_dominant = None
         dominant_version = None
 
         for pub in publications:
+            check_order.check(pub)
+
             version = generalization.getPackageVersion(pub)
             # There should never be two published releases with the same
-            # version.  So this comparison is really a string
-            # comparison, not a version comparison: if the versions are
-            # equal by either measure, they're from the same release.
-            if dominant_version is not None and version == dominant_version:
+            # version.  So it doesn't matter whether this comparison is
+            # really a string comparison or a version comparison: if the
+            # versions are equal by either measure, they're from the same
+            # release.
+            if version == dominant_version:
                 # This publication is for a live version, but has been
                 # superseded by a newer publication of the same version.
                 # Supersede it.
                 pub.supersede(current_dominant, logger=self.logger)
                 self.logger.debug2(
                     "Superseding older publication for version %s.", version)
-            elif (version in live_versions or
-                  (not generalization.is_source and
-                   not self._checkArchIndep(pub))):
+            elif version in live_versions:
                 # This publication stays active; if any publications
                 # that follow right after this are to be superseded,
                 # this is the release that they are superseded by.
@@ -303,50 +388,32 @@
                 pub.supersede(current_dominant, logger=self.logger)
                 self.logger.debug2("Superseding version %s.", version)
 
-    def _dominatePublications(self, pubs, generalization):
-        """Perform dominations for the given publications.
-
-        Keep the latest published version for each package active,
-        superseding older versions.
-
-        :param pubs: A dict mapping names to a list of publications. Every
-            publication must be PUBLISHED or PENDING, and the first in each
-            list will be treated as dominant (so should be the latest).
-        :param generalization: A `GeneralizedPublication` helper representing
-            the kind of publications these are--source or binary.
-        """
-        self.logger.debug("Dominating packages...")
-        for name, publications in pubs.iteritems():
-            assert publications, "Empty list of publications for %s." % name
-            # Since this always picks the latest version as the live
-            # one, this dominatePackage call will never result in a
-            # deletion.
-            latest_version = generalization.getPackageVersion(publications[0])
-            self.logger.debug2("Dominating %s" % name)
-            self.dominatePackage(
-                publications, [latest_version], generalization)
-
-    def _sortPackages(self, pkglist, generalization):
-        """Map out packages by name, and sort by descending version.
-
-        :param pkglist: An iterable of `SourcePackagePublishingHistory` or
-            `BinaryPackagePublishingHistory`.
-        :param generalization: A `GeneralizedPublication` helper representing
-            the kind of publications these are--source or binary.
-        :return: A dict mapping each package name to a list of publications
-            from `pkglist`, newest first.
-        """
-        self.logger.debug("Sorting packages...")
-
-        outpkgs = {}
-        for inpkg in pkglist:
-            key = generalization.getPackageName(inpkg)
-            outpkgs.setdefault(key, []).append(inpkg)
-
-        for package_pubs in outpkgs.itervalues():
-            package_pubs.sort(cmp=generalization.compare, reverse=True)
-
-        return outpkgs
+    def _sortPackages(self, publications, generalization):
+        """Partition publications by package name, and sort them.
+
+        The publications are sorted from most current to least current,
+        as required by `dominatePackage` etc.
+
+        :param publications: An iterable of `SourcePackagePublishingHistory`
+            or of `BinaryPackagePublishingHistory`.
+        :param generalization: A `GeneralizedPublication` helper representing
+            the kind of publications these are: source or binary.
+        :return: A dict mapping each package name to a sorted list of
+            publications from `publications`.
+        """
+        pubs_by_package = defaultdict(list)
+        for pub in publications:
+            pubs_by_package[generalization.getPackageName(pub)].append(pub)
+
+        # Sort the publication lists.  This is not an in-place sort, so
+        # it involves altering the dict while we iterate it.  Listify
+        # the keys so that we can be sure that we're not altering the
+        # iteration order while iteration is underway.
+        for package in list(pubs_by_package.keys()):
+            pubs_by_package[package] = generalization.sortPublications(
+                pubs_by_package[package])
+
+        return pubs_by_package
 
     def _setScheduledDeletionDate(self, pub_record):
         """Set the scheduleddeletiondate on a publishing record.
@@ -442,72 +509,98 @@
             # always equals to "scheduleddeletiondate - quarantine".
             pub_record.datemadepending = UTC_NOW
 
+    def findBinariesForDomination(self, distroarchseries, pocket):
+        """Find binary publications that need dominating.
+
+        This is only for traditional domination, where the latest published
+        publication is always kept published.  It will ignore publications
+        that have no other publications competing for the same binary package.
+        """
+        # Avoid circular imports.
+        from lp.soyuz.model.publishing import BinaryPackagePublishingHistory
+
+        bpph_location_clauses = [
+            BinaryPackagePublishingHistory.status ==
+                PackagePublishingStatus.PUBLISHED,
+            BinaryPackagePublishingHistory.distroarchseries ==
+                distroarchseries,
+            BinaryPackagePublishingHistory.archive == self.archive,
+            BinaryPackagePublishingHistory.pocket == pocket,
+            ]
+        candidate_binary_names = Select(
+            BinaryPackageName.id,
+            And(
+                BinaryPackageRelease.binarypackagenameID ==
+                    BinaryPackageName.id,
+                BinaryPackagePublishingHistory.binarypackagereleaseID ==
+                    BinaryPackageRelease.id,
+                bpph_location_clauses,
+            ),
+            group_by=BinaryPackageName.id,
+            having=Count(BinaryPackagePublishingHistory.id) > 1)
+        main_clauses = [
+            BinaryPackageRelease.id ==
+                BinaryPackagePublishingHistory.binarypackagereleaseID,
+            BinaryPackageRelease.binarypackagenameID.is_in(
+                candidate_binary_names),
+            BinaryPackageRelease.binpackageformat !=
+                BinaryPackageFormat.DDEB,
+            ]
+        main_clauses.extend(bpph_location_clauses)
+
+        store = IStore(BinaryPackagePublishingHistory)
+
+        # We're going to access the BPRs as well.  Since we make the
+        # database look them up anyway, and since there won't be many
+        # duplications among them, load them alongside the publications.
+        # We'll also want their BinaryPackageNames, but adding those to
+        # the join would complicate the query.
+        query = store.find(
+            (BinaryPackagePublishingHistory, BinaryPackageRelease),
+            *main_clauses)
+        return DecoratedResultSet(query, itemgetter(0))
+
     def dominateBinaries(self, distroseries, pocket):
         """Perform domination on binary package publications.
 
         Dominates binaries, restricted to `distroseries`, `pocket`, and
         `self.archive`.
         """
-        # Avoid circular imports.
-        from lp.soyuz.model.publishing import BinaryPackagePublishingHistory
-
         generalization = GeneralizedPublication(is_source=False)
 
         for distroarchseries in distroseries.architectures:
             self.logger.info(
                 "Performing domination across %s/%s (%s)",
-                distroseries.name, pocket.title,
+                distroarchseries.distroseries.name, pocket.title,
                 distroarchseries.architecturetag)
 
-            bpph_location_clauses = [
-                BinaryPackagePublishingHistory.status ==
-                    PackagePublishingStatus.PUBLISHED,
-                BinaryPackagePublishingHistory.distroarchseries ==
-                    distroarchseries,
-                BinaryPackagePublishingHistory.archive == self.archive,
-                BinaryPackagePublishingHistory.pocket == pocket,
-                ]
-            candidate_binary_names = Select(
-                BinaryPackageName.id,
-                And(
-                    BinaryPackageRelease.binarypackagenameID ==
-                        BinaryPackageName.id,
-                    BinaryPackagePublishingHistory.binarypackagereleaseID ==
-                        BinaryPackageRelease.id,
-                    bpph_location_clauses,
-                ),
-                group_by=BinaryPackageName.id,
-                having=Count(BinaryPackagePublishingHistory.id) > 1)
-            main_clauses = [
-                BinaryPackagePublishingHistory,
-                BinaryPackageRelease.id ==
-                    BinaryPackagePublishingHistory.binarypackagereleaseID,
-                BinaryPackageRelease.binarypackagenameID.is_in(
-                    candidate_binary_names),
-                BinaryPackageRelease.binpackageformat !=
-                    BinaryPackageFormat.DDEB,
-                ]
-            main_clauses.extend(bpph_location_clauses)
-
-            def do_domination(pass2_msg=""):
-                msg = "binaries..." + pass2_msg
-                self.logger.info("Finding %s" % msg)
-                bins = IStore(
-                    BinaryPackagePublishingHistory).find(*main_clauses)
-                self.logger.info("Dominating %s" % msg)
-                sorted_packages = self._sortPackages(bins, generalization)
-                self._dominatePublications(sorted_packages, generalization)
-
-            do_domination()
-
-            # We need to make a second pass to cover the cases where:
-            #  * arch-specific binaries were not all dominated before arch-all
-            #    ones that depend on them
-            #  * An arch-all turned into an arch-specific, or vice-versa
-            #  * A package is completely schizophrenic and changes all of
-            #    its binaries between arch-all and arch-any (apparently
-            #    occurs sometimes!)
-            do_domination("(2nd pass)")
+            self.logger.info("Finding binaries...")
+            bins = self.findBinariesForDomination(distroarchseries, pocket)
+            sorted_packages = self._sortPackages(bins, generalization)
+            self.logger.info("Dominating binaries...")
+            for name, pubs in sorted_packages.iteritems():
+                self.logger.debug("Dominating %s" % name)
+                assert len(pubs) > 0, "Dominating zero binaries!"
+                live_versions = find_live_binary_versions_first_pass(pubs)
+                self.dominatePackage(pubs, live_versions, generalization)
+
+        # We need to make a second pass to cover the cases where:
+        #  * arch-specific binaries were not all dominated before arch-all
+        #    ones that depend on them
+        #  * An arch-all turned into an arch-specific, or vice-versa
+        #  * A package is completely schizophrenic and changes all of
+        #    its binaries between arch-all and arch-any (apparently
+        #    occurs sometimes!)
+        for distroarchseries in distroseries.architectures:
+            self.logger.info("Finding binaries...(2nd pass)")
+            bins = self.findBinariesForDomination(distroarchseries, pocket)
+            sorted_packages = self._sortPackages(bins, generalization)
+            self.logger.info("Dominating binaries...(2nd pass)")
+            for name, pubs in sorted_packages.iteritems():
+                self.logger.debug("Dominating %s" % name)
+                assert len(pubs) > 0, "Dominating zero binaries in 2nd pass!"
+                live_versions = find_live_binary_versions_second_pass(pubs)
+                self.dominatePackage(pubs, live_versions, generalization)
 
     def _composeActiveSourcePubsCondition(self, distroseries, pocket):
         """Compose ORM condition for restricting relevant source pubs."""
@@ -522,21 +615,16 @@
             SourcePackagePublishingHistory.pocket == pocket,
             )
 
-    def dominateSources(self, distroseries, pocket):
-        """Perform domination on source package publications.
+    def findSourcesForDomination(self, distroseries, pocket):
+        """Find binary publications that need dominating.
 
-        Dominates sources, restricted to `distroseries`, `pocket`, and
-        `self.archive`.
+        This is only for traditional domination, where the latest published
+        publication is always kept published.  It will ignore publications
+        that have no other publications competing for the same binary package.
         """
         # Avoid circular imports.
         from lp.soyuz.model.publishing import SourcePackagePublishingHistory
 
-        generalization = GeneralizedPublication(is_source=True)
-
-        self.logger.debug(
-            "Performing domination across %s/%s (Source)",
-            distroseries.name, pocket.title)
-
         spph_location_clauses = self._composeActiveSourcePubsCondition(
             distroseries, pocket)
         having_multiple_active_publications = (
@@ -546,16 +634,44 @@
             And(join_spph_spr(), join_spr_spn(), spph_location_clauses),
             group_by=SourcePackageName.id,
             having=having_multiple_active_publications)
-        sources = IStore(SourcePackagePublishingHistory).find(
-            SourcePackagePublishingHistory,
+
+        # We'll also access the SourcePackageReleases associated with
+        # the publications we find.  Since they're in the join anyway,
+        # load them alongside the publications.
+        # Actually we'll also want the SourcePackageNames, but adding
+        # those to the (outer) query would complicate it, and
+        # potentially slow it down.
+        query = IStore(SourcePackagePublishingHistory).find(
+            (SourcePackagePublishingHistory, SourcePackageRelease),
             join_spph_spr(),
             SourcePackageRelease.sourcepackagenameID.is_in(
                 candidate_source_names),
             spph_location_clauses)
+        return DecoratedResultSet(query, itemgetter(0))
+
+    def dominateSources(self, distroseries, pocket):
+        """Perform domination on source package publications.
+
+        Dominates sources, restricted to `distroseries`, `pocket`, and
+        `self.archive`.
+        """
+        self.logger.debug(
+            "Performing domination across %s/%s (Source)",
+            distroseries.name, pocket.title)
+
+        generalization = GeneralizedPublication(is_source=True)
+
+        self.logger.debug("Finding sources...")
+        sources = self.findSourcesForDomination(distroseries, pocket)
+        sorted_packages = self._sortPackages(sources, generalization)
 
         self.logger.debug("Dominating sources...")
-        self._dominatePublications(
-            self._sortPackages(sources, generalization), generalization)
+        for name, pubs in sorted_packages.iteritems():
+            self.logger.debug("Dominating %s" % name)
+            assert len(pubs) > 0, "Dominating zero binaries!"
+            live_versions = find_live_source_versions(pubs)
+            self.dominatePackage(pubs, live_versions, generalization)
+
         flush_database_updates()
 
     def findPublishedSourcePackageNames(self, distroseries, pocket):
@@ -615,6 +731,7 @@
         """
         generalization = GeneralizedPublication(is_source=True)
         pubs = self.findPublishedSPPHs(distroseries, pocket, package_name)
+        pubs = generalization.sortPublications(pubs)
         self.dominatePackage(pubs, live_versions, generalization)
 
     def judge(self, distroseries, pocket):

=== modified file 'lib/lp/archivepublisher/tests/test_dominator.py'
--- lib/lp/archivepublisher/tests/test_dominator.py	2011-10-27 11:36:13 +0000
+++ lib/lp/archivepublisher/tests/test_dominator.py	2011-11-02 14:39:30 +0000
@@ -30,6 +30,7 @@
     StormStatementRecorder,
     TestCaseWithFactory,
     )
+from lp.testing.fakemethod import FakeMethod
 from lp.testing.matchers import HasQueryCount
 
 
@@ -72,13 +73,9 @@
             is_source=ISourcePackagePublishingHistory.providedBy(dominant))
         dominator = Dominator(self.logger, self.ubuntutest.main_archive)
 
-        # The _dominate* test methods require a dictionary where the
-        # package name is the key. The key's value is a list of
-        # source or binary packages representing dominant, the first element
-        # and dominated, the subsequents.
-        pubs = {'foo': [dominant, dominated]}
-
-        dominator._dominatePublications(pubs, generalization)
+        pubs = [dominant, dominated]
+        live_versions = [generalization.getPackageVersion(dominant)]
+        dominator.dominatePackage(pubs, live_versions, generalization)
         flush_database_updates()
 
         # The dominant version remains correctly published.
@@ -158,16 +155,30 @@
             [foo_10_source] + foo_10_binaries,
             PackagePublishingStatus.SUPERSEDED)
 
-    def testEmptyDomination(self):
-        """Domination asserts for not empty input list."""
-        dominator = Dominator(self.logger, self.ubuntutest.main_archive)
-        pubs = {'foo': []}
-        # This isn't a really good exception. It should probably be
-        # something more indicative of bad input.
-        self.assertRaises(
-            AssertionError,
-            dominator._dominatePublications,
-            pubs, GeneralizedPublication(True))
+    def test_dominateBinaries_rejects_empty_publication_list(self):
+        """Domination asserts for non-empty input list."""
+        package = self.factory.makeBinaryPackageName()
+        dominator = Dominator(self.logger, self.ubuntutest.main_archive)
+        dominator.mapPackages = FakeMethod({package.name: []})
+        # This isn't a really good exception. It should probably be
+        # something more indicative of bad input.
+        self.assertRaises(
+            AssertionError,
+            dominator.dominateBinaries,
+            self.factory.makeDistroArchSeries().distroseries,
+            self.factory.getAnyPocket())
+
+    def test_dominateSources_rejects_empty_publication_list(self):
+        """Domination asserts for non-empty input list."""
+        package = self.factory.makeSourcePackageName()
+        dominator = Dominator(self.logger, self.ubuntutest.main_archive)
+        dominator.mapPackages = FakeMethod({package.name: []})
+        # This isn't a really good exception. It should probably be
+        # something more indicative of bad input.
+        self.assertRaises(
+            AssertionError,
+            dominator.dominateSources,
+            self.factory.makeDistroSeries(), self.factory.getAnyPocket())
 
     def test_archall_domination(self):
         # Arch-all binaries should not be dominated when a new source
@@ -361,7 +372,10 @@
 def make_spphs_for_versions(factory, versions):
     """Create publication records for each of `versions`.
 
-    They records are created in the same order in which they are specified.
+    All these publications will be in the same source package, archive,
+    distroseries, and pocket.  They will all be in Published status.
+
+    The records are created in the same order in which they are specified.
     Make the order irregular to prove that version ordering is not a
     coincidence of object creation order etc.
 
@@ -371,6 +385,7 @@
     spn = factory.makeSourcePackageName()
     distroseries = factory.makeDistroSeries()
     pocket = factory.getAnyPocket()
+    archive = distroseries.main_archive
     sprs = [
         factory.makeSourcePackageRelease(
             sourcepackagename=spn, version=version)
@@ -378,11 +393,34 @@
     return [
         factory.makeSourcePackagePublishingHistory(
             distroseries=distroseries, pocket=pocket,
-            sourcepackagerelease=spr,
+            sourcepackagerelease=spr, archive=archive,
             status=PackagePublishingStatus.PUBLISHED)
         for spr in sprs]
 
 
+def make_bpphs_for_versions(factory, versions):
+    """Create publication records for each of `versions`.
+
+    All these publications will be in the same binary package, source
+    package, archive, distroarchseries, and pocket.  They will all be in
+    Published status.
+    """
+    bpn = factory.makeBinaryPackageName()
+    spn = factory.makeSourcePackageName()
+    das = factory.makeDistroArchSeries()
+    archive = das.distroseries.main_archive
+    pocket = factory.getAnyPocket()
+    bprs = [
+        factory.makeBinaryPackageRelease(binarypackagename=bpn)
+        for version in versions]
+    return [
+        factory.makeBinaryPackagePublishingHistory(
+            binarypackagerelease=bpr, binarypackagename=bpn,
+            distroarchseries=das, pocket=pocket, archive=archive,
+            sourcepackagename=spn, status=PackagePublishingStatus.PUBLISHED)
+        for bpr in bprs]
+
+
 def list_source_versions(spphs):
     """Extract the versions from `spphs` as a list, in the same order."""
     return [spph.sourcepackagerelease.version for spph in spphs]
@@ -564,9 +602,10 @@
     def test_dominatePackage_supersedes_older_pub_with_newer_live_pub(self):
         # When marking a package as superseded, dominatePackage
         # designates a newer live version as the superseding version.
+        generalization = GeneralizedPublication(True)
         pubs = make_spphs_for_versions(self.factory, ['1.0', '1.1'])
         self.makeDominator(pubs).dominatePackage(
-            pubs, ['1.1'], GeneralizedPublication(True))
+            generalization.sortPublications(pubs), ['1.1'], generalization)
         self.assertEqual(PackagePublishingStatus.SUPERSEDED, pubs[0].status)
         self.assertEqual(pubs[1].sourcepackagerelease, pubs[0].supersededby)
         self.assertEqual(PackagePublishingStatus.PUBLISHED, pubs[1].status)
@@ -574,10 +613,11 @@
     def test_dominatePackage_only_supersedes_with_live_pub(self):
         # When marking a package as superseded, dominatePackage will
         # only pick a live version as the superseding one.
+        generalization = GeneralizedPublication(True)
         pubs = make_spphs_for_versions(
             self.factory, ['1.0', '2.0', '3.0', '4.0'])
         self.makeDominator(pubs).dominatePackage(
-            pubs, ['3.0'], GeneralizedPublication(True))
+            generalization.sortPublications(pubs), ['3.0'], generalization)
         self.assertEqual([
                 pubs[2].sourcepackagerelease,
                 pubs[2].sourcepackagerelease,
@@ -589,23 +629,27 @@
     def test_dominatePackage_supersedes_with_oldest_newer_live_pub(self):
         # When marking a package as superseded, dominatePackage picks
         # the oldest of the newer, live versions as the superseding one.
+        generalization = GeneralizedPublication(True)
         pubs = make_spphs_for_versions(self.factory, ['2.7', '2.8', '2.9'])
         self.makeDominator(pubs).dominatePackage(
-            pubs, ['2.8', '2.9'], GeneralizedPublication(True))
+            generalization.sortPublications(pubs), ['2.8', '2.9'],
+            generalization)
         self.assertEqual(pubs[1].sourcepackagerelease, pubs[0].supersededby)
 
     def test_dominatePackage_only_supersedes_with_newer_live_pub(self):
         # When marking a package as superseded, dominatePackage only
         # considers a newer version as the superseding one.
+        generalization = GeneralizedPublication(True)
         pubs = make_spphs_for_versions(self.factory, ['0.1', '0.2'])
         self.makeDominator(pubs).dominatePackage(
-            pubs, ['0.1'], GeneralizedPublication(True))
+            generalization.sortPublications(pubs), ['0.1'], generalization)
         self.assertEqual(None, pubs[1].supersededby)
         self.assertEqual(PackagePublishingStatus.DELETED, pubs[1].status)
 
     def test_dominatePackage_supersedes_replaced_pub_for_live_version(self):
         # Even if a publication record is for a live version, a newer
         # one for the same version supersedes it.
+        generalization = GeneralizedPublication(True)
         spr = self.factory.makeSourcePackageRelease()
         series = self.factory.makeDistroSeries()
         pocket = PackagePublishingPocket.RELEASE
@@ -622,7 +666,8 @@
             ])
 
         self.makeDominator(pubs).dominatePackage(
-            pubs, [spr.version], GeneralizedPublication(True))
+            generalization.sortPublications(pubs), [spr.version],
+            generalization)
         self.assertEqual([
             PackagePublishingStatus.SUPERSEDED,
             PackagePublishingStatus.SUPERSEDED,
@@ -634,12 +679,13 @@
 
     def test_dominatePackage_is_efficient(self):
         # dominatePackage avoids issuing too many queries.
+        generalization = GeneralizedPublication(True)
         versions = ["1.%s" % revision for revision in xrange(5)]
         pubs = make_spphs_for_versions(self.factory, versions)
         with StormStatementRecorder() as recorder:
             self.makeDominator(pubs).dominatePackage(
-                pubs, versions[2:-1],
-                GeneralizedPublication(True))
+                generalization.sortPublications(pubs), versions[2:-1],
+                generalization)
         self.assertThat(recorder, HasQueryCount(LessThan(5)))
 
     def test_dominatePackage_advanced_scenario(self):
@@ -650,6 +696,7 @@
         # don't just patch up the code or this test.  Create unit tests
         # that specifically cover the difference, then change the code
         # and/or adapt this test to return to harmony.
+        generalization = GeneralizedPublication(True)
         series = self.factory.makeDistroSeries()
         package = self.factory.makeSourcePackageName()
         pocket = PackagePublishingPocket.RELEASE
@@ -696,7 +743,8 @@
 
         all_pubs = sum(pubs_by_version.itervalues(), [])
         Dominator(DevNullLogger(), series.main_archive).dominatePackage(
-            all_pubs, live_versions, GeneralizedPublication(True))
+            generalization.sortPublications(all_pubs), live_versions,
+            generalization)
 
         for version in reversed(versions):
             pubs = pubs_by_version[version]
@@ -921,3 +969,144 @@
             [],
             dominator.findPublishedSPPHs(
                 spph.distroseries, spph.pocket, other_package.name))
+
+    def test_findBinariesForDomination_finds_published_publications(self):
+        bpphs = make_bpphs_for_versions(self.factory, ['1.0', '1.1'])
+        dominator = self.makeDominator(bpphs)
+        self.assertContentEqual(
+            bpphs, dominator.findBinariesForDomination(
+                bpphs[0].distroarchseries, bpphs[0].pocket))
+
+    def test_findBinariesForDomination_skips_single_pub_packages(self):
+        # The domination algorithm that uses findBinariesForDomination
+        # always keeps the latest version live.  Thus, a single
+        # publication isn't worth dominating.  findBinariesForDomination
+        # won't return it.
+        bpphs = make_bpphs_for_versions(self.factory, ['1.0'])
+        dominator = self.makeDominator(bpphs)
+        self.assertContentEqual(
+            [], dominator.findBinariesForDomination(
+                bpphs[0].distroarchseries, bpphs[0].pocket))
+
+    def test_findBinariesForDomination_ignores_other_distroseries(self):
+        bpphs = make_bpphs_for_versions(self.factory, ['1.0', '1.1'])
+        dominator = self.makeDominator(bpphs)
+        das = bpphs[0].distroarchseries
+        other_series = self.factory.makeDistroSeries(
+            distribution=das.distroseries.distribution)
+        other_das = self.factory.makeDistroArchSeries(
+            distroseries=other_series, architecturetag=das.architecturetag,
+            processorfamily=das.processorfamily)
+        self.assertContentEqual(
+            [], dominator.findBinariesForDomination(
+                other_das, bpphs[0].pocket))
+
+    def test_findBinariesForDomination_ignores_other_architectures(self):
+        bpphs = make_bpphs_for_versions(self.factory, ['1.0', '1.1'])
+        dominator = self.makeDominator(bpphs)
+        other_das = self.factory.makeDistroArchSeries(
+            distroseries=bpphs[0].distroseries)
+        self.assertContentEqual(
+            [], dominator.findBinariesForDomination(
+                other_das, bpphs[0].pocket))
+
+    def test_findBinariesForDomination_ignores_other_archive(self):
+        bpphs = make_bpphs_for_versions(self.factory, ['1.0', '1.1'])
+        dominator = self.makeDominator(bpphs)
+        dominator.archive = self.factory.makeArchive()
+        self.assertContentEqual(
+            [], dominator.findBinariesForDomination(
+                bpphs[0].distroarchseries, bpphs[0].pocket))
+
+    def test_findBinariesForDomination_ignores_other_pocket(self):
+        bpphs = make_bpphs_for_versions(self.factory, ['1.0', '1.1'])
+        dominator = self.makeDominator(bpphs)
+        for bpph in bpphs:
+            removeSecurityProxy(bpph).pocket = PackagePublishingPocket.UPDATES
+        self.assertContentEqual(
+            [], dominator.findBinariesForDomination(
+                bpphs[0].distroarchseries, PackagePublishingPocket.SECURITY))
+
+    def test_findBinariesForDomination_ignores_other_status(self):
+        # If we have one BPPH for each possible status, plus one
+        # Published one to stop findBinariesForDomination from skipping
+        # the package, findBinariesForDomination returns only the
+        # Published ones.
+        versions = [
+            '1.%d' % self.factory.getUniqueInteger()
+            for status in PackagePublishingStatus.items] + ['0.9']
+        bpphs = make_bpphs_for_versions(self.factory, versions)
+        dominator = self.makeDominator(bpphs)
+
+        for bpph, status in zip(bpphs, PackagePublishingStatus.items):
+            bpph.status = status
+
+        # These are the Published publications.  The other ones will all
+        # be ignored.
+        published_bpphs = [
+            bpph
+            for bpph in bpphs
+                if bpph.status == PackagePublishingStatus.PUBLISHED]
+
+        self.assertContentEqual(
+            published_bpphs,
+            dominator.findBinariesForDomination(
+                bpphs[0].distroarchseries, bpphs[0].pocket))
+
+    def test_findSourcesForDomination_finds_published_publications(self):
+        spphs = make_spphs_for_versions(self.factory, ['2.0', '2.1'])
+        dominator = self.makeDominator(spphs)
+        self.assertContentEqual(
+            spphs, dominator.findSourcesForDomination(
+                spphs[0].distroseries, spphs[0].pocket))
+
+    def test_findSourcesForDomination_skips_single_pub_packages(self):
+        # The domination algorithm that uses findSourcesForDomination
+        # always keeps the latest version live.  Thus, a single
+        # publication isn't worth dominating.  findSourcesForDomination
+        # won't return it.
+        spphs = make_spphs_for_versions(self.factory, ['2.0'])
+        dominator = self.makeDominator(spphs)
+        self.assertContentEqual(
+            [], dominator.findSourcesForDomination(
+                spphs[0].distroseries, spphs[0].pocket))
+
+    def test_findSourcesForDomination_ignores_other_distroseries(self):
+        spphs = make_spphs_for_versions(self.factory, ['2.0', '2.1'])
+        dominator = self.makeDominator(spphs)
+        other_series = self.factory.makeDistroSeries(
+            distribution=spphs[0].distroseries.distribution)
+        self.assertContentEqual(
+            [], dominator.findSourcesForDomination(
+                other_series, spphs[0].pocket))
+
+    def test_findSourcesForDomination_ignores_other_pocket(self):
+        spphs = make_spphs_for_versions(self.factory, ['2.0', '2.1'])
+        dominator = self.makeDominator(spphs)
+        for spph in spphs:
+            removeSecurityProxy(spph).pocket = PackagePublishingPocket.UPDATES
+        self.assertContentEqual(
+            [], dominator.findSourcesForDomination(
+                spphs[0].distroseries, PackagePublishingPocket.SECURITY))
+
+    def test_findSourcesForDomination_ignores_other_status(self):
+        versions = [
+            '1.%d' % self.factory.getUniqueInteger()
+            for status in PackagePublishingStatus.items] + ['0.9']
+        spphs = make_spphs_for_versions(self.factory, versions)
+        dominator = self.makeDominator(spphs)
+
+        for spph, status in zip(spphs, PackagePublishingStatus.items):
+            spph.status = status
+
+        # These are the Published publications.  The other ones will all
+        # be ignored.
+        published_spphs = [
+            spph
+            for spph in spphs
+                if spph.status == PackagePublishingStatus.PUBLISHED]
+
+        self.assertContentEqual(
+            published_spphs,
+            dominator.findSourcesForDomination(
+                spphs[0].distroseries, spphs[0].pocket))

=== modified file 'lib/lp/soyuz/model/publishing.py'
--- lib/lp/soyuz/model/publishing.py	2011-10-23 02:58:56 +0000
+++ lib/lp/soyuz/model/publishing.py	2011-11-02 14:39:30 +0000
@@ -33,7 +33,6 @@
     Desc,
     LeftJoin,
     Or,
-    Select,
     Sum,
     )
 from storm.store import Store
@@ -1154,19 +1153,10 @@
         # Avoid circular wotsits.
         from lp.soyuz.model.binarypackagebuild import BinaryPackageBuild
         from lp.soyuz.model.distroarchseries import DistroArchSeries
-        from lp.soyuz.model.sourcepackagerelease import SourcePackageRelease
-        source_select = Select(
-            SourcePackageRelease.id,
-            And(
-                BinaryPackageBuild.source_package_release_id ==
-                    SourcePackageRelease.id,
-                BinaryPackageRelease.build == BinaryPackageBuild.id,
-                self.binarypackagereleaseID == BinaryPackageRelease.id,
-            ))
+
         pubs = [
             BinaryPackageBuild.source_package_release_id ==
-                SourcePackageRelease.id,
-            SourcePackageRelease.id.is_in(source_select),
+                self.binarypackagerelease.build.source_package_release_id,
             BinaryPackageRelease.build == BinaryPackageBuild.id,
             BinaryPackagePublishingHistory.binarypackagereleaseID ==
                 BinaryPackageRelease.id,


Follow ups