← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~jtv/launchpad/pre-836743 into lp:launchpad

 

Jeroen T. Vermeulen has proposed merging lp:~jtv/launchpad/pre-836743 into lp:launchpad.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)

For more details, see:
https://code.launchpad.net/~jtv/launchpad/pre-836743/+merge/73338

= Summary =

I'll need to make Gina perform some form of domination on Debian.  But the Dominator does far more than we need for that.


== Proposed fix ==

Before I make those changes, I'm extracting some methods from an excessively long and intimidating stretch of code.  This exposes the underlying simplicity (and opportunity for optimization) and also allows us to invoke just the relevant part — to wit dominateSources — of what the dominator does.

The "judging" part of domination, where superseded SPPHs and BPPHs are scheduled for deletion, is not relevant for Debian as far as I can see; it would erase SPPHs from the historical record, which is not likely to be desirable.


== Pre-implementation notes ==

How Gina will dominate Debian is still open for debate.  William suggested not using the dominator at all, but iterating over all active SPPHs for Debian and superseding ones that are listed in Sources, but whose specific package versions aren't.  However that would still leave all SPPHs for packages that are no longer listed open for deletion, including ones that were actually superseded before their packages were deleted.  The given reason for running domination first was to prevent this in the first place.  So it would require a custom solution plus a transitional measure that did effectively the same as a dominator run.

The reason for a custom solution is that Debian uses different rules for superseding packages than Ubuntu currently does: multiple versions of the same source package can stay active in the same archive, component, and pocket.  But, William tells me, in the future we want to have the same ability in Ubuntu too, and then the dominator will have to support it.  It may turn out to make more sense to run the dominator for Debian as for Ubuntu, and once we get around to supporting multiple active package versions, to update gina to provide the dominator with enough information.

Julian feels that package versions should not be considered superseded until they're out of (or ready to be taken out of) the Sources lists, but is not too bothered if we don't support this in Debian for the time being.


== Implementation details ==

I ditched ELIGIBLE_DOMINATION_STATES.  It was identical to inactive_publishing_status, and it was already being used for purposes other than just domination.  It was also in an iffy place.

You may wonder why variables named "dr" become "distroseries."  The "dr" was an abbreviation of DistroSeries' former name, DistroRelease.

Oh, and copying the logger's debug method into Dominator and then passing the Dominator instance to other functions as a logger?  That's just sick.  I won't stand for it.


== Tests ==

{{{
./bin/test -vvc lp.archivepublisher.tests.test_dominator
}}}


== Demo and Q/A ==

Run publish-ftpmaster on Ubuntu or derivatives.  It should dominate successfully.


= Launchpad lint =

Checking for conflicts and issues in changed files.

Linting changed files:
  lib/lp/archivepublisher/deathrow.py
  lib/lp/archivepublisher/domination.py
  lib/lp/archivepublisher/__init__.py
  lib/lp/soyuz/model/publishing.py
-- 
https://code.launchpad.net/~jtv/launchpad/pre-836743/+merge/73338
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~jtv/launchpad/pre-836743 into lp:launchpad.
=== modified file 'lib/lp/archivepublisher/__init__.py'
--- lib/lp/archivepublisher/__init__.py	2010-10-03 15:30:06 +0000
+++ lib/lp/archivepublisher/__init__.py	2011-08-30 07:54:25 +0000
@@ -1,20 +1,10 @@
-# Copyright 2009 Canonical Ltd.  This software is licensed under the
+# Copyright 2009-2011 Canonical Ltd.  This software is licensed under the
 # GNU Affero General Public License version 3 (see the file LICENSE).
 
-__all__ = ['HARDCODED_COMPONENT_ORDER', 'ELIGIBLE_DOMINATION_STATES']
-
-from lp.soyuz.enums import PackagePublishingStatus
+__all__ = ['HARDCODED_COMPONENT_ORDER']
 
 # XXX: kiko 2006-08-23: if people actually start seriously using
 # ComponentSelections this will need to be revisited. For instance, adding
 # new components will break places which use this list.
 HARDCODED_COMPONENT_ORDER = [
     'main', 'restricted', 'universe', 'multiverse', 'partner']
-
-# This list contains the states that are eligible for domination and death
-# row processing.
-ELIGIBLE_DOMINATION_STATES = [
-    PackagePublishingStatus.SUPERSEDED,
-    PackagePublishingStatus.DELETED,
-    PackagePublishingStatus.OBSOLETE,
-    ]

=== modified file 'lib/lp/archivepublisher/deathrow.py'
--- lib/lp/archivepublisher/deathrow.py	2011-04-04 11:41:52 +0000
+++ lib/lp/archivepublisher/deathrow.py	2011-08-30 07:54:25 +0000
@@ -1,4 +1,4 @@
-# Copyright 2009 Canonical Ltd.  This software is licensed under the
+# Copyright 2009-2011 Canonical Ltd.  This software is licensed under the
 # GNU Affero General Public License version 3 (see the file LICENSE).
 
 """
@@ -14,14 +14,12 @@
 
 from canonical.database.constants import UTC_NOW
 from canonical.database.sqlbase import sqlvalues
-from lp.archivepublisher import ELIGIBLE_DOMINATION_STATES
-from lp.archivepublisher.config import (
-    getPubConfig,
-    )
+from lp.archivepublisher.config import getPubConfig
 from lp.archivepublisher.diskpool import DiskPool
 from lp.soyuz.enums import ArchivePurpose
 from lp.soyuz.interfaces.publishing import (
     IBinaryPackagePublishingHistory,
+    inactive_publishing_status,
     ISourcePackagePublishingHistory,
     MissingSymlinkInPool,
     NotInPool,
@@ -122,7 +120,7 @@
                   spph.archive = %s AND
                   spph.status NOT IN %s)
         """ % sqlvalues(self.archive, UTC_NOW, self.archive,
-                        ELIGIBLE_DOMINATION_STATES), orderBy="id")
+                        inactive_publishing_status), orderBy="id")
         self.logger.debug("%d Sources" % sources.count())
 
         binaries = BinaryPackagePublishingHistory.select("""
@@ -137,7 +135,7 @@
                   bpph.archive = %s AND
                   bpph.status NOT IN %s)
         """ % sqlvalues(self.archive, UTC_NOW, self.archive,
-                        ELIGIBLE_DOMINATION_STATES), orderBy="id")
+                        inactive_publishing_status), orderBy="id")
         self.logger.debug("%d Binaries" % binaries.count())
 
         return (sources, binaries)
@@ -191,7 +189,7 @@
         right_now = datetime.datetime.now(pytz.timezone('UTC'))
         for pub in all_publications:
             # Deny removal if any reference is still active.
-            if pub.status not in ELIGIBLE_DOMINATION_STATES:
+            if pub.status not in inactive_publishing_status:
                 return False
             # Deny removal if any reference wasn't dominated yet.
             if pub.scheduleddeletiondate is None:

=== modified file 'lib/lp/archivepublisher/domination.py'
--- lib/lp/archivepublisher/domination.py	2011-06-09 10:50:25 +0000
+++ lib/lp/archivepublisher/domination.py	2011-08-30 07:54:25 +0000
@@ -1,4 +1,4 @@
-# Copyright 2009 Canonical Ltd.  This software is licensed under the
+# Copyright 2009-2011 Canonical Ltd.  This software is licensed under the
 # GNU Affero General Public License version 3 (see the file LICENSE).
 
 """Archive Domination class.
@@ -57,7 +57,11 @@
 import operator
 
 import apt_pkg
-from storm.expr import And, Count, Select
+from storm.expr import (
+    And,
+    Count,
+    Select,
+    )
 
 from canonical.database.constants import UTC_NOW
 from canonical.database.sqlbase import (
@@ -65,17 +69,16 @@
     sqlvalues,
     )
 from canonical.launchpad.interfaces.lpstorm import IMasterStore
-from lp.archivepublisher import ELIGIBLE_DOMINATION_STATES
 from lp.registry.model.sourcepackagename import SourcePackageName
 from lp.soyuz.enums import (
     BinaryPackageFormat,
     PackagePublishingStatus,
     )
+from lp.soyuz.interfaces.publishing import inactive_publishing_status
 from lp.soyuz.model.binarypackagename import BinaryPackageName
 from lp.soyuz.model.binarypackagerelease import BinaryPackageRelease
 from lp.soyuz.model.sourcepackagerelease import SourcePackageRelease
 
-
 # Days before a package will be removed from disk.
 STAY_OF_EXECUTION = 1
 
@@ -98,7 +101,7 @@
 
 
 class Dominator:
-    """ Manage the process of marking packages as superseded.
+    """Manage the process of marking packages as superseded.
 
     Packages are marked as superseded when they become obsolete.
     """
@@ -110,9 +113,8 @@
         new stuff into the distribution but before the publisher
         creates the file lists for apt-ftparchive.
         """
-        self._logger = logger
+        self.logger = logger
         self.archive = archive
-        self.debug = self._logger.debug
 
     def _dominatePublications(self, pubs):
         """Perform dominations for the given publications.
@@ -121,39 +123,43 @@
             publication must be PUBLISHED or PENDING, and the first in each
             list will be treated as dominant (so should be the latest).
         """
-        self.debug("Dominating packages...")
+        self.logger.debug("Dominating packages...")
 
         for name in pubs.keys():
             assert pubs[name], (
                 "Empty list of publications for %s" % name)
             for pubrec in pubs[name][1:]:
-                pubrec.supersede(pubs[name][0], self)
+                pubrec.supersede(pubs[name][0], logger=self.logger)
 
     def _sortPackages(self, pkglist, is_source=True):
-        # pkglist is a list of packages with the following
-        #  * sourcepackagename or packagename as appropriate
-        #  * version
-        #  * status
-        # Don't care about any other attributes
+        """Map out packages by name, and sort by descending version.
+
+        :param pkglist: An iterable of `SourcePackagePublishingHistory` or
+            `BinaryPackagePublishingHistory`.
+        :param is_source: Whether this call involves source package
+            publications.  If so, work with `SourcePackagePublishingHistory`.
+            If not, work with `BinaryPackagepublishingHistory`.
+        :return: A dict mapping each package name (as UTF-8 encoded string)
+            to a list of publications from `pkglist`, newest first.
+        """
+        self.logger.debug("Sorting packages...")
+
+        if is_source:
+            get_release = operator.attrgetter("sourcepackagerelease")
+            get_name = operator.attrgetter("sourcepackagename")
+        else:
+            get_release = operator.attrgetter("binarypackagerelease")
+            get_name = operator.attrgetter("binarypackagename")
+
         outpkgs = {}
-
-        self.debug("Sorting packages...")
-
-        attr_prefix = 'source' if is_source else 'binary'
-        get_release = operator.attrgetter(attr_prefix + 'packagerelease')
-        get_name = operator.attrgetter(attr_prefix + 'packagename')
-
         for inpkg in pkglist:
-            L = outpkgs.setdefault(
-                get_name(get_release(inpkg)).name.encode('utf-8'), [])
-            L.append(inpkg)
+            key = get_name(get_release(inpkg)).name.encode('utf-8')
+            outpkgs.setdefault(key, []).append(inpkg)
 
-        for pkgname in outpkgs:
-            if len(outpkgs[pkgname]) > 1:
-                outpkgs[pkgname].sort(
-                    functools.partial(
-                        _compare_packages_by_version_and_date, get_release))
-                outpkgs[pkgname].reverse()
+        sort_order = functools.partial(
+            _compare_packages_by_version_and_date, get_release)
+        for package_pubs in outpkgs.itervalues():
+            package_pubs.sort(cmp=sort_order, reverse=True)
 
         return outpkgs
 
@@ -188,9 +194,10 @@
         # Avoid circular imports.
         from lp.soyuz.model.publishing import (
             BinaryPackagePublishingHistory,
-            SourcePackagePublishingHistory)
+            SourcePackagePublishingHistory,
+            )
 
-        self.debug("Beginning superseded processing...")
+        self.logger.debug("Beginning superseded processing...")
 
         # XXX: dsilvers 2005-09-22 bug=55030:
         # Need to make binaries go in groups but for now this'll do.
@@ -215,10 +222,10 @@
         # then we can consider them eligible for removal.
         for pub_record in binary_records:
             binpkg_release = pub_record.binarypackagerelease
-            self.debug("%s/%s (%s) has been judged eligible for removal" %
-                       (binpkg_release.binarypackagename.name,
-                        binpkg_release.version,
-                        pub_record.distroarchseries.architecturetag))
+            self.logger.debug(
+                "%s/%s (%s) has been judged eligible for removal",
+                binpkg_release.binarypackagename.name, binpkg_release.version,
+                pub_record.distroarchseries.architecturetag)
             self._setScheduledDeletionDate(pub_record)
             # XXX cprov 20070820: 'datemadepending' is useless, since it's
             # always equals to "scheduleddeletiondate - quarantine".
@@ -262,28 +269,29 @@
                     continue
 
             # Okay, so there's no unremoved binaries, let's go for it...
-            self.debug(
-                "%s/%s (%s) source has been judged eligible for removal" %
-                (srcpkg_release.sourcepackagename.name,
-                 srcpkg_release.version, pub_record.id))
+            self.logger.debug(
+                "%s/%s (%s) source has been judged eligible for removal",
+                srcpkg_release.sourcepackagename.name, srcpkg_release.version,
+                pub_record.id)
             self._setScheduledDeletionDate(pub_record)
             # XXX cprov 20070820: 'datemadepending' is pointless, since it's
             # always equals to "scheduleddeletiondate - quarantine".
             pub_record.datemadepending = UTC_NOW
 
-    def judgeAndDominate(self, dr, pocket):
-        """Perform the domination and superseding calculations
+    def dominateBinaries(self, distroseries, pocket):
+        """Perform domination on binary package publications.
 
-        It only works across the distroseries and pocket specified.
+        Dominates binaries, restricted to `distroseries`, `pocket`, and
+        `self.archive`.
         """
         # Avoid circular imports.
-        from lp.soyuz.model.publishing import (
-             BinaryPackagePublishingHistory,
-             SourcePackagePublishingHistory)
+        from lp.soyuz.model.publishing import BinaryPackagePublishingHistory
 
-        for distroarchseries in dr.architectures:
-            self.debug("Performing domination across %s/%s (%s)" % (
-                dr.name, pocket.title, distroarchseries.architecturetag))
+        for distroarchseries in distroseries.architectures:
+            self.logger.debug(
+                "Performing domination across %s/%s (%s)",
+                distroseries.name, pocket.title,
+                distroarchseries.architecturetag)
 
             bpph_location_clauses = And(
                 BinaryPackagePublishingHistory.status ==
@@ -313,15 +321,24 @@
                 BinaryPackageRelease.binpackageformat !=
                     BinaryPackageFormat.DDEB,
                 bpph_location_clauses)
-            self.debug("Dominating binaries...")
+            self.logger.debug("Dominating binaries...")
             self._dominatePublications(self._sortPackages(binaries, False))
 
-        self.debug("Performing domination across %s/%s (Source)" %
-                   (dr.name, pocket.title))
+    def dominateSources(self, distroseries, pocket):
+        """Perform domination on source package publications.
+
+        Dominates sources, restricted to `distroseries`, `pocket`, and
+        `self.archive`.
+        """
+        # Avoid circular imports.
+        from lp.soyuz.model.publishing import SourcePackagePublishingHistory
+        self.logger.debug(
+            "Performing domination across %s/%s (Source)",
+            distroseries.name, pocket.title)
         spph_location_clauses = And(
             SourcePackagePublishingHistory.status ==
                 PackagePublishingStatus.PUBLISHED,
-            SourcePackagePublishingHistory.distroseries == dr,
+            SourcePackagePublishingHistory.distroseries == distroseries,
             SourcePackagePublishingHistory.archive == self.archive,
             SourcePackagePublishingHistory.pocket == pocket,
             )
@@ -343,18 +360,27 @@
             SourcePackageRelease.sourcepackagenameID.is_in(
                 candidate_source_names),
             spph_location_clauses)
-        self.debug("Dominating sources...")
+        self.logger.debug("Dominating sources...")
         self._dominatePublications(self._sortPackages(sources))
         flush_database_updates()
 
+    def judge(self, distroseries, pocket):
+        """Judge superseded sources and binaries."""
+        # Avoid circular imports.
+        from lp.soyuz.model.publishing import (
+             BinaryPackagePublishingHistory,
+             SourcePackagePublishingHistory,
+             )
+
         sources = SourcePackagePublishingHistory.select("""
             sourcepackagepublishinghistory.distroseries = %s AND
             sourcepackagepublishinghistory.archive = %s AND
             sourcepackagepublishinghistory.pocket = %s AND
             sourcepackagepublishinghistory.status IN %s AND
             sourcepackagepublishinghistory.scheduleddeletiondate is NULL
-            """ % sqlvalues(dr, self.archive, pocket,
-                            ELIGIBLE_DOMINATION_STATES))
+            """ % sqlvalues(
+                distroseries, self.archive, pocket,
+                inactive_publishing_status))
 
         binaries = BinaryPackagePublishingHistory.select("""
             binarypackagepublishinghistory.distroarchseries =
@@ -364,11 +390,22 @@
             binarypackagepublishinghistory.pocket = %s AND
             binarypackagepublishinghistory.status IN %s AND
             binarypackagepublishinghistory.scheduleddeletiondate is NULL
-            """ % sqlvalues(dr, self.archive, pocket,
-                            ELIGIBLE_DOMINATION_STATES),
+            """ % sqlvalues(
+                distroseries, self.archive, pocket,
+                inactive_publishing_status),
             clauseTables=['DistroArchSeries'])
 
         self._judgeSuperseded(sources, binaries)
 
-        self.debug("Domination for %s/%s finished" %
-                   (dr.name, pocket.title))
+    def judgeAndDominate(self, distroseries, pocket):
+        """Perform the domination and superseding calculations
+
+        It only works across the distroseries and pocket specified.
+        """
+
+        self.dominateBinaries(distroseries, pocket)
+        self.dominateSources(distroseries, pocket)
+        self.judge(distroseries, pocket)
+
+        self.logger.debug(
+            "Domination for %s/%s finished", distroseries.name, pocket.title)

=== modified file 'lib/lp/soyuz/model/publishing.py'
--- lib/lp/soyuz/model/publishing.py	2011-08-26 09:17:08 +0000
+++ lib/lp/soyuz/model/publishing.py	2011-08-30 07:54:25 +0000
@@ -740,7 +740,7 @@
 
     def supersede(self, dominant=None, logger=None):
         """See `ISourcePackagePublishingHistory`."""
-        assert self.status in [PUBLISHED, PENDING], (
+        assert self.status in active_publishing_status, (
             "Should not dominate unpublished source %s" %
             self.sourcepackagerelease.title)
 
@@ -1120,7 +1120,7 @@
         # tolerate SUPERSEDED architecture-independent binaries, because
         # they are dominated automatically once the first publication is
         # processed.
-        if self.status not in [PUBLISHED, PENDING]:
+        if self.status not in active_publishing_status:
             assert not self.binarypackagerelease.architecturespecific, (
                 "Should not dominate unpublished architecture specific "
                 "binary %s (%s)" % (


Follow ups