← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~jtv/launchpad/bug-844550 into lp:launchpad

 

Jeroen T. Vermeulen has proposed merging lp:~jtv/launchpad/bug-844550 into lp:launchpad.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  Bug #844550 in Launchpad itself: "Retire gina transitional domination"
  https://bugs.launchpad.net/launchpad/+bug/844550

For more details, see:
https://code.launchpad.net/~jtv/launchpad/bug-844550/+merge/75150

= Summary =

We're rolling out “transitional Gina domination.”  This means that Gina will perform source domination on the imported Debian packages: publications of package versions that are no longer in Debian's Sources lists will become Superseded by their newer versions.  Where the very newest versions are deleted from the Sources lists for whatever reason, the publication records in our database will be marked Deleted.  (Also, Debian source publications are no longer nonsensically kept Pending; they will be Published immediately upon import.)

But what happens when a package no longer occurs in the Debian Sources list at all?  We have tons of these still lying around for old packages.  For the historical cases, we'll assume that the last known release for a package superseded all others.  So the last known release should end up Deleted and all its older ones Superseded.  In the future, when a package disappears from Debian's Sources list, its remaining active publications should be considered Deleted.

To make a smooth transition, transitional domination does the one-off jobs: mark any remaining Pending publications for Debian as Published (the bulk of the work was done in custom SQL), and for any package that isn't in the Sources lists, pick the latest version as “live.”  That last version stays Published; the older ones become Deleted.

And then there's this branch.  Once transitional domination has run, disposing of all the historical data, we can start marking publications for packages that aren't in the Sources at all as Deleted.  That includes, for the historical data, that one last release that transitional domination had kept Published.


== Pre-implementation notes ==

Discussed in great detail with wgrant.  Unfortunately the transition to making Debian source package publications Published (rather than Pending) broke some scripts the Ubuntu people were running, which explicitly looked for the doubtful Pending status.  No multi-status filtering is available, so the easiest fix was to steam ahead to the more desirable situation and assume that all new publications are Published instead.


== Implementation details ==

Mostly this branch consists of removals of clearly marked transitional code.  Things get a bit simpler, as planned.  But I extended a test for the gina code, to cover a more complete scenario.  There is already an extensive scenario test for the underlying functionality, but the new test code shows in one place how permanent domination behaves over a realistic "chain" of package publications.


== Tests ==

{{{
./bin/test -vvc lp.soyuz.scripts.tests.test_gina
}}}


== Demo and Q/A ==

We've found we can run Gina on dogfood's copy of sid to compare:

    ./scripts/gina.py -v sid

This closely matches what will happen in production, since we also ran transitional Gina domination there.  The only slightly unusual thing there is that the Sources list we have on dogfood is a bit older than the publications we have in the database, so some publications are marked Deleted because they're newer than any listed ones.

A useful thing to do before running domination is to copy all (id, status) from SourcePackagePublishingHistory into a table of its own.  We currently have such a snapshot from before we ran transitional domination on dogfood.

Then, to get an overview of all changes made by gina:

    http://paste.ubuntu.com/687616/

Results after setting all SPPHs for archive 3 (Debian primary) to Published, for a really thorough domination run:

    http://paste.ubuntu.com/687584/

With this new code, a portion of those 18668 Published records should become Deleted.  These should be one per SourcePackageName, and all for source packages that are no longer in the Sources list we have in /src/launchpad.net/gina-mirror/dists/sid/main/source/Sources.gz, or possibly that are in there but only with versions that we don't have.


= Launchpad lint =

Checking for conflicts and issues in changed files.

Linting changed files:
  lib/lp/soyuz/scripts/gina/dominate.py
  lib/lp/soyuz/scripts/tests/test_gina.py
-- 
https://code.launchpad.net/~jtv/launchpad/bug-844550/+merge/75150
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~jtv/launchpad/bug-844550 into lp:launchpad.
=== modified file 'lib/lp/soyuz/scripts/gina/dominate.py'
--- lib/lp/soyuz/scripts/gina/dominate.py	2011-09-09 04:57:18 +0000
+++ lib/lp/soyuz/scripts/gina/dominate.py	2011-09-13 10:32:28 +0000
@@ -10,14 +10,7 @@
 
 from zope.component import getUtility
 
-# XXX JeroenVermeulen 2011-09-08, bug=844550: The GeneralizedPublication
-# import violates import policy and elicits a warning from the test
-# suite.  The warning helps remind us to retire this code as soon as
-# possible.
-from lp.archivepublisher.domination import (
-    Dominator,
-    GeneralizedPublication,
-    )
+from lp.archivepublisher.domination import Dominator
 from lp.registry.interfaces.distribution import IDistributionSet
 
 
@@ -48,35 +41,9 @@
     # Dominate packages found in the Sources list we're importing.
     package_names = dominator.findPublishedSourcePackageNames(series, pocket)
     for package_name in package_names:
-        entries = packages_map.src_map.get(package_name)
-
-        if entries is None:
-            # XXX JeroenVermeulen 2011-09-08, bug=844550: This is a
-            # transitional hack.  The database is full of "Published"
-            # Debian SPPHs whose packages have actually been deleted.
-            # In the future such publications should simply be marked
-            # Deleted, but for the legacy baggage we currently carry
-            # around we'll just do traditional domination first: pick
-            # the latest Published version, and mark the rest of the
-            # SPPHs as superseded by that version.  The latest version
-            # will then, finally, be marked appropriately Deleted once
-            # we remove this transitional hack.
-            # To remove the transitional hack, just let live_versions
-            # default to the empty list instead of doing this:
-            pubs = dominator.findPublishedSPPHs(series, pocket, package_name)
-            generalization = GeneralizedPublication(is_source=True)
-            pubs_dict = dominator._sortPackages(pubs, generalization)
-            sorted_pubs = pubs_dict[package_name]
-            if len(sorted_pubs) <= 1:
-                # If there's only one published SPPH, the transitional
-                # code will just leave it Published.  Don't bother; the
-                # migration will be costly enough as it is.
-                continue
-            live_versions = [sorted_pubs[0].sourcepackagerelease.version]
-        else:
-            live_versions = [
-                entry['Version']
-                for entry in entries if 'Version' in entry]
+        entries = packages_map.src_map.get(package_name, [])
+        live_versions = [
+            entry['Version'] for entry in entries if 'Version' in entry]
 
         dominator.dominateRemovedSourceVersions(
             series, pocket, package_name, live_versions)

=== modified file 'lib/lp/soyuz/scripts/tests/test_gina.py'
--- lib/lp/soyuz/scripts/tests/test_gina.py	2011-09-09 04:57:18 +0000
+++ lib/lp/soyuz/scripts/tests/test_gina.py	2011-09-13 10:32:28 +0000
@@ -34,15 +34,51 @@
         # dominate_imported_source_packages dominates the source
         # packages that Gina imports.
         logger = DevNullLogger()
-        pub = self.factory.makeSourcePackagePublishingHistory(
-            status=PackagePublishingStatus.PUBLISHED)
-        series = pub.distroseries
-        spr = pub.sourcepackagerelease
-        package = spr.sourcepackagename
+        series = self.factory.makeDistroSeries()
+        pocket = PackagePublishingPocket.RELEASE
+        package = self.factory.makeSourcePackageName()
+
+        # Realistic situation: there's an older, superseded publication;
+        # a series of active ones; and a newer, pending publication
+        # that's not in the Sources lists yet.
+        # Gina dominates the Published ones and leaves the rest alone.
+        old_spph = self.factory.makeSourcePackagePublishingHistory(
+            distroseries=series, archive=series.main_archive,
+            pocket=pocket, status=PackagePublishingStatus.SUPERSEDED,
+            sourcepackagerelease=self.factory.makeSourcePackageRelease(
+                sourcepackagename=package, version='1.0'))
+
+        active_spphs = [
+            self.factory.makeSourcePackagePublishingHistory(
+                distroseries=series, archive=series.main_archive,
+                pocket=pocket, status=PackagePublishingStatus.PUBLISHED,
+                sourcepackagerelease=self.factory.makeSourcePackageRelease(
+                    sourcepackagename=package, version=version))
+            for version in ['1.1', '1.1.1', '1.1.1.1']]
+
+        new_spph = self.factory.makeSourcePackagePublishingHistory(
+            distroseries=series, archive=series.main_archive,
+            pocket=pocket, status=PackagePublishingStatus.PENDING,
+            sourcepackagerelease=self.factory.makeSourcePackageRelease(
+                sourcepackagename=package, version='1.2'))
+
+        spphs = [old_spph] + active_spphs + [new_spph]
+
+        # Of the active publications, in this scenario, only one version
+        # matches what Gina finds in the Sources list.  It stays
+        # published; older active publications are superseded, newer
+        # ones deleted.
         dominate_imported_source_packages(
-            logger, series.distribution.name, series.name, pub.pocket,
-            FakePackagesMap({package.name: []}))
-        self.assertEqual(PackagePublishingStatus.DELETED, pub.status)
+            logger, series.distribution.name, series.name, pocket,
+            FakePackagesMap({package.name: [{'Version': '1.1.1'}]}))
+        self.assertEqual([
+            PackagePublishingStatus.SUPERSEDED,
+            PackagePublishingStatus.SUPERSEDED,
+            PackagePublishingStatus.PUBLISHED,
+            PackagePublishingStatus.DELETED,
+            PackagePublishingStatus.PENDING,
+            ],
+            [pub.status for pub in spphs])
 
     def test_dominate_imported_source_packages_dominates_deletions(self):
         # dominate_imported_source_packages dominates the source
@@ -58,78 +94,24 @@
                 sourcepackagerelease=self.factory.makeSourcePackageRelease(
                     sourcepackagename=package, version=version))
             for version in ['1.0', '1.1', '1.1a']]
+
+        # In this scenario, 1.0 is a superseded release.
+        pubs[0].supersede()
         logger = DevNullLogger()
         dominate_imported_source_packages(
             logger, series.distribution.name, series.name, pocket,
             FakePackagesMap({}))
-        # XXX JeroenVermeulen 2011-09-08, bug=844550: This is
-        # "transitional" domination which supersedes older versions of
-        # deleted packages with the last known version.  Permanent
-        # domination will then mark the last known version as deleted.
-        # For permanent domination, the expected outcome is that all
-        # these publications will be Deleted (but any pre-existing
-        # Superseded publications for older versions will remain
-        # Superseded).
+
+        # The older, superseded release stays superseded; but the
+        # releases that dropped out of the imported Sources list without
+        # known successors are marked deleted.
         self.assertEqual([
             PackagePublishingStatus.SUPERSEDED,
-            PackagePublishingStatus.SUPERSEDED,
-            PackagePublishingStatus.PUBLISHED,
+            PackagePublishingStatus.DELETED,
+            PackagePublishingStatus.DELETED,
             ],
             [pub.status for pub in pubs])
 
-    def test_dominate_imported_source_packages_cleans_up_pending_spphs(self):
-        # XXX JeroenVermeulen 2011-09-08, bug=844550: For transition to
-        # Gina domination, dominate_imported_source_packages turns any
-        # remaining Pending SPPHS into Published ones.
-        series = self.factory.makeDistroSeries()
-        spph = self.factory.makeSourcePackagePublishingHistory(
-            distroseries=series, archive=series.main_archive,
-            status=PackagePublishingStatus.PENDING)
-        spr = spph.sourcepackagerelease
-        package_name = spr.sourcepackagename.name
-        logger = DevNullLogger()
-        dominate_imported_source_packages(
-            logger, series.distribution.name, series.name, spph.pocket,
-            FakePackagesMap({package_name: [{"Version": spr.version}]}))
-        self.assertEqual(PackagePublishingStatus.PUBLISHED, spph.status)
-
-    def test_dominate_imported_source_packages_cleans_up_first(self):
-        # XXX JeroenVermeulen 2011-09-08, bug=844550: For transition to
-        # Gina domination, dominate_imported_source_packages turns any
-        # remaining Pending SPPHS into Published ones.  It does this
-        # *before* dominating, so no domination happens while some of
-        # the SPPHs are still mistakenly Pending (which would result in
-        # mistaken deletions).
-        series = self.factory.makeDistroSeries()
-        package = self.factory.makeSourcePackageName()
-        pocket = PackagePublishingPocket.RELEASE
-        versions = ['1.0', '1.1']
-        statuses_before = [
-            PackagePublishingStatus.PUBLISHED,
-            PackagePublishingStatus.PENDING,
-            ]
-        statuses_after = [
-            PackagePublishingStatus.SUPERSEDED,
-            PackagePublishingStatus.PUBLISHED,
-            ]
-        live_version = versions[-1]
-        sprs = [
-            self.factory.makeSourcePackageRelease(
-                sourcepackagename=package, version=version)
-            for version in versions]
-        spphs = [
-            self.factory.makeSourcePackagePublishingHistory(
-                archive=series.main_archive, distroseries=series,
-                sourcepackagerelease=spr, pocket=pocket, status=status)
-            for spr, status in zip(sprs, statuses_before)]
-
-        logger = DevNullLogger()
-        dominate_imported_source_packages(
-            logger, series.distribution.name, series.name, pocket,
-            FakePackagesMap({package.name: [{"Version": live_version}]}))
-
-        self.assertEqual(statuses_after, [spph.status for spph in spphs])
-
 
 class TestSourcePackagePublisher(TestCaseWithFactory):