← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~jtv/launchpad/bug-851684 into lp:launchpad

 

Jeroen T. Vermeulen has proposed merging lp:~jtv/launchpad/bug-851684 into lp:launchpad.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  Bug #851684 in Launchpad itself: "Long-running Gina transactions"
  https://bugs.launchpad.net/launchpad/+bug/851684

For more details, see:
https://code.launchpad.net/~jtv/launchpad/bug-851684/+merge/75721

= Summary =

Having production problems with the initial run of transitional gina domination: https://wiki.canonical.com/IncidentReports/2011-09-16-LP-Gina%20hogs%20iron%20during%20initial%20debian%20domination


== Proposed fix ==

Commit much more regularly, and try to perform long client-side sorting (where gina's database connection got idle-killed) without a database open.


== Pre-implementation notes ==

In conjunction with William.  This probably won't fix the transition's humungous memory usage, but it'll make life easier for the ORM, make it more likely that there is at least _some_ progress on each run, and reduce the risk of idle-kill.


== Implementation details ==

This is ugly code.  But it's to be retired immediately after the first runs succeed, so I didn't bother putting imports in the right place etc.

One _sortPackages run is replaced with an O(n) find-the-maximum pass.  And I made the database sort by descending version & creation date, which should make the python-side sort a noop (barring small differences in version interpretation).  The version field is declared as a Debian version string in the database, so that won't be a naïve textual sort.


== Tests ==

{{{
./bin/test -vvc lp.soyuz -t gina
}}}

I'm getting a failure in gina.txt if I run this, which I don't get when I run that test in isolation.  Internal server error; may be a Librarian problem.  Been getting those on other branches since yesterday.


== Demo and Q/A ==

On dogfood: UPDATE SourcePackagePublishingHistory SET status = 2 WHERE archive = 3;

Then run gina and check statuses as usual.  Watch out for long "idle in transaction" and for high memory usage.


= Launchpad lint =

Checking for conflicts and issues in changed files.

Linting changed files:
  lib/lp/archivepublisher/domination.py
  lib/lp/soyuz/scripts/gina/dominate.py
  scripts/gina.py
  lib/lp/soyuz/scripts/tests/test_gina.py

./scripts/gina.py
      25: '_pythonpath' imported but unused
-- 
https://code.launchpad.net/~jtv/launchpad/bug-851684/+merge/75721
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~jtv/launchpad/bug-851684 into lp:launchpad.
=== modified file 'lib/lp/archivepublisher/domination.py'
--- lib/lp/archivepublisher/domination.py	2011-09-09 14:30:03 +0000
+++ lib/lp/archivepublisher/domination.py	2011-09-16 11:49:36 +0000
@@ -58,6 +58,7 @@
 from storm.expr import (
     And,
     Count,
+    Desc,
     Select,
     )
 
@@ -524,12 +525,18 @@
         # Avoid circular imports.
         from lp.soyuz.model.publishing import SourcePackagePublishingHistory
 
-        return IStore(SourcePackagePublishingHistory).find(
+        query = IStore(SourcePackagePublishingHistory).find(
             SourcePackagePublishingHistory,
             join_spph_spr(),
             join_spr_spn(),
             SourcePackageName.name == package_name,
             self._composeActiveSourcePubsCondition(distroseries, pocket))
+        # Sort by descending version (SPR.version has type debversion in
+        # the database, so this should be a real proper comparison) so
+        # that _sortPackage will have slightly less work to do later.
+        return query.order_by(
+            Desc(SourcePackageRelease.version),
+            Desc(SourcePackagePublishingHistory.datecreated))
 
     def dominateRemovedSourceVersions(self, distroseries, pocket,
                                       package_name, live_versions):

=== modified file 'lib/lp/soyuz/scripts/gina/dominate.py'
--- lib/lp/soyuz/scripts/gina/dominate.py	2011-09-09 04:57:18 +0000
+++ lib/lp/soyuz/scripts/gina/dominate.py	2011-09-16 11:49:36 +0000
@@ -10,18 +10,11 @@
 
 from zope.component import getUtility
 
-# XXX JeroenVermeulen 2011-09-08, bug=844550: The GeneralizedPublication
-# import violates import policy and elicits a warning from the test
-# suite.  The warning helps remind us to retire this code as soon as
-# possible.
-from lp.archivepublisher.domination import (
-    Dominator,
-    GeneralizedPublication,
-    )
+from lp.archivepublisher.domination import Dominator
 from lp.registry.interfaces.distribution import IDistributionSet
 
 
-def dominate_imported_source_packages(logger, distro_name, series_name,
+def dominate_imported_source_packages(txn, logger, distro_name, series_name,
                                       pocket, packages_map):
     """Perform domination."""
     series = getUtility(IDistributionSet)[distro_name].getSeries(series_name)
@@ -63,16 +56,46 @@
             # we remove this transitional hack.
             # To remove the transitional hack, just let live_versions
             # default to the empty list instead of doing this:
-            pubs = dominator.findPublishedSPPHs(series, pocket, package_name)
-            generalization = GeneralizedPublication(is_source=True)
-            pubs_dict = dominator._sortPackages(pubs, generalization)
-            sorted_pubs = pubs_dict[package_name]
-            if len(sorted_pubs) <= 1:
-                # If there's only one published SPPH, the transitional
-                # code will just leave it Published.  Don't bother; the
-                # migration will be costly enough as it is.
+            import apt_pkg
+            from lp.services.database.bulk import load_related
+            from lp.soyuz.model.sourcepackagerelease import (
+                SourcePackageRelease,
+                )
+            pubs = list(
+                dominator.findPublishedSPPHs(series, pocket, package_name))
+            if len(pubs) <= 1:
+                # Without at least two published SPPHs, the transitional
+                # code will make no changes.  Skip, and leave the
+                # algorithm free to assume there's a pubs[0].
                 continue
-            live_versions = [sorted_pubs[0].sourcepackagerelease.version]
+            load_related(
+                SourcePackageRelease, pubs, ['sourcepackagereleaseID'])
+
+            # Close off the transaction to avoid being idle-killed.
+            # Nothing else is going to supersede or delete these
+            # publications in the meantime, so our data stays valid; and
+            # if new ones become published, they still won't be
+            # considered anyway.
+            txn.commit()
+
+            # Find the "latest" publication.  A purely in-memory
+            # operation; won't open a new transaction.
+            def is_newer(candidate, reference):
+                comparison = apt_pkg.VersionCompare(
+                    candidate.sourcepackagerelease.version,
+                    reference.sourcepackagerelease.version)
+                if comparison > 0:
+                    return True
+                elif comparison < 0:
+                    return False
+                else:
+                    return candidate.datecreated > reference.datecreated
+
+            latest_pub = pubs[0]
+            for pub in pubs[1:]:
+                if is_newer(pub, latest_pub):
+                    latest_pub = pub
+            live_versions = [latest_pub.sourcepackagerelease.version]
         else:
             live_versions = [
                 entry['Version']
@@ -80,3 +103,5 @@
 
         dominator.dominateRemovedSourceVersions(
             series, pocket, package_name, live_versions)
+
+        txn.commit()

=== modified file 'lib/lp/soyuz/scripts/tests/test_gina.py'
--- lib/lp/soyuz/scripts/tests/test_gina.py	2011-09-09 04:57:18 +0000
+++ lib/lp/soyuz/scripts/tests/test_gina.py	2011-09-16 11:49:36 +0000
@@ -19,6 +19,7 @@
     SourcePackageData,
     )
 from lp.testing import TestCaseWithFactory
+from lp.testing.faketransaction import FakeTransaction
 
 
 class FakePackagesMap:
@@ -34,13 +35,14 @@
         # dominate_imported_source_packages dominates the source
         # packages that Gina imports.
         logger = DevNullLogger()
+        txn = FakeTransaction()
         pub = self.factory.makeSourcePackagePublishingHistory(
             status=PackagePublishingStatus.PUBLISHED)
         series = pub.distroseries
         spr = pub.sourcepackagerelease
         package = spr.sourcepackagename
         dominate_imported_source_packages(
-            logger, series.distribution.name, series.name, pub.pocket,
+            txn, logger, series.distribution.name, series.name, pub.pocket,
             FakePackagesMap({package.name: []}))
         self.assertEqual(PackagePublishingStatus.DELETED, pub.status)
 
@@ -59,8 +61,9 @@
                     sourcepackagename=package, version=version))
             for version in ['1.0', '1.1', '1.1a']]
         logger = DevNullLogger()
+        txn = FakeTransaction()
         dominate_imported_source_packages(
-            logger, series.distribution.name, series.name, pocket,
+            txn, logger, series.distribution.name, series.name, pocket,
             FakePackagesMap({}))
         # XXX JeroenVermeulen 2011-09-08, bug=844550: This is
         # "transitional" domination which supersedes older versions of
@@ -88,8 +91,9 @@
         spr = spph.sourcepackagerelease
         package_name = spr.sourcepackagename.name
         logger = DevNullLogger()
+        txn = FakeTransaction()
         dominate_imported_source_packages(
-            logger, series.distribution.name, series.name, spph.pocket,
+            txn, logger, series.distribution.name, series.name, spph.pocket,
             FakePackagesMap({package_name: [{"Version": spr.version}]}))
         self.assertEqual(PackagePublishingStatus.PUBLISHED, spph.status)
 
@@ -124,8 +128,9 @@
             for spr, status in zip(sprs, statuses_before)]
 
         logger = DevNullLogger()
+        txn = FakeTransaction()
         dominate_imported_source_packages(
-            logger, series.distribution.name, series.name, pocket,
+            txn, logger, series.distribution.name, series.name, pocket,
             FakePackagesMap({package.name: [{"Version": live_version}]}))
 
         self.assertEqual(statuses_after, [spph.status for spph in spphs])

=== modified file 'scripts/gina.py'
--- scripts/gina.py	2011-09-15 04:30:02 +0000
+++ scripts/gina.py	2011-09-16 11:49:36 +0000
@@ -118,7 +118,7 @@
 
     # XXX JeroenVermeulen 2011-09-07 bug=843728: Dominate binaries as well.
     dominate_imported_source_packages(
-        log, distro, distroseries, pocket, packages_map)
+        ztm, log, distro, distroseries, pocket, packages_map)
     ztm.commit()
 
     if source_only: