← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~jtv/launchpad/bug-733245 into lp:launchpad

 

Jeroen T. Vermeulen has proposed merging lp:~jtv/launchpad/bug-733245 into lp:launchpad.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)

For more details, see:
https://code.launchpad.net/~jtv/launchpad/bug-733245/+merge/53230

= Summary =

The DistroSeriesDifference population script creates DSDs, but leaves base_version uninitialized.  That part needs to be done from python.


== Proposed fix ==

Fire off a DBLoopTuner to fix up any DSDs for a freshly-populated DistroSeries.  The work is likely to take long enough to warrant a DBLoopTuner; that way we know it won't cause undue stress on the backend.

If we enable the feature before the fixup is done, we'll start out with DSDs where it's impossible to request a diff.  But that may be better than leaving the feature disabled until all the work is finished.


== Pre-implementation notes ==

This work is subject to more or less continuous discussion.  In particular in this case, I discussed with StevenK.


== Implementation details ==

A new option --ignore-base-version lets the script ignore base_version.  If desired, we can run the script with this option to create all required DSDs very quickly, and then run it again without the option to fix up the base_versions.

Conversely, there is no mode to skip creation of DSDs; that part is fast and unintrusive enough to run multiple times — especially since it does not update or re-create existing DSDs.  Just detecting which DSDs there should be takes a few seconds at worst.


== Tests ==

{{{
./bin/test -vvc lp.registry.scripts.tests.test_populate_distroseriesdiff
}}}


== Demo and Q/A ==

We'll run this on a full derived distroseries and check that some of the base_versions get updated.  It won't be all, since it's not always possible to determine a usable base_version.


= Launchpad lint =

Checking for conflicts and issues in changed files.

Linting changed files:
  lib/lp/registry/scripts/populate_distroseriesdiff.py
  lib/lp/registry/model/distroseriesdifference.py
  lib/lp/registry/scripts/tests/test_populate_distroseriesdiff.py
-- 
https://code.launchpad.net/~jtv/launchpad/bug-733245/+merge/53230
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~jtv/launchpad/bug-733245 into lp:launchpad.
=== modified file 'lib/lp/registry/model/distroseriesdifference.py'
--- lib/lp/registry/model/distroseriesdifference.py	2011-03-10 04:00:08 +0000
+++ lib/lp/registry/model/distroseriesdifference.py	2011-03-14 12:23:49 +0000
@@ -1,4 +1,4 @@
-# Copyright 2010 Canonical Ltd.  This software is licensed under the
+# Copyright 2010-2011 Canonical Ltd.  This software is licensed under the
 # GNU Affero General Public License version 3 (see the file LICENSE).
 
 """Database classes for a difference between two distribution series."""
@@ -29,7 +29,6 @@
     IMasterStore,
     IStore,
     )
-from lp.archivepublisher.debversion import Version
 from lp.registry.enum import (
     DistroSeriesDifferenceStatus,
     DistroSeriesDifferenceType,

=== modified file 'lib/lp/registry/scripts/populate_distroseriesdiff.py'
--- lib/lp/registry/scripts/populate_distroseriesdiff.py	2011-03-04 05:50:01 +0000
+++ lib/lp/registry/scripts/populate_distroseriesdiff.py	2011-03-14 12:23:49 +0000
@@ -33,9 +33,11 @@
     DistroSeriesDifferenceStatus,
     DistroSeriesDifferenceType,
     )
+from canonical.launchpad.utilities.looptuner import TunableLoop
 from lp.registry.interfaces.distribution import IDistributionSet
 from lp.registry.interfaces.pocket import PackagePublishingPocket
 from lp.registry.model.distroseries import DistroSeries
+from lp.registry.model.distroseriesdifference import DistroSeriesDifference
 from lp.services.scripts.base import LaunchpadScript
 from lp.soyuz.interfaces.publishing import active_publishing_status
 
@@ -188,7 +190,7 @@
     store.execute("DROP TABLE IF EXISTS %s" % quote_identifier(table))
 
 
-def populate_distroseriesdiff(derived_distroseries):
+def populate_distroseriesdiff(logger, derived_distroseries):
     """Compare `derived_distroseries` to parent, and register differences.
 
     The differences are registered by creating `DistroSeriesDifference`
@@ -201,6 +203,9 @@
     store.execute("CREATE TEMP TABLE %s AS %s" % (
         quote_identifier(temp_table),
         compose_sql_find_differences(derived_distroseries)))
+    logger.info(
+        "Found %d potential difference(s).",
+        store.execute("SELECT count(*) FROM %s" % temp_table).get_one()[0])
     store.execute(
         compose_sql_populate_distroseriesdiff(
             derived_distroseries, temp_table))
@@ -221,6 +226,59 @@
             (DistroSeries.parent_seriesID, DistroSeries.id))
 
 
+class BaseVersionFixer(TunableLoop):
+    """Fix up `DistroSeriesDifference.base_version` in the database.
+
+    The code that creates `DistroSeriesDifference`s does not set the
+    `base_version`.  In cases where there may actually be a real base
+    version, this needs to be fixed up.
+
+    Since this is likely to be much, much slower than the rest of the
+    work of creating and initializing `DistroSeriesDifference`s, it is
+    done in a `DBLoopTuner`.
+    """
+
+    def __init__(self, log, store, commit, ids):
+        """See `TunableLoop`.
+
+        :param log: A logger.
+        :param store: Database store to work on.
+        :param commit: A commit function to call after each batch.
+        :param ids: Sequence of `DistroSeriesDifference` ids to fix.
+        """
+        super(BaseVersionFixer, self).__init__(log)
+        self.minimum_chunk_size = 2
+        self.maximum_chunk_size = 1000
+        self.store = store
+        self.commit = commit
+        self.ids = sorted(ids, reverse=True)
+
+    def isDone(self):
+        """See `ITunableLoop`."""
+        return len(self.ids) == 0
+
+    def _cutChunk(self, chunk_size):
+        """Cut a chunk of up to `chunk_size` items from the remaining work.
+
+        Removes the items to be processed in this chunk from the list of
+        remaining work, and returns those.
+        """
+        todo = self.ids[-chunk_size:]
+        self.ids = self.ids[:-chunk_size]
+        return todo
+
+    def _getBatch(self, ids):
+        """Retrieve a batch of `DistroSeriesDifference`s with given ids."""
+        return self.store.find(
+            DistroSeriesDifference, DistroSeriesDifference.id.is_in(ids))
+
+    def __call__(self, chunk_size):
+        """See `ITunableLoop`."""
+        for dsd in self._getBatch(self._cutChunk(int(chunk_size))):
+            dsd._updateBaseVersion()
+        self.commit()
+
+
 class PopulateDistroSeriesDiff(LaunchpadScript):
     """Populate `DistroSeriesDifference` for pre-existing differences."""
 
@@ -264,7 +322,10 @@
     def processDistroSeries(self, distroseries):
         """Generate `DistroSeriesDifference`s for `distroseries`."""
         self.logger.info("Looking for differences in %s.", distroseries)
-        populate_distroseriesdiff(distroseries)
+        populate_distroseriesdiff(self.logger, distroseries)
+        self.commit()
+        self.logger.info("Updating base_versions.")
+        self.fixBaseVersions(distroseries)
         self.commit()
         self.logger.info("Done with %s.", distroseries)
 
@@ -288,10 +349,10 @@
         specified_series = (self.options.series is not None)
         if specified_distro != specified_series:
             raise OptionValueError(
-                "Specify neither a distribution or a series, or both.")
+                "Specify both a distribution and a series, or use --all.")
         if specified_distro == self.options.all:
             raise OptionValueError(
-                "Either specify a distribution series, or use --all.")
+                "Either specify a distribution and series, or use --all.")
 
     def main(self):
         """Do the script's work."""
@@ -307,3 +368,26 @@
 
         for series in self.getDistroSeries():
             self.processDistroSeries(series)
+
+    def fixBaseVersions(self, distroseries):
+        """Fix up `DistroSeriesDifference.base_version` where appropriate.
+
+        The `DistroSeriesDifference` records we create don't have their
+        `base_version` fields set yet.  This is a shame because it's the
+        only thing we need to figure out python-side.
+
+        Only instances where the source package is published in both the
+        parent series and the derived series need to have this done.
+        """
+        self.logger.info(
+            "Fixing up base_versions for %s.", distroseries.title)
+        store = IStore(distroseries)
+        dsd_ids = store.find(
+            DistroSeriesDifference.id,
+            DistroSeriesDifference.derived_series == distroseries,
+            DistroSeriesDifference.status ==
+                DistroSeriesDifferenceStatus.NEEDS_ATTENTION,
+            DistroSeriesDifference.difference_type ==
+                DistroSeriesDifferenceType.DIFFERENT_VERSIONS,
+            DistroSeriesDifference.base_version == None)
+        BaseVersionFixer(self.logger, store, self.commit, dsd_ids).run()

=== modified file 'lib/lp/registry/scripts/tests/test_populate_distroseriesdiff.py'
--- lib/lp/registry/scripts/tests/test_populate_distroseriesdiff.py	2011-03-11 00:29:38 +0000
+++ lib/lp/registry/scripts/tests/test_populate_distroseriesdiff.py	2011-03-14 12:23:49 +0000
@@ -24,6 +24,7 @@
 from lp.registry.interfaces.pocket import PackagePublishingPocket
 from lp.registry.model.distroseriesdifference import DistroSeriesDifference
 from lp.registry.scripts.populate_distroseriesdiff import (
+    BaseVersionFixer,
     compose_sql_difference_type,
     compose_sql_find_latest_source_package_releases,
     compose_sql_find_differences,
@@ -42,7 +43,10 @@
     )
 from lp.soyuz.model.archive import Archive
 from lp.soyuz.enums import ArchivePurpose
-from lp.testing import TestCaseWithFactory
+from lp.testing import (
+    TestCase,
+    TestCaseWithFactory,
+    )
 from lp.testing.fakemethod import FakeMethod
 
 
@@ -409,7 +413,7 @@
     def test_creates_distroseriesdifference(self):
         distroseries = self.makeDerivedDistroSeries()
         spph = self.makeSPPH(distroseries=distroseries)
-        populate_distroseriesdiff(distroseries)
+        populate_distroseriesdiff(DevNullLogger(), distroseries)
         store = Store.of(distroseries)
         dsd = self.getDistroSeriesDiff(distroseries).one()
         spr = spph.sourcepackagerelease
@@ -445,6 +449,50 @@
         self.assertEqual(existing_versions['derived'], dsd.source_version)
 
 
+class FakeDSD:
+    _updateBaseVersion = FakeMethod()
+
+
+class TestBaseVersionFixer(TestCase):
+    """Test the poignant parts of `BaseVersionFixer`."""
+
+    def makeFixer(self, ids):
+        fixer = BaseVersionFixer(DevNullLogger(), None, FakeMethod(), ids)
+        fixer._getBatch = FakeMethod()
+        return fixer
+
+    def test_isDone_is_done_when_ids_is_empty(self):
+        self.assertTrue(self.makeFixer([]).isDone())
+
+    def test_isDone_is_not_done_until_ids_is_empty(self):
+        self.assertFalse(self.makeFixer([1]).isDone())
+
+    def test_cutChunk_one_cuts_exactly_one(self):
+        fixer = self.makeFixer(range(3))
+        chunk = fixer._cutChunk(1)
+        self.assertEqual([0], chunk)
+        self.assertEqual(3 - 1, len(fixer.ids))
+
+    def test_cutChunk_over_remaining_size_completes_loop(self):
+        fixer = self.makeFixer(range(3))
+        chunk = fixer._cutChunk(100)
+        self.assertContentEqual(range(3), chunk)
+        self.assertEqual([], fixer.ids)
+
+    def test_updatesBaseVersion(self):
+        fake_dsd = FakeDSD()
+        fixer = self.makeFixer([fake_dsd])
+        fixer._getBatch.result = fixer.ids
+        fixer(1)
+        self.assertNotEqual(0, fake_dsd._updateBaseVersion.call_count)
+
+    def test_loop_commits(self):
+        fixer = self.makeFixer([FakeDSD()])
+        fixer._getBatch = FakeMethod(result=fixer.ids)
+        fixer(1)
+        self.assertNotEqual(0, fixer.commit.call_count)
+
+
 class TestPopulateDistroSeriesDiffScript(TestCaseWithFactory, FactoryHelper):
     """Test the `populate-distroseriesdiff` script."""
 
@@ -522,3 +570,39 @@
         expected_series_name = "%s %s" % (
             spph.distroseries.distribution.name, spph.distroseries.name)
         self.assertIn(expected_series_name, script.logger.getLogBuffer())
+
+    def test_calls_fixBaseVersions(self):
+        distroseries = self.makeDerivedDistroSeries()
+        self.makeSPPH(distroseries=distroseries)
+        script = self.makeScript([
+            '--distribution', distroseries.distribution.name,
+            '--series', distroseries.name,
+            ])
+        script.fixBaseVersions = FakeMethod()
+        script.main()
+        self.assertEqual(
+            [((distroseries,), {})], script.fixBaseVersions.calls)
+
+    def test_fixes_base_versions(self):
+        distroseries = self.makeDerivedDistroSeries()
+        package = self.factory.makeSourcePackageName()
+        derived_spr = self.factory.makeSourcePackageRelease(
+            distroseries=distroseries, sourcepackagename=package)
+        self.makeSPPH(
+            distroseries=distroseries,
+            sourcepackagerelease=derived_spr)
+        parent_spr = self.factory.makeSourcePackageRelease(
+            distroseries=distroseries.parent_series,
+            sourcepackagename=package)
+        self.makeSPPH(
+            distroseries=distroseries.parent_series,
+            sourcepackagerelease=parent_spr)
+        script = self.makeScript([
+            '--distribution', distroseries.distribution.name,
+            '--series', distroseries.name,
+            ])
+        script.main()
+        dsd = self.getDistroSeriesDiff(distroseries)[0]
+        dsd._updateBaseVersion = FakeMethod()
+        script.fixBaseVersions(distroseries)
+        self.assertEqual(1, dsd._updateBaseVersion.call_count)


Follow ups