← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~jtv/launchpad/bug-623391 into lp:launchpad

 

Jeroen T. Vermeulen has proposed merging lp:~jtv/launchpad/bug-623391 into lp:launchpad.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers): code
Related bugs:
  #623391 Create DistroSeriesDifferences via script
  https://bugs.launchpad.net/bugs/623391

For more details, see:
https://code.launchpad.net/~jtv/launchpad/bug-623391/+merge/52114

= Bug 623391: Populate DistroSeriesDifference =

A DistroSeriesDifference is a record of the fact that between a derived distroseries in one distribution, and its parent distroseries in another distribution, a source package differs.  It can exist in the parent series but not yet in the derived series, or it can be specific (or "unique" as the enum calls it, but that may be a bit misleading) to the derived series.  Or it can exist in both, but in different versions.

Soyuz will soon automatically create DistroSeriesDifferenceJob records as distributions change, and those jobs will be processed by a cron job to create DistroSeriesDifference records.  All DistroSeriesDifference records get created in an initial "Needs Attention" state, and a second cron job will update them with full, detailed difference information that's harder to compute.  Finally, the derived distribution's owners will review the differences and act on them.  Julian has drawn up a diagram of this:

    http://people.canonical.com/~ed/DerivedDistrosDataFlow.jpeg

What the branch you're looking at does is create DistroSeriesDifferences for differences that already existed before the code starts producing DistroSeriesDifferenceJobs for new changes.  You'll find a script that creates DistroSeriesDifferences directly based on SourcePackageReleases, without going through the job system.

So, what if this runs when the job system has already been active and some DistroSeriesDifferences already exist?  When the script finds those, it will assume that the job system is taking care of changes as they come in, and leave them unchanged.  Thus it won't interfere with ongoing processing of new changes.

In principle, it's an easy job: find the latest SourcePackagePublishingHistory for each SourcePackageName in a given derived DistroSeries.  Do the same for its parent.  For each of the SourcePackageNames found in either, compare versions.  If there is a difference (including one or the other side not having a SourcePackagePublishingHistory for the SourcePackageName at all), create a DistroSeriesDifference.  There is no need to track transitive differences when distroseries are derived from other derived distroseries.

Doing all that in python loops could be tortuously slow, but it turned out to be easy enough to do completely in SQL without even sending any of the data to the client.  I don't recall ever using a full outer join before, but I've always sort of wanted to.  It came in very handy in this case.  On a warm cache, I've seen it handle a sizable difference (between Debian Squeeze and Ubuntu Maverick Meerkat, released 4 months apart) in just one or two seconds.  It found about twenty thousand differences in that time.  If we wanted, though we probably don't, we could afford to re-run this query frequently instead of registering changes from the application code.

The overall query was not hugely large or complex, but it broke down naturally into several components.  In my branch I have separate functions to generate the SQL for those components, and unit-test the query components separately.

One other concern was write traffic to the database.  I asked Stuart: producing batches of 20,000 or so records is no problem as long as nothing else is working on the same table.  We can achieve that by stopping the cron jobs for the duration of the populating script.  We'll also want to try out how this performs first; a dry-run option would help get useful measurements and per-distroseries intermediate commits would help ensure progress.

I added a tests module to lp.registry.scripts.  To test,
{{{
./bin/test -vvc lp.registry.scripts.tests
}}}


The only lint was the usual warning about _pythonpath in the script file itself.


Jeroen
-- 
https://code.launchpad.net/~jtv/launchpad/bug-623391/+merge/52114
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~jtv/launchpad/bug-623391 into lp:launchpad.
=== added file 'lib/lp/registry/scripts/populate_distroseriesdiff.py'
--- lib/lp/registry/scripts/populate_distroseriesdiff.py	1970-01-01 00:00:00 +0000
+++ lib/lp/registry/scripts/populate_distroseriesdiff.py	2011-03-03 19:56:39 +0000
@@ -0,0 +1,256 @@
+# Copyright 2011 Canonical Ltd.  This software is licensed under the
+# GNU Affero General Public License version 3 (see the file LICENSE).
+
+"""Populate `DistroSeriesDifference` table.
+
+This script creates `DistroSeriesDifference` entries for the package
+version differences between a derived `DistroSeries` and its parent.
+
+The entries will still need to be processed by the cron job that works
+out the exact differences.  Any pre-existing `DistroSeriesDifference`
+entries remain untouched.
+"""
+
+__metaclass__ = type
+__all__ = [
+    'PopulateDistroSeriesDiff',
+    ]
+
+from optparse import (
+    Option,
+    OptionValueError,
+    )
+from storm.info import ClassAlias
+from zope.component import getUtility
+
+from canonical.database.sqlbase import (
+    quote,
+    quote_identifier,
+    )
+from canonical.launchpad.interfaces.lpstorm import IStore
+from lp.registry.enum import (
+    DistroSeriesDifferenceStatus,
+    DistroSeriesDifferenceType,
+    )
+from lp.registry.interfaces.distribution import IDistributionSet
+from lp.registry.interfaces.pocket import PackagePublishingPocket
+from lp.registry.model.distroseries import DistroSeries
+from lp.services.scripts.base import LaunchpadScript
+from lp.soyuz.interfaces.publishing import active_publishing_status
+
+
+def compose_sql_find_latest_source_package_releases(distroseries):
+    """Produce SQL that gets the last-published `SourcePackageRelease`s.
+
+    Within `distroseries`, looks for the `SourcePackageRelease`
+    belonging to each respective `SourcePackageName`'s respective latest
+    `SourcePackagePublishingHistory`.
+
+    For each of those, it produces a tuple consisting of:
+     * `SourcePackageName` id: sourcepackagename
+     * `SourcePackageRelease` id: sourcepackagerelease
+     * Source package version: version.
+
+    :return: SQL query, as a string.
+    """
+    parameters = {
+        'active_status': quote(active_publishing_status),
+        'distroseries': quote(distroseries),
+        'main_archive': quote(distroseries.distribution.main_archive),
+        'release_pocket': quote(PackagePublishingPocket.RELEASE),
+    }
+    return """
+        SELECT DISTINCT ON (SPR.sourcepackagename)
+            SPR.sourcepackagename,
+            SPR.id As sourcepackagerelease,
+            SPR.version
+        FROM SourcePackagePublishingHistory AS SPPH
+        JOIN SourcePackageRelease AS SPR ON SPR.id = SPPH.sourcepackagerelease
+        WHERE
+            SPPH.distroseries = %(distroseries)s AND
+            SPPH.archive = %(main_archive)s AND
+            SPPH.pocket = %(release_pocket)s AND
+            SPPH.status IN %(active_status)s
+        ORDER BY SPR.sourcepackagename, SPPH.id DESC
+        """ % parameters
+
+
+def compose_sql_find_differences(derived_distroseries):
+    """Produce SQL that finds differences for a `DistroSeries`.
+
+    The query compares `derived_distroseries` and its `parent_series`
+    and for each package whose latest `SourcePackageRelease`s in the
+    respective series differ, produces a tuple of:
+     * `SourcePackageName` id: sourcepackagename
+     * Source package version in derived series: source_version
+     * Source package version in parent series: parent_source_version.
+
+    :return: SQL query, as a string.
+    """
+    parameters = {
+        'derived_query': compose_sql_find_latest_source_package_releases(
+            derived_distroseries),
+        'parent_query': compose_sql_find_latest_source_package_releases(
+            derived_distroseries.parent_series),
+    }
+    return """
+        SELECT DISTINCT
+            COALESCE(
+                parent.sourcepackagename,
+                derived.sourcepackagename) AS sourcepackagename,
+            derived.version AS source_version,
+            parent.version AS parent_source_version
+        FROM (%(parent_query)s) AS parent
+        FULL OUTER JOIN (%(derived_query)s) AS derived
+        ON derived.sourcepackagename = parent.sourcepackagename
+        WHERE
+            derived.sourcepackagerelease IS DISTINCT FROM
+                parent.sourcepackagerelease
+        """ % parameters
+
+
+def compose_sql_difference_type():
+    """Produce SQL to compute a difference's `DistroSeriesDifferenceType`.
+
+    Works with the parent_source_version and source_version fields as
+    produced by the SQL from `compose_sql_find_differences`.
+
+    :return: SQL query, as a string.
+    """
+    parameters = {
+        'unique_to_derived_series': quote(
+            DistroSeriesDifferenceType.UNIQUE_TO_DERIVED_SERIES),
+        'missing_from_derived_series': quote(
+            DistroSeriesDifferenceType.MISSING_FROM_DERIVED_SERIES),
+        'different_versions': quote(
+            DistroSeriesDifferenceType.DIFFERENT_VERSIONS),
+    }
+    return """
+        CASE
+            WHEN parent_source_version IS NULL THEN
+                %(unique_to_derived_series)s
+            WHEN source_version IS NULL THEN
+                %(missing_from_derived_series)s
+            ELSE %(different_versions)s
+        END
+        """ % parameters
+
+
+def compose_sql_populate_distroseriesdiff(derived_distroseries, temp_table):
+    """Create `DistroSeriesDifference` rows based on found differences.
+
+    Uses field values that describe the difference, as produced by the
+    SQL from `compose_sql_find_differences`:
+     * sourcepackagename
+     * source_version
+     * parent_source_version
+
+    Existing `DistroSeriesDifference` rows are not affected.
+
+    :param derived_distroseries: A derived `DistroSeries`.
+    :param temp_table: The name of a table to select the input fields
+        from.
+    :return: SQL query, as a string.
+    """
+    parameters = {
+        'derived_series': quote(derived_distroseries),
+        'difference_type_expression': compose_sql_difference_type(),
+        'needs_attention': quote(
+            DistroSeriesDifferenceStatus.NEEDS_ATTENTION),
+        'temp_table': quote_identifier(temp_table),
+    }
+    return """
+        INSERT INTO DistroSeriesDifference (
+            derived_series,
+            source_package_name,
+            status,
+            difference_type,
+            source_version,
+            parent_source_version)
+        SELECT
+            %(derived_series)s,
+            sourcepackagename,
+            %(needs_attention)s,
+            %(difference_type_expression)s,
+            source_version,
+            parent_source_version
+        FROM %(temp_table)s
+        WHERE sourcepackagename NOT IN (
+            SELECT source_package_name
+            FROM DistroSeriesDifference
+            WHERE derived_series = %(derived_series)s)
+        """ % parameters
+
+
+def drop_table(store, table):
+    """Drop `table`, if it exists."""
+    store.execute("DROP TABLE IF EXISTS %s" % quote_identifier(table))
+
+
+def populate_distroseriesdiff(derived_distroseries):
+    """Compare `derived_distroseries` to parent, and register differences.
+
+    The differences are registered by creating `DistroSeriesDifference`
+    records, insofar as they do not yet exist.
+    """
+    temp_table = "temp_potentialdistroseriesdiff"
+
+    store = IStore(derived_distroseries)
+    drop_table(store, temp_table)
+    store.execute("CREATE TEMP TABLE %s AS %s" % (
+        quote_identifier(temp_table),
+        compose_sql_find_differences(derived_distroseries)))
+    store.execute(
+        compose_sql_populate_distroseriesdiff(
+            derived_distroseries, temp_table))
+    drop_table(store, temp_table)
+
+
+def find_derived_series():
+    """Find all derived `DistroSeries`.
+
+    Derived `DistroSeries` are ones that have a `parent_series`, but
+    where the `parent_series` is not in the same distribution.
+    """
+    Parent = ClassAlias(DistroSeries, "Parent")
+    return IStore(DistroSeries).find(
+        DistroSeries,
+        Parent.id == DistroSeries.parent_seriesID,
+        Parent.distributionID != DistroSeries.distributionID)
+
+
+class PopulateDistroSeriesDiff(LaunchpadScript):
+
+    def add_my_options(self):
+        self.parser.add_options([
+            Option(
+                '-a', '--all', dest='all', action='store_true', default=False,
+                help="Populate all derived distribution series."),
+            Option(
+                '-d', '--distribution', dest='distribution', default=None,
+                help="Derived distribution."),
+            Option('-s', '--series', dest='series', default=None,
+                help="Derived distribution series.")])
+
+    def getDistroSeries(self):
+        if self.options.all:
+            return list(find_derived_series())
+        else:
+            distro = getUtility(IDistributionSet).getByName(
+                self.options.distribution)
+            return [distro.getSeries(self.options.series)]
+
+    def main(self):
+        specified_distro = (self.options.distribution is not None)
+        specified_series = (self.options.series is not None)
+        if specified_distro != specified_series:
+            raise OptionValueError(
+                "Specify neither a distribution or a series, or both.")
+        if specified_distro == self.options.all:
+            raise OptionValueError(
+                "Either specify a distribution series, or use --all.")
+
+        self.distroseries = self.getDistroSeries()
+        for series in self.distroseries:
+            self.logger.info("Looking for differences in %s.", series)
+            populate_distroseriesdiff(series)

=== added directory 'lib/lp/registry/scripts/tests'
=== added file 'lib/lp/registry/scripts/tests/__init__.py'
=== added file 'lib/lp/registry/scripts/tests/test_populate_distroseriesdiff.py'
--- lib/lp/registry/scripts/tests/test_populate_distroseriesdiff.py	1970-01-01 00:00:00 +0000
+++ lib/lp/registry/scripts/tests/test_populate_distroseriesdiff.py	2011-03-03 19:56:39 +0000
@@ -0,0 +1,468 @@
+# Copyright 2011 Canonical Ltd.  This software is licensed under the
+# GNU Affero General Public License version 3 (see the file LICENSE).
+
+"""Test the populate-distroseriesdiff script."""
+
+__metaclass__ = type
+
+from storm.store import Store
+
+from canonical.database.sqlbase import (
+    cursor,
+    quote,
+    )
+from canonical.testing.layers import (
+    DatabaseFunctionalLayer,
+    ZopelessDatabaseLayer,
+    )
+from lp.registry.enum import (
+    DistroSeriesDifferenceStatus,
+    DistroSeriesDifferenceType,
+    )
+from lp.registry.interfaces.pocket import PackagePublishingPocket
+from lp.registry.model.distroseriesdifference import DistroSeriesDifference
+from lp.registry.scripts.populate_distroseriesdiff import (
+    compose_sql_difference_type,
+    compose_sql_find_latest_source_package_releases,
+    compose_sql_find_differences,
+    compose_sql_populate_distroseriesdiff,
+    find_derived_series,
+    populate_distroseriesdiff,
+    PopulateDistroSeriesDiff,
+    )
+from lp.services.log.logger import DevNullLogger
+from lp.soyuz.interfaces.publishing import (
+    active_publishing_status,
+    inactive_publishing_status,
+    )
+from lp.soyuz.model.archive import Archive
+from lp.soyuz.enums import ArchivePurpose
+from lp.testing import TestCaseWithFactory
+
+
+class FactoryHelper:
+    """Some helper methods for making stuff that only make sense here."""
+
+    def getArchive(self, distribution, purpose):
+        """Get an existing `Archive`, or create one."""
+        archive = Store.of(distribution).find(
+            Archive,
+            Archive.distribution == distribution,
+            Archive.purpose == purpose).any()
+        if archive is not None:
+            return archive
+        return self.factory.makeArchive(
+            distribution=distribution, purpose=purpose)
+
+    def makeSPPH(self, distroseries=None, archive_purpose=None,
+                 pocket=PackagePublishingPocket.RELEASE, status=None,
+                 sourcepackagerelease=None):
+        """Create a `SourcePackagePublishingHistory` for derivation.
+
+        Has slightly different defaults from the `LaunchpadObjectFactory`
+        method for this, so that the SPPH will be picked up as a
+        `DistroSeriesDifference`.
+        """
+        if distroseries is None:
+            distroseries = self.factory.makeDistroSeries()
+
+        if archive_purpose is None:
+            archive = None
+        else:
+            archive = self.getArchive(
+                distroseries.distribution, archive_purpose)
+
+        return self.factory.makeSourcePackagePublishingHistory(
+            pocket=pocket, distroseries=distroseries, archive=archive,
+            status=status, sourcepackagerelease=sourcepackagerelease)
+
+    def makeDerivedDistroSeries(self):
+        """Create a `DistroSeries` that's derived from another distro."""
+        return self.factory.makeDistroSeries(
+            parent_series=self.factory.makeDistroSeries())
+
+    def getDistroSeriesDiff(self, distroseries):
+        """Find the `DistroSeriesDifference` records for `distroseries`."""
+        return Store.of(distroseries).find(
+            DistroSeriesDifference,
+            DistroSeriesDifference.derived_series == distroseries)
+
+
+class TestFindLatestSourcePackageReleases(TestCaseWithFactory, FactoryHelper):
+    """Test finding of latest `SourcePackageRelease`s for a series' packages.
+    """
+
+    layer = ZopelessDatabaseLayer
+
+    def getExpectedResultFor(self, spph):
+        """Compose what the query should return for `spph`.
+
+        :param spph: A `SourcePackagePublishingHistory`.
+        :return: The tuple of data that we'd expect the latest-spr query
+            to return for `spph`.
+        """
+        spr = spph.sourcepackagerelease
+        return (spr.sourcepackagenameID, spr.id, spr.version)
+
+    def test_baseline(self):
+        distroseries = self.factory.makeDistroSeries()
+        query = compose_sql_find_latest_source_package_releases(distroseries)
+        self.assertIsInstance(query, basestring)
+
+    def test_finds_nothing_for_empty_distroseries(self):
+        distroseries = self.factory.makeDistroSeries()
+        query = compose_sql_find_latest_source_package_releases(distroseries)
+        self.assertContentEqual([], Store.of(distroseries).execute(query))
+
+    def test_finds_published_sourcepackagerelease(self):
+        spph = self.makeSPPH()
+        query = compose_sql_find_latest_source_package_releases(
+            spph.distroseries)
+        self.assertEqual(1, Store.of(spph).execute(query).rowcount)
+
+    def test_selects_sourcepackagename_sourcepackagerelease_version(self):
+        spph = self.makeSPPH()
+        spr = spph.sourcepackagerelease
+        query = compose_sql_find_latest_source_package_releases(
+            spph.distroseries)
+        self.assertContentEqual(
+            [self.getExpectedResultFor(spph)], Store.of(spph).execute(query))
+
+    def test_does_not_find_publication_from_other_series(self):
+        spph = self.makeSPPH()
+        query = compose_sql_find_latest_source_package_releases(
+            self.factory.makeDistroSeries())
+        self.assertEqual(0, Store.of(spph).execute(query).rowcount)
+
+    def test_does_not_find_publication_outside_primary_archive(self):
+        distroseries = self.factory.makeDistroSeries()
+        spphs = dict(
+            (purpose, self.makeSPPH(
+                distroseries=distroseries, archive_purpose=purpose))
+            for purpose in ArchivePurpose.items)
+        primary_spr = spphs[ArchivePurpose.PRIMARY]
+        query = compose_sql_find_latest_source_package_releases(distroseries)
+        self.assertContentEqual(
+            [self.getExpectedResultFor(spphs[ArchivePurpose.PRIMARY])],
+            Store.of(distroseries).execute(query))
+
+    def test_does_not_find_publication_outside_release_pocket(self):
+        distroseries = self.factory.makeDistroSeries()
+        spphs = dict(
+            (pocket, self.makeSPPH(distroseries=distroseries, pocket=pocket))
+            for pocket in PackagePublishingPocket.items)
+        release_spph = spphs[PackagePublishingPocket.RELEASE]
+        query = compose_sql_find_latest_source_package_releases(distroseries)
+        self.assertContentEqual(
+            [self.getExpectedResultFor(release_spph)],
+            Store.of(distroseries).execute(query))
+
+    def test_finds_active_publication(self):
+        distroseries = self.factory.makeDistroSeries()
+        spphs = dict(
+            (status, self.makeSPPH(distroseries=distroseries, status=status))
+            for status in active_publishing_status)
+        query = compose_sql_find_latest_source_package_releases(distroseries)
+        self.assertContentEqual(
+            [self.getExpectedResultFor(spph) for spph in spphs.itervalues()],
+            Store.of(distroseries).execute(query))
+
+    def test_does_not_find_inactive_publication(self):
+        distroseries = self.factory.makeDistroSeries()
+        spphs = dict(
+            (status, self.makeSPPH(distroseries=distroseries, status=status))
+            for status in inactive_publishing_status)
+        query = compose_sql_find_latest_source_package_releases(distroseries)
+        self.assertContentEqual([], Store.of(distroseries).execute(query))
+
+    def test_finds_only_latest_publication_for_release(self):
+        distroseries = self.factory.makeDistroSeries()
+        spr = self.factory.makeSourcePackageRelease(distroseries=distroseries)
+        spphs = [
+            self.makeSPPH(distroseries=distroseries, sourcepackagerelease=spr)
+            for counter in xrange(5)]
+        query = compose_sql_find_latest_source_package_releases(distroseries)
+        self.assertContentEqual(
+            [self.getExpectedResultFor(spphs[-1])],
+            Store.of(distroseries).execute(query))
+
+    def test_finds_only_last_published_release_for_package(self):
+        distroseries = self.factory.makeDistroSeries()
+        spn = self.factory.makeSourcePackageName()
+        sprs = [
+            self.factory.makeSourcePackageRelease(
+                sourcepackagename=spn, distroseries=distroseries)
+            for counter in xrange(5)]
+        spphs = [
+            self.makeSPPH(distroseries=distroseries, sourcepackagerelease=spr)
+            for spr in reversed(sprs)]
+        query = compose_sql_find_latest_source_package_releases(distroseries)
+        self.assertContentEqual(
+            [self.getExpectedResultFor(spphs[-1])],
+            Store.of(distroseries).execute(query))
+
+
+class TestFindDifferences(TestCaseWithFactory, FactoryHelper):
+    """Test the finding of differences between a distroseries and parent."""
+
+    layer = ZopelessDatabaseLayer
+
+    def test_baseline(self):
+        query = compose_sql_find_differences(self.makeDerivedDistroSeries())
+        self.assertIsInstance(query, basestring)
+
+    def test_finds_nothing_for_empty_distroseries(self):
+        distroseries = self.makeDerivedDistroSeries()
+        query = compose_sql_find_differences(distroseries)
+        self.assertContentEqual([], Store.of(distroseries).execute(query))
+
+    def test_does_not_find_grandparents_packages(self):
+        parent = self.makeDerivedDistroSeries()
+        distroseries = self.factory.makeDistroSeries(parent_series=parent)
+        self.makeSPPH(distroseries=parent.parent_series)
+        query = compose_sql_find_differences(distroseries)
+        self.assertContentEqual([], Store.of(distroseries).execute(query))
+
+    def test_does_not_find_identical_releases(self):
+        distroseries = self.makeDerivedDistroSeries()
+        spr = self.factory.makeSourcePackageRelease()
+        self.makeSPPH(
+            distroseries=distroseries.parent_series, sourcepackagerelease=spr)
+        self.makeSPPH(
+            distroseries=distroseries, sourcepackagerelease=spr)
+        query = compose_sql_find_differences(distroseries)
+        self.assertContentEqual([], Store.of(distroseries).execute(query))
+
+    def test_finds_release_missing_in_derived_series(self):
+        distroseries = self.makeDerivedDistroSeries()
+        spph = self.makeSPPH(distroseries=distroseries.parent_series)
+        query = compose_sql_find_differences(distroseries)
+        self.assertContentEqual(
+            [(
+                spph.sourcepackagerelease.sourcepackagenameID,
+                None,
+                spph.sourcepackagerelease.version,
+            )],
+            Store.of(distroseries).execute(query))
+
+    def test_finds_release_unique_to_derived_series(self):
+        distroseries = self.makeDerivedDistroSeries()
+        spph = self.makeSPPH(distroseries=distroseries)
+        query = compose_sql_find_differences(distroseries)
+        self.assertContentEqual(
+            [(
+                spph.sourcepackagerelease.sourcepackagenameID,
+                spph.sourcepackagerelease.version,
+                None,
+            )],
+            Store.of(distroseries).execute(query))
+
+    def test_does_not_conflate_releases_of_different_packages(self):
+        distroseries = self.makeDerivedDistroSeries()
+        parent_spph = self.makeSPPH(distroseries=distroseries.parent_series)
+        derived_spph = self.makeSPPH(distroseries=distroseries)
+        query = compose_sql_find_differences(distroseries)
+        self.assertEqual(2, Store.of(distroseries).execute(query).rowcount)
+        self.assertContentEqual([(
+                parent_spph.sourcepackagerelease.sourcepackagenameID,
+                None,
+                parent_spph.sourcepackagerelease.version,
+            ), (
+                derived_spph.sourcepackagerelease.sourcepackagenameID,
+                derived_spph.sourcepackagerelease.version,
+                None,
+            )],
+            Store.of(distroseries).execute(query))
+
+    def test_finds_different_releases_of_same_package(self):
+        distroseries = self.makeDerivedDistroSeries()
+        parent_series = distroseries.parent_series
+        spn = self.factory.makeSourcePackageName()
+        parent_spph = self.makeSPPH(
+            distroseries=parent_series,
+            sourcepackagerelease=self.factory.makeSourcePackageRelease(
+                distroseries=parent_series, sourcepackagename=spn))
+        derived_spph = self.makeSPPH(
+            distroseries=distroseries,
+            sourcepackagerelease=self.factory.makeSourcePackageRelease(
+                distroseries=distroseries, sourcepackagename=spn))
+        query = compose_sql_find_differences(distroseries)
+        self.assertContentEqual(
+            [(
+                parent_spph.sourcepackagerelease.sourcepackagenameID,
+                derived_spph.sourcepackagerelease.version,
+                parent_spph.sourcepackagerelease.version,
+            )],
+            Store.of(distroseries).execute(query))
+
+    def test_finds_newer_release_even_when_same_release_also_exists(self):
+        derived_series = self.makeDerivedDistroSeries()
+        parent_series = derived_series.parent_series
+        spn = self.factory.makeSourcePackageName()
+        shared_spr = self.factory.makeSourcePackageRelease(
+            distroseries=parent_series, sourcepackagename=spn)
+        parent_spph = self.makeSPPH(
+            distroseries=parent_series,
+            sourcepackagerelease=shared_spr)
+        derived_spph = self.makeSPPH(
+            distroseries=derived_series,
+            sourcepackagerelease=shared_spr)
+        newer_spr = self.factory.makeSourcePackageRelease(
+            distroseries=derived_series, sourcepackagename=spn)
+        self.makeSPPH(
+            distroseries=derived_series, sourcepackagerelease=newer_spr)
+        query = compose_sql_find_differences(derived_series)
+        self.assertContentEqual(
+            [(
+                parent_spph.sourcepackagerelease.sourcepackagenameID,
+                newer_spr.version,
+                shared_spr.version,
+            )],
+            Store.of(derived_series).execute(query))
+
+
+class TestDifferenceTypeExpression(TestCaseWithFactory):
+
+    layer = ZopelessDatabaseLayer
+
+    def selectDifferenceType(self, parent_version=None, derived_version=None):
+        """Execute the SQL expression to compute `DistroSeriesDifferenceType`.
+
+        :param parent_version: The parent series' last released version
+            of a package, if any.
+        :param derived_version: The derived series' last released
+            version of the same package, if any.
+        :return: A numeric `DistroSeriesDifferenceType` value.
+        """
+        query = """
+            SELECT %s FROM (
+                SELECT %s AS source_version, %s AS parent_source_version
+            ) AS input""" % (
+            compose_sql_difference_type(),
+            quote(derived_version),
+            quote(parent_version),
+            )
+        cur = cursor()
+        cur.execute(query)
+        result = cur.fetchall()
+        self.assertEqual(1, len(result))
+        self.assertEqual(1, len(result[0]))
+        return result[0][0]
+
+    def test_baseline(self):
+        query = compose_sql_difference_type()
+        self.assertIsInstance(query, basestring)
+
+    def test_no_parent_version_means_unique_to_derived_series(self):
+        expected = DistroSeriesDifferenceType.UNIQUE_TO_DERIVED_SERIES
+        self.assertEqual(
+            expected.value, self.selectDifferenceType(derived_version=1))
+
+    def test_no_derived_version_means_missing_in_derived_series(self):
+        expected = DistroSeriesDifferenceType.MISSING_FROM_DERIVED_SERIES
+        self.assertEqual(
+            expected.value, self.selectDifferenceType(parent_version=1))
+
+    def test_two_versions_means_different_versions(self):
+        expected = DistroSeriesDifferenceType.DIFFERENT_VERSIONS
+        self.assertEqual(
+            expected.value,
+            self.selectDifferenceType(parent_version=1, derived_version=2))
+
+
+class TestFindDerivedSeries(TestCaseWithFactory, FactoryHelper):
+    """Test finding of all derived `DistroSeries`."""
+
+    layer = ZopelessDatabaseLayer
+
+    def test_does_not_find_underived_distroseries(self):
+        distroseries = self.factory.makeDistroSeries()
+        self.assertNotIn(distroseries, find_derived_series())
+
+    def test_finds_derived_distroseries(self):
+        self.assertIn(self.makeDerivedDistroSeries(), find_derived_series())
+
+    def test_ignores_parent_within_same_distro(self):
+        parent_series = self.factory.makeDistroSeries()
+        derived_series = self.factory.makeDistroSeries(
+            distribution=parent_series.distribution,
+            parent_series=parent_series)
+        self.assertNotIn(derived_series, find_derived_series())
+
+
+class TestPopulateDistroSeriesDiff(TestCaseWithFactory, FactoryHelper):
+    """Test `populate_distroseriesdiff`."""
+
+    layer = ZopelessDatabaseLayer
+
+    def test_baseline(self):
+        distroseries = self.factory.makeDistroSeries()
+        query = compose_sql_populate_distroseriesdiff(distroseries, "tmp")
+        self.assertIsInstance(query, basestring)
+
+    def test_creates_distroseriesdifference(self):
+        distroseries = self.makeDerivedDistroSeries()
+        spph = self.makeSPPH(distroseries=distroseries)
+        populate_distroseriesdiff(distroseries)
+        store = Store.of(distroseries)
+        dsd = self.getDistroSeriesDiff(distroseries).one()
+        spr = spph.sourcepackagerelease
+        self.assertEqual(spr.sourcepackagename, dsd.source_package_name)
+        self.assertEqual(
+            DistroSeriesDifferenceStatus.NEEDS_ATTENTION, dsd.status)
+
+    def test_does_not_overwrite_distroseriesdifference(self):
+        distroseries = self.makeDerivedDistroSeries()
+        existing_versions = {
+            'base': '3.1',
+            'parent': '3.14',
+            'derived': '3.141',
+        }
+        spph = self.makeSPPH(distroseries=distroseries)
+        spr = spph.sourcepackagerelease
+        self.factory.makeDistroSeriesDifference(
+            derived_series=distroseries,
+            source_package_name_str=spr.sourcepackagename.name,
+            versions=existing_versions)
+        dsd = self.getDistroSeriesDiff(distroseries).one()
+        self.assertEqual(existing_versions['base'], dsd.base_version)
+        self.assertEqual(
+            existing_versions['parent'], dsd.parent_source_version)
+        self.assertEqual(existing_versions['derived'], dsd.source_version)
+
+
+class TestPopulateDistroSeriesDiffScript(TestCaseWithFactory, FactoryHelper):
+    """Test the `populate-distroseriesdiff` script."""
+
+    layer = DatabaseFunctionalLayer
+
+    def makeScript(self, test_args):
+        script = PopulateDistroSeriesDiff(test_args=test_args)
+        script.logger = DevNullLogger()
+        return script
+
+    def test_script_populates_for_distroseries(self):
+        distroseries = self.makeDerivedDistroSeries()
+        self.makeSPPH(distroseries=distroseries)
+        self.makeSPPH(distroseries=distroseries.parent_series)
+        shared_spph = self.makeSPPH(distroseries=distroseries)
+        self.makeSPPH(
+            distroseries=distroseries.parent_series,
+            sourcepackagerelease=shared_spph.sourcepackagerelease)
+
+        script = self.makeScript([
+            '--distribution', distroseries.distribution.name,
+            '--series', distroseries.name,
+            ])
+        script.main()
+
+        self.assertEqual([distroseries], script.distroseries)
+        self.assertNotEqual(0, self.getDistroSeriesDiff(distroseries).count())
+
+    def test_script_populates_for_all_distroseries(self):
+        distroseries = [
+            self.makeDerivedDistroSeries() for counter in xrange(2)]
+        script = self.makeScript(['--all'])
+        script.main()
+        for series in distroseries:
+            self.assertIn(series, script.distroseries)

=== added file 'scripts/populate-distroseriesdiff.py'
--- scripts/populate-distroseriesdiff.py	1970-01-01 00:00:00 +0000
+++ scripts/populate-distroseriesdiff.py	2011-03-03 19:56:39 +0000
@@ -0,0 +1,16 @@
+#!/usr/bin/python -S
+#
+# Copyright 2011 Canonical Ltd.  This software is licensed under the
+# GNU Affero General Public License version 3 (see the file LICENSE).
+
+__metaclass__ = type
+
+import _pythonpath
+
+from lp.registry.scripts.populate_distroseriesdiff import (
+    PopulateDistroSeriesDiff,
+    )
+
+
+if __name__ == '__main__':
+    PopulateDistroSeriesDiff('populate-distroseriesdiff').run()


Follow ups