← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~wgrant/launchpad/publisher-release-cleanup into lp:launchpad/devel

 

William Grant has proposed merging lp:~wgrant/launchpad/publisher-release-cleanup into lp:launchpad/devel with lp:~wgrant/launchpad/better-publisher-index-tests as a prerequisite.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  #655690 release_files_needed shouldn't live on FTPArchiveHandler
  https://bugs.launchpad.net/bugs/655690


= Summary =
lp.archivepublisher has two index generation mechanisms: native DB publication for partner, debug and PPA archives; and apt-ftparchive publication for primary and copy archives. The apt-ftparchive handler is deprecated and to be phased out, but its release_files_needed attribute is still used by the native publisher (bug #655690).

== Proposed fix ==
Move release_files_needed from FTPArchiveHandler to Publisher, and only instantiate FTPArchiveHandler when it's actually needed.

== Implementation details ==
While moving release_files_needed I realised that it wasn't very sensible: it consisted of a map from suite to component to a list of architectures. While the set of suites changes depending on what needs regeneration, the component and architecture lists must contain the full set each time -- calculating them in a roundabout manner and storing them in the dict is pointless.

So I changed release_files_needed into a set of suites, and adjusted its users to retrieve the component and architecture lists directly from the publisher configuration. This mainly involved the removal of hacks from _writeDistroArchSeries, which I've split and renamed.

I also removed some duplicated code from lib/lp/archivepublisher/ftparchive.py. It was XXX'd by kiko four years ago as probably being redundant, and investigation confirms that proposition.

== Tests ==
test_publisher.py has had complex release_files_needed verification removed. The new simpler data structure removes what they are testing, and their role is fulfilled by testAllIndicesArePublished as added in the prereq.

== Demo and Q/A ==
This is just a refactoring. It's fine as long as the tests work.

= Launchpad lint =
There is plenty of inherited lint that could be fixed, but this branch is heavy enough, and there are five following it fixing most of it.
-- 
https://code.launchpad.net/~wgrant/launchpad/publisher-release-cleanup/+merge/38530
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~wgrant/launchpad/publisher-release-cleanup into lp:launchpad/devel.
=== modified file 'lib/lp/archivepublisher/ftparchive.py'
--- lib/lp/archivepublisher/ftparchive.py	2010-10-06 13:50:50 +0000
+++ lib/lp/archivepublisher/ftparchive.py	2010-10-15 11:41:18 +0000
@@ -157,7 +157,6 @@
             else:
                 self.distroseries.append(distroseries)
         self.publisher = publisher
-        self.release_files_needed = {}
 
         # We need somewhere to note down where the debian-installer
         # components came from. in _di_release_components we store
@@ -199,13 +198,9 @@
         We do this to have Packages or Sources for them even if we lack
         anything in them currently.
         """
-        # XXX: kiko 2006-08-24: suffix is completely unnecessary here. Just
-        # iterate over the pockets, and do the suffix check inside
-        # createEmptyPocketRequest; that would also allow us to replace
-        # the == "" check we do there by a RELEASE match
         for distroseries in self.distroseries:
             components = self._config.componentsForSeries(distroseries.name)
-            for pocket, suffix in pocketsuffix.items():
+            for pocket in PackagePublishingPocket.items:
                 if not fullpublish:
                     if not self.publisher.isDirty(distroseries, pocket):
                         continue
@@ -213,29 +208,15 @@
                     if not self.publisher.isAllowed(distroseries, pocket):
                         continue
 
+                self.publisher.release_files_needed.add(
+                    (distroseries.name, pocket))
+
                 for comp in components:
-                    self.createEmptyPocketRequest(distroseries, suffix, comp)
-
-    def requestReleaseFile(self, suite_name, component_name, arch_name):
-        """Request Release file generation for given context.
-
-        'suite_name', 'component_name' and 'arch_name' will be organised as
-        a dictionary (self.release_files_needed) keyed by 'suite_name' which
-        value will be another dictionary keyed by 'component_name' and
-        containing a set of 'arch_name's as value.
-        """
-        suite_special = self.release_files_needed.setdefault(
-            suite_name, {})
-        suite_component_special = suite_special.setdefault(
-            component_name, set())
-        suite_component_special.add(arch_name)
-
-    def createEmptyPocketRequest(self, distroseries, suffix, comp):
+                    self.createEmptyPocketRequest(distroseries, pocket, comp)
+
+    def createEmptyPocketRequest(self, distroseries, pocket, comp):
         """Creates empty files for a release pocket and distroseries"""
-        full_distroseries_name = distroseries.name + suffix
-        arch_tags = self._config.archTagsForSeries(distroseries.name)
-
-        if suffix == "":
+        if pocket == PackagePublishingPocket.RELEASE:
             # organize distroseries and component pair as
             # debian-installer -> distroseries_component
             # internal map. Only the main pocket actually
@@ -246,6 +227,8 @@
                     ".".join(["override", distroseries.name, comp,
                               "debian-installer"]))
 
+        full_distroseries_name = distroseries.name + pocketsuffix[pocket]
+
         # Touch the source file lists and override files
         f_touch(self._config.overrideroot,
                 ".".join(["override", full_distroseries_name, comp]))
@@ -254,20 +237,10 @@
         f_touch(self._config.overrideroot,
                 ".".join(["override", full_distroseries_name, comp, "src"]))
 
-        dr_comps = self.release_files_needed.setdefault(
-            full_distroseries_name, {})
-
         f_touch(self._config.overrideroot,
                 "_".join([full_distroseries_name, comp, "source"]))
-        dr_comps.setdefault(comp, set()).add("source")
-
-        for arch in arch_tags:
-            # organize dr/comp/arch into temporary binary
-            # archive map for the architecture in question.
-            dr_special = self.release_files_needed.setdefault(
-                full_distroseries_name, {})
-            dr_special.setdefault(comp, set()).add("binary-"+arch)
-
+
+        for arch in self._config.archTagsForSeries(distroseries.name):
             # Touch more file lists for the archs.
             f_touch(self._config.overrideroot,
                     "_".join([full_distroseries_name, comp, "binary-"+arch]))
@@ -714,9 +687,6 @@
 
         Also outputs a debian-installer file list if necessary.
         """
-        self.release_files_needed.setdefault(
-            dr_pocketed, {}).setdefault(component, set()).add(arch)
-
         files = []
         di_files = []
         f_path = os.path.join(self._config.overrideroot,
@@ -807,37 +777,8 @@
         """Generates the config stanza for an individual pocket."""
         dr_pocketed = distroseries_name + pocketsuffix[pocket]
 
-        # XXX kiko 2006-08-24: I have no idea what the code below is meant
-        # to do -- it appears to be a rehash of createEmptyPocketRequests.
         archs = self._config.archTagsForSeries(distroseries_name)
         comps = self._config.componentsForSeries(distroseries_name)
-        for comp in comps:
-            comp_path = os.path.join(self._config.overrideroot,
-                                     "_".join([dr_pocketed, comp, "source"]))
-            if not os.path.exists(comp_path):
-                # Create empty files so that even if we don't output
-                # anything here apt-ftparchive will DTRT
-                f_touch(comp_path)
-                f_touch(self._config.overrideroot,
-                        ".".join(["override", dr_pocketed, comp]))
-                f_touch(self._config.overrideroot,
-                        ".".join(["override", dr_pocketed, comp, "src"]))
-
-        if len(comps) == 0:
-            self.log.debug("Did not find any components to create config "
-                           "for %s" % dr_pocketed)
-            return
-
-        # Second up, pare archs down as appropriate
-        for arch in archs:
-            # XXX: kiko 2006-08-24: why is it comps[0] here?
-            arch_path = os.path.join(self._config.overrideroot,
-                "_".join([dr_pocketed, comps[0], "binary-"+arch]))
-            if not os.path.exists(arch_path):
-                # Create an empty file if we don't have one so that
-                # apt-ftparchive will dtrt.
-                f_touch(arch_path)
-        # XXX kiko 2006-08-24: End uncomprehensible code.
 
         self.log.debug("Generating apt config for %s" % dr_pocketed)
         apt_config.write(STANZA_TEMPLATE % {

=== modified file 'lib/lp/archivepublisher/publishing.py'
--- lib/lp/archivepublisher/publishing.py	2010-10-06 13:49:38 +0000
+++ lib/lp/archivepublisher/publishing.py	2010-10-15 11:41:18 +0000
@@ -58,13 +58,19 @@
     Over time this method needs to be removed and replaced by having
     component ordering codified in the database.
     """
-    ret = []
+    remaining = list(components)
+    ordered = []
     for comp in HARDCODED_COMPONENT_ORDER:
-        if comp in components:
-            ret.append(comp)
-            components.remove(comp)
-    ret.extend(components)
-    return ret
+        if comp in remaining:
+            ordered.append(comp)
+            remaining.remove(comp)
+    ordered.extend(remaining)
+    return ordered
+
+
+def get_suffixed_indices(path):
+    """Return a set of paths to compressed copies of the given index."""
+    return set([path + suffix for suffix in ('', '.gz', '.bz2')])
 
 
 def _getDiskPool(pubconf, log):
@@ -147,21 +153,19 @@
         else:
             self._library = library
 
-        # Grab a reference to an apt_handler as we use it later to
-        # probe which components need releases files generated.
-        self.apt_handler = FTPArchiveHandler(self.log, self._config,
-                                             self._diskpool, self.distro,
-                                             self)
         # Track which distroseries pockets have been dirtied by a
         # change, and therefore need domination/apt-ftparchive work.
         # This is a set of tuples in the form (distroseries.name, pocket)
         self.dirty_pockets = set()
 
+        # Track which pockets need release files. This will contain more
+        # than dirty_pockets in the case of a careful index run.
+        # This is a set of tuples in the form (distroseries.name, pocket)
+        self.release_files_needed = set()
+
     def isDirty(self, distroseries, pocket):
         """True if a publication has happened in this release and pocket."""
-        if not (distroseries.name, pocket) in self.dirty_pockets:
-            return False
-        return True
+        return (distroseries.name, pocket) in self.dirty_pockets
 
     def markPocketDirty(self, distroseries, pocket):
         """Mark a pocket dirty only if it's allowed."""
@@ -176,10 +180,8 @@
 
         Otherwise, return False.
         """
-        if (self.allowed_suites and
-            (distroseries.name, pocket) not in self.allowed_suites):
-            return False
-        return True
+        return (not self.allowed_suites or
+                (distroseries.name, pocket) in self.allowed_suites)
 
     def A_publish(self, force_publishing):
         """First step in publishing: actual package publishing.
@@ -281,7 +283,10 @@
     def C_doFTPArchive(self, is_careful):
         """Does the ftp-archive step: generates Sources and Packages."""
         self.log.debug("* Step C: Set apt-ftparchive up and run it")
-        self.apt_handler.run(is_careful)
+        apt_handler = FTPArchiveHandler(self.log, self._config,
+                                        self._diskpool, self.distro,
+                                        self)
+        apt_handler.run(is_careful)
 
     def C_writeIndexes(self, is_careful):
         """Write Index files (Packages & Sources) using LP information.
@@ -297,6 +302,9 @@
                                        (distroseries.name, pocket.name))
                         continue
                     self.checkDirtySuiteBeforePublishing(distroseries, pocket)
+
+                self.release_files_needed.add((distroseries.name, pocket))
+
                 # Retrieve components from the publisher config because
                 # it gets overridden in getPubConfig to set the
                 # correct components for the archive being used.
@@ -323,7 +331,7 @@
                                        (distroseries.name, pocket.name))
                         continue
                     self.checkDirtySuiteBeforePublishing(distroseries, pocket)
-                self._writeDistroSeries(distroseries, pocket)
+                self._writeSuite(distroseries, pocket)
 
     def _writeComponentIndexes(self, distroseries, pocket, component):
         """Write Index files for single distroseries + pocket + component.
@@ -358,11 +366,6 @@
 
             arch_path = 'binary-%s' % arch.architecturetag
 
-            # XXX wgrant 2010-10-06 bug=655690: Using FTPArchiveHandler
-            # for NMAF is wrong.
-            self.apt_handler.requestReleaseFile(
-                suite_name, component.name, arch_path)
-
             self.log.debug("Generating Packages for %s" % arch_path)
 
             package_index_root = os.path.join(
@@ -394,11 +397,6 @@
             package_index.close()
             di_index.close()
 
-        # XXX wgrant 2010-10-06 bug=655690: Using FTPArchiveHandler
-        # is wrong here too.
-        self.apt_handler.requestReleaseFile(
-            suite_name, component.name, 'source')
-
     def cannotModifySuite(self, distroseries, pocket):
         """Return True if the distroseries is stable and pocket is release."""
         return (not distroseries.isUnstable() and
@@ -447,34 +445,28 @@
             return self.distro.displayname
         return "LP-PPA-%s" % get_ppa_reference(self.archive)
 
-    def _writeDistroSeries(self, distroseries, pocket):
-        """Write out the Release files for the provided distroseries."""
+    def _writeSuite(self, distroseries, pocket):
+        """Write out the Release files for the provided suite."""
         # XXX: kiko 2006-08-24: Untested method.
 
         # As we generate file lists for apt-ftparchive we record which
         # distroseriess and so on we need to generate Release files for.
         # We store this in release_files_needed and consume the information
         # when writeReleaseFiles is called.
-        full_name = distroseries.name + pocketsuffix[pocket]
-        release_files_needed = self.apt_handler.release_files_needed
-        if full_name not in release_files_needed:
+        if (distroseries.name, pocket) not in self.release_files_needed:
             # If we don't need to generate a release for this release
             # and pocket, don't!
             return
 
-        all_components = set()
-        all_architectures = set()
+        all_components = self._config.componentsForSeries(distroseries.name)
+        all_architectures = self._config.archTagsForSeries(distroseries.name)
         all_files = set()
-        release_files_needed_items = release_files_needed[full_name].items()
-        for component, architectures in release_files_needed_items:
-            all_components.add(component)
-            for architecture in architectures:
-                # XXX malcc 2006-09-20: We don't like the way we build this
-                # all_architectures list. Make this better code.
-                clean_architecture = self._writeDistroArchSeries(
+        for component in all_components:
+            self._writeSuiteSource(
+                distroseries, pocket, component, all_files)
+            for architecture in all_architectures:
+                self._writeSuiteArch(
                     distroseries, pocket, component, architecture, all_files)
-                if clean_architecture != "source":
-                    all_architectures.add(clean_architecture)
 
         drsummary = "%s %s " % (self.distro.displayname,
                                 distroseries.displayname)
@@ -483,6 +475,7 @@
         else:
             drsummary += pocket.name.capitalize()
 
+        full_name = distroseries.getSuite(pocket)
         release_file = Release()
         release_file["Origin"] = self._getOrigin()
         release_file["Label"] = self._getLabel()
@@ -530,60 +523,61 @@
         archive_signer = IArchiveSigningKey(self.archive)
         archive_signer.signRepository(full_name)
 
-    def _writeDistroArchSeries(self, distroseries, pocket, component,
-                                architecture, all_files):
-        """Write out a Release file for a DAR."""
+    def _writeSuiteArchOrSource(self, distroseries, pocket, component,
+                                file_stub, arch_name, arch_path,
+                                all_series_files):
+        """Write out a Release file for an architecture or source."""
         # XXX kiko 2006-08-24: Untested method.
 
-        full_name = distroseries.name + pocketsuffix[pocket]
-        index_suffixes = ('', '.gz', '.bz2')
-
+        suite = distroseries.getSuite(pocket)
         self.log.debug("Writing Release file for %s/%s/%s" % (
-            full_name, component, architecture))
-
-        if architecture != "source":
-            # Strip "binary-" off the front of the architecture
-            clean_architecture = architecture[7:]
-            file_stub = "Packages"
-
-            # Only the primary and PPA archives have debian-installer.
-            if self.archive.purpose != ArchivePurpose.PARTNER:
-                # Set up the debian-installer paths for main_archive.
-                # d-i paths are nested inside the component.
-                di_path = os.path.join(
-                    component, "debian-installer", architecture)
-                di_file_stub = os.path.join(di_path, file_stub)
-                for suffix in index_suffixes:
-                    all_files.add(di_file_stub + suffix)
-        else:
-            file_stub = "Sources"
-            clean_architecture = architecture
+            suite, component, arch_path))
 
         # Now, grab the actual (non-di) files inside each of
         # the suite's architectures
-        file_stub = os.path.join(component, architecture, file_stub)
-
-        for suffix in index_suffixes:
-            all_files.add(file_stub + suffix)
-
-        all_files.add(os.path.join(component, architecture, "Release"))
+        file_stub = os.path.join(component, arch_path, file_stub)
+
+        all_series_files.update(get_suffixed_indices(file_stub))
+        all_series_files.add(os.path.join(component, arch_path, "Release"))
 
         release_file = Release()
-        release_file["Archive"] = full_name
+        release_file["Archive"] = suite
         release_file["Version"] = distroseries.version
         release_file["Component"] = component
         release_file["Origin"] = self._getOrigin()
         release_file["Label"] = self._getLabel()
-        release_file["Architecture"] = clean_architecture
+        release_file["Architecture"] = arch_name
 
-        f = open(os.path.join(self._config.distsroot, full_name,
-                              component, architecture, "Release"), "w")
+        f = open(os.path.join(self._config.distsroot, suite,
+                              component, arch_path, "Release"), "w")
         try:
             release_file.dump(f, "utf-8")
         finally:
             f.close()
 
-        return clean_architecture
+    def _writeSuiteSource(self, distroseries, pocket, component,
+                          all_series_files):
+        """Write out a Release file for a suite's sources."""
+        self._writeSuiteArchOrSource(
+            distroseries, pocket, component, 'Sources', 'source', 'source',
+            all_series_files)
+
+    def _writeSuiteArch(self, distroseries, pocket, component,
+                        arch_name, all_series_files):
+        """Write out a Release file for an architecture in a suite."""
+        file_stub = 'Packages'
+        arch_path = 'binary-' + arch_name
+        # Only the primary and PPA archives have debian-installer.
+        if self.archive.purpose != ArchivePurpose.PARTNER:
+            # Set up the debian-installer paths for main_archive.
+            # d-i paths are nested inside the component.
+            di_path = os.path.join(
+                component, "debian-installer", arch_path)
+            di_file_stub = os.path.join(di_path, file_stub)
+            all_series_files.update(get_suffixed_indices(di_file_stub))
+        self._writeSuiteArchOrSource(
+            distroseries, pocket, component, 'Packages', arch_name, arch_path,
+            all_series_files)
 
     def _readIndexFileContents(self, distroseries_name, file_name):
         """Read an index files' contents.

=== modified file 'lib/lp/archivepublisher/tests/test_publisher.py'
--- lib/lp/archivepublisher/tests/test_publisher.py	2010-10-15 11:41:16 +0000
+++ lib/lp/archivepublisher/tests/test_publisher.py	2010-10-15 11:41:18 +0000
@@ -639,13 +639,10 @@
              ''],
             index_contents)
 
-        # Check if apt_handler.release_files_needed has the right requests.
-        # 'source' & 'binary-i386' Release files should be regenerated
-        # for all breezy-autotest components. Note that 'debian-installer'
-        # indexes do not need Release files.
-
         # We always regenerate all Releases file for a given suite.
-        self.checkAllRequestedReleaseFiles(archive_publisher)
+        self.assertTrue(
+            ('breezy-autotest', PackagePublishingPocket.RELEASE) in
+            archive_publisher.release_files_needed)
 
         # remove PPA root
         shutil.rmtree(config.personalpackagearchive.root)
@@ -769,45 +766,6 @@
         # are marked as dirty.
         self.checkDirtyPockets(publisher, expected=allowed_suites)
 
-    def assertReleaseFileRequested(self, publisher, suite_name,
-                                   component_name, archs):
-        """Assert the given context will have it's Release file regenerated.
-
-        Check if a request for the given context is correctly stored in:
-           publisher.apt_handler.release_files_needed
-        """
-        suite = publisher.apt_handler.release_files_needed.get(suite_name)
-        self.assertTrue(
-            suite is not None, 'Suite %s not requested' % suite_name)
-        self.assertTrue(
-            component_name in suite,
-            'Component %s/%s not requested' % (suite_name, component_name))
-        self.assertEquals(archs, suite[component_name])
-
-    def checkAllRequestedReleaseFiles(self, publisher,
-                                      architecturetags=('hppa', 'i386')):
-        """Check if all expected Release files are going to be regenerated.
-
-        'source', 'binary-i386' and 'binary-hppa' Release Files should be
-        requested for regeneration in all breezy-autotest components.
-        """
-        available_components = sorted([
-            c.name for c in self.breezy_autotest.components])
-        self.assertEqual(available_components,
-                         ['main', 'multiverse', 'restricted', 'universe'])
-
-        available_archs = ['binary-%s' % arch for arch in architecturetags]
-
-        # XXX cprov 20071210: Include the artificial component 'source' as a
-        # location to check for generated indexes. Ideally we should
-        # encapsulate this task in publishing.py and this common method
-        # in tests as well.
-        all_archs = set(available_archs)
-        all_archs.add('source')
-        for component in available_components:
-            self.assertReleaseFileRequested(
-                publisher, 'breezy-autotest', component, all_archs)
-
     def _getReleaseFileOrigin(self, contents):
         origin_header = 'Origin: '
         [origin_line] = [
@@ -831,8 +789,9 @@
         publisher.A_publish(False)
         publisher.C_doFTPArchive(False)
 
-        # We always regenerate all Releases file for a given suite.
-        self.checkAllRequestedReleaseFiles(publisher)
+        self.assertTrue(
+            ('breezy-autotest', PackagePublishingPocket.RELEASE) in
+            publisher.release_files_needed)
 
         publisher.D_writeReleaseFiles(False)
 
@@ -1109,8 +1068,7 @@
         """
 
         self.assertTrue(
-            series.getSuite(pocket) in
-            publisher.apt_handler.release_files_needed)
+            (series.name, pocket) in publisher.release_files_needed)
 
         arch_template = os.path.join(
             publisher._config.distsroot, series.getSuite(pocket), '%s/%s')