← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~cjwatson/launchpad/gina-deb-vendor-debian into lp:launchpad

 

Colin Watson has proposed merging lp:~cjwatson/launchpad/gina-deb-vendor-debian into lp:launchpad.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  Bug #901334 in Launchpad itself: "ExecutionError: Error 2 unpacking source raised by gina.py"
  https://bugs.launchpad.net/launchpad/+bug/901334

For more details, see:
https://code.launchpad.net/~cjwatson/launchpad/gina-deb-vendor-debian/+merge/111120

== Summary ==

Bug 901334 reports gina raising ExecutionErrors while attempting to extract a few source packages in Debian.  The logs indicate that the affected packages are currently accountsservice, qjackctl, and xfce4-smartbookmark-plugin.

== Proposed fix ==

The affected packages can all be fixed by extracting with DEB_VENDOR=debian set in the environment, which tells dpkg-source to use the Debian patch series rather than the (broken) Ubuntu one.  This is correct anyway when extracting Debian source packages, as opposed to source packages that have been synced from Debian into Ubuntu.

Not many packages make use of the facility to have different Debian and Ubuntu patch series, never mind leaving one of them broken, which explains why this failure only occurs on three packages.  We'll still have to fix this up to make them build on Ubuntu, but this should be handled in Ubuntu (or Debian) rather than having gina fail.

== Implementation details ==

The only annoyance is that we have to pass the distribution name down through quite a number of layers of code.

== LOC Rationale ==

+84.  I have 2074 lines of credit and just submitted a branch which gains me another 816, so I'd like to use some of that credit for this critical bug fix.

== Tests ==

bin/test -vvct gina

== Demo and Q/A ==

This I'm unsure about.  https://code.launchpad.net/~jtv/launchpad/katie-and-gina-are-bad-bad-girls/+merge/75472 suggests that I might want to use a scratch table to compare gina results before and after the upgrade, and presumably I could inject one of the broken packages into the local mirror used for this.  However, I'm unsure about how well gina would deal with the local mirror being several months older than dogfood's last database import, and thus several months older than its current /debian/+source/*.  Any advice?

== Lint ==

The usual false positive for scripts:

./scripts/gina.py
      22: '_pythonpath' imported but unused
-- 
https://code.launchpad.net/~cjwatson/launchpad/gina-deb-vendor-debian/+merge/111120
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~cjwatson/launchpad/gina-deb-vendor-debian into lp:launchpad.
=== modified file 'lib/lp/archiveuploader/utils.py'
--- lib/lp/archiveuploader/utils.py	2011-06-14 20:20:13 +0000
+++ lib/lp/archiveuploader/utils.py	2012-06-19 23:13:26 +0000
@@ -1,4 +1,4 @@
-# Copyright 2009-2010 Canonical Ltd.  This software is licensed under the
+# Copyright 2009-2012 Canonical Ltd.  This software is licensed under the
 # GNU Affero General Public License version 3 (see the file LICENSE).
 
 """Archive uploader utilities."""
@@ -28,6 +28,7 @@
 
 
 import email.Header
+import os
 import re
 import signal
 import subprocess
@@ -267,7 +268,7 @@
     return fix_maintainer(content, fieldname)
 
 
-def extract_dpkg_source(dsc_filepath, target):
+def extract_dpkg_source(dsc_filepath, target, vendor=None):
     """Extract a source package by dsc file path.
 
     :param dsc_filepath: Path of the DSC file
@@ -281,9 +282,12 @@
         #   blosxom/2009-07-02-python-sigpipe.html
         signal.signal(signal.SIGPIPE, signal.SIG_DFL)
     args = ["dpkg-source", "-sn", "-x", dsc_filepath]
+    env = dict(os.environ)
+    if vendor is not None:
+        env["DEB_VENDOR"] = vendor
     dpkg_source = subprocess.Popen(
         args, stdout=subprocess.PIPE, cwd=target, stderr=subprocess.PIPE,
-        preexec_fn=subprocess_setup)
+        preexec_fn=subprocess_setup, env=env)
     output, unused = dpkg_source.communicate()
     result = dpkg_source.wait()
     if result != 0:

=== modified file 'lib/lp/soyuz/scripts/gina/handlers.py'
--- lib/lp/soyuz/scripts/gina/handlers.py	2012-01-05 14:23:33 +0000
+++ lib/lp/soyuz/scripts/gina/handlers.py	2012-06-19 23:13:26 +0000
@@ -179,7 +179,7 @@
         self.imported_bins = {}
 
         self.sphandler = SourcePackageHandler(
-            archive_root, pocket, component_override)
+            distro_name, archive_root, pocket, component_override)
         self.bphandler = BinaryPackageHandler(
             self.sphandler, archive_root, pocket)
 
@@ -454,8 +454,9 @@
     on the launchpad db a little easier.
     """
 
-    def __init__(self, archive_root, pocket, component_override):
+    def __init__(self, distro_name, archive_root, pocket, component_override):
         self.distro_handler = DistroHandler()
+        self.distro_name = distro_name
         self.archive_root = archive_root
         self.pocket = pocket
         self.component_override = component_override
@@ -492,7 +493,7 @@
             return None
 
         # Process the package
-        sp_data.process_package(self.archive_root)
+        sp_data.process_package(self.distro_name, self.archive_root)
         sp_data.ensure_complete()
 
         spr = self.createSourcePackageRelease(sp_data, distroseries)

=== modified file 'lib/lp/soyuz/scripts/gina/packages.py'
--- lib/lp/soyuz/scripts/gina/packages.py	2011-12-30 06:14:56 +0000
+++ lib/lp/soyuz/scripts/gina/packages.py	2012-06-19 23:13:26 +0000
@@ -1,4 +1,4 @@
-# Copyright 2009-2011 Canonical Ltd.  This software is licensed under the
+# Copyright 2009-2012 Canonical Ltd.  This software is licensed under the
 # GNU Affero General Public License version 3 (see the file LICENSE).
 
 # pylint: disable-msg=W0631
@@ -100,11 +100,11 @@
     raise PoolFileNotFound("File %s not in archive" % filename)
 
 
-def unpack_dsc(package, version, component, archive_root):
+def unpack_dsc(package, version, component, distro_name, archive_root):
     dsc_name, dsc_path, component = get_dsc_path(package, version,
                                                  component, archive_root)
     try:
-        extract_dpkg_source(dsc_path, ".")
+        extract_dpkg_source(dsc_path, ".", vendor=distro_name)
     except DpkgSourceError, e:
         raise ExecutionError("Error %d unpacking source" % e.result)
 
@@ -114,9 +114,9 @@
     return source_dir, dsc_path
 
 
-def read_dsc(package, version, component, archive_root):
+def read_dsc(package, version, component, distro_name, archive_root):
     source_dir, dsc_path = unpack_dsc(package, version, component,
-                                      archive_root)
+                                      distro_name, archive_root)
 
     dsc = open(dsc_path).read().strip()
 
@@ -229,7 +229,7 @@
         if missing:
             raise MissingRequiredArguments(missing)
 
-    def process_package(self, archive_root):
+    def process_package(self, distro_name, archive_root):
         """Process the package using the files located in the archive.
 
         Raises PoolFileNotFound if a file is not found in the pool.
@@ -240,7 +240,7 @@
         cwd = os.getcwd()
         os.chdir(tempdir)
         try:
-            self.do_package(archive_root)
+            self.do_package(distro_name, archive_root)
         finally:
             os.chdir(cwd)
         # We only rmtree if everything worked as expected; otherwise,
@@ -250,7 +250,7 @@
         self.date_uploaded = UTC_NOW
         return True
 
-    def do_package(self, archive_root):
+    def do_package(self, distro_name, archive_root):
         """To be provided by derived class."""
         raise NotImplementedError
 
@@ -332,14 +332,15 @@
 
         AbstractPackageData.__init__(self)
 
-    def do_package(self, archive_root):
+    def do_package(self, distro_name, archive_root):
         """Get the Changelog and urgency from the package on archive.
 
         If successful processing of the package occurs, this method
         sets the changelog and urgency attributes.
         """
         dsc, changelog, copyright = read_dsc(
-            self.package, self.version, self.component, archive_root)
+            self.package, self.version, self.component, distro_name,
+            archive_root)
 
         self.dsc = encoding.guess(dsc)
         self.copyright = encoding.guess(copyright)
@@ -497,7 +498,7 @@
 
         AbstractPackageData.__init__(self)
 
-    def do_package(self, archive_root):
+    def do_package(self, distro_name, archive_root):
         """Grab shared library info from .deb."""
         fullpath = os.path.join(archive_root, self.filename)
         if not os.path.exists(fullpath):

=== modified file 'lib/lp/soyuz/scripts/tests/test_gina.py'
--- lib/lp/soyuz/scripts/tests/test_gina.py	2012-01-01 02:58:52 +0000
+++ lib/lp/soyuz/scripts/tests/test_gina.py	2012-06-19 23:13:26 +0000
@@ -1,12 +1,18 @@
-# Copyright 2009-2011 Canonical Ltd.  This software is licensed under the
+# Copyright 2009-2012 Canonical Ltd.  This software is licensed under the
 # GNU Affero General Public License version 3 (see the file LICENSE).
 
 from doctest import DocTestSuite
+import hashlib
+import os
+from textwrap import dedent
 from unittest import TestLoader
 
+from lp.archiveuploader.tagfiles import parse_tagfile
 from lp.registry.interfaces.pocket import PackagePublishingPocket
 from lp.services.log.logger import DevNullLogger
+from lp.services.tarfile_helpers import LaunchpadWriteTarFile
 from lp.soyuz.enums import PackagePublishingStatus
+from lp.soyuz.scripts.gina import ExecutionError
 from lp.soyuz.scripts.gina.dominate import dominate_imported_source_packages
 import lp.soyuz.scripts.gina.handlers
 from lp.soyuz.scripts.gina.handlers import (
@@ -116,6 +122,70 @@
             [pub.status for pub in pubs])
 
 
+class TestSourcePackageData(TestCaseWithFactory):
+
+    layer = ZopelessDatabaseLayer
+
+    def test_unpack_dsc_with_vendor(self):
+        # Some source packages unpack differently depending on dpkg's idea
+        # of the "vendor", and in extreme cases may even fail with some
+        # vendors.  gina always sets the vendor to "debian" to ensure that
+        # it unpacks packages as if unpacking on Debian.
+        archive_root = self.useTempDir()
+        pool_dir = os.path.join(archive_root, "pool/main/f/foo")
+        os.makedirs(pool_dir)
+
+        # Synthesise a package that can be unpacked with DEB_VENDOR=debian
+        # but not with DEB_VENDOR=ubuntu.
+        with open(os.path.join(
+            pool_dir, "foo_1.0.orig.tar.gz"), "wb+") as buffer:
+            orig_tar = LaunchpadWriteTarFile(buffer)
+            orig_tar.add_directory("foo-1.0")
+            orig_tar.close()
+            buffer.seek(0)
+            orig_tar_contents = buffer.read()
+        with open(os.path.join(
+            pool_dir, "foo_1.0-1.debian.tar.gz"), "wb+") as buffer:
+            debian_tar = LaunchpadWriteTarFile(buffer)
+            debian_tar.add_file("debian/source/format", "3.0 (quilt)\n")
+            debian_tar.add_file(
+                "debian/patches/ubuntu.series", "--- corrupt patch\n")
+            debian_tar.add_file("debian/rules", "")
+            debian_tar.close()
+            buffer.seek(0)
+            debian_tar_contents = buffer.read()
+        dsc_path = os.path.join(pool_dir, "foo_1.0-1.dsc")
+        with open(dsc_path, "w") as dsc:
+            dsc.write(dedent("""\
+                Format: 3.0 (quilt)
+                Source: foo
+                Binary: foo
+                Architecture: all
+                Version: 1.0-1
+                Maintainer: Foo Bar <foo.bar@xxxxxxxxxxxxx>
+                Files:
+                 %s %s foo_1.0.orig.tar.gz
+                 %s %s foo_1.0-1.debian.tar.gz
+                """ % (
+                    hashlib.md5(orig_tar_contents).hexdigest(),
+                    len(orig_tar_contents),
+                    hashlib.md5(debian_tar_contents).hexdigest(),
+                    len(debian_tar_contents))))
+
+        dsc_contents = parse_tagfile(dsc_path)
+        dsc_contents["Directory"] = pool_dir
+        dsc_contents["Package"] = "foo"
+        dsc_contents["Component"] = "main"
+        dsc_contents["Section"] = "misc"
+
+        sp_data = SourcePackageData(**dsc_contents)
+        # Unpacking this in an Ubuntu context fails.
+        self.assertRaises(
+            ExecutionError, sp_data.do_package, "ubuntu", archive_root)
+        # But all is well in a Debian context.
+        sp_data.do_package("debian", archive_root)
+
+
 class TestSourcePackagePublisher(TestCaseWithFactory):
 
     layer = ZopelessDatabaseLayer

=== modified file 'scripts/gina.py'
--- scripts/gina.py	2012-01-01 03:13:08 +0000
+++ scripts/gina.py	2012-06-19 23:13:26 +0000
@@ -1,6 +1,6 @@
 #!/usr/bin/python -S
 #
-# Copyright 2009-2011 Canonical Ltd.  This software is licensed under the
+# Copyright 2009-2012 Canonical Ltd.  This software is licensed under the
 # GNU Affero General Public License version 3 (see the file LICENSE).
 
 # This module uses relative imports.
@@ -114,7 +114,7 @@
     importer_handler = ImporterHandler(
         ztm, distro, distroseries, package_root, pocket, component_override)
 
-    import_sourcepackages(packages_map, package_root, importer_handler)
+    import_sourcepackages(distro, packages_map, package_root, importer_handler)
     importer_handler.commit()
 
     # XXX JeroenVermeulen 2011-09-07 bug=843728: Dominate binaries as well.
@@ -133,23 +133,26 @@
             log.exception("Database setup required for run on %s", archtag)
             sys.exit(1)
 
-    import_binarypackages(packages_map, package_root, importer_handler)
+    import_binarypackages(distro, packages_map, package_root, importer_handler)
     importer_handler.commit()
 
 
-def attempt_source_package_import(source, package_root, importer_handler):
+def attempt_source_package_import(distro, source, package_root,
+                                  importer_handler):
     """Attempt to import a source package, and handle typical errors."""
     package_name = source.get("Package", "unknown")
     try:
         try:
-            do_one_sourcepackage(source, package_root, importer_handler)
+            do_one_sourcepackage(
+                distro, source, package_root, importer_handler)
         except psycopg2.Error:
             log.exception(
                 "Database error: unable to create SourcePackage for %s. "
                 "Retrying once..", package_name)
             importer_handler.abort()
             time.sleep(15)
-            do_one_sourcepackage(source, package_root, importer_handler)
+            do_one_sourcepackage(
+                distro, source, package_root, importer_handler)
     except (
         InvalidVersionError, MissingRequiredArguments,
         DisplayNameDecodingError):
@@ -168,7 +171,8 @@
             "Database duplication processing %s", package_name)
 
 
-def import_sourcepackages(packages_map, package_root, importer_handler):
+def import_sourcepackages(distro, packages_map, package_root,
+                          importer_handler):
     # Goes over src_map importing the sourcepackages packages.
     count = 0
     npacks = len(packages_map.src_map)
@@ -178,25 +182,26 @@
         for source in packages_map.src_map[package]:
             count += 1
             attempt_source_package_import(
-                source, package_root, importer_handler)
+                distro, source, package_root, importer_handler)
             if COUNTDOWN and (count % COUNTDOWN == 0):
                 log.warn('%i/%i sourcepackages processed', count, npacks)
 
 
-def do_one_sourcepackage(source, package_root, importer_handler):
+def do_one_sourcepackage(distro, source, package_root, importer_handler):
     source_data = SourcePackageData(**source)
     if importer_handler.preimport_sourcecheck(source_data):
         # Don't bother reading package information if the source package
         # already exists in the database
         log.info('%s already exists in the archive', source_data.package)
         return
-    source_data.process_package(package_root)
+    source_data.process_package(distro, package_root)
     source_data.ensure_complete()
     importer_handler.import_sourcepackage(source_data)
     importer_handler.commit()
 
 
-def import_binarypackages(packages_map, package_root, importer_handler):
+def import_binarypackages(distro, packages_map, package_root,
+                          importer_handler):
     nosource = []
 
     # Run over all the architectures we have
@@ -212,7 +217,8 @@
             try:
                 try:
                     do_one_binarypackage(
-                        binary, archtag, package_root, importer_handler)
+                        distro, binary, archtag, package_root,
+                        importer_handler)
                 except psycopg2.Error:
                     log.exception(
                         "Database errors when importing a BinaryPackage "
@@ -220,7 +226,8 @@
                     importer_handler.abort()
                     time.sleep(15)
                     do_one_binarypackage(
-                        binary, archtag, package_root, importer_handler)
+                        distro, binary, archtag, package_root,
+                        importer_handler)
             except (InvalidVersionError, MissingRequiredArguments):
                 log.exception(
                     "Unable to create BinaryPackageData for %s", package_name)
@@ -257,12 +264,13 @@
                 log.warn(pkg)
 
 
-def do_one_binarypackage(binary, archtag, package_root, importer_handler):
+def do_one_binarypackage(distro, binary, archtag, package_root,
+                         importer_handler):
     binary_data = BinaryPackageData(**binary)
     if importer_handler.preimport_binarycheck(archtag, binary_data):
         log.info('%s already exists in the archive', binary_data.package)
         return
-    binary_data.process_package(package_root)
+    binary_data.process_package(distro, package_root)
     importer_handler.import_binarypackage(archtag, binary_data)
     importer_handler.commit()
 


Follow ups