← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~jtv/launchpad/db-bug-752178 into lp:launchpad/db-devel

 

Jeroen T. Vermeulen has proposed merging lp:~jtv/launchpad/db-bug-752178 into lp:launchpad/db-devel.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  Bug #752178 in Launchpad itself: "PublishFTPMaster.runFinalizeParts creates a bad environment if there are multiple archives"
  https://bugs.launchpad.net/launchpad/+bug/752178

For more details, see:
https://code.launchpad.net/~jtv/launchpad/db-bug-752178/+merge/56719

= Summary =

The new python replacement for cron.publish-ftpmaster passes a variable $ARCHIVEROOTS to the run-parts scripts in finalize.d.  This is a list of values, but due to incorrect quoting, it would cause a second value (if any) in the list to be execute as a command instead of added to the variable.


== Proposed fix ==

Quote and escape variables where they are passed to the shell.


== Pre-implementation notes ==

Problem diagnosed by and discussed with William Grant.


== Implementation details ==

The quoting is still nowhere near perfect: what happens if there's a space in one of the configured archive roots?  Still, this should be an improvement over the old unquoted shell code.

A new end-to-end test makes sure that the ARCHIVEROOTS variable is actually usable in the run-parts scripts.  This should also serve as an example of how to deal with the quoting of these variables.  The test deliberately uses a filename with a space in it to guard against slipups and lucky escapes.


== Tests ==

{{{
./bin/test -vvc lp.archivepublisher.tests.test_publish_ftparchive
}}}


== Demo and Q/A ==

Run cronscripts/publish-ftparchive.py on dogfood, with the run_parts_location configuration item (under [archivepublisher]) set to "cronscripts/publishing/distro-parts".


= Launchpad lint =

Checking for conflicts and issues in changed files.

Linting changed files:
  cronscripts/publishing/distro-parts/ubuntu/publish-distro.d/30-copy-indices
  cronscripts/publishing/distro-parts/ubuntu/publish-distro.d/20-remove-uncompressed-listings
  lib/lp/archivepublisher/scripts/publish_ftpmaster.py
  cronscripts/publishing/distro-parts/ubuntu/publish-distro.d/10-sign-releases
  lib/lp/archivepublisher/tests/test_publish_ftpmaster.py
-- 
https://code.launchpad.net/~jtv/launchpad/db-bug-752178/+merge/56719
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~jtv/launchpad/db-bug-752178 into lp:launchpad/db-devel.
=== modified file 'cronscripts/publishing/distro-parts/ubuntu/publish-distro.d/10-sign-releases'
--- cronscripts/publishing/distro-parts/ubuntu/publish-distro.d/10-sign-releases	2011-03-29 14:09:31 +0000
+++ cronscripts/publishing/distro-parts/ubuntu/publish-distro.d/10-sign-releases	2011-04-07 10:11:43 +0000
@@ -1,8 +1,8 @@
 #!/bin/sh -e
 
-RELEASE_FILES=`find "$DISTSROOT".new -maxdepth 2 -name Release`
+RELEASE_FILES=`find $DISTSROOT.new -maxdepth 2 -name Release`
 DIST_UPGRADER_TARBALLS=`
-	find "$DISTSROOT".new"/*/*/dist-upgrader* -name "*.tar.gz" || true`
+	find $DISTSROOT.new/*/*/dist-upgrader* -name "*.tar.gz" || true`
 
 for CANDIDATE in $RELEASE_FILES $DIST_UPGRADER_TARBALLS
 do

=== modified file 'cronscripts/publishing/distro-parts/ubuntu/publish-distro.d/20-remove-uncompressed-listings'
--- cronscripts/publishing/distro-parts/ubuntu/publish-distro.d/20-remove-uncompressed-listings	2011-03-28 09:18:42 +0000
+++ cronscripts/publishing/distro-parts/ubuntu/publish-distro.d/20-remove-uncompressed-listings	2011-04-07 10:11:43 +0000
@@ -8,4 +8,4 @@
 # It's safe to do this since the uncompressed MD5 hashes have already been
 # computed for inclusion in the Release files.
 
-find "$DISTSROOT".new \( -name -o -name Sources \) -exec rm -f -- "{}" \;
+find $DISTSROOT.new \( -name -o -name Sources \) -exec rm -f -- "{}" \;

=== modified file 'cronscripts/publishing/distro-parts/ubuntu/publish-distro.d/30-copy-indices'
--- cronscripts/publishing/distro-parts/ubuntu/publish-distro.d/30-copy-indices	2011-03-28 09:18:42 +0000
+++ cronscripts/publishing/distro-parts/ubuntu/publish-distro.d/30-copy-indices	2011-04-07 10:11:43 +0000
@@ -2,7 +2,7 @@
 
 echo "$(date -R): Copying the indices into place."
 
-INDICES="$ARCHIVEROOT/indices"
+INDICES=$ARCHIVEROOT/indices
 
-rm -f -- "$INDICES/override"
-cp -- "$OVERRIDEROOT"/override.* "$INDICES/"
+rm -f -- $INDICES/override
+cp -- $OVERRIDEROOT/override.* $INDICES/

=== modified file 'lib/lp/archivepublisher/scripts/publish_ftpmaster.py'
--- lib/lp/archivepublisher/scripts/publish_ftpmaster.py	2011-04-01 07:12:44 +0000
+++ lib/lp/archivepublisher/scripts/publish_ftpmaster.py	2011-04-07 10:11:43 +0000
@@ -43,11 +43,46 @@
     return boolean_text[boolean_value]
 
 
-def compose_env_string(env):
-    """Turn a dict into a series of shell parameter assignments."""
+def shell_quote(literal):
+    """Escape `literal` for use in a double-quoted shell string.
+
+    This is a pretty naive substitution: it doesn't deal well with
+    non-ASCII characters or special characters.
+    """
+    # Characters that need backslash-escaping.  Do the backslash itself
+    # first; any escapes we introduced *before* the backslash would have
+    # their own backslashes escaped afterwards when we got to the
+    # backslash itself.
+    special_characters = '\\"$`\n'
+    for escapee in special_characters:
+        literal = literal.replace(escapee, '\\' + escapee)
+    return '"%s"' % literal
+
+
+def compose_env_string(*env_dicts):
+    """Turn dict(s) into a series of shell parameter assignments.
+
+    Values in later dicts override any values with the same respective
+    keys in earlier dicts.
+    """
+    env = {}
+    for env_dict in env_dicts:
+        env.update(env_dict)
     return ' '.join(['='.join(pair) for pair in env.iteritems()])
 
 
+def extend_PATH():
+    """Produce env dict for extending $PATH.
+
+    Adds the Launchpad source tree's cronscripts/publishing to the
+    existing $PATH.
+
+    :return: a dict suitable for passing to compose_env_string.
+    """
+    scripts_dir = os.path.join(config.root, "cronscripts", "publishing")
+    return {"PATH": '"$PATH":%s' % shell_quote(scripts_dir)}
+
+
 def get_distscopyroot(archive_config):
     """Return the distscopy root directory for `archive_config`."""
     return archive_config.archiveroot + "-distscopy"
@@ -256,8 +291,8 @@
         """Execute the publish-distro hooks."""
         archive_config = self.configs[archive.purpose]
         env = {
-            'DISTSROOT': archive_config.distsroot,
-            'ARCHIVEROOT': archive_config.archiveroot,
+            'DISTSROOT': shell_quote(archive_config.distsroot),
+            'ARCHIVEROOT': shell_quote(archive_config.archiveroot),
             }
         self.runParts('publish-distro.d', env)
 
@@ -305,11 +340,10 @@
         if not config.archivepublisher.run_commercial_compat:
             return
 
-        self.executeShell("""
-            env PATH="$PATH:%s/cronscripts/publishing" \
-                LPCONFIG="%s" \
-                commercial-compat.sh
-            """ % (config.root, config.instance_name))
+        env = {"LPCONFIG": shell_quote(config.instance_name)}
+        self.executeShell(
+            "env %s commercial-compat.sh"
+            % compose_env_string(env, extend_PATH()))
 
     def generateListings(self):
         """Create ls-lR.gz listings."""
@@ -361,22 +395,21 @@
         if parts_dir is None:
             self.logger.debug("Skipping run-parts %s: not configured.", parts)
             return
-        total_env_string = ' '.join([
-            "PATH=\"$PATH:%s/cronscripts/publishing\"" % config.root,
-            compose_env_string(env),
-            ])
+        env_string = compose_env_string(env, extend_PATH())
         self.executeShell(
-            "env %s run-parts -- '%s'" % (total_env_string, parts_dir),
+            "env %s run-parts -- '%s'" % (env_string, parts_dir),
             failure=LaunchpadScriptFailure(
                 "Failure while executing run-parts %s." % parts_dir))
 
     def runFinalizeParts(self, security_only=False):
         """Run the finalize.d parts to finalize publication."""
+        archive_roots = shell_quote(' '.join([
+            archive_config.archiveroot
+            for archive_config in self.configs.itervalues()]))
+
         env = {
             'SECURITY_UPLOAD_ONLY': compose_shell_boolean(security_only),
-            'ARCHIVEROOTS': ' '.join([
-                archive_config.archiveroot
-                for archive_config in self.configs.itervalues()]),
+            'ARCHIVEROOTS': archive_roots,
         }
         self.runParts('finalize.d', env)
 

=== modified file 'lib/lp/archivepublisher/tests/test_publish_ftpmaster.py'
--- lib/lp/archivepublisher/tests/test_publish_ftpmaster.py	2011-03-31 11:25:53 +0000
+++ lib/lp/archivepublisher/tests/test_publish_ftpmaster.py	2011-04-07 10:11:43 +0000
@@ -16,6 +16,7 @@
     LaunchpadZopelessLayer,
     ZopelessDatabaseLayer,
     )
+from lp.archivepublisher.config import getPubConfig
 from lp.archivepublisher.interfaces.publisherconfig import IPublisherConfigSet
 from lp.registry.interfaces.pocket import (
     PackagePublishingPocket,
@@ -33,6 +34,7 @@
     compose_shell_boolean,
     find_run_parts_dir,
     PublishFTPMaster,
+    shell_quote,
     )
 from lp.soyuz.tests.test_publishing import SoyuzTestPublisher
 from lp.testing import (
@@ -101,7 +103,7 @@
 
 class TestPublishFTPMasterHelpers(TestCase):
 
-    def test_compose_env_string_iterates_env(self):
+    def test_compose_env_string_iterates_env_dict(self):
         env = {
             "A": "1",
             "B": "2",
@@ -109,6 +111,27 @@
         env_string = compose_env_string(env)
         self.assertIn(env_string, ["A=1 B=2", "B=2 A=1"])
 
+    def test_compose_env_string_combines_env_dicts(self):
+        env1 = {"A": "1"}
+        env2 = {"B": "2"}
+        env_string = compose_env_string(env1, env2)
+        self.assertIn(env_string, ["A=1 B=2", "B=2 A=1"])
+
+    def test_compose_env_string_overrides_repeated_keys(self):
+        self.assertEqual("A=2", compose_env_string({"A": "1"}, {"A": "2"}))
+
+    def test_shell_quote_quotes_string(self):
+        self.assertEqual('"x"', shell_quote("x"))
+
+    def test_shell_quote_escapes_string(self):
+        self.assertEqual('"\\\\"', shell_quote("\\"))
+
+    def test_shell_quote_does_not_escape_its_own_escapes(self):
+        self.assertEqual('"\\$"', shell_quote("$"))
+
+    def test_shell_quote_escapes_entire_string(self):
+        self.assertEqual('"\\$\\$\\$"', shell_quote("$$$"))
+
     def test_compose_shell_boolean_shows_True_as_yes(self):
         self.assertEqual("yes", compose_shell_boolean(True))
 
@@ -223,6 +246,26 @@
             """))
         self.addCleanup(config.pop, "commercial-compat")
 
+    def installRunPartsScript(self, distro, parts_dir, script_code):
+        """Set up a run-parts script, and configure it to run.
+
+        :param distro: The `Distribution` you're testing on.  Must have
+            a temporary directory as its publishing root directory.
+        :param parts_dir: The run-parts subdirectory to execute:
+            publish-distro.d or finalize.d.
+        :param script_code: The code to go into the script.
+        """
+        distro_config = get_pub_config(distro)
+        parts_base = os.path.join(distro_config.root_dir, "distro-parts")
+        self.enableRunParts(parts_base)
+        script_dir = os.path.join(parts_base, distro.name, parts_dir)
+        os.makedirs(script_dir)
+        script_path = os.path.join(script_dir, self.factory.getUniqueString())
+        script_file = file(script_path, "w")
+        script_file.write(script_code)
+        script_file.close()
+        os.chmod(script_path, 0755)
+
     def test_script_runs_successfully(self):
         ubuntu = self.prepareUbuntu()
         self.layer.txn.commit()
@@ -697,3 +740,49 @@
         script.runFinalizeParts = FakeMethod()
         script.publishAllUploads()
         self.assertEqual(1, script.runFinalizeParts.call_count)
+
+    def test_runFinalizeParts_quotes_archiveroots(self):
+        # Passing ARCHIVEROOTS to the finalize.d scripts is a bit
+        # difficult because the variable holds multiple values in a
+        # single, double-quoted string.  Escaping and quoting a sequence
+        # of escaped and quoted items won't work.
+        # This test establishes how a script can sanely deal with the
+        # list.  It'll probably go wrong if the configured archive root
+        # contains spaces and such, but should work with Unix-sensible
+        # paths.
+        distro = self.makeDistro()
+        self.factory.makeArchive(
+            distribution=distro, purpose=ArchivePurpose.PARTNER)
+        script = self.makeScript(distro)
+        script.setUp()
+        script.setUpDirs()
+
+        # Create a run-parts script that creates marker files in each of
+        # the archive roots, and writes an expected string to them.
+        # Doesn't write to a marker file that already exists, because it
+        # might be a sign that the path it received is ridiculously
+        # wrong.  Don't want to go overwriting random files now do we?
+        self.installRunPartsScript(distro, "finalize.d", dedent("""\
+            #!/bin/sh -e
+            MARKER_NAME="marker file"
+            for DIRECTORY in $ARCHIVEROOTS
+            do
+                MARKER="$DIRECTORY/$MARKER_NAME"
+                if [ -e "$MARKER" ]
+                then
+                    echo "Marker file $MARKER already exists." >&2
+                    exit 1
+                fi
+                echo "This is an archive root." >"$MARKER"
+            done
+            """))
+
+        script.runFinalizeParts()
+
+        for archive in [distro.main_archive, distro.getArchive("partner")]:
+            archive_root = getPubConfig(archive).archiveroot
+            self.assertEqual(
+                "This is an archive root.",
+                self.readMarkerFile([archive_root, "marker file"]).rstrip(),
+                "Did not find expected marker for %s."
+                % archive.purpose.title)


Follow ups