← Back to team overview

checkbox-dev team mailing list archive

[PATCH] plainbox:secure:rfc822: fix parsing multi-line values

 

Recent changes to the parser introduced this mistake:

              # Strip the whitespace from the right side
-            line = line.rstrip()
+            line = line.strip()

Instead of stripping spaces at the end of the string we were stripping
them on both sides. This caused us to loose any details about nested
indents, such as one that matters when parsing local jobs.

This patch corrects this flaw.

Recent changes to the parser introduced another mistake:

+            if line.startswith(" ."):
+                line = line[2:]

This would effectively makes the "newline escape sequence" only applicable to
the simple case of one leading space. It would *not* work when there were any
other spaces (for instance, when the space escape sequence itself would be
quoted in some other multi-line value)

That code was now replaced with:

+            if line.lstrip().startswith("."):
+                line = line.replace('.', '', 1)

Here any amount of leading whitespace is considered valid (the code earlier,
not listed here, ensure that at least one leading space is present). Moreover
the space is preserved, so subsequent parsing of the (presumably) nested
multi-line value will work as intended.

This patch adds a few test cases, modeled after the data that triggered
discovery of this bug.

Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@xxxxxxxxxxxxx>
---
 plainbox/plainbox/impl/secure/rfc822.py      | 13 +++++----
 plainbox/plainbox/impl/secure/test_rfc822.py | 42 ++++++++++++++++++++++++++++
 2 files changed, 49 insertions(+), 6 deletions(-)

diff --git a/plainbox/plainbox/impl/secure/rfc822.py b/plainbox/plainbox/impl/secure/rfc822.py
index 7e7f339..c06c66d 100644
--- a/plainbox/plainbox/impl/secure/rfc822.py
+++ b/plainbox/plainbox/impl/secure/rfc822.py
@@ -470,14 +470,15 @@ def gen_rfc822_records(stream, data_cls=dict, source=None):
             if key is None:
                 # If we have not seen any keys yet then this is a syntax error
                 raise _syntax_error("Unexpected multi-line value")
-            # If the line is is composed of a leading space and a dot the strip
-            # those away. This allows us to support a generic escape sequence
-            # after which any characters can be injected (until the end of the
+            # If the line is is composed of leading spaces and a dot
+            # then the remove the dot whithout touching the spaces.
+            # This allows us to support a generic escape sequence after
+            # which any characters can be injected (until the end of the
             # line), including empty lines, lines any number of dots.
-            if line.startswith(" ."):
-                line = line[2:]
+            if line.lstrip().startswith("."):
+                line = line.replace('.', '', 1)
             # Strip the whitespace from the right side
-            line = line.strip()
+            line = line.rstrip()
             # Append the current line to the list of values of the most recent
             # key. This prevents quadratic complexity of string concatenation
             value_list.append(line)
diff --git a/plainbox/plainbox/impl/secure/test_rfc822.py b/plainbox/plainbox/impl/secure/test_rfc822.py
index 807582c..90e57c7 100644
--- a/plainbox/plainbox/impl/secure/test_rfc822.py
+++ b/plainbox/plainbox/impl/secure/test_rfc822.py
@@ -414,6 +414,18 @@ class RFC822ParserTests(TestCase):
         self.assertEqual(len(records), 1)
         self.assertEqual(records[0].data, {'key': 'longer\n\nvalue'})
 
+    def test_multiline_value_with_space__deep_indent(self):
+        text = (
+            "key:\n"
+            "       longer\n"
+            "       .\n"
+            "       value\n"
+        )
+        with StringIO(text) as stream:
+            records = type(self).loader(stream)
+        self.assertEqual(len(records), 1)
+        self.assertEqual(records[0].data, {'key': 'longer\n\nvalue'})
+
     def test_multiline_value_with_period(self):
         text = (
             "key:\n"
@@ -442,6 +454,36 @@ class RFC822ParserTests(TestCase):
         self.assertEqual(records[0].data, {'key1': 'initial\nlonger\nvalue 1'})
         self.assertEqual(records[1].data, {'key2': 'longer\nvalue 2'})
 
+    def test_proper_parsing_nested_multiline(self):
+        text = (
+            "key:"
+            " nested: stuff\n"
+            " even:\n"
+            "  more\n"
+            "  text\n"
+        )
+        with StringIO(text) as stream:
+            records = type(self).loader(stream)
+        self.assertEqual(len(records), 1)
+        self.assertEqual(
+            records[0].data,
+            {'key': 'nested: stuff\neven:\n more\n text'})
+
+    def test_proper_parsing_nested_multiline__deep_indent(self):
+        text = (
+            "key:"
+            "        nested: stuff\n"
+            "        even:\n"
+            "           more\n"
+            "           text\n"
+        )
+        with StringIO(text) as stream:
+            records = type(self).loader(stream)
+        self.assertEqual(len(records), 1)
+        self.assertEqual(
+            records[0].data,
+            {'key': 'nested: stuff\neven:\n   more\n   text'})
+
     def test_irrelevant_whitespace(self):
         text = "key :  value  "
         with StringIO(text) as stream:
-- 
1.8.5.3