← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~stevenk/launchpad/handle-invalid-unicode-in-query-string into lp:launchpad

 

Steve Kowalik has proposed merging lp:~stevenk/launchpad/handle-invalid-unicode-in-query-string into lp:launchpad.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  Bug #897053 in Launchpad itself: "apport generated urls can trigger UnicodeDecodeError"
  https://bugs.launchpad.net/launchpad/+bug/897053

For more details, see:
https://code.launchpad.net/~stevenk/launchpad/handle-invalid-unicode-in-query-string/+merge/146029

If BrowserRequest._decode() fails to decode a string using the users charsets, it will leave it as a string. Which means that an invalid utf-8 bare string can be tossed into a TextWidget, which expects unicode, and blows up with a OOPS when it attempts to decode it.

Now, if we get back a string from BrowserRequest._decode(), we will forcibly decode it to utf-8 with replacements.
-- 
https://code.launchpad.net/~stevenk/launchpad/handle-invalid-unicode-in-query-string/+merge/146029
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~stevenk/launchpad/handle-invalid-unicode-in-query-string into lp:launchpad.
=== modified file 'lib/lp/services/webapp/servers.py'
--- lib/lp/services/webapp/servers.py	2013-01-07 02:40:55 +0000
+++ lib/lp/services/webapp/servers.py	2013-02-01 02:59:23 +0000
@@ -1,4 +1,4 @@
-# Copyright 2009-2012 Canonical Ltd.  This software is licensed under the
+# Copyright 2009-2013 Canonical Ltd.  This software is licensed under the
 # GNU Affero General Public License version 3 (see the file LICENSE).
 
 """Definition of the internet servers that Launchpad uses."""
@@ -657,6 +657,15 @@
         """As per zope.publisher.browser.BrowserRequest._createResponse"""
         return LaunchpadBrowserResponse()
 
+    def _decode(self, text):
+        text = super(LaunchpadBrowserRequest, self)._decode(text)
+        if isinstance(text, str):
+            # BrowserRequest._decode failed to do so with the user-specified
+            # charsets, so decode as UTF-8 with replacements, since we always
+            # want unicode.
+            text = unicode(text, 'utf-8', 'replace')
+        return text
+
     @cachedproperty
     def form_ng(self):
         """See ILaunchpadBrowserApplicationRequest."""

=== modified file 'lib/lp/services/webapp/tests/test_servers.py'
--- lib/lp/services/webapp/tests/test_servers.py	2013-01-07 02:40:55 +0000
+++ lib/lp/services/webapp/tests/test_servers.py	2013-02-01 02:59:23 +0000
@@ -1,4 +1,4 @@
-# Copyright 2009-2012 Canonical Ltd.  This software is licensed under the
+# Copyright 2009-2013 Canonical Ltd.  This software is licensed under the
 # GNU Affero General Public License version 3 (see the file LICENSE).
 
 __metaclass__ = type
@@ -50,6 +50,7 @@
     EventRecorder,
     TestCase,
     )
+from lp.testing.layers import FunctionalLayer
 
 
 class SetInWSGIEnvironmentTestCase(TestCase):
@@ -230,8 +231,7 @@
 
         for method in denied_methods:
             env = self.wsgi_env(self.non_api_path, method)
-            self.assert_(self.factory.canHandle(env),
-                "Sanity check")
+            self.assert_(self.factory.canHandle(env), "Sanity check")
             # Returns a tuple of (request_factory, publication_factory).
             rfactory, pfactory = self.factory.checkRequest(env)
             self.assert_(rfactory is not None,
@@ -352,6 +352,8 @@
 class TestBasicLaunchpadRequest(TestCase):
     """Tests for the base request class"""
 
+    layer = FunctionalLayer
+
     def test_baserequest_response_should_vary(self):
         """Test that our base response has a proper vary header."""
         request = LaunchpadBrowserRequest(StringIO.StringIO(''), {})
@@ -387,6 +389,14 @@
         request = LaunchpadBrowserRequest(StringIO.StringIO(''), env)
         self.assertEquals(u'fnord/trunk\ufffd', request.getHeader('PATH_INFO'))
 
+    def test_request_with_invalid_query_string_recovers(self):
+        # When the query string has invalid utf-8, it is decoded with
+        # replacement.
+        env = {'QUERY_STRING': 'field.title=subproc\xe9s '}
+        request = LaunchpadBrowserRequest(StringIO.StringIO(''), env)
+        self.assertEquals(
+            [u'subproc\ufffds '], request.query_string_params['field.title'])
+
 
 class TestFeedsBrowserRequest(TestCase):
     """Tests for `FeedsBrowserRequest`."""