← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] ~cjwatson/launchpad:py3-sane-environment into launchpad:master

 

Colin Watson has proposed merging ~cjwatson/launchpad:py3-sane-environment into launchpad:master.

Commit message:
Adjust sane_environment countermeasures for Python 3

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)

For more details, see:
https://code.launchpad.net/~cjwatson/launchpad/+git/launchpad/+merge/392140

The countermeasures in BasicLaunchpadRequest.__init__ for the recoding done by zope.publisher.http.sane_environment didn't quite work on Python 3, where PATH_INFO arrives as text.  Adjust them slightly to work.  The behaviour on Python 2 remains unchanged.
-- 
Your team Launchpad code reviewers is requested to review the proposed merge of ~cjwatson/launchpad:py3-sane-environment into launchpad:master.
diff --git a/lib/lp/services/webapp/servers.py b/lib/lp/services/webapp/servers.py
index f9a01b4..5c82ad0 100644
--- a/lib/lp/services/webapp/servers.py
+++ b/lib/lp/services/webapp/servers.py
@@ -575,14 +575,20 @@ class BasicLaunchpadRequest(LaunchpadBrowserRequestMixin):
         self.traversed_objects = []
         self._wsgi_keys = set()
         if 'PATH_INFO' in environ:
-            # Zope's sane_environment assumes that PATH_INFO is UTF-8 encoded.
-            # This next step replaces problems with U+FFFD to ensure
-            # a UnicodeDecodeError is not raised before OOPS error handling
-            # is available. This change will convert a 400 error to a 404
-            # because tranversal will raise NotFound when it encounters a
-            # non-ascii path part.
-            environ['PATH_INFO'] = environ['PATH_INFO'].decode(
-                'utf-8', 'replace').encode('utf-8')
+            # Zope's sane_environment (called by the superclass's __init__)
+            # takes PATH_INFO, which according to WSGI must be a native
+            # string containing only code points representable in
+            # ISO-8859-1, and recodes it to UTF-8.  However, we don't want
+            # it to raise UnicodeDecodeError before OOPS error handling is
+            # available, so replace problems with U+FFFD before it has a
+            # chance to recode anything.  This change will convert a 400
+            # error to a 404, because traversal will raise NotFound when it
+            # encounters a non-ASCII path part.
+            environ = dict(environ)
+            pi = environ['PATH_INFO']
+            if isinstance(pi, bytes):
+                pi = pi.decode('utf-8', 'replace')
+            environ['PATH_INFO'] = pi.encode('utf-8')
         super(BasicLaunchpadRequest, self).__init__(
             body_instream, environ, response)