← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~wallyworld/launchpad/codehost-error-trackback-leak into lp:launchpad

 

Ian Booth has proposed merging lp:~wallyworld/launchpad/codehost-error-trackback-leak into lp:launchpad.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)


= Summary =

Bugs 674305 and 675517 relate to a situation whereby the xmlrpc server was misbehaving and bzr users who were trying to push or commit branches saw tracebacks printed on their consoles. This branch is to stop the traceback from being printed in front of the user - instead they get a message with an oops id.

= Implementation =

The problem was that the branchChanged() method on class LaunchpadServer merely added Twisted's log.err as an error callback to the relevant deferred instance:

return deferred.addCallback(got_path_info).addErrback(log.err)

log.err writes traceback info etc to stderr which I believe the smart server just catches and passes back to the user.
***
Actually, the above line was true when this branch was created. Creating this merge proposal resulted in conflicts in the diff and so after merging from trunk it is apparent there have been changes to Twisted infrastructure. log.err no longer outputs to stderr (when running the tests) so an explicit write to stderr has been added as well as doing the logging via log.err. I assume this changed behaviour of log.err will also be reflected in production.
***

So, what I've done is:

- create a new OopsOccurred(LaunchpadFault) class in canonical.launchpad.xmlrpc.faults
- implement a new error handler which:
  * creates an oops
  * creates and returns an OopsOccurred instance
  * use log.err and also write to stderr to record repr(OopsOccurred) so the user sees a nicer error message

The new oops logged by LaunchpadServer is the one whose oopid is reported to the user. The error report for this oops will contain a traceback and error message from the root cause xmlrpc error.

There are also a number of drive by lint fixes, mainly inserting newlines above method defs.

= Tests =

A new test class was added to TestBranchChangedErrorHandling. The setup for this class installs a replacement codehosting_api.branchChanged endpoint which generates an oops and raises an xmlrpc fault. The test captures stderr and ensures the expected user friendly text is output. It also checks the contents of the error report to ensure the expected oopsid is included.

  bin/test -vvt test_branchfs

= Lint =

Linting changed files:
  lib/canonical/launchpad/xmlrpc/faults.py
  lib/lp/codehosting/vfs/branchfs.py
  lib/lp/codehosting/vfs/tests/test_branchfs.py

./lib/lp/codehosting/vfs/tests/test_branchfs.py
    1168: E501 line too long (81 characters)
    1168: Line exceeds 78 characters.


-- 
https://code.launchpad.net/~wallyworld/launchpad/codehost-error-trackback-leak/+merge/41685
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~wallyworld/launchpad/codehost-error-trackback-leak into lp:launchpad.
=== modified file 'lib/canonical/launchpad/xmlrpc/faults.py'
--- lib/canonical/launchpad/xmlrpc/faults.py	2010-07-02 12:10:21 +0000
+++ lib/canonical/launchpad/xmlrpc/faults.py	2010-11-27 05:11:16 +0000
@@ -22,6 +22,7 @@
     'InvalidBranchUniqueName',
     'InvalidProductIdentifier',
     'InvalidBranchUrl',
+    'OopsOccurred',
     'NoBranchWithID',
     'NoLinkedBranch',
     'NoSuchBranch',
@@ -461,3 +462,15 @@
 
     def __init__(self, openid_identifier):
         LaunchpadFault.__init__(self, openid_identifier=openid_identifier)
+
+
+class OopsOccurred(LaunchpadFault):
+    """An oops has occurred performing the requested operation."""
+
+    error_code = 380
+    msg_template = ('An unexpected error has occurred while %(server_op)s. '
+                    'Please report a Launchpad bug and quote: %(oopsid)s.')
+
+    def __init__(self, server_op, oopsid):
+        LaunchpadFault.__init__(self, server_op=server_op, oopsid=oopsid)
+        self.oopsid = oopsid

=== modified file 'lib/lp/codehosting/vfs/branchfs.py'
--- lib/lp/codehosting/vfs/branchfs.py	2010-09-22 18:32:22 +0000
+++ lib/lp/codehosting/vfs/branchfs.py	2010-11-27 05:11:16 +0000
@@ -61,6 +61,7 @@
     'LaunchpadServer',
     ]
 
+import sys
 import xmlrpclib
 
 from bzrlib import urlutils
@@ -80,7 +81,10 @@
 from bzrlib.transport import get_transport
 from bzrlib.transport.memory import MemoryServer
 from lazr.uri import URI
-from twisted.internet import defer
+from twisted.internet import (
+    defer,
+    error,
+    )
 from twisted.python import (
     failure,
     log,
@@ -92,6 +96,7 @@
     )
 
 from canonical.config import config
+from canonical.launchpad.webapp import errorlog
 from canonical.launchpad.xmlrpc import faults
 from lp.code.enums import BranchType
 from lp.code.interfaces.branchlookup import IBranchLookup
@@ -141,7 +146,7 @@
 # directories that Bazaar creates as part of regular operation. We support
 # only two numbered backups to avoid indefinite space usage.
 ALLOWED_DIRECTORIES = ('.bzr', '.bzr.backup', 'backup.bzr', 'backup.bzr.~1~',
-	'backup.bzr.~2~')
+    'backup.bzr.~2~')
 FORBIDDEN_DIRECTORY_ERROR = (
     "Cannot create '%s'. Only Bazaar branches are allowed.")
 
@@ -500,10 +505,12 @@
         # Launchpad branch.
         deferred = AsyncVirtualTransport._getUnderylingTransportAndPath(
             self, relpath)
+
         def maybe_make_branch_in_db(failure):
             # Looks like we are trying to make a branch.
             failure.trap(NoSuchFile)
             return self.server.createBranch(self._abspath(relpath))
+
         def real_mkdir((transport, path)):
             return getattr(transport, 'mkdir')(path, mode)
 
@@ -687,10 +694,42 @@
                 data['id'], stacked_on_url, last_revision,
                 control_string, branch_string, repository_string)
 
-        # It gets really confusing if we raise an exception from this method
-        # (the branch remains locked, but this isn't obvious to the client) so
-        # just log the error, which will result in an OOPS being logged.
-        return deferred.addCallback(got_path_info).addErrback(log.err)
+        def handle_error(failure=None, **kw):
+            # It gets really confusing if we raise an exception from this
+            # method (the branch remains locked, but this isn't obvious to
+            # the client). We could just log the failure using Twisted's
+            # log.err but this results in text containing traceback
+            # information etc being written to stderr. Since stderr is
+            # displayed to the user, if for example they arrive at this point
+            # via the smart server, we want to ensure that the message is
+            # sanitised. So what we will do is raise an oops and ask the user
+            # to log a bug with the oops information.
+            # See bugs 674305 and 675517 for details.
+
+            request = errorlog.ScriptRequest([
+                ('source', virtual_url_fragment),
+                ('error-explanation', failure.getErrorMessage())])
+            self.unexpectedError(failure, request)
+            fault = faults.OopsOccurred("updating a Launchpad branch",
+                                        request.oopsid)
+            # Twisted's log.err used to write to stderr but it doesn't now so
+            # we will write to stderr as well as log.err.
+            print >> sys.stderr, repr(fault)
+            log.err(repr(fault))
+            return fault
+        return deferred.addCallback(got_path_info).addErrback(handle_error)
+
+    def unexpectedError(self, failure, request=None, now=None):
+        # If the sub-process exited abnormally, the stderr it produced is
+        # probably a much more interesting traceback than the one attached to
+        # the Failure we've been passed.
+        tb = None
+        if failure.check(error.ProcessTerminated):
+            tb = getattr(failure, 'error', None)
+        if tb is None:
+            tb = failure.getTraceback()
+        errorlog.globalErrorUtility.raising(
+            (failure.type, failure.value, tb), request, now)
 
 
 def get_lp_server(user_id, codehosting_endpoint_url=None, branch_url=None,

=== modified file 'lib/lp/codehosting/vfs/tests/test_branchfs.py'
--- lib/lp/codehosting/vfs/tests/test_branchfs.py	2010-10-30 22:53:08 +0000
+++ lib/lp/codehosting/vfs/tests/test_branchfs.py	2010-11-27 05:11:16 +0000
@@ -7,7 +7,12 @@
 
 __metaclass__ = type
 
+import codecs
 import os
+import re
+import sys
+import xmlrpclib
+from StringIO import StringIO
 
 from bzrlib import errors
 from bzrlib.bzrdir import (
@@ -43,6 +48,8 @@
 
 from twisted.internet import defer
 
+from canonical.launchpad.webapp import errorlog
+from canonical.launchpad.webapp.errorlog import ErrorReportingUtility
 from canonical.testing.layers import (
     ZopelessDatabaseLayer,
     )
@@ -68,6 +75,7 @@
     UnknownTransportType,
     )
 from lp.codehosting.vfs.transport import AsyncVirtualTransport
+from lp.services.job.runner import TimeoutError
 from lp.testing import (
     TestCase,
     TestCaseWithFactory,
@@ -255,6 +263,7 @@
         deferred = self.server.translateVirtualPath(
             '~%s/%s/.bzr/control.conf'
             % (branch.owner.name, branch.product.name))
+
         def check_control_file((transport, path)):
             self.assertEqual(
                 'default_stack_on = /%s\n' % branch.unique_name,
@@ -273,6 +282,7 @@
 
         deferred = self.server.translateVirtualPath(
             '%s/.bzr/README' % (branch.unique_name,))
+
         def check_branch_transport((transport, path)):
             self.assertEqual(expected_path, path)
             # Can't test for equality of transports, since URLs and object
@@ -289,6 +299,7 @@
         branch = self.factory.makeAnyBranch(branch_type=BranchType.HOSTED)
         deferred = self.server.translateVirtualPath(
             '%s/.bzr/README' % (branch.unique_name,))
+
         def check_branch_transport((transport, path)):
             self.assertEqual('.bzr/README', path)
             self.assertEqual(True, transport.is_readonly())
@@ -300,6 +311,7 @@
         branch_url = '/~%s/no-such-product/new-branch' % (self.requester.name)
         deferred = self.server.createBranch(branch_url)
         deferred = assert_fails_with(deferred, errors.PermissionDenied)
+
         def check_exception(exception):
             self.assertEqual(branch_url, exception.path)
             self.assertEqual(
@@ -328,6 +340,7 @@
 
         deferred = self.server.translateVirtualPath(
             '/%s/.bzr/README' % (branch.unique_name,))
+
         def check_branch_transport((transport, path)):
             self.assertEqual(expected_path, path)
             # Can't test for equality of transports, since URLs and object
@@ -393,7 +406,6 @@
             'lp-test://', MemoryTransport())
 
 
-
 class TestAsyncVirtualTransport(TestCaseInTempDir):
     """Tests for `AsyncVirtualTransport`."""
 
@@ -607,6 +619,7 @@
         deferred = self._ensureDeferred(
             transport.readv, '%s/.bzr/hello.txt' % branch.unique_name,
             [(3, 2)])
+
         def get_chunk(generator):
             return generator.next()[1]
         deferred.addCallback(get_chunk)
@@ -622,6 +635,7 @@
         deferred = self._ensureDeferred(
             transport.put_bytes,
             '%s/.bzr/goodbye.txt' % branch.unique_name, "Goodbye")
+
         def check_bytes_written(ignored):
             self.assertEqual(
                 "Goodbye", backing_transport.get_bytes('goodbye.txt'))
@@ -795,6 +809,7 @@
     run_tests_with = AsynchronousDeferredRunTest
 
     def _ensureDeferred(self, function, *args, **kwargs):
+
         def call_function_and_check_not_deferred():
             ret = function(*args, **kwargs)
             self.assertFalse(
@@ -1004,6 +1019,109 @@
         self.assertFormatStringsPassed(branch)
 
 
+class TestBranchChangedErrorHandling(TestCaseWithTransport, TestCase):
+    """Test handling of errors when branchChange is called."""
+
+    def setUp(self):
+        TestCaseWithTransport.setUp(self)
+        self._server = None
+        frontend = InMemoryFrontend()
+        self.factory = frontend.getLaunchpadObjectFactory()
+        self.codehosting_api = frontend.getCodehostingEndpoint()
+        self.codehosting_api.branchChanged = self._replacement_branchChanged
+        self.requester = self.factory.makePerson()
+        self.backing_transport = MemoryTransport()
+        self.disable_directory_isolation()
+
+        # Trap stderr.
+        self._real_stderr = sys.stderr
+        sys.stderr = codecs.getwriter('utf8')(StringIO())
+
+        # To record generated oopsids
+        self.generated_oopsids = []
+
+    def tearDown(self):
+        sys.stderr = self._real_stderr
+        TestCaseWithTransport.tearDown(self)
+
+    def _replacement_branchChanged(self, user_id, branch_id, stacked_on_url,
+                                   last_revision, *format_strings):
+        """Log an oops and raise an xmlrpc fault."""
+
+        request = errorlog.ScriptRequest([
+                ('source', branch_id),
+                ('error-explanation', "An error occurred")])
+        try:
+            raise TimeoutError()
+        except TimeoutError:
+            f = sys.exc_info()
+            report = errorlog.globalErrorUtility.raising(f, request)
+            # Record the id for checking later.
+            self.generated_oopsids.append(report.id)
+            raise xmlrpclib.Fault(-1, report)
+
+    def get_server(self):
+        if self._server is None:
+            self._server = LaunchpadServer(
+                XMLRPCWrapper(self.codehosting_api), self.requester.id,
+                self.backing_transport)
+            self._server.start_server()
+            self.addCleanup(self._server.stop_server)
+        return self._server
+
+    def test_branchChanged_stderr_text(self):
+        # An unexpected error invoking branchChanged() results in a user
+        # friendly error printed to stderr (and not a traceback).
+
+        # Unlocking a branch calls branchChanged x 2 on the branch filesystem
+        # endpoint. We will then check the error handling.
+        db_branch = self.factory.makeAnyBranch(
+            branch_type=BranchType.HOSTED, owner=self.requester)
+        branch = self.make_branch(db_branch.unique_name)
+        branch.lock_write()
+        branch.unlock()
+        stderr_text = sys.stderr.getvalue()
+
+        # The text printed to stderr should be like this:
+        # (we need the prefix text later for extracting the oopsid)
+        expected_fault_text_prefix = """
+        <Fault 380: 'An unexpected error has occurred while updating a
+        Launchpad branch. Please report a Launchpad bug and quote:"""
+        expected_fault_text = expected_fault_text_prefix + " OOPS-.*'>"
+
+        # For our test case, branchChanged() is called twice, hence 2 errors.
+        expected_stderr = ' '.join([expected_fault_text for x in range(2)])
+        self.assertTextMatchesExpressionIgnoreWhitespace(
+            expected_stderr, stderr_text)
+
+        # Extract an oops id from the std error text.
+        # There will be 2 oops ids. The 2nd will be the oops for the last
+        # logged error report and the 1st will be in the error text from the
+        # error report.
+        oopsids = []
+        stderr_text = ' '.join(stderr_text.split())
+        expected_fault_text_prefix = ' '.join(
+            expected_fault_text_prefix.split())
+        parts = re.split(expected_fault_text_prefix, stderr_text)
+        for txt in parts:
+            if len(txt) == 0:
+                continue
+            txt = txt.strip()
+            # The oopsid ends with a '.'
+            oopsid = txt[:txt.find('.')]
+            oopsids.append(oopsid)
+
+        # Now check the error report - we just check the last one.
+        self.assertEqual(len(oopsids), 2)
+        error_utility = ErrorReportingUtility()
+        error_report = error_utility.getLastOopsReport()
+        # The error report oopsid should match what's print to stderr.
+        self.assertEqual(error_report.id, oopsids[1])
+        # The error report text should contain the root cause oopsid.
+        self.assertContainsString(
+            error_report.tb_text, self.generated_oopsids[1])
+
+
 class TestLaunchpadTransportReadOnly(BzrTestCase):
     """Tests for read-only operations on the LaunchpadTransport."""