← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~spiv/launchpad/codebrowse-oops-84838 into lp:launchpad/devel

 

Andrew Bennetts has proposed merging lp:~spiv/launchpad/codebrowse-oops-84838 into lp:launchpad/devel with lp:~spiv/launchpad/loggerhead-logging as a prerequisite.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  #84838 code browser should use oops system
  https://bugs.launchpad.net/bugs/84838


This fixes a 5-digit bug number. :)

This fixes codebrowse to log tracebacks from failed requests as OOPSes, rather than dumping them in debug.log.  This is a fairly basic implementation: it shows very elementary HTML to the user, and it doesn't capture any information in the OOPS beyond the traceback and the URL... but even those flaws are improvements over the status quo, and are clearly marked with XXXs.

The only outstanding issue I know of is finding out what the production config values for this should be, particularly the error_dir.  And of course arranging for the production OOPSes to get synced and reported on like OOPS reports from other systems.

This is based on an already-approved (but as yet unmerged) branch that makes a more modest tweak to codebrowse's logging.
-- 
https://code.launchpad.net/~spiv/launchpad/codebrowse-oops-84838/+merge/31007
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~spiv/launchpad/codebrowse-oops-84838 into lp:launchpad/devel.
=== modified file 'configs/development/launchpad-lazr.conf'
--- configs/development/launchpad-lazr.conf	2010-07-25 05:16:58 +0000
+++ configs/development/launchpad-lazr.conf	2010-07-27 03:02:58 +0000
@@ -52,6 +52,9 @@
 log_folder: /var/tmp/codebrowse.launchpad.dev/logs
 launchpad_root: https://code.launchpad.dev/
 secret_path: configs/development/codebrowse-secret
+error_dir: /var/tmp/codebrowse.launchpad.dev/errors
+oops_prefix: CB
+copy_to_zlog: false
 
 [codehosting]
 launch: True

=== modified file 'lib/canonical/config/schema-lazr.conf'
--- lib/canonical/config/schema-lazr.conf	2010-07-26 09:23:26 +0000
+++ lib/canonical/config/schema-lazr.conf	2010-07-27 03:02:58 +0000
@@ -283,6 +283,15 @@
 # datatype: string
 secret_path:
 
+# See [error_reports].
+copy_to_zlog: False
+
+# See [error_reports].
+error_dir: none
+
+# See [error_reports].
+oops_prefix: CB
+
 
 [codehosting]
 # datatype: string

=== modified file 'lib/launchpad_loggerhead/app.py'
--- lib/launchpad_loggerhead/app.py	2010-05-15 17:43:59 +0000
+++ lib/launchpad_loggerhead/app.py	2010-07-27 03:02:58 +0000
@@ -3,6 +3,7 @@
 
 import logging
 import os
+import sys
 import threading
 import urllib
 import urlparse
@@ -26,6 +27,8 @@
 from canonical.config import config
 from canonical.launchpad.xmlrpc import faults
 from canonical.launchpad.webapp.vhosts import allvhosts
+from canonical.launchpad.webapp.errorlog import (
+    ErrorReportingUtility, ScriptRequest)
 from lp.code.interfaces.codehosting import (
     BRANCH_TRANSPORT, LAUNCHPAD_ANONYMOUS)
 from lp.codehosting.vfs import get_lp_server
@@ -227,3 +230,90 @@
                 bzr_branch.unlock()
         finally:
             lp_server.stop_server()
+
+
+def make_oops_logging_exception_hook(error_utility, request):
+    """Make a hook for logging OOPSes."""
+    def log_oops():
+        error_utility.raising(sys.exc_info(), request)
+    return log_oops
+
+
+def make_error_utility():
+    """Make an error utility for logging errors from codebrowse."""
+    error_utility = ErrorReportingUtility()
+    error_utility.configure('codebrowse')
+    return error_utility
+
+
+# XXX: This HTML template should be replaced with the same one that lpnet uses
+# for reporting OOPSes to users, or at least something that looks similar.  But
+# even this is better than the "Internal Server Error" you'd get otherwise.
+#  - Andrew Bennetts, 2010-07-27.
+_oops_html_template = '''\
+<html>
+<head>Oops! %(oopsid)s</head>
+<body>
+<h1>Oops!</h1>
+<p>Something broke while generating the page.
+Please try again in a few minutes, and if the problem persists file a bug at
+<a href="https://bugs.launchpad.net/launchpad-code";
+>https://bugs.launchpad.net/launchpad-code</a>
+and quote OOPS-ID <strong>%(oopsid)s</strong>
+</p></body></html>'''
+
+
+def oops_middleware(app):
+    """Middleware to log an OOPS if the request fails.
+
+    If the request fails before the response body has started then this returns
+    a basic error page with the OOPS ID to the user (and status code 200).
+
+    Strictly speaking this isn't 100% correct WSGI middleware, because it
+    doesn't respect the return value from start_response (the 'write'
+    callable), but using that return value is deprecated and nothing in our
+    codebrowse stack uses that.
+    """
+    error_utility = make_error_utility()
+    def wrapped_app(environ, start_response):
+        response_start = [None]
+        def wrapped_start_response(status, headers, exc_info=None):
+            response_start[0] = (status, headers, exc_info)
+        def report_oops():
+            # XXX: We should capture more per-request information to include in
+            # the OOPS here, e.g. duration, user, etc.  But even an OOPS with
+            # just a traceback and URL is better than nothing.
+            #   - Andrew Bennetts, 2010-07-27.
+            request = ScriptRequest(
+                [], URL=construct_url(environ))
+            error_utility.raising(sys.exc_info(), request)
+            return request.oopsid
+        # Start processing this request
+        try:
+            app_iter = app(environ, wrapped_start_response)
+        except:
+            oopsid = report_oops()
+            start_response('200 OK', [('Content-Type:', 'text/html')])
+            yield _oops_html_template % {'oopsid': oopsid}
+            return
+        # Start yielding the response
+        body_started = False
+        while True:
+            try:
+                yield app_iter.next()
+            except StopIteration:
+                return
+            except:
+                oopsid = report_oops()
+                if body_started:
+                    # We've already started sending a response, so... just give
+                    # up.
+                    raise
+                start_response('200 OK', [('Content-Type:', 'text/html')])
+                yield _oops_html_template % {'oopsid': oopsid}
+                return
+            else:
+                if not body_started:
+                    start_response(*response_start[0])
+                    body_started = True
+    return wrapped_app

=== modified file 'scripts/start-loggerhead.py'
--- scripts/start-loggerhead.py	2010-06-08 15:43:13 +0000
+++ scripts/start-loggerhead.py	2010-07-27 03:02:58 +0000
@@ -119,7 +119,7 @@
 
 from launchpad_loggerhead.debug import (
     change_kill_thread_criteria, threadpool_debug)
-from launchpad_loggerhead.app import RootApp
+from launchpad_loggerhead.app import RootApp, oops_middleware
 from launchpad_loggerhead.session import SessionHandler
 
 SESSION_VAR = 'lh.session'
@@ -153,6 +153,7 @@
     return wrapped
 app = set_scheme(app)
 app = change_kill_thread_criteria(app)
+app = oops_middleware(app)
 
 try:
     httpserver.serve(