← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~sinzui/launchpad/suppress-url-hacker-oops into lp:launchpad

 

Curtis Hovey has proposed merging lp:~sinzui/launchpad/suppress-url-hacker-oops into lp:launchpad.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  Bug #640049 in Launchpad itself: "GoneError:  must not raise an OOPS"
  https://bugs.launchpad.net/launchpad/+bug/640049
  Bug #730393 in Launchpad itself: "when a user urls hacks we can get InvalidBatchSize oopses"
  https://bugs.launchpad.net/launchpad/+bug/730393

For more details, see:
https://code.launchpad.net/~sinzui/launchpad/suppress-url-hacker-oops/+merge/61617

Errors caused by hacked URLs or offsite referrers are not oopses.

    Launchpad bug:
        https://bugs.launchpad.net/bugs/730393
        https://bugs.launchpad.net/bugs/640049
    Pre-implementation: lifeless

While InvalidBatchSize oopses are valuable if our form machinery ever gets out
of sync with backend limits, a user hacking urls shouldn't trigger an oops.

What we should do is look at the referer: if its not from lp (or affiliated
sites per our 404 heuristics) just give them an error (e,g, 'invalid batch
size') and not log an oops.

This issue is broader than InvalidBatchSize. GoneError has the same issue.
We are also not interested in NotFound errors cause by offsite links; there
was some confusion about these because the UI says the oops is not reported,
but it really was.
--------------------------------------------------------------------

RULES

    * Update ErrorReportingUtility to ignore a class of errors if the
      referer is no  .*launchpad\.net
      * Add _ignored_exceptions_for_offsite_referer that is a list of
        exceptions that can be ignored when the referer is not lp.
      * _makeErrorReport currently checks the request and there are
        certain conditions for certain classes or error where it returns
        early. For the _ignored_exceptions_for_offsite_referer, the method
        should return early when lp is not the referer.

QA

    * Hack the batch size in a URL to 500.
    * Verify you see an error page.
    * Verify an oops was not reported.


LINT

    lib/canonical/launchpad/webapp/errorlog.py
    lib/canonical/launchpad/webapp/tests/test_errorlog.py


TEST

    ./bin/test -vv -m canonical.launchpad.webapp.tests.test_errorlog


IMPLEMENTATION

Extracted the rule "if self._isIgnoredException(strtype, request):" rule
to a new methods called _isIgnoredException() that could be extended.
Added a new set of error names called _ignored_exceptions_for_offsite_referer.
Updated _isIgnoredException() to return False if there error is a bad URL
that has a request without an lp referer.
I originally planned to use an re to check the referring domain, but I saw
other code in webapp use vhost and urlparse. I take advantage of of the fact
the urlparse returns a named tuple in python 2.6.
    lib/canonical/launchpad/webapp/errorlog.py
    lib/canonical/launchpad/webapp/tests/test_errorlog.py
-- 
https://code.launchpad.net/~sinzui/launchpad/suppress-url-hacker-oops/+merge/61617
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~sinzui/launchpad/suppress-url-hacker-oops into lp:launchpad.
=== modified file 'lib/canonical/launchpad/webapp/errorlog.py'
--- lib/canonical/launchpad/webapp/errorlog.py	2010-12-13 18:04:24 +0000
+++ lib/canonical/launchpad/webapp/errorlog.py	2011-05-19 17:06:18 +0000
@@ -17,6 +17,7 @@
 import rfc822
 import types
 import urllib
+import urlparse
 
 from lazr.restful.utils import get_current_browser_request
 import pytz
@@ -41,6 +42,7 @@
     IErrorReportRequest,
     )
 from canonical.launchpad.webapp.opstats import OpStats
+from canonical.launchpad.webapp.vhosts import allvhosts
 from canonical.lazr.utils import safe_hasattr
 from lp.services.log.uniquefileallocator import UniqueFileAllocator
 from lp.services.timeline.requesttimeline import get_request_timeline
@@ -238,7 +240,7 @@
                 "Unable to interpret oops line: %s" % line)
             start, end, db_id, statement = match.groups()
             if db_id is not None:
-                db_id = intern(db_id) # This string is repeated lots.
+                db_id = intern(db_id)  # This string is repeated lots.
             statements.append(
                 (int(start), int(end), db_id, statement))
 
@@ -257,6 +259,8 @@
         'ReadOnlyModeDisallowedStore', 'ReadOnlyModeViolation',
         'TranslationUnavailable', 'NoReferrerError'])
     _ignored_exceptions_for_unauthenticated_users = set(['Unauthorized'])
+    _ignored_exceptions_for_offsite_referer = set([
+        'GoneError', 'InvalidBatchSizeError', 'NotFound'])
     _default_config_section = 'error_reports'
 
     def __init__(self):
@@ -368,6 +372,19 @@
         notify(ErrorReportEvent(entry))
         return entry
 
+    def _isIgnoredException(self, strtype, request=None):
+        if strtype in self._ignored_exceptions:
+            return True
+        if strtype in self._ignored_exceptions_for_offsite_referer:
+            if request is not None:
+                referer = request.get('HTTP_REFERER', '')
+                referer_parts = urlparse.urlparse(referer)
+                root_parts = urlparse.urlparse(
+                    allvhosts.configs['mainsite'].rooturl)
+                if root_parts.netloc not in referer_parts.netloc:
+                    return True
+        return False
+
     def _makeErrorReport(self, info, request=None, now=None,
                          informational=False):
         """Return an ErrorReport for the supplied data.
@@ -387,7 +404,7 @@
         tb_text = None
 
         strtype = str(getattr(info[0], '__name__', info[0]))
-        if strtype in self._ignored_exceptions:
+        if self._isIgnoredException(strtype, request):
             return
 
         if not isinstance(info[2], basestring):
@@ -498,8 +515,8 @@
         distant_past = datetime.datetime(1970, 1, 1, 0, 0, 0, tzinfo=UTC)
         when = _rate_restrict_pool.get(strtype, distant_past)
         if now > when:
-            next_when = max(when,
-                            now - _rate_restrict_burst*_rate_restrict_period)
+            next_when = max(
+                when, now - _rate_restrict_burst * _rate_restrict_period)
             next_when += _rate_restrict_period
             _rate_restrict_pool[strtype] = next_when
             # Sometimes traceback information can be passed in as a string. In

=== modified file 'lib/canonical/launchpad/webapp/tests/test_errorlog.py'
--- lib/canonical/launchpad/webapp/tests/test_errorlog.py	2010-12-13 18:04:24 +0000
+++ lib/canonical/launchpad/webapp/tests/test_errorlog.py	2011-05-19 17:06:18 +0000
@@ -17,6 +17,7 @@
 import traceback
 import unittest
 
+from lazr.batchnavigator.interfaces import InvalidBatchSizeError
 from lazr.restful.declarations import webservice_error
 import pytz
 import testtools
@@ -25,6 +26,7 @@
     )
 from zope.interface import directlyProvides
 from zope.publisher.browser import TestRequest
+from zope.publisher.interfaces import NotFound
 from zope.publisher.interfaces.xmlrpc import IXMLRPCRequest
 from zope.security.interfaces import Unauthorized
 from zope.testing.loggingsupport import InstalledHandler
@@ -41,7 +43,10 @@
     )
 from canonical.launchpad.webapp.interfaces import NoReferrerError
 from canonical.testing import reset_logging
-from lp.app.errors import TranslationUnavailable
+from lp.app.errors import (
+    GoneError,
+    TranslationUnavailable,
+    )
 from lp.services.log.uniquefileallocator import UniqueFileAllocator
 from lp.services.osutils import remove_tree
 from lp.testing import TestCase
@@ -304,7 +309,7 @@
         file_permission = stat.S_IMODE(st.st_mode)
         self.assertEqual(file_permission, wanted_permission)
         # Restore the umask to the original value.
-        ignored = os.umask(old_umask)
+        os.umask(old_umask)
 
         lines = open(errorfile, 'r').readlines()
 
@@ -315,7 +320,7 @@
         self.assertEqual(lines[3], 'Date: 2006-04-01T00:30:00+00:00\n')
         self.assertEqual(lines[4], 'Page-Id: \n')
         self.assertEqual(lines[5], 'Branch: %s\n' % versioninfo.branch_nick)
-        self.assertEqual(lines[6], 'Revision: %s\n'% versioninfo.revno)
+        self.assertEqual(lines[6], 'Revision: %s\n' % versioninfo.revno)
         self.assertEqual(lines[7], 'User: None\n')
         self.assertEqual(lines[8], 'URL: None\n')
         self.assertEqual(lines[9], 'Duration: -1\n')
@@ -488,7 +493,7 @@
         self.assertEqual(lines[3], 'Date: 2006-04-01T00:30:00+00:00\n')
         self.assertEqual(lines[4], 'Page-Id: \n')
         self.assertEqual(lines[5], 'Branch: %s\n' % versioninfo.branch_nick)
-        self.assertEqual(lines[6], 'Revision: %s\n'% versioninfo.revno)
+        self.assertEqual(lines[6], 'Revision: %s\n' % versioninfo.revno)
         self.assertEqual(lines[7], 'User: None\n')
         self.assertEqual(lines[8], 'URL: https://launchpad.net/example\n')
         self.assertEqual(lines[9], 'Duration: -1\n')
@@ -636,6 +641,50 @@
         except TranslationUnavailable:
             utility.raising(sys.exc_info(), now=now)
 
+        self.assertTrue(
+            TranslationUnavailable.__name__ in utility._ignored_exceptions,
+            'TranslationUnavailable is not in _ignored_exceptions.')
+        errorfile = os.path.join(
+            utility.log_namer.output_dir(now), '01800.T1')
+        self.assertFalse(os.path.exists(errorfile))
+
+    def test_ignored_exceptions_for_offsite_referer(self):
+        # Exceptions caused by bad URLs that may not be an Lp code issue.
+        utility = ErrorReportingUtility()
+        errors = set([
+            GoneError.__name__, InvalidBatchSizeError.__name__,
+            NotFound.__name__])
+        self.assertEqual(
+            errors, utility._ignored_exceptions_for_offsite_referer)
+
+    def test_ignored_exceptions_for_offsite_referer_reported(self):
+        # Oopses are reported when Launchpad is the referer for a URL
+        # that caused an exception.
+        utility = ErrorReportingUtility()
+        now = datetime.datetime(2006, 04, 01, 00, 30, 00, tzinfo=UTC)
+        request = TestRequest(
+            environ={
+                'SERVER_URL': 'http://launchpad.dev/fnord',
+                'HTTP_REFERER': 'http://launchpad.dev/snarf'})
+        try:
+            raise GoneError('fnord')
+        except GoneError:
+            utility.raising(sys.exc_info(), request, now=now)
+        errorfile = os.path.join(
+            utility.log_namer.output_dir(now), '01800.T1')
+        self.assertTrue(os.path.exists(errorfile))
+
+    def test_ignored_exceptions_for_offsite_referer_not_reported(self):
+        # Oopses are not reported when Launchpad is not the referer.
+        utility = ErrorReportingUtility()
+        now = datetime.datetime(2006, 04, 01, 00, 30, 00, tzinfo=UTC)
+        # There is no HTTP_REFERER header in this request
+        request = TestRequest(
+            environ={'SERVER_URL': 'http://launchpad.dev/fnord'})
+        try:
+            raise GoneError('fnord')
+        except GoneError:
+            utility.raising(sys.exc_info(), request, now=now)
         errorfile = os.path.join(
             utility.log_namer.output_dir(now), '01800.T1')
         self.assertFalse(os.path.exists(errorfile))
@@ -689,7 +738,7 @@
         self.assertEqual(lines[3], 'Date: 2006-04-01T00:30:00+00:00\n')
         self.assertEqual(lines[4], 'Page-Id: \n')
         self.assertEqual(lines[5], 'Branch: %s\n' % versioninfo.branch_nick)
-        self.assertEqual(lines[6], 'Revision: %s\n'% versioninfo.revno)
+        self.assertEqual(lines[6], 'Revision: %s\n' % versioninfo.revno)
         self.assertEqual(lines[7], 'User: None\n')
         self.assertEqual(lines[8], 'URL: None\n')
         self.assertEqual(lines[9], 'Duration: -1\n')
@@ -727,7 +776,7 @@
         self.assertEqual(lines[3], 'Date: 2006-04-01T00:30:00+00:00\n')
         self.assertEqual(lines[4], 'Page-Id: \n')
         self.assertEqual(lines[5], 'Branch: %s\n' % versioninfo.branch_nick)
-        self.assertEqual(lines[6], 'Revision: %s\n'% versioninfo.revno)
+        self.assertEqual(lines[6], 'Revision: %s\n' % versioninfo.revno)
         self.assertEqual(lines[7], 'User: None\n')
         self.assertEqual(lines[8], 'URL: None\n')
         self.assertEqual(lines[9], 'Duration: -1\n')
@@ -865,7 +914,7 @@
         # logged will have OOPS reports generated for them.
         error_message = self.factory.getUniqueString()
         try:
-            ignored = 1/0
+            1 / 0
         except ZeroDivisionError:
             self.logger.exception(error_message)
         oops_report = self.error_utility.getLastOopsReport()