launchpad-reviewers team mailing list archive
-
launchpad-reviewers team
-
Mailing list archive
-
Message #02358
[Merge] lp:~wgrant/launchpad/bug-702819-bad-uris into lp:launchpad
William Grant has proposed merging lp:~wgrant/launchpad/bug-702819-bad-uris into lp:launchpad.
Requested reviews:
Launchpad code reviewers (launchpad-reviewers)
Related bugs:
#702819 Log parser should skip lines raising InvalidURIError
https://bugs.launchpad.net/bugs/702819
For more details, see:
https://code.launchpad.net/~wgrant/launchpad/bug-702819-bad-uris/+merge/46683
Upon seeing an unparsable URI, the Apache log parser currently skips the remainder of the file. This branch fixes it to just treat an unparsable URL as a path, causing the specific parser implementation to skip the line and continue.
--
https://code.launchpad.net/~wgrant/launchpad/bug-702819-bad-uris/+merge/46683
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~wgrant/launchpad/bug-702819-bad-uris into lp:launchpad.
=== modified file 'lib/lp/services/apachelogparser/base.py'
--- lib/lp/services/apachelogparser/base.py 2011-01-05 04:56:11 +0000
+++ lib/lp/services/apachelogparser/base.py 2011-01-18 21:41:57 +0000
@@ -6,7 +6,7 @@
import os
from contrib import apachelog
-from lazr.uri import URI
+from lazr.uri import InvalidURIError, URI
import pytz
from zope.component import getUtility
@@ -218,6 +218,12 @@
# This is the common case.
path = first
if path.startswith('http://') or path.startswith('https://'):
- uri = URI(path)
- path = uri.path
+ try:
+ uri = URI(path)
+ path = uri.path
+ except InvalidURIError:
+ # The URL is not valid, so we can't extract a path. Let it
+ # pass through, where it will probably be skipped as
+ # unparsable.
+ pass
return method, path
=== modified file 'lib/lp/services/apachelogparser/tests/test_apachelogparser.py'
--- lib/lp/services/apachelogparser/tests/test_apachelogparser.py 2011-01-05 04:56:11 +0000
+++ lib/lp/services/apachelogparser/tests/test_apachelogparser.py 2011-01-18 21:41:57 +0000
@@ -101,6 +101,15 @@
path,
r'/56222647/deluge-gtk_1.3.0-0ubuntu1_all.deb?N\x1f\x9b Z%7B...')
+ def test_parsing_invalid_url(self):
+ # See bug 702819.
+ request = r'GET http://blah/1234/fewfwfw GET http://blah HTTP/1.0'
+ method, path = get_method_and_path(request)
+ self.assertEqual(method, 'GET')
+ self.assertEqual(
+ path,
+ r'http://blah/1234/fewfwfw GET http://blah')
+
class Test_get_fd_and_file_size(TestCase):