← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~gary/launchpad/bug683115 into lp:launchpad

 

Gary Poster has proposed merging lp:~gary/launchpad/bug683115 into lp:launchpad.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)


This branch raises an error at the point that I believe it is entering our code base for bug 683115.  If I am right, then we will see the new error in the oops reports with the value that is causing the problem.  We can then decided what to do about it.

An alternate approach would be to set the total to 0 and make an informative OOPS. I decided against that approach because I don't know if the Google results are actually causing the problem, and I did not want to introduce unnecessary band-aids until it was proven that we needed them.  I don't feel very strongly about it.

This is a quick diagnostic branch, so I did not change the doctest (which is actually a decent one, I think).  

On the other hand, I did correct all of what ``make lint`` complained about because it was quick.  The downside there is that it adds a bunch of noise to the diff.  The most questionable thing I did for lint's sake is to make searchservice.py not support old spelling of the elementtree import. I think it's fine.

Thank you

Gary
-- 
https://code.launchpad.net/~gary/launchpad/bug683115/+merge/42322
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~gary/launchpad/bug683115 into lp:launchpad.
=== modified file 'lib/canonical/launchpad/doc/google-searchservice.txt'
--- lib/canonical/launchpad/doc/google-searchservice.txt	2010-10-25 13:16:10 +0000
+++ lib/canonical/launchpad/doc/google-searchservice.txt	2010-11-30 23:11:31 +0000
@@ -1,17 +1,21 @@
-= Google Search Service =
+=====================
+Google Search Service
+=====================
 
 The GoogleSearchService is a Google Custom Service Business Edition
 (cs-be) client. Given one or more terms, it will retrieve an XML
 summary of the matching launchpad.net pages.
 
 
-== GoogleSearchService ==
+GoogleSearchService
+===================
 
 The GoogleSearchService implements the ISearchService interface.
 
     >>> from zope.component import getUtility
     >>> from zope.interface.verify import verifyObject
-    >>> from canonical.launchpad.interfaces.searchservice import ISearchService
+    >>> from canonical.launchpad.interfaces.searchservice import (
+    ...     ISearchService)
 
     >>> google_search = getUtility(ISearchService)
     >>> verifyObject(ISearchService, google_search)
@@ -20,13 +24,16 @@
     <...GoogleSearchService ...>
 
 
-=== GoogleSearchService search() ===
+----------------------------
+GoogleSearchService search()
+----------------------------
 
 The search method accepts a string argument of terms and an optional int
 argument of start. The terms are the same as the text that would be
 entered in Google search form; the terms should not be escaped.
 
-    >>> from canonical.launchpad.interfaces.searchservice import ISearchResults
+    >>> from canonical.launchpad.interfaces.searchservice import (
+    ...     ISearchResults)
 
     >>> first_page_matches = google_search.search(terms='bug')
     >>> first_page_matches
@@ -41,7 +48,8 @@
     <...PageMatches ...>
 
 
-== PageMatches ==
+PageMatches
+===========
 
 The PageMatches object returned by GoogleSearchService.search()
 implements ISearchResults.
@@ -106,7 +114,8 @@
     0
 
 
-== PageMatch ==
+PageMatch
+=========
 
 The PageMatch object represents a single result from a search result
 set. It is created by passing a title, url, and a summary. It is
@@ -138,7 +147,8 @@
     'http://launchpad.dev/unicode-titles'
 
 
-== Search configuration ==
+Search configuration
+====================
 
 The google search service is configured by the google section in
 lazr.config. All requests are made to Google's site, but the
@@ -183,7 +193,8 @@
     start : 0
 
 
-== create_search_url() ==
+create_search_url()
+===================
 
 The search url used inside the search() method is created by
 create_search_url(). It accepts two optional arguments: terms and start.
@@ -229,7 +240,8 @@
     'http://launchpad.dev:.../...q=svg+%2Bbugs&start=20'
 
 
-== Google Search Protocol parsing ==
+Google Search Protocol parsing
+==============================
 
 The GoogleSearchService's _parse_google_search_protocol() requires a
 subset of the GSP 3.2 markup to create the PageMatch and PageMatches
@@ -283,6 +295,25 @@
     GoogleWrongGSPVersion: Could not get the 'total' from the
                            GSP XML response.
 
+Similarly, if the total were ever less than zero (which we are suspicious
+of because of bug 683115), we would see an error.
+
+    >>> gsp_xml_file_name = path.join(
+    ...     base_path, 'googlesearchservice-negative-total.xml')
+    >>> gsp_xml_file = open(gsp_xml_file_name, 'r')
+    >>> gsp_xml = gsp_xml_file.read()
+    >>> gsp_xml_file.close()
+    >>> print gsp_xml
+    <...
+    <RES SN="1" EN="1">
+    <M>-1</M>...
+
+    >>> google_search._parse_google_search_protocol(gsp_xml)
+    Traceback (most recent call last):
+     ...
+    GoogleResponseError: The reported total (-1, from '-1') was less than
+                zero. See bug 683115.
+
 A PageMatch requires a title, url, and a summary. If those elements
 ('<T>', '<U>', '<S>') cannot be found nested in an '<R>' a PageMatch
 cannot be made. A missing title (<T>) indicates a bad page on Launchpad,
@@ -343,7 +374,7 @@
     ...
     <R N="1">
     <U>https://blueprints.edge.launchpad.net/ubuntu/+spec/gobuntu-hardy</U>
-    <UE>https://blueprints.edge.launchpad.net/ubuntu/%2Bspec/gobuntu-hardy</UE>
+    <UE>https://blueprints.edge.launchpad.net.../%2Bspec/gobuntu-hardy</UE>
     <T>Blueprint: &lt;b&gt;gobuntu&lt;/b&gt; hardy</T>
     <RK>0</RK>
     <S></S>
@@ -504,7 +535,9 @@
       tracker allows collaboration'
 
 
-=== URL rewriting ===
+-------------
+URL rewriting
+-------------
 
 The URL scheme used in the rewritten URL is configured in
 config.google.url_rewrite_scheme. The hostname is set in the shared
@@ -557,7 +590,9 @@
     'https://help.launchpad.net/OpenID'
 
 
-=== Graceful handling of timeouts ===
+-----------------------------
+Graceful handling of timeouts
+-----------------------------
 
 The external service (Google Search Engine) may not be available, or
 is not responding quickly because there are network issues. In these

=== modified file 'lib/canonical/launchpad/utilities/searchservice.py'
--- lib/canonical/launchpad/utilities/searchservice.py	2010-09-09 21:49:53 +0000
+++ lib/canonical/launchpad/utilities/searchservice.py	2010-11-30 23:11:31 +0000
@@ -13,10 +13,7 @@
     'PageMatches',
     ]
 
-try:
-    import xml.etree.cElementTree as ET
-except ImportError:
-    import cElementTree as ET
+import xml.etree.cElementTree as ET
 import urllib
 from urlparse import urlunparse
 
@@ -70,7 +67,6 @@
         """
         return config.vhost.mainsite.hostname
 
-
     def __init__(self, title, url, summary):
         """initialize a PageMatch.
 
@@ -155,14 +151,14 @@
     implements(ISearchService)
 
     _default_values = {
-        'client' : 'google-csbe',
-        'cx' : None,
-        'ie' : 'utf8',
-        'num' : 20,
-        'oe' : 'utf8',
-        'output' : 'xml_no_dtd',
+        'client': 'google-csbe',
+        'cx': None,
+        'ie': 'utf8',
+        'num': 20,
+        'oe': 'utf8',
+        'output': 'xml_no_dtd',
         'start': 0,
-        'q' : None,
+        'q': None,
         }
 
     @property
@@ -293,6 +289,10 @@
             # The datatype is not what PageMatches requires.
             raise GoogleWrongGSPVersion(
                 "Could not get the 'total' from the GSP XML response.")
+        if total < 0:
+            raise GoogleResponseError(
+                ("The reported total (%d, from %r) was less than zero. "
+                 "See bug 683115.") % (total, results.find('M').text))
         for result in results.findall('R'):
             url_tag = result.find('U')
             title_tag = result.find('T')