← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] ~cjwatson/launchpad:librarian-layer-retry into launchpad:master

 

Colin Watson has proposed merging ~cjwatson/launchpad:librarian-layer-retry into launchpad:master.

Commit message:
Retry LibrarianLayer._check_and_reset a few times

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)

For more details, see:
https://code.launchpad.net/~cjwatson/launchpad/+git/launchpad/+merge/419974

We relatively often see mysterious `process-returncode` errors in buildbot runs that don't seem to correspond to a failed test on the same worker.  On closer inspection of the subunit stream, these seem to be due to `LibrarianLayer._check_and_reset` getting `ECONNRESET` when trying to check whether the librarian is still up.  I'm not sure exactly why this might be happening, but it seems reasonable to retry the are-you-still-there request a few times on general principles to see if that makes things more resilient.
-- 
Your team Launchpad code reviewers is requested to review the proposed merge of ~cjwatson/launchpad:librarian-layer-retry into launchpad:master.
diff --git a/lib/lp/testing/layers.py b/lib/lp/testing/layers.py
index 1b6fcb0..94c5dcd 100644
--- a/lib/lp/testing/layers.py
+++ b/lib/lp/testing/layers.py
@@ -71,6 +71,8 @@ from fixtures import (
     MonkeyPatch,
     )
 import psycopg2
+from requests import Session
+from requests.adapters import HTTPAdapter
 from six.moves.urllib.error import (
     HTTPError,
     URLError,
@@ -822,8 +824,11 @@ class LibrarianLayer(DatabaseLayer):
     def _check_and_reset(cls):
         """Raise an exception if the Librarian has been killed, else reset."""
         try:
-            f = urlopen(config.librarian.download_url)
-            f.read()
+            session = Session()
+            session.mount(
+                config.librarian.download_url,
+                HTTPAdapter(max_retries=3))
+            session.get(config.librarian.download_url).content
         except Exception as e:
             raise LayerIsolationError(
                     "Librarian has been killed or has hung."