← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~wallyworld/launchpad/deleted-job-oops-602323 into lp:launchpad

 

Ian Booth has proposed merging lp:~wallyworld/launchpad/deleted-job-oops-602323 into lp:launchpad.

Commit message:
Ignore an LostObjectError processing a job oops if the job has been deleted.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  Bug #602323 in Launchpad itself: "branch scanner throws LostObjectError if the branch it is scanning is deleted"
  https://bugs.launchpad.net/launchpad/+bug/602323

For more details, see:
https://code.launchpad.net/~wallyworld/launchpad/deleted-job-oops-602323/+merge/133842

A quick fix to job error handling. If the job oopses, but is deleted before error processing can occur, a LostObjectError is raised. We don't care, so catch the exception and ignore it.

This branch comes about following a previous fix to BranchScanJob. That job raised a LostObjectError if the branch was deleted before the job ran. This was fixed. However QA appeared to indicate that a LostObjectError was still being raised. It is, but the root cause in unrelated. Celery logs for qas show that BrancScanJob has been raising "ImportError: cannot import name xmlconfig" for a while. This error is normally handled by the job runner infrastructure and logged. But if the job is deleted, the oops handling fails. This branch fixes that.


-- 
https://code.launchpad.net/~wallyworld/launchpad/deleted-job-oops-602323/+merge/133842
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~wallyworld/launchpad/deleted-job-oops-602323 into lp:launchpad.
=== modified file 'lib/lp/services/job/runner.py'
--- lib/lp/services/job/runner.py	2012-10-26 00:45:30 +0000
+++ lib/lp/services/job/runner.py	2012-11-12 00:32:20 +0000
@@ -35,6 +35,7 @@
     SIGHUP,
     signal,
     )
+from storm.exceptions import LostObjectError
 import sys
 from textwrap import dedent
 from uuid import uuid4
@@ -491,17 +492,22 @@
                 self._logOopsId(response['oops_id'])
 
         def job_raised(failure):
-            self.incomplete_jobs.append(job)
-            exit_code = getattr(failure.value, 'exitCode', None)
-            if exit_code == self.TIMEOUT_CODE:
-                # The process ended with the error code that we have
-                # arbitrarily chosen to indicate a timeout. Rather than log
-                # that error (ProcessDone), we log a TimeoutError instead.
-                self._logTimeout(job)
+            try:
+                exit_code = getattr(failure.value, 'exitCode', None)
+                if exit_code == self.TIMEOUT_CODE:
+                    # The process ended with the error code that we have
+                    # arbitrarily chosen to indicate a timeout. Rather than log
+                    # that error (ProcessDone), we log a TimeoutError instead.
+                    self._logTimeout(job)
+                else:
+                    info = (failure.type, failure.value, failure.tb)
+                    oops = self._doOops(job, info)
+                    self._logOopsId(oops['id'])
+            except LostObjectError:
+                # The job may have been deleted, so we can ignore this error.
+                pass
             else:
-                info = (failure.type, failure.value, failure.tb)
-                oops = self._doOops(job, info)
-                self._logOopsId(oops['id'])
+                self.incomplete_jobs.append(job)
         deferred.addCallbacks(update, job_raised)
         return deferred
 


Follow ups