launchpad-reviewers team mailing list archive

Thread
Date

[Merge] lp:~adeuring/launchpad/bug-1015667 into lp:launchpad

To: mp+113245@xxxxxxxxxxxxxxxxxx
From: Abel Deuring <abel.deuring@xxxxxxxxxxxxx>
Date: Tue, 03 Jul 2012 15:49:23 -0000
Reply-to: mp+113245@xxxxxxxxxxxxxxxxxx
Sender: bounces@xxxxxxxxxxxxx

Abel Deuring has proposed merging lp:~adeuring/launchpad/bug-1015667 into lp:launchpad.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  Bug #1015667 in Launchpad itself: "Celery is leaving an ever increasing number of queues behind"
  https://bugs.launchpad.net/launchpad/+bug/1015667

For more details, see:
https://code.launchpad.net/~adeuring/launchpad/bug-1015667/+merge/113245

This branch changes the task ID assigned to Celery jobs in
BaseRunnableJob.runViaCelery(). This should help us fix bug
1015667.

As explained in the doc string of the new method taskId(), it
is difficult to get any clue why and where the result queues
were created. Adding the job class and the job ID gives us at
least a bit more information.

test: ./bin/test services -vvt test_taskId

no lint

-- 
https://code.launchpad.net/~adeuring/launchpad/bug-1015667/+merge/113245
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~adeuring/launchpad/bug-1015667 into lp:launchpad.

=== modified file 'lib/lp/services/job/runner.py'
--- lib/lp/services/job/runner.py	2012-06-14 05:18:22 +0000
+++ lib/lp/services/job/runner.py	2012-07-03 15:48:38 +0000
@@ -37,6 +37,7 @@
     )
 import sys
 from textwrap import dedent
+from uuid import uuid4
 
 from ampoule import (
     child,
@@ -196,6 +197,29 @@
         return oops_config.create(
             context=dict(exc_info=info))
 
+    def taskId(self):
+        """Return a task ID that gives a clue what this job is about.
+
+        Though we intend to drop the result return by a Celery job
+        (in the sense that we don't care what
+        lazr.jobrunner.celerytask.RunJob.run() returns), we might
+        accidentally create result queues, for example, when a job fails.
+        The messages stored in these queues are often not very specific,
+        the queues names are just the IDs of the task, which are by
+        default just strings returned by Celery's uuid() function.
+
+        If we put the job's class name and the job ID into the task ID,
+        we have better chances to figure out what went wrong than by just
+        look for example at a message like
+
+            {'status': 'FAILURE',
+            'traceback': None,
+            'result': SoftTimeLimitExceeded(1,),
+            'task_id': 'cba7d07b-37fe-4f1d-a5f6-79ad7c30222f'}
+        """
+        return '%s-%s-%s' % (
+            self.__class__.__name__, self.job_id, uuid4())
+
     def runViaCelery(self, ignore_result=False):
         """Request that this job be run via celery."""
         # Avoid importing from lp.services.job.celeryjob where not needed, to
@@ -213,7 +237,8 @@
         else:
             eta = None
         return cls.apply_async(
-            (ujob_id, self.config.dbuser), queue=self.task_queue, eta=eta)
+            (ujob_id, self.config.dbuser), queue=self.task_queue, eta=eta,
+            task_id=self.taskId())
 
     def getDBClass(self):
         return self.context.__class__

=== modified file 'lib/lp/services/job/tests/test_runner.py'
--- lib/lp/services/job/tests/test_runner.py	2012-05-09 14:17:41 +0000
+++ lib/lp/services/job/tests/test_runner.py	2012-07-03 15:48:38 +0000
@@ -4,6 +4,7 @@
 """Tests for job-running facilities."""
 
 import logging
+import re
 import sys
 from textwrap import dedent
 from time import sleep
@@ -376,6 +377,16 @@
         self.assertNotIn(job, runner.completed_jobs)
         self.assertIn(job, runner.incomplete_jobs)
 
+    def test_taskId(self):
+        # BaseRunnableJob.taskId() creates a task ID that consists
+        # of the Job's class name, the job ID and a UUID.
+        job = NullJob(completion_message="doesn't matter")
+        task_id = job.taskId()
+        uuid_expr = (
+            '[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}')
+        mo = re.search('^NullJob-%s-%s$' % (job.job_id, uuid_expr), task_id)
+        self.assertIsNot(None, mo)
+
 
 class StaticJobSource(BaseRunnableJob):
 

=== modified file 'versions.cfg'
--- versions.cfg	2012-07-03 04:35:43 +0000
+++ versions.cfg	2012-07-03 15:48:38 +0000
@@ -49,7 +49,7 @@
 lazr.config = 1.1.3
 lazr.delegates = 1.2.0
 lazr.enum = 1.1.3
-lazr.jobrunner = 0.6
+lazr.jobrunner = 0.8
 lazr.lifecycle = 1.1
 lazr.restful = 0.19.6
 lazr.restfulclient = 0.12.2

Follow ups

[Merge] lp:~adeuring/launchpad/bug-1015667 into lp:launchpad
From: noreply, 2012-07-03
[Merge] lp:~adeuring/launchpad/bug-1015667 into lp:launchpad
From: Abel Deuring, 2012-07-03
[Merge] lp:~adeuring/launchpad/bug-1015667 into lp:launchpad
From: Abel Deuring, 2012-07-03
Re: [Merge] lp:~adeuring/launchpad/bug-1015667 into lp:launchpad
From: Richard Harding, 2012-07-03