← Back to team overview

launchpad-dev team mailing list archive

Re: Design pattern for interruptible processing of continuous data

 

On Wednesday 05 January 2011 00:30:35 Martin Pool wrote:
> On 5 January 2011 02:29, Julian Edwards <julian.edwards@xxxxxxxxxxxxx> 
wrote:
> > Dear all
> > 
> > I've seen this problem pop up in similar ways a few times now, where
> > we're processing a bunch of data in a cron job (whether externally on
> > the API, or internally) and it needs to do a batch of work, remember
> > where it left off (whether reaching a batch limit or the live data is
> > paused), and continue later.
> 
> I think I something like this in a bug last December about the branch
> scanner being killed when it runs out of memory.  This apparently
> doesn't happen very often and it wasn't totally clear to me or people
> I asked what would happen to jobs (using the word loosely) that were
> in progress at the moment it was killed.
> 
> It would be awfully nice if that could be handled by a common layer so
> that killing the batch-processing job (even without unwinding its
> python stack) would result in the jobs being retried a few times and
> then failed.  This seems to be a requirement mentioned in
> <https://dev.launchpad.net/Foundations/NewTaskSystem/Requirements>.
> 
> Maybe that page can turn into a LEP and get moved along.

This would be great.  I already coded something in the buildd-manager that 
tries to do intelligent re-try and failure processing of jobs.  The next step 
is to generalise that behaviour.

When I get some time I'll start a LEP and further discussion.

Cheers
J



References