← Back to team overview

launchpad-dev team mailing list archive

Re: performance tuesday - the rabbit has landed

 

On 2011-05-11 10:13, Robert Collins wrote:

I suspect an easy migration target if folk want one would be to
migrate all the fire-and-forget jobs to trigger via rabbit (leaving
the payload in the db), by hooking a 'do it now' message into the
post-transaction actions in zope.

It's exciting news. We'll want to be careful in migrating jobs though: IIRC rabbit is nontransactional. That means we'll still need some way for consumers of jobs to recognize cases where the producer transaction aborted after firing off the job.

In some of those cases, executing a job unnecessarily won't hurt -- ones that refresh statistics for example. In others, the job absolutely must not execute.

Without having looked into it properly, I think we'll need some kind of wrapper to support this distinction. Traditional transactional messaging uses two-phase commit; other products use database queues similar to our Job. Both are probably overweight to the point where our baby would go out with the bathwater. We could fake it by queuing up jobs in memory and sending them after commit, but that leaves open a window for message loss.

Another problem happens when things work too well: you create a database-backed object. You fire off a job related to that object. You commit. But the consumer of that job picks it up before your commit has propagated and boom! The job dies in flames because it accesses objects that aren't decisively in the database yet.

I imagine both problems go away if every message carries a database transaction id, and the job runner keeps an eye on the database transaction log: the runner shouldn't consume a job until the producing transaction has committed, and it should drop jobs whose producers have aborted. Is something along those lines built in?


Jeroen


Follow ups

References