← Back to team overview

maas-devel team mailing list archive

Re: Cluster download tasks completely block other jobs :(

 

On 12 June 2014 10:21, Julian Edwards <julian.edwards@xxxxxxxxxxxxx> wrote:
> On 12/06/14 17:11, Raphaël Badin wrote:
>> [...]
>>> I've filed a critical bug about this.
>>> https://bugs.launchpad.net/maas/+bug/1328351
>>
>> Even if other tasks might take a long time to execute (powering up a
>> machine might for instance), the import task is a bit of a special case
>> in the sense that it can take a really long time to execute *and* it
>> doesn't make any sense to have two instances of this task running at the
>> same time.  Plus it can be triggered both by a user and by a cron-like
>> mechanism.
>>
>> We could change the import task so that it grabs a file-based lock when
>> it starts.  Any import task started before the release of the lock would
>> just exit silently.  Celery has provision to help us deal gracefully
>> with failure modes (task crashing without releasing the lock, etc.).
>>
>
> I had refrained from jumping into solutions yet, but since you started
> it, here goes :)
>
>  - The job should not be in Celery, but in pserv which is Twisted-based
> and lets us do event-driven IO.  This means either hacking the
> simplestreams library to use Twisted or cheating with a deferToThread.
>
>  - We do need a lock of some sort (I don't care what kind).
>
>  - Ensure that downloads are serialised
>
> As we start to move away from Celery I hope this all falls into place by
> using the existing RPC mechanism in pserv.

Despite my desire to not look at a single document for 24 hours, I am
in fact writing up a short plan for migrating away from Celery today.
What you've suggested is a-okay (whatever that means). Also,
deferToThread will be fine too :)


Follow ups

References