nova team mailing list archive

Thread
Date

Re: Why not python threads?

To: Joshua McKenty <jmckenty@xxxxxxxxx>
From: Justin Santa Barbara <justin@xxxxxxxxxxxx>
Date: Wed, 4 Aug 2010 12:29:40 -0700
Cc: "nova@xxxxxxxxxxxxxxxxxxx" <nova@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <232ED98F-2409-4150-9798-62A12727C031@gmail.com>

If this is the primary issue, then I think that dealing with signals is
surely easier than dealing with Twisted or Eventlet.

I would propose that for the back-end services we write them 'simply' in
Python threads.  For the front end services (which are effectively proxies
to the data store and the back end services), there may be a performance
case to be made for async code.  I don't believe that the back-end services
have the high performance requirements, but they do have the requirement to
be correct even when dealing with messy back-end APIs and things going badly
wrong.  That logic will end up twisted enough as it is :-)

I believe the signal problems must be the same for Twisted as for simple
Python threads (in particular with threads.deferToThread), it's just that
Twisted (hopefully) handles signals.  Maybe we can look at how they make it
work.

What are the requirements for 'correct signal handling'?  Is it "the process
should exit in a timely way in response to SIGINT and SIGTERM, and
immediately for SIGKILL?"

Justin




On Wed, Aug 4, 2010 at 3:23 AM, Joshua McKenty <jmckenty@xxxxxxxxx> wrote:

> The biggest issue is the interaction with signals and the python threading
> model. Multiprocess certainly works (see the nova use of process pool), but
> you're making your code simpler at the cost of more complex process
> supervision (which I don't object to in this case).
>
> Signals come up in deployment a lot, how to roll out code changes, etc. If
> we fix live migration, this gets much easier.
>
> Sent from my iPhone
>
> On 2010-08-04, at 5:05 AM, Justin Santa Barbara <justin@xxxxxxxxxxxx>
> wrote:
>
> Forgive a Python noob's question, but what's wrong with just using Python
> threads?  Why introduce multiple processes?
>
> It seems that Eric's benchmarks indicate that the overhead would be
> tolerable, and the code would definitely be much cleaner.
>
> The multiple process idea is another argument in favor of simple
> threading... if we figure out sharding, we could run multiple compute
> service processes to get around scaling limits that going with simple
> threading might introduce (e.g. GIL contention).
>
> Justin
>
>
>
> On Tue, Aug 3, 2010 at 7:56 PM, Vishvananda Ishaya <<vishvananda@xxxxxxxxx>
> vishvananda@xxxxxxxxx> wrote:
>
>> If we want to go with the simplest possible approach, we could make the
>> compute workers synchronous and just run multiple copies on each host.  We
>> could make one of them 'read only' so it only answers simple/fast requests,
>> and a few (4?) others for other long/io intensive tasks.  The ultimate would
>> be to have each message actually have its own worker a la erlang, but that
>> might be a bit extreme.
>>
>> I've been doing a lot of the changes later that require switching
>> everything to async.  It is a bit annoying to wrap your head around it, but
>> it really isn't all that bad.  That said, I'm all for making things as
>> simple as possible.
>>
>> Vish
>>
>> On Tue, Aug 3, 2010 at 6:30 PM, Justin Santa Barbara <<justin@xxxxxxxxxxxx>
>> justin@xxxxxxxxxxxx> wrote:
>>
>>> Without meaning to make the twisted/eventlet flamewar any worse, can I
>>> just ask why we're not just using 'good old threads'?  I've asked Eric Day
>>> for his input based on his great benchmarks (<http://oddments.org/?p=494>
>>> http://oddments.org/?p=494).  My background is from the Java world,
>>> where threads work wonderfully - possibly even better than async: <http://www.thebuzzmedia.com/java-io-faster-than-nio-old-is-new-again/>
>>> http://www.thebuzzmedia.com/java-io-faster-than-nio-old-is-new-again
>>>
>>> I feel like Nova is greatly complicated by the async code, and I'm
>>> starting to see some of the pain of Twisted: it seems that _everything_
>>> needs to be async in the long run, because if something calls a function
>>> that is (or could be) async, it must itself be async.  So yields and
>>> @defer.inlineCallbacks start cropping up everywhere.
>>>
>>> One of the project goals seems to be simplicity of the code, for fewer
>>> bugs and to reduce barriers to entry, and it seems that if we could use
>>> 'plain old Python' that we would better achieve this goal than if we have to
>>> use an async framework.
>>>
>>> I know that Python has its issues here with the GIL, but I'm just
>>> wondering whether, in the case of nova, threads might be good enough, and
>>> produce much easier to understand code?  I'm guessing that maybe the project
>>> started with threads - what happened?
>>>  <http://oddments.org/?p=494>
>>> Justin
>>>
>>>
>>>
>>> _______________________________________________
>>> Mailing list: <https://launchpad.net/~nova>https://launchpad.net/~nova
>>> Post to     : <nova@xxxxxxxxxxxxxxxxxxx>nova@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : <https://launchpad.net/~nova>https://launchpad.net/~nova
>>> More help   : <https://help.launchpad.net/ListHelp>
>>> https://help.launchpad.net/ListHelp
>>>
>>>
>>
> _______________________________________________
> Mailing list: <https://launchpad.net/~nova>https://launchpad.net/~nova
> Post to     : <nova@xxxxxxxxxxxxxxxxxxx>nova@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : <https://launchpad.net/~nova>https://launchpad.net/~nova
> More help   : <https://help.launchpad.net/ListHelp>
> https://help.launchpad.net/ListHelp
>
>

Follow ups

Re: Why not python threads?
From: Eric Day, 2010-08-04

References

Why not python threads?
From: Justin Santa Barbara, 2010-08-04
Re: Why not python threads?
From: Vishvananda Ishaya, 2010-08-04
Re: Why not python threads?
From: Justin Santa Barbara, 2010-08-04
Re: Why not python threads?
From: Joshua McKenty, 2010-08-04