Hi Yun,
The point of the sleep(0) is to explicitly yield from a long running eventlet to so that other eventlets aren't blocked for a long period. Depending on how you look at that either means we're making an explicit judgement on priority, or trying to provide a more equal sharing of run-time across eventlets.
It's not that things are CPU bound as such - more just that eventlets have every few pre-emption points. Even an IO bound activity like creating a snapshot won't cause an eventlet switch.
So in terms of priority we're trying to get to the state where:
- Important periodic events (such as service status) run when expected (if these take a long time we're stuffed anyway)
- User initiated actions don't get blocked by background system eventlets (such as refreshing power-state)
- Slow action from one user don't block actions from other users (the first user will expect their snapshot to take X seconds, the second one won't expect their VM creation to take X + Y seconds).
It almost feels like the right level of concurrency would be to have a task/process running for each VM, so that there is concurrency across un-related VMs, but serialisation for each VM.
Phil
-----Original Message-----
From: Yun Mao [mailto:yunmao@xxxxxxxxx]
Sent: 02 March 2012 20:32
To: Day, Phil
Cc: Chris Behrens; Joshua Harlow; openstack
Subject: Re: [Openstack] eventlet weirdness
Hi Phil, I'm a little confused. To what extend does sleep(0) help?
It only gives the greenlet scheduler a chance to switch to another green thread. If we are having a CPU bound issue, sleep(0) won't give us access to any more CPU cores. So the total time to finish should be the same no matter what. It may improve the fairness among different green threads but shouldn't help the throughput. I think the only apparent gain to me is situation such that there is 1 green thread with long CPU time and many other green threads with small CPU time.
The total finish time will be the same with or without sleep(0), but with sleep in the first threads, the others should be much more responsive.
However, it's unclear to me which part of Nova is very CPU intensive.
It seems that most work here is IO bound, including the snapshot. Do we have other blocking calls besides mysql access? I feel like I'm missing something but couldn't figure out what.
Thanks,
Yun
On Fri, Mar 2, 2012 at 2:08 PM, Day, Phil<philip.day@xxxxxx> wrote:
I didn't say it was pretty - Given the choice I'd much rather have a threading model that really did concurrency and pre-emption all the right places, and it would be really cool if something managed the threads that were started so that is a second conflicting request was received it did some proper tidy up or blocking rather than just leaving the race condition to work itself out (then we wouldn't have to try and control it by checking vm_state).
However ... In the current code base where we only have user space based eventlets, with no pre-emption, and some activities that need to be prioritised then forcing pre-emption with a sleep(0) seems a pretty small bit of untidy. And it works now without a major code refactor.
Always open to other approaches ...
Phil
-----Original Message-----
From: openstack-bounces+philip.day=hp.com@xxxxxxxxxxxxxxxxxxx
[mailto:openstack-bounces+philip.day=hp.com@xxxxxxxxxxxxxxxxxxx] On
Behalf Of Chris Behrens
Sent: 02 March 2012 19:00
To: Joshua Harlow
Cc: openstack; Chris Behrens
Subject: Re: [Openstack] eventlet weirdness
It's not just you
On Mar 2, 2012, at 10:35 AM, Joshua Harlow wrote:
Does anyone else feel that the following seems really "dirty", or is it just me.
"adding a few sleep(0) calls in various places in the Nova codebase
(as was recently added in the _sync_power_states() periodic task) is
an easy and simple win with pretty much no ill side-effects. :)"
Dirty in that it feels like there is something wrong from a design point of view.
Sprinkling "sleep(0)" seems like its a band-aid on a larger problem imho.
But that's just my gut feeling.
:-(
On 3/2/12 8:26 AM, "Armando Migliaccio"<Armando.Migliaccio@xxxxxxxxxxxxx> wrote:
I knew you'd say that :P
There you go: https://bugs.launchpad.net/nova/+bug/944145
Cheers,
Armando
-----Original Message-----
From: Jay Pipes [mailto:jaypipes@xxxxxxxxx]
Sent: 02 March 2012 16:22
To: Armando Migliaccio
Cc: openstack@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Openstack] eventlet weirdness
On 03/02/2012 10:52 AM, Armando Migliaccio wrote:
I'd be cautious to say that no ill side-effects were introduced.
I found a
race condition right in the middle of sync_power_states, which I
assume was exposed by "breaking" the task deliberately.
Such a party-pooper! ;)
Got a link to the bug report for me?
Thanks!
-jay
_______________________________________________
Mailing list: https://launchpad.net/~openstack Post to :
openstack@xxxxxxxxxxxxxxxxxxx Unsubscribe :
https://launchpad.net/~openstack More help :
https://help.launchpad.net/ListHelp
_______________________________________________
Mailing list: https://launchpad.net/~openstack Post to :
openstack@xxxxxxxxxxxxxxxxxxx Unsubscribe :
https://launchpad.net/~openstack More help :
https://help.launchpad.net/ListHelp
_______________________________________________
Mailing list: https://launchpad.net/~openstack Post to :
openstack@xxxxxxxxxxxxxxxxxxx Unsubscribe :
https://launchpad.net/~openstack More help :
https://help.launchpad.net/ListHelp
_______________________________________________
Mailing list: https://launchpad.net/~openstack Post to :
openstack@xxxxxxxxxxxxxxxxxxx Unsubscribe :
https://launchpad.net/~openstack More help :
https://help.launchpad.net/ListHelp