openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #17322
Re: Folsom nova-scheduler race condition?
Hi Jon,
I believe the retry is meant to occur not just if the spawn fails, but also if a host receives a request which it can't honour because it already has too many VMs running or in progress of being launched.
Maybe try reducing your filters down a bit ("standard_filters" means all filters I think) in case there is some odd interaction between that full set ?
Phil
-----Original Message-----
From: openstack-bounces+philip.day=hp.com@xxxxxxxxxxxxxxxxxxx [mailto:openstack-bounces+philip.day=hp.com@xxxxxxxxxxxxxxxxxxx] On Behalf Of Jonathan Proulx
Sent: 09 October 2012 15:53
To: openstack@xxxxxxxxxxxxxxxxxxx
Subject: [Openstack] Folsom nova-scheduler race condition?
Hi All,
Looking for a sanity test before I file a bug. I very recently upgraded my install to Folsom (on top of Ubuntu 12.04/kvm). My scheduler settings in nova.conf are:
scheduler_available_filters=nova.scheduler.filters.standard_filters
scheduler_default_filters=AvailabilityZoneFilter,RamFilter,CoreFilter,ComputeFilter
least_cost_functions=nova.scheduler.least_cost.compute_fill_first_cost_fn
compute_fill_first_cost_fn_weight=1.0
cpu_allocation_ratio=1.0
This had been working to fill systems based on available RAM and to not exceed 1:1 allocation ration of CPU resources with Essex. With Folsom, if I specify a moderately large number of instances to boot or spin up single instances in a tight shell loop they will all get schedule on the same compute node well in excess of the number of available vCPUs . If I start them one at a time (using --poll in a shell loop so each instance is started before the next launches) then I get the expected allocation behaviour.
I see https://bugs.launchpad.net/nova/+bug/1011852 which seems to attempt to address this issue but as I read it that "fix" is based on retrying failures. Since KVM is capable of over committing both CPU and Memory I don't seem to get retryable failure, just really bad performance.
Am I missing something this this fix or perhaps there's a reported bug I didn't find in my search, or is this really a bug no one has reported?
Thanks,
-Jon
_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp
References