← Back to team overview

openstack team mailing list archive

Re: Scheduler issues in folsom

 

Hi Vish,

I like to idea to keep host states in memory (or external caching like
memcached).  This should fix the root cause why core filter doesn't
work for Jonathan in his case, but for memory, I think we still need
to find a way to handle those hypervisors don't allocated entire
memory for guest, like KVM.

On Thu, Nov 1, 2012 at 10:54 AM, Vishvananda Ishaya
<vishvananda@xxxxxxxxx> wrote:
>
> On Oct 31, 2012, at 5:57 PM, Vishvananda Ishaya <vishvananda@xxxxxxxxx> wrote:
>
>>
>> Looking at the code it appears that the relevent info is being sent down to the compute node. That said I can't seem to repro your issue with even just the ram filter. I can't get it to overallocate on one node unless I specifically change the ratio, regardless of fill-first or spread-first.
>>
>> Vish
>>
>
> I was finally able to reproduce this. It appears that you are scheduling the ~300 instances very rapidly one at a time. If you schedule the instances in one batch (using min_count and max_count) the scheduler schedules them all in a single loop, taking the previous allocations into account. Otherwise it has to wait for a new state report from the compute node for it to update the used resources.  Fortunately, I think we can also create a fix for the many requests case by either:
>
> a) keeping the host states in memory and using the in memory version (multiple schedulers would have to use something like a shared memcache)
>
> or
>
> b) updating the host state in the database when we launch the instance so the next request gets the new data.
>
> I think that b) still has a pretty large window where there can be a race condition whereas the potential race in a) is almost impossible so I attempted a version of a)
>
> My patch here seems to fix the issue in the one scheduler case:
>
> https://github.com/vishvananda/nova/commit/2eaf796e60bd35319fe6add6dd04359546a21682
>
> If you could give that a try on your scheduler node and see if it fixes it that would be awesome. Also, it would be fery helpful if you can report a bug for me to reference in my merge proposal. I will see what I can do to write a few tests and have a potential fix for multiple schedulers.
>
> Vish



-- 
Regards
Huang Zhiteng


References