← Back to team overview

openstack team mailing list archive

Capacity based scheduling: What updated free_ram_mb in Folsom

 

Hi Folks,

I was reviewing a code change to add generic retries for build failures ( https://review.openstack.org/#/c/9540/2 ), and wanted to be sure that it wouldn't invalidate the capacity accounting used by the scheduler.


However I've been sitting here for a while working through the Folsom scheduler code trying to understand how the capacity based scheduling now works, and I'm sure I'm missing something obvious but I just can't work out where the free_ram_mb value in the compute_node table gets updated.



I can see the database api method to update the values, compute_node_utilization_update(),  it doesn't look as if anything in the code ever calls that ?



>From when I last looked at this / various discussions here and at the design summits I thought the approach was that:

-          The scheduler would make a call (rather than a cast) to the compute manger, which would then do some verification work, update the DB table whilst in the context of that call, and then start a thread to complete the spawn.  The need to go all the way to the compute node as a call was to avoid race conditions from multiple schedulers.  (the change I'm looking at is part of a blueprint to avoid such a race, so maybe I imagined the change from cast to call ?)



-          On a delete, the capacity_notifer (which had to be configured into the list_notifier) would detect the delete message, and decrement the database values.



But now I look through the code it looks as if the scheduler is still doing a cast (scheduler/driver),  and although I can see the database api call to update the values, compute_node_utilization_update(),  it doesn't look as if anything in the code ever calls that ?



The ram_filter scheduler seems to use the free_ram_mb value, and that value seems to come from the host_manager in the scheduler which is read from the Database,  but I can't for the life of me work out where these values are updated in the Database.



The capacity_notifier, which used to decrement values on a VM deletion only (according to the comments the increment was done in the scheduler) seems to have now disappeared altogether in the move of the notifier to openstack/common ?



So I'm sure I'm missing some other even more cunning plan on how to keep the values current, but I can't for the life of me work out what it is - can someone fill me in please ?



Thanks,

Phil


Follow ups