← Back to team overview

openstack team mailing list archive

deal with booting lots of instance simultaneously

 

Hi all,

When create lots of instance simultaneously, there will be lots of
instance in ERROR state. And most of them are caused by network rpc
request timeout. This result is not so graceful.

I think it will be better if scheduler keep a queue of creating request.
when he find all the hosts are busy enough(compute_node.current_workload
reach some value), stop cast the request to host temporarily, until he
found some host free enough. In this way, we can make sure booting lots
of instances simultaneously results in active instances rather than lots
of ERROR instance. but will cause a small weak point, if the top value
of current_workload small enough, create instance processing will be slow.

Do you have another quick fix?

Thanks,

-- 
best regards,
gtt


Follow ups