yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #45197
[Bug 1454451] Re: simultaneous boot of multiple instances leads to cpu pinning overlap
** Changed in: nova/kilo
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1454451
Title:
simultaneous boot of multiple instances leads to cpu pinning overlap
Status in OpenStack Compute (nova):
Fix Released
Status in OpenStack Compute (nova) kilo series:
Fix Released
Bug description:
I'm running into an issue with kilo-3 that I think is present in
current trunk. Basically it results in multiple instances (with
dedicated cpus) being pinned to the same physical cpus.
I think there is a race between the claimed CPUs of an instance being
persisted to the DB, and the resource audit scanning the DB for
instances and subtracting pinned CPUs from the list of available CPUs.
The problem only shows up when the following sequence happens:
1) instance A (with dedicated cpus) boots on a compute node
2) resource audit runs on that compute node
3) instance B (with dedicated cpus) boots on the same compute node
So you need to either be booting many instances, or limiting the valid
compute nodes (host aggregate or server groups), or have a small
cluster in order to hit this.
The nitty-gritty view looks like this:
When booting up an instance we hold the COMPUTE_RESOURCE_SEMAPHORE in
compute.resource_tracker.ResourceTracker.instance_claim() and that
covers updating the resource usage on the compute node. But we don't
persist the instance numa topology to the database until after
instance_claim() returns, in
compute.manager.ComputeManager._build_instance(). Note that this is
done *after* we've given up the semaphore, so there is no longer any
sort of ordering guarantee.
compute.resource_tracker.ResourceTracker.update_available_resource()
then aquires COMPUTE_RESOURCE_SEMAPHORE, queries the database for a
list of instances and uses that to calculate a new view of what
resources are available. If the numa topology of the most recent
instance hasn't been persisted yet, then the new view of resources
won't include any pCPUs pinned by that instance.
compute.manager.ComputeManager._build_instance() runs for the next
instance and based on the new view of available resources it allocates
the same pCPU(s) used by the earlier instance. Boom, overlapping
pinned pCPUs.
Lastly, the same bug applies to the compute.manager.ComputeManager.rebuild_instance() case. It uses the same pattern of doing the claim and then updating the instance numa topology after releasing the semaphore.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1454451/+subscriptions
References