← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1635674] Re: 'hw:cpu_thread_policy=isolate' is not accounted properly on non-HT hosts

 

Reviewed:  https://review.openstack.org/391416
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=9f12b592d1d26a985699fefde2a7ce0164d0b5d3
Submitter: Jenkins
Branch:    master

commit 9f12b592d1d26a985699fefde2a7ce0164d0b5d3
Author: Sergey Nikitin <snikitin@xxxxxxxxxxxx>
Date:   Fri Dec 9 17:42:14 2016 +0400

    Mark sibling CPUs as 'used' for cpu_thread_policy = 'isolated'
    
    'isolated' CPU allocation thread policy is guarantee
    that no vCPUs from other guests wouldn't be able to be
    placed on the cores of booted VM (In this case core is
    a set of sibling vCPUs).
    
    But we still able to boot VMs with 'dedicated' CPU
    allocation policy on these cores. This problem is actual
    for hosts without HyperThreading. In this case sets of
    siblings vCPUs are empty for each core but we are still
    trying to work with them as with HyperThreading cores.
    This causes the problem when one "isolated" core
    is used by several VMs.
    
    To fix it we must use method unpin_cpus_with_siblings()
    only if NUMA cell has siblings (i.e. has HyperThreading).
    For cells without HyperThreading CPU isolation is
    guaranteed by 'dedicated' CPU allocation policy.
    
    Closes-Bug: #1635674
    
    Change-Id: I8f72187153c930cd941b7ee7e835a20ed0c0de03


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1635674

Title:
  'hw:cpu_thread_policy=isolate' is not accounted properly on non-HT
  hosts

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  If an instance with 'hw:cpu_thread_policy=isolate' is scheduled on a
  non-HT host, the pinned pCPUs are not properly accounted for.  This
  can lead to multiple instances running on the same pCPUs

  The problem is that in LibvirtDriver._get_host_numa_topology() when
  calculating the NUMACell.siblings field we filter out single cpus.  On
  a non-HT host this means that NUMACell.siblings is an empty list.

  Later when _update_usage() runs it ends up eventually running
  NUMACell.pin_cpus_with_siblings().  This contains the following code:

      def pin_cpus_with_siblings(self, cpus):
          pin_siblings = set()
          for sib in self.siblings:
              if cpus & sib:
                  pin_siblings.update(sib)
          self.pin_cpus(pin_siblings)

  Since "self.siblings" is empty, we end up calling self.pin_cpus() with
  an empty list, which means that we don't update self.pinned_cpus.

  Stephen Finucane has suggested the correct fix might be to leave
  single pCPUs in the NUMACell.siblings field.  This needs to be
  verified to make sure that it doesn't cause other problems.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1635674/+subscriptions


References