yahoo-eng-team team mailing list archive

Thread
Date
[Bug 1889633] Re: Pinned instance with thread policy can consume VCPU

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Elod Illes <1889633@xxxxxxxxxxxxxxxxxx>
Date: Tue, 24 Nov 2020 11:28:27 -0000
Reply-to: Bug 1889633 <1889633@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx
** Changed in: nova/train
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1889633

Title:
  Pinned instance with thread policy can consume VCPU

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) train series:
  Fix Released
Status in OpenStack Compute (nova) ussuri series:
  Fix Released

Bug description:
  In Train, we introduced the concept of the 'PCPU' resource type to
  track pinned instance CPU usage. The '[compute] cpu_dedicated_set' is
  used to indicate which host cores should be used by pinned instances
  and, once this config option was set, nova would start reporting
  'PCPU' resource types in addition to (or entirely instead of, if
  'cpu_shared_set' was unset) 'VCPU'. Requests for pinned instances (via
  the 'hw:cpu_policy=dedicated' flavor extra spec or equivalent image
  metadata property) would result in a query for 'PCPU' inventory rather
  than 'VCPU', as previously done.

  We anticipated some upgrade issues with this change, whereby there
  could be a period during an upgrade in which some hosts would have the
  new configuration, meaning they'd be reporting PCPU, but the remainder
  would still be on legacy config and therefore would continue reporting
  just VCPU. An instance could be reasonably expected to land on any
  host, but since only the hosts with the new configuration were
  reporting 'PCPU' inventory and the 'hw:cpu_policy=dedicated' extra
  spec was resulting in a request for 'PCPU', the hosts with legacy
  configuration would never be consumed.

  We worked around this issue by adding support for a fallback placement
  query, enabled by default, which would make a second request using
  'VCPU' inventory instead of 'PCPU'. The idea behind this was that the
  hosts with 'PCPU' inventory would be preferred, meaning we'd only try
  the 'VCPU' allocation if the preferred path failed. Crucially, we
  anticipated that if a host with new style configuration was picked up
  by this second 'VCPU' query, an instance would never actually be able
  to build there. This is because the new-style configuration would be
  reflected in the 'numa_topology' blob of the 'ComputeNode' object,
  specifically via the 'cpuset' (for cores allocated to 'VCPU') and
  'pcpuset' (for cores allocated to 'PCPU') fields. With new-style
  configuration, both of these are set to unique values. If the
  scheduler had determined that there wasn't enough 'PCPU' inventory
  available for the instance, that would implicitly mean there weren't
  enough of the cores listed in the 'pcpuset' field still available.

  Turns out there's a gap in this thinking: thread policies. The
  'isolate' CPU thread policy previously meant "give me a host with no
  hyperthreads, else a host with hyperthreads but mark the thread
  siblings of the cores used by the instance as reserved". This didn't
  translate to a new 'PCPU' world where we needed to know how many cores
  we were consuming up front before landing on the host. To work around
  this, we removed support for the latter case and instead relied on a
  trait, 'HW_CPU_HYPERTHEADING', to indicate whether a host had
  hyperthread support or not. Using the 'isolate' policy meant that
  trait could not be defined on the host, or the trait was "forbidden".
  The gap comes via a combination of this trait request and the fallback
  query. If we request the isolate thread policy, hosts with new-style
  configuration and sufficient PCPU inventory would nonetheless be
  rejected if they reported the 'HW_CPU_HYPERTHEADING' trait. However,
  these could get picked up in the fallback query and the instance would
  not fail to build on the host because of lack of 'PCPU' inventory.
  This means we end up with a pinned instance on a host using new-style
  configuration that is consuming 'VCPU' inventory. Boo.

  # Steps to reproduce

  1. Using a host with hyperthreading support enabled, configure both
  '[compute] cpu_dedicated_set' and '[compute] cpu_shared_set'

  2. Boot an instance with the 'hw:cpu_thread_policy=isolate' extra
  spec.

  # Expected result

  Instance should not boot since the host has hyperthreads.

  # Actual result

  Instance boots.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1889633/+subscriptions
References

[Bug 1889633] [NEW] Pinned instance with thread policy can consume VCPU
From: Stephen Finucane, 2020-07-30