← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1417144] [NEW] libvirt: instance schedule fails when the host has offlined CPUs

 

Public bug reported:

When a host system has CPUs that are offlined via CPU hotplug, nova
fails to start an instance on the host. On a test system, I've offlined
CPU 31 on an Intel blade server by running the following command:

sudo echo 0 > /sys/devices/system/cpu/cpu31/online

When starting an instance, I see the following error from libvirt:

TRACE nova.compute.manager [instance: 27c5aafc-a994-4a33-b23e-
287cc5be8d8b] libvirtError: Invalid value '8-15,24-31' for
'cpuset.cpus': Invalid argument

This is because CPU 31 is included in the cpuset passed to libvirt,
although the CPU is offline. Excerpt from the instance XML:

<vcpu placement='static' cpuset='8-15,24-31'>1</vcpu>

As a fix, I suggest to enhance the libvirt nova driver by using the
getCPUMap() API in libvirt to determine if CPUs on the host are offline.
If there are offline CPUs, they should not be included in the XML
definition passed to libvirt. I'll attach a proposed fix.

Rationale: on server platforms like s390, it is common to have offlined
CPUs on a host as the platform offers capabilities to run multiple host
operatings systems (e.g. multiple KVM hypervisors / compute nodes). CPUs
can dynamically be assigned to the different host operating systems, so
it is common to have offlined CPUs.

** Affects: nova
     Importance: Undecided
         Status: New


** Tags: libvirt

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1417144

Title:
  libvirt: instance schedule fails when the host has offlined CPUs

Status in OpenStack Compute (Nova):
  New

Bug description:
  When a host system has CPUs that are offlined via CPU hotplug, nova
  fails to start an instance on the host. On a test system, I've
  offlined CPU 31 on an Intel blade server by running the following
  command:

  sudo echo 0 > /sys/devices/system/cpu/cpu31/online

  When starting an instance, I see the following error from libvirt:

  TRACE nova.compute.manager [instance: 27c5aafc-a994-4a33-b23e-
  287cc5be8d8b] libvirtError: Invalid value '8-15,24-31' for
  'cpuset.cpus': Invalid argument

  This is because CPU 31 is included in the cpuset passed to libvirt,
  although the CPU is offline. Excerpt from the instance XML:

  <vcpu placement='static' cpuset='8-15,24-31'>1</vcpu>

  As a fix, I suggest to enhance the libvirt nova driver by using the
  getCPUMap() API in libvirt to determine if CPUs on the host are
  offline. If there are offline CPUs, they should not be included in the
  XML definition passed to libvirt. I'll attach a proposed fix.

  Rationale: on server platforms like s390, it is common to have
  offlined CPUs on a host as the platform offers capabilities to run
  multiple host operatings systems (e.g. multiple KVM hypervisors /
  compute nodes). CPUs can dynamically be assigned to the different host
  operating systems, so it is common to have offlined CPUs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1417144/+subscriptions


Follow ups

References