yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #27521
[Bug 1417144] [NEW] libvirt: instance schedule fails when the host has offlined CPUs
Public bug reported:
When a host system has CPUs that are offlined via CPU hotplug, nova
fails to start an instance on the host. On a test system, I've offlined
CPU 31 on an Intel blade server by running the following command:
sudo echo 0 > /sys/devices/system/cpu/cpu31/online
When starting an instance, I see the following error from libvirt:
TRACE nova.compute.manager [instance: 27c5aafc-a994-4a33-b23e-
287cc5be8d8b] libvirtError: Invalid value '8-15,24-31' for
'cpuset.cpus': Invalid argument
This is because CPU 31 is included in the cpuset passed to libvirt,
although the CPU is offline. Excerpt from the instance XML:
<vcpu placement='static' cpuset='8-15,24-31'>1</vcpu>
As a fix, I suggest to enhance the libvirt nova driver by using the
getCPUMap() API in libvirt to determine if CPUs on the host are offline.
If there are offline CPUs, they should not be included in the XML
definition passed to libvirt. I'll attach a proposed fix.
Rationale: on server platforms like s390, it is common to have offlined
CPUs on a host as the platform offers capabilities to run multiple host
operatings systems (e.g. multiple KVM hypervisors / compute nodes). CPUs
can dynamically be assigned to the different host operating systems, so
it is common to have offlined CPUs.
** Affects: nova
Importance: Undecided
Status: New
** Tags: libvirt
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1417144
Title:
libvirt: instance schedule fails when the host has offlined CPUs
Status in OpenStack Compute (Nova):
New
Bug description:
When a host system has CPUs that are offlined via CPU hotplug, nova
fails to start an instance on the host. On a test system, I've
offlined CPU 31 on an Intel blade server by running the following
command:
sudo echo 0 > /sys/devices/system/cpu/cpu31/online
When starting an instance, I see the following error from libvirt:
TRACE nova.compute.manager [instance: 27c5aafc-a994-4a33-b23e-
287cc5be8d8b] libvirtError: Invalid value '8-15,24-31' for
'cpuset.cpus': Invalid argument
This is because CPU 31 is included in the cpuset passed to libvirt,
although the CPU is offline. Excerpt from the instance XML:
<vcpu placement='static' cpuset='8-15,24-31'>1</vcpu>
As a fix, I suggest to enhance the libvirt nova driver by using the
getCPUMap() API in libvirt to determine if CPUs on the host are
offline. If there are offline CPUs, they should not be included in the
XML definition passed to libvirt. I'll attach a proposed fix.
Rationale: on server platforms like s390, it is common to have
offlined CPUs on a host as the platform offers capabilities to run
multiple host operatings systems (e.g. multiple KVM hypervisors /
compute nodes). CPUs can dynamically be assigned to the different host
operating systems, so it is common to have offlined CPUs.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1417144/+subscriptions
Follow ups
References