yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #84080
[Bug 1898272] [NEW] "mixed" policy calculations don't account for host cells with no free shared CPUs
Public bug reported:
The 'mixed' CPU policy allows us to use both shared and dedicated CPUs
(VCPU and PCPU) in the same instance. The expectation is that the both
sets of CPUs will use host cores from the same NUMA node(s). The current
code does appear to be doing this, at least for single NUMA nodes,
however, it does not account for NUMA nodes without any shared CPUs.
# Steps to reproduce
Configure a dual NUMA node host so that all cores from one node are
assigned to '[compute] cpu_shared_set', while all the cores from the
other node are assigned to '[compute] cpu_dedicated_set'. For example,
on a host where cores 0-5 are on node 0, while cores 6-11 are on node 1:
[compute]
cpu_shared_set = 0-5
cpu_dedicated_set = 6-11
Now attempt to boot a guest using the mixed policy, e.g.
$ openstack flavor create --vcpu 4 --ram 512 --disk 1 \
--property 'hw:cpu_policy=mixed' --property 'hw:cpu_dedicated_mask=^0' \
test.mixed
$ openstack server create --os-compute-api-version=2.latest \
--flavor test.mixed --image cirros-0.5.1-x86_64-disk --nic none --wait \
test-server
# Expected result
The instance should fail to schedule as the 'NUMATopologyFilter' should
reject the host.
# Actual result
The instance is scheduled but fails to boot since the following invalid
XML snippet is generated:
<cputune>
<shares>4096</shares>
<emulatorpin cpuset="0-1,4"/>
<vcpupin vcpu="0" cpuset=""/> # <--- here
<vcpupin vcpu="1" cpuset="0"/>
<vcpupin vcpu="2" cpuset="1"/>
<vcpupin vcpu="3" cpuset="4"/>
</cputune>
This results in the following traceback in the nova-compute logs.
ERROR nova.compute.manager [instance: ...] Traceback (most recent call last):
ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/compute/manager.py", line 2625, in _build_resources
ERROR nova.compute.manager [instance: ...] yield resources
ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/compute/manager.py", line 2398, in _build_and_run_instance
ERROR nova.compute.manager [instance: ...] accel_info=accel_info)
ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 3752, in spawn
ERROR nova.compute.manager [instance: ...] cleanup_instance_disks=created_disks)
ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6749, in _create_guest_with_network
ERROR nova.compute.manager [instance: ...] cleanup_instance_disks=cleanup_instance_disks)
ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
ERROR nova.compute.manager [instance: ...] self.force_reraise()
ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
ERROR nova.compute.manager [instance: ...] six.reraise(self.type_, self.value, self.tb)
ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/six.py", line 703, in reraise
ERROR nova.compute.manager [instance: ...] raise value
ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6718, in _create_guest_with_network
ERROR nova.compute.manager [instance: ...] post_xml_callback=post_xml_callback)
ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6643, in _create_guest
ERROR nova.compute.manager [instance: ...] guest = libvirt_guest.Guest.create(xml, self._host)
ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 145, in create
ERROR nova.compute.manager [instance: ...] encodeutils.safe_decode(xml))
ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
ERROR nova.compute.manager [instance: ...] self.force_reraise()
ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
ERROR nova.compute.manager [instance: ...] six.reraise(self.type_, self.value, self.tb)
ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/six.py", line 703, in reraise
ERROR nova.compute.manager [instance: ...] raise value
ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 141, in create
ERROR nova.compute.manager [instance: ...] guest = host.write_instance_config(xml)
ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/host.py", line 1144, in write_instance_config
ERROR nova.compute.manager [instance: ...] domain = self.get_connection().defineXML(xml)
ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/eventlet/tpool.py", line 190, in doit
ERROR nova.compute.manager [instance: ...] result = proxy_call(self._autowrap, f, *args, **kwargs)
ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/eventlet/tpool.py", line 148, in proxy_call
ERROR nova.compute.manager [instance: ...] rv = execute(f, *args, **kwargs)
ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/eventlet/tpool.py", line 129, in execute
ERROR nova.compute.manager [instance: ...] six.reraise(c, e, tb)
ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/six.py", line 703, in reraise
ERROR nova.compute.manager [instance: ...] raise value
ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/eventlet/tpool.py", line 83, in tworker
ERROR nova.compute.manager [instance: ...] rv = meth(*args, **kwargs)
ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/libvirt.py", line 3703, in defineXML
ERROR nova.compute.manager [instance: ...] if ret is None:raise libvirtError('virDomainDefineXML() failed', conn=self)
ERROR nova.compute.manager [instance: ...] libvirt.libvirtError: invalid argument: Failed to parse bitmap ''
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1898272
Title:
"mixed" policy calculations don't account for host cells with no free
shared CPUs
Status in OpenStack Compute (nova):
New
Bug description:
The 'mixed' CPU policy allows us to use both shared and dedicated CPUs
(VCPU and PCPU) in the same instance. The expectation is that the both
sets of CPUs will use host cores from the same NUMA node(s). The
current code does appear to be doing this, at least for single NUMA
nodes, however, it does not account for NUMA nodes without any shared
CPUs.
# Steps to reproduce
Configure a dual NUMA node host so that all cores from one node are
assigned to '[compute] cpu_shared_set', while all the cores from the
other node are assigned to '[compute] cpu_dedicated_set'. For example,
on a host where cores 0-5 are on node 0, while cores 6-11 are on node
1:
[compute]
cpu_shared_set = 0-5
cpu_dedicated_set = 6-11
Now attempt to boot a guest using the mixed policy, e.g.
$ openstack flavor create --vcpu 4 --ram 512 --disk 1 \
--property 'hw:cpu_policy=mixed' --property 'hw:cpu_dedicated_mask=^0' \
test.mixed
$ openstack server create --os-compute-api-version=2.latest \
--flavor test.mixed --image cirros-0.5.1-x86_64-disk --nic none --wait \
test-server
# Expected result
The instance should fail to schedule as the 'NUMATopologyFilter'
should reject the host.
# Actual result
The instance is scheduled but fails to boot since the following
invalid XML snippet is generated:
<cputune>
<shares>4096</shares>
<emulatorpin cpuset="0-1,4"/>
<vcpupin vcpu="0" cpuset=""/> # <--- here
<vcpupin vcpu="1" cpuset="0"/>
<vcpupin vcpu="2" cpuset="1"/>
<vcpupin vcpu="3" cpuset="4"/>
</cputune>
This results in the following traceback in the nova-compute logs.
ERROR nova.compute.manager [instance: ...] Traceback (most recent call last):
ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/compute/manager.py", line 2625, in _build_resources
ERROR nova.compute.manager [instance: ...] yield resources
ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/compute/manager.py", line 2398, in _build_and_run_instance
ERROR nova.compute.manager [instance: ...] accel_info=accel_info)
ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 3752, in spawn
ERROR nova.compute.manager [instance: ...] cleanup_instance_disks=created_disks)
ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6749, in _create_guest_with_network
ERROR nova.compute.manager [instance: ...] cleanup_instance_disks=cleanup_instance_disks)
ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
ERROR nova.compute.manager [instance: ...] self.force_reraise()
ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
ERROR nova.compute.manager [instance: ...] six.reraise(self.type_, self.value, self.tb)
ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/six.py", line 703, in reraise
ERROR nova.compute.manager [instance: ...] raise value
ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6718, in _create_guest_with_network
ERROR nova.compute.manager [instance: ...] post_xml_callback=post_xml_callback)
ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6643, in _create_guest
ERROR nova.compute.manager [instance: ...] guest = libvirt_guest.Guest.create(xml, self._host)
ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 145, in create
ERROR nova.compute.manager [instance: ...] encodeutils.safe_decode(xml))
ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
ERROR nova.compute.manager [instance: ...] self.force_reraise()
ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
ERROR nova.compute.manager [instance: ...] six.reraise(self.type_, self.value, self.tb)
ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/six.py", line 703, in reraise
ERROR nova.compute.manager [instance: ...] raise value
ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 141, in create
ERROR nova.compute.manager [instance: ...] guest = host.write_instance_config(xml)
ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/host.py", line 1144, in write_instance_config
ERROR nova.compute.manager [instance: ...] domain = self.get_connection().defineXML(xml)
ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/eventlet/tpool.py", line 190, in doit
ERROR nova.compute.manager [instance: ...] result = proxy_call(self._autowrap, f, *args, **kwargs)
ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/eventlet/tpool.py", line 148, in proxy_call
ERROR nova.compute.manager [instance: ...] rv = execute(f, *args, **kwargs)
ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/eventlet/tpool.py", line 129, in execute
ERROR nova.compute.manager [instance: ...] six.reraise(c, e, tb)
ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/six.py", line 703, in reraise
ERROR nova.compute.manager [instance: ...] raise value
ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/eventlet/tpool.py", line 83, in tworker
ERROR nova.compute.manager [instance: ...] rv = meth(*args, **kwargs)
ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/libvirt.py", line 3703, in defineXML
ERROR nova.compute.manager [instance: ...] if ret is None:raise libvirtError('virDomainDefineXML() failed', conn=self)
ERROR nova.compute.manager [instance: ...] libvirt.libvirtError: invalid argument: Failed to parse bitmap ''
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1898272/+subscriptions