yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #87448
[Bug 1947396] [NEW] No logs if scheduling fails due to pages requirements
Public bug reported:
On a customer's environment with the NUMATopologyFilter enabled, when
trying to create an instance using a flavor which has these properties:
aggregate_instance_extra_specs:cloud_metadata='true'
aggregate_instance_extra_specs:cpu_allocation_ratio='1.0'
hw:cpu_max_sockets='1'
hw:cpu_policy='dedicated'
hw:cpu_sockets='1'
hw:cpu_thread_policy='require'
hw:emulator_threads_policy='isolate'
hw:mem_page_size='2MB'
hw:numa_nodes='1'
hw:pmu='False'
we are getting "No valid hosts found" fault eventually on the openstack
server show output. Letting aside why this is happening, I'd like to
report an improvement which could be done to the NUMATopologyFilter's
logging.
With debug logging enabled, this is a piece of the scheduler's log:
2021-10-14 15:23:08.047 34021 DEBUG nova.virt.hardware [...] Attempting to fit instance cell InstanceNUMACell(cpu_pinning_raw=None,cpu_policy='dedicated',cpu_thread_policy='require',cpu_topology=<?>,cpuset=set([...]),cpuset_reserved=None,id=0,memory=16384,pagesize=2048) on host_cell NUMACell(cpu_usage=0,cpuset=set([]),id=0,memory=96415,memory_usage=32768,mempages=[NUMAPagesTopology,NUMAPagesTopology],network_metadata=NetworkMetadata,pcpuset=set([...]),pinned_cpus=set([...]),siblings=[...]) _numa_fit_instance_cell /usr/lib/python3/dist-packages/nova/virt/hardware.py:1078
2021-10-14 15:23:08.048 34021 DEBUG nova.virt.hardware [...] Attempting to fit instance cell InstanceNUMACell(cpu_pinning_raw=None,cpu_policy='dedicated',cpu_thread_policy='require',cpu_topology=<?>,cpuset=set([...]),cpuset_reserved=None,id=0,memory=16384,pagesize=2048) on host_cell NUMACell(cpu_usage=0,cpuset=set([]),id=1,memory=96733,memory_usage=22528,mempages=[NUMAPagesTopology,NUMAPagesTopology],network_metadata=NetworkMetadata,pcpuset=set([...]),pinned_cpus=set([...]),siblings=[...]) _numa_fit_instance_cell /usr/lib/python3/dist-packages/nova/virt/hardware.py:1078
2021-10-14 15:23:08.048 34021 DEBUG nova.scheduler.filters.numa_topology_filter [...] [instance: ...] ..., redacted fails NUMA topology requirements. The instance does not fit on this host. host_passes /usr/lib/python3/dist-packages/nova/scheduler/filters/numa_topology_filter.py:110
I've redacted some parts for privacy and some other for clarity. Those
messages are repeated for each compute tested.
The issue is that there's no indication of why the VM doesn't fit on the
host.
Looking at the code I narrowed it down to the numa_fit_instance_to_host
function on nova/virt/hardware.py. The raising of the exception
exception.MemoryPageSizeNotSupported by the
_numa_cell_supports_pagesize_request function doesn't generate any log.
I think it might be useful to get this information to the logs to ease
on the debugging of the filter's working (as it is done for other
reasons of the instance not passing the filter).
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1947396
Title:
No logs if scheduling fails due to pages requirements
Status in OpenStack Compute (nova):
New
Bug description:
On a customer's environment with the NUMATopologyFilter enabled, when
trying to create an instance using a flavor which has these
properties:
aggregate_instance_extra_specs:cloud_metadata='true'
aggregate_instance_extra_specs:cpu_allocation_ratio='1.0'
hw:cpu_max_sockets='1'
hw:cpu_policy='dedicated'
hw:cpu_sockets='1'
hw:cpu_thread_policy='require'
hw:emulator_threads_policy='isolate'
hw:mem_page_size='2MB'
hw:numa_nodes='1'
hw:pmu='False'
we are getting "No valid hosts found" fault eventually on the
openstack server show output. Letting aside why this is happening, I'd
like to report an improvement which could be done to the
NUMATopologyFilter's logging.
With debug logging enabled, this is a piece of the scheduler's log:
2021-10-14 15:23:08.047 34021 DEBUG nova.virt.hardware [...] Attempting to fit instance cell InstanceNUMACell(cpu_pinning_raw=None,cpu_policy='dedicated',cpu_thread_policy='require',cpu_topology=<?>,cpuset=set([...]),cpuset_reserved=None,id=0,memory=16384,pagesize=2048) on host_cell NUMACell(cpu_usage=0,cpuset=set([]),id=0,memory=96415,memory_usage=32768,mempages=[NUMAPagesTopology,NUMAPagesTopology],network_metadata=NetworkMetadata,pcpuset=set([...]),pinned_cpus=set([...]),siblings=[...]) _numa_fit_instance_cell /usr/lib/python3/dist-packages/nova/virt/hardware.py:1078
2021-10-14 15:23:08.048 34021 DEBUG nova.virt.hardware [...] Attempting to fit instance cell InstanceNUMACell(cpu_pinning_raw=None,cpu_policy='dedicated',cpu_thread_policy='require',cpu_topology=<?>,cpuset=set([...]),cpuset_reserved=None,id=0,memory=16384,pagesize=2048) on host_cell NUMACell(cpu_usage=0,cpuset=set([]),id=1,memory=96733,memory_usage=22528,mempages=[NUMAPagesTopology,NUMAPagesTopology],network_metadata=NetworkMetadata,pcpuset=set([...]),pinned_cpus=set([...]),siblings=[...]) _numa_fit_instance_cell /usr/lib/python3/dist-packages/nova/virt/hardware.py:1078
2021-10-14 15:23:08.048 34021 DEBUG nova.scheduler.filters.numa_topology_filter [...] [instance: ...] ..., redacted fails NUMA topology requirements. The instance does not fit on this host. host_passes /usr/lib/python3/dist-packages/nova/scheduler/filters/numa_topology_filter.py:110
I've redacted some parts for privacy and some other for clarity. Those
messages are repeated for each compute tested.
The issue is that there's no indication of why the VM doesn't fit on
the host.
Looking at the code I narrowed it down to the
numa_fit_instance_to_host function on nova/virt/hardware.py. The
raising of the exception exception.MemoryPageSizeNotSupported by the
_numa_cell_supports_pagesize_request function doesn't generate any
log.
I think it might be useful to get this information to the logs to ease
on the debugging of the filter's working (as it is done for other
reasons of the instance not passing the filter).
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1947396/+subscriptions