yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #76555
[Bug 1810977] [NEW] Oversubscription broken for instances with NUMA topologies
Public bug reported:
As described in [1], the fix to [2] appears to have inadvertently broken
oversubscription of memory for instances with a NUMA topology but no
hugepages.
Steps to reproduce:
1. Create a flavor that will consume > 50% available memory for your
host(s) and specify an explicit NUMA topology. For example, on my all-
in-one deployment where the host has 32GB RAM, we will request a 20GB
instance:
$ openstack flavor create --vcpu 2 --disk 0 --ram 20480 test.numa
$ openstack flavor set test.numa --property hw:numa_nodes=2
2. Boot an instance using this flavor:
$ openstack server create --flavor test.numa --image
cirros-0.3.6-x86_64-disk --wait test
3. Boot another instance using this flavor:
$ openstack server create --flavor test.numa --image
cirros-0.3.6-x86_64-disk --wait test2
# Expected result:
The second instance should boot.
# Actual result:
The second instance fails to boot. We see the following error message in
the logs.
nova-scheduler[18295]: DEBUG nova.virt.hardware [None req-f7a6594b-8d25-424c-9c6e-8522f66ffd22 demo admin] No specific pagesize requested for instance, selected pagesize: 4 {{(pid=18318) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1045}}
nova-scheduler[18295]: DEBUG nova.virt.hardware [None req-f7a6594b-8d25-424c-9c6e-8522f66ffd22 demo admin] Not enough available memory to schedule instance with pagesize 4. Required: 10240, available: 5676, total: 15916. {{(pid=18318) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1055}}
If we revert the patch that addressed the bug [3] then we revert to the
correct behaviour and the instance boots. With this though, we obviously
lose whatever benefits that change gave us.
[1] http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001459.html
[2] https://bugs.launchpad.net/nova/+bug/1734204
[3] https://review.openstack.org/#/c/532168
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1810977
Title:
Oversubscription broken for instances with NUMA topologies
Status in OpenStack Compute (nova):
New
Bug description:
As described in [1], the fix to [2] appears to have inadvertently
broken oversubscription of memory for instances with a NUMA topology
but no hugepages.
Steps to reproduce:
1. Create a flavor that will consume > 50% available memory for your
host(s) and specify an explicit NUMA topology. For example, on my all-
in-one deployment where the host has 32GB RAM, we will request a 20GB
instance:
$ openstack flavor create --vcpu 2 --disk 0 --ram 20480 test.numa
$ openstack flavor set test.numa --property hw:numa_nodes=2
2. Boot an instance using this flavor:
$ openstack server create --flavor test.numa --image
cirros-0.3.6-x86_64-disk --wait test
3. Boot another instance using this flavor:
$ openstack server create --flavor test.numa --image
cirros-0.3.6-x86_64-disk --wait test2
# Expected result:
The second instance should boot.
# Actual result:
The second instance fails to boot. We see the following error message
in the logs.
nova-scheduler[18295]: DEBUG nova.virt.hardware [None req-f7a6594b-8d25-424c-9c6e-8522f66ffd22 demo admin] No specific pagesize requested for instance, selected pagesize: 4 {{(pid=18318) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1045}}
nova-scheduler[18295]: DEBUG nova.virt.hardware [None req-f7a6594b-8d25-424c-9c6e-8522f66ffd22 demo admin] Not enough available memory to schedule instance with pagesize 4. Required: 10240, available: 5676, total: 15916. {{(pid=18318) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1055}}
If we revert the patch that addressed the bug [3] then we revert to
the correct behaviour and the instance boots. With this though, we
obviously lose whatever benefits that change gave us.
[1] http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001459.html
[2] https://bugs.launchpad.net/nova/+bug/1734204
[3] https://review.openstack.org/#/c/532168
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1810977/+subscriptions
Follow ups