yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #88943
[Bug 1950186] Re: Nova doesn't account for hugepages when scheduling VMs
This is not a bug it is user error.
when using hugepages if you want to have non hugepage guests on the same host then you must use
hw:mem_page_size=small or hw:mem_page_size=4k for all non hugepages guests
we do not support memory oversubscriton when using hw:mem_page_size and
this also makes the guest have 1 implicit numa node.
we intentually do not support mixing numa and non numa guest on the same
host which is what happens if you do not use hw:mem_page_size=small
when hw:mem_page_size is not set we do not do page size/numa node aware
schduling.
the reason that you are having the current issue is because you are
mixing numa and non numa instance on the same host which has never been
supported in nova.
we may eventually support this in the distantant future but we have no
plans to support this in zed and no one has proposed a way to support it
upstream yet.
it is a very non trivial feature and would require us to effectively make all instance numa instances.
we cannot support mixing floating instance an numa affined instances on the same host today due to how we do numa affinity
and how that interacts with the kernel OOM reaper.
basically the OOM reaper operates per numa node not globally so if the kernel need memory on numa node 0 even if there is free memory on numa node 0 if it cant free the memory on numa node 0 it will kill process to free it.
that will often result in numa affined non hugepage guest being killed if a floating guest is spawned and it triggers an OOM event.
that is not something we can allow to happen as its a multi tenant issue so we cannot support mixing numa and non numa instance in the same host.
the workaround to use hugepage and non hugepage guests on the same host is there for to make all the guest have numa affinity by using hw:mem_page_size.
this is a well know limitation and not a bug so I'm closing this as wont
fix
** Changed in: nova
Status: Confirmed => Won't Fix
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1950186
Title:
Nova doesn't account for hugepages when scheduling VMs
Status in OpenStack Compute (nova):
Won't Fix
Bug description:
Description
===========
When hugepages are enabled on the host it's possible to schedule VMs
using more RAM than available.
On the node with memory usage presented below it was possible to
schedule 6 instances using a total of 140G of memory and a non-
hugepages-enabled flavor. The same machine has 188G of memory in
total, of which 64G were reserved for hugepages. Additional ~4G were
used for housekeeping, OpenStack control plane, etc. This resulted in
overcommitment of roughly 20G.
After running memory intensive operations on the VMs, some of them got
OOM killed.
$ cat /proc/meminfo | egrep "^(Mem|Huge)" # on the compute node
MemTotal: 197784792 kB
MemFree: 115005288 kB
MemAvailable: 116745612 kB
HugePages_Total: 64
HugePages_Free: 64
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 1048576 kB
Hugetlb: 67108864 kB
$ os hypervisor show copmute1 -c memory_mb -c memory_mb_used -c free_ram_mb
+----------------+--------+
| Field | Value |
+----------------+--------+
| free_ram_mb | 29309 |
| memory_mb | 193149 |
| memory_mb_used | 163840 |
+----------------+--------+
$ os host show compute1
+----------+----------------------------------+-----+-----------+---------+
| Host | Project | CPU | Memory MB | Disk GB |
+----------+----------------------------------+-----+-----------+---------+
| compute1 | (total) | 0 | 193149 | 893 |
| compute1 | (used_now) | 72 | 163840 | 460 |
| compute1 | (used_max) | 72 | 147456 | 460 |
| compute1 | some_project_id_was_here | 2 | 4096 | 40 |
| compute1 | another_anonymized_id_here | 70 | 143360 | 420 |
+----------+----------------------------------+-----+-----------+---------+
$ os resource provider inventory list uuid_of_compute1_node
+----------------+------------------+----------+----------+----------+-----------+--------+
| resource_class | allocation_ratio | min_unit | max_unit | reserved | step_size | total |
+----------------+------------------+----------+----------+----------+-----------+--------+
| MEMORY_MB | 1.0 | 1 | 193149 | 16384 | 1 | 193149 |
| DISK_GB | 1.0 | 1 | 893 | 0 | 1 | 893 |
| PCPU | 1.0 | 1 | 72 | 0 | 1 | 72 |
+----------------+------------------+----------+----------+----------+-----------+--------+
Steps to reproduce
==================
1. Reserve a large part of memory for hugepages on the hypervisor.
2. Create VMs using a flavor that uses a lot of memory that isn't backed by hugepages.
3. Start memory intensive operations on the VMs, e.g.:
stress-ng --vm-bytes $(awk '/MemAvailable/{printf "%d", $2 * 0.98;}' < /proc/meminfo)k --vm-keep -m 1
Expected result
===============
Nova should not allow overcommitment and should be able to
differentiate between hugepages and "normal" memory.
Actual result
=============
Overcommitment resulting in OOM kills.
Environment
===========
nova-api-metadata 2:21.2.1-0ubuntu1~cloud0
nova-common 2:21.2.1-0ubuntu1~cloud0
nova-compute 2:21.2.1-0ubuntu1~cloud0
nova-compute-kvm 2:21.2.1-0ubuntu1~cloud0
nova-compute-libvirt 2:21.2.1-0ubuntu1~cloud0
python3-nova 2:21.2.1-0ubuntu1~cloud0
python3-novaclient 2:17.0.0-0ubuntu1~cloud0
OS: Ubuntu 18.04.5 LTS
Hypervisor: libvirt + KVM
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1950186/+subscriptions
References