yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #87609
[Bug 1950186] [NEW] Nova doesn't account for hugepages when scheduling VMs
Public bug reported:
Description
===========
When hugepages are enabled on the host it's possible to schedule VMs
using more RAM than available.
On the node with memory usage presented below it was possible to
schedule 6 instances using a total of 140G of memory and a non-
hugepages-enabled flavor. The same machine has 188G of memory in total,
of which 64G were reserved for hugepages. Additional ~4G were used for
housekeeping, OpenStack control plane, etc. This resulted in
overcommitment of roughly 20G.
After running memory intensive operations on the VMs, some of them got
OOM killed.
$ cat /proc/meminfo | egrep "^(Mem|Huge)" # on the compute node
MemTotal: 197784792 kB
MemFree: 115005288 kB
MemAvailable: 116745612 kB
HugePages_Total: 64
HugePages_Free: 64
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 1048576 kB
Hugetlb: 67108864 kB
$ os hypervisor show copmute1 -c memory_mb -c memory_mb_used -c free_ram_mb
+----------------+--------+
| Field | Value |
+----------------+--------+
| free_ram_mb | 29309 |
| memory_mb | 193149 |
| memory_mb_used | 163840 |
+----------------+--------+
$ os host show compute1
+----------+----------------------------------+-----+-----------+---------+
| Host | Project | CPU | Memory MB | Disk GB |
+----------+----------------------------------+-----+-----------+---------+
| compute1 | (total) | 0 | 193149 | 893 |
| compute1 | (used_now) | 72 | 163840 | 460 |
| compute1 | (used_max) | 72 | 147456 | 460 |
| compute1 | some_project_id_was_here | 2 | 4096 | 40 |
| compute1 | another_anonymized_id_here | 70 | 143360 | 420 |
+----------+----------------------------------+-----+-----------+---------+
$ os resource provider inventory list uuid_of_compute1_node
+----------------+------------------+----------+----------+----------+-----------+--------+
| resource_class | allocation_ratio | min_unit | max_unit | reserved | step_size | total |
+----------------+------------------+----------+----------+----------+-----------+--------+
| MEMORY_MB | 1.0 | 1 | 193149 | 16384 | 1 | 193149 |
| DISK_GB | 1.0 | 1 | 893 | 0 | 1 | 893 |
| PCPU | 1.0 | 1 | 72 | 0 | 1 | 72 |
+----------------+------------------+----------+----------+----------+-----------+--------+
Steps to reproduce
==================
1. Reserve a large part of memory for hugepages on the hypervisor.
2. Create VMs using a flavor that uses a lot of memory that isn't backed by hugepages.
3. Start memory intensive operations on the VMs, e.g.:
stress-ng --vm-bytes $(awk '/MemAvailable/{printf "%d", $2 * 0.98;}' < /proc/meminfo)k --vm-keep -m 1
Expected result
===============
Nova should not allow overcommitment and should be able to differentiate
between hugepages and "normal" memory.
Actual result
=============
Overcommitment resulting in OOM kills.
Environment
===========
nova-api-metadata 2:21.2.1-0ubuntu1~cloud0
nova-common 2:21.2.1-0ubuntu1~cloud0
nova-compute 2:21.2.1-0ubuntu1~cloud0
nova-compute-kvm 2:21.2.1-0ubuntu1~cloud0
nova-compute-libvirt 2:21.2.1-0ubuntu1~cloud0
python3-nova 2:21.2.1-0ubuntu1~cloud0
python3-novaclient 2:17.0.0-0ubuntu1~cloud0
OS: Ubuntu 18.04.5 LTS
Hypervisor: libvirt + KVM
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1950186
Title:
Nova doesn't account for hugepages when scheduling VMs
Status in OpenStack Compute (nova):
New
Bug description:
Description
===========
When hugepages are enabled on the host it's possible to schedule VMs
using more RAM than available.
On the node with memory usage presented below it was possible to
schedule 6 instances using a total of 140G of memory and a non-
hugepages-enabled flavor. The same machine has 188G of memory in
total, of which 64G were reserved for hugepages. Additional ~4G were
used for housekeeping, OpenStack control plane, etc. This resulted in
overcommitment of roughly 20G.
After running memory intensive operations on the VMs, some of them got
OOM killed.
$ cat /proc/meminfo | egrep "^(Mem|Huge)" # on the compute node
MemTotal: 197784792 kB
MemFree: 115005288 kB
MemAvailable: 116745612 kB
HugePages_Total: 64
HugePages_Free: 64
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 1048576 kB
Hugetlb: 67108864 kB
$ os hypervisor show copmute1 -c memory_mb -c memory_mb_used -c free_ram_mb
+----------------+--------+
| Field | Value |
+----------------+--------+
| free_ram_mb | 29309 |
| memory_mb | 193149 |
| memory_mb_used | 163840 |
+----------------+--------+
$ os host show compute1
+----------+----------------------------------+-----+-----------+---------+
| Host | Project | CPU | Memory MB | Disk GB |
+----------+----------------------------------+-----+-----------+---------+
| compute1 | (total) | 0 | 193149 | 893 |
| compute1 | (used_now) | 72 | 163840 | 460 |
| compute1 | (used_max) | 72 | 147456 | 460 |
| compute1 | some_project_id_was_here | 2 | 4096 | 40 |
| compute1 | another_anonymized_id_here | 70 | 143360 | 420 |
+----------+----------------------------------+-----+-----------+---------+
$ os resource provider inventory list uuid_of_compute1_node
+----------------+------------------+----------+----------+----------+-----------+--------+
| resource_class | allocation_ratio | min_unit | max_unit | reserved | step_size | total |
+----------------+------------------+----------+----------+----------+-----------+--------+
| MEMORY_MB | 1.0 | 1 | 193149 | 16384 | 1 | 193149 |
| DISK_GB | 1.0 | 1 | 893 | 0 | 1 | 893 |
| PCPU | 1.0 | 1 | 72 | 0 | 1 | 72 |
+----------------+------------------+----------+----------+----------+-----------+--------+
Steps to reproduce
==================
1. Reserve a large part of memory for hugepages on the hypervisor.
2. Create VMs using a flavor that uses a lot of memory that isn't backed by hugepages.
3. Start memory intensive operations on the VMs, e.g.:
stress-ng --vm-bytes $(awk '/MemAvailable/{printf "%d", $2 * 0.98;}' < /proc/meminfo)k --vm-keep -m 1
Expected result
===============
Nova should not allow overcommitment and should be able to
differentiate between hugepages and "normal" memory.
Actual result
=============
Overcommitment resulting in OOM kills.
Environment
===========
nova-api-metadata 2:21.2.1-0ubuntu1~cloud0
nova-common 2:21.2.1-0ubuntu1~cloud0
nova-compute 2:21.2.1-0ubuntu1~cloud0
nova-compute-kvm 2:21.2.1-0ubuntu1~cloud0
nova-compute-libvirt 2:21.2.1-0ubuntu1~cloud0
python3-nova 2:21.2.1-0ubuntu1~cloud0
python3-novaclient 2:17.0.0-0ubuntu1~cloud0
OS: Ubuntu 18.04.5 LTS
Hypervisor: libvirt + KVM
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1950186/+subscriptions
Follow ups