yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #88301
[Bug 1950186] Re: Nova doesn't account for hugepages when scheduling VMs
This can be reproduced on Focal/ussuri:
############ Computes:
$ os resource provider list
+--------------------------------------+-----------------------------------------------------+------------+--------------------------------------+----------------------+
| uuid | name | generation | root_provider_uuid | parent_provider_uuid |
+--------------------------------------+-----------------------------------------------------+------------+--------------------------------------+----------------------+
| ca3fa736-7e60-4365-9cc8-7afc78b53005 | juju-98fb61-zaza-d6f2c7825043-9.project.serverstack | 5 | ca3fa736-7e60-4365-9cc8-7afc78b53005 | None |
| 0605bd29-71d5-40ed-ab8f-eceeaaac59b5 | juju-98fb61-zaza-d6f2c7825043-8.project.serverstack | 4 | 0605bd29-71d5-40ed-ab8f-eceeaaac59b5 | None |
+--------------------------------------+-----------------------------------------------------+------------+--------------------------------------+----------------------+
############ Mem Allocation ratio is 1:
$ openstack resource provider inventory list ca3fa736-7e60-4365-9cc8-7afc78b53005
+----------------+------------------+----------+----------+----------+-----------+-------+-------+
| resource_class | allocation_ratio | min_unit | max_unit | reserved | step_size | total | used |
+----------------+------------------+----------+----------+----------+-----------+-------+-------+
| VCPU | 16.0 | 1 | 8 | 0 | 1 | 8 | 2 |
| MEMORY_MB | 1.0 | 1 | 16008 | 2048 | 1 | 16008 | 13960 |
| DISK_GB | 1.0 | 1 | 77 | 0 | 1 | 77 | 20 |
+----------------+------------------+----------+----------+----------+-----------+-------+-------+
$ openstack resource provider inventory list 0605bd29-71d5-40ed-ab8f-eceeaaac59b5
+----------------+------------------+----------+----------+----------+-----------+-------+------+
| resource_class | allocation_ratio | min_unit | max_unit | reserved | step_size | total | used |
+----------------+------------------+----------+----------+----------+-----------+-------+------+
| VCPU | 16.0 | 1 | 8 | 0 | 1 | 8 | 0 |
| MEMORY_MB | 1.0 | 1 | 16008 | 2048 | 1 | 16008 | 0 |
| DISK_GB | 1.0 | 1 | 77 | 0 | 1 | 77 | 0 |
+----------------+------------------+----------+----------+----------+-----------+-------+------+
######## Hugepages: 1000 * 2M
root@juju-98fb61-zaza-d6f2c7825043-9:~# cat /proc/meminfo | grep -i huge
AnonHugePages: 622592 kB
ShmemHugePages: 0 kB
FileHugePages: 0 kB
HugePages_Total: 1000
HugePages_Free: 1000
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 2048000 kB
root@juju-98fb61-zaza-d6f2c7825043-9:~# free -mh
total used free shared buff/cache available
Mem: 15Gi 3.5Gi 11Gi 1.0Mi 713Mi 11Gi
Swap: 0B 0B 0B
######## Host reserved memory is 2G:
$ juju config nova-compute reserved-host-memory
2048
######## Available Mem for general use (not hugepage)
I expect to have available memory for VMs = 16008 (total) - 2048 (reserved) - 2048 (hugepages) = 11912
######## Flavor with mem 13960 (> of expected total available 11912)
$ os flavor show 14g-mem
+----------------------------+--------------------------------------+
| Field | Value |
+----------------------------+--------------------------------------+
| OS-FLV-DISABLED:disabled | False |
| OS-FLV-EXT-DATA:ephemeral | 0 |
| access_project_ids | None |
| description | None |
| disk | 20 |
| id | 377de58b-7aa2-499d-9940-abf98aaa5a8a |
| name | 14g-mem |
| os-flavor-access:is_public | True |
| properties | |
| ram | 13960 |
| rxtx_factor | 1.0 |
| swap | |
| vcpus | 2 |
+----------------------------+--------------------------------------+
###### VM with flavor 14g-mem is scheduled correctly (Expected No Valid host)
$ os server list -c ID -c Name -c Status -c "Flavor"
+--------------------------------------+---------+--------+---------+
| ID | Name | Status | Flavor |
+--------------------------------------+---------+--------+---------+
| fd5c8bc9-22a6-4f5e-b745-026fa00e26ea | test-vm | ACTIVE | 14g-mem |
+--------------------------------------+---------+--------+---------+
** Project changed: nova => ubuntu
** Package changed: ubuntu => nova (Ubuntu)
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1950186
Title:
Nova doesn't account for hugepages when scheduling VMs
Status in nova package in Ubuntu:
New
Bug description:
Description
===========
When hugepages are enabled on the host it's possible to schedule VMs
using more RAM than available.
On the node with memory usage presented below it was possible to
schedule 6 instances using a total of 140G of memory and a non-
hugepages-enabled flavor. The same machine has 188G of memory in
total, of which 64G were reserved for hugepages. Additional ~4G were
used for housekeeping, OpenStack control plane, etc. This resulted in
overcommitment of roughly 20G.
After running memory intensive operations on the VMs, some of them got
OOM killed.
$ cat /proc/meminfo | egrep "^(Mem|Huge)" # on the compute node
MemTotal: 197784792 kB
MemFree: 115005288 kB
MemAvailable: 116745612 kB
HugePages_Total: 64
HugePages_Free: 64
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 1048576 kB
Hugetlb: 67108864 kB
$ os hypervisor show copmute1 -c memory_mb -c memory_mb_used -c free_ram_mb
+----------------+--------+
| Field | Value |
+----------------+--------+
| free_ram_mb | 29309 |
| memory_mb | 193149 |
| memory_mb_used | 163840 |
+----------------+--------+
$ os host show compute1
+----------+----------------------------------+-----+-----------+---------+
| Host | Project | CPU | Memory MB | Disk GB |
+----------+----------------------------------+-----+-----------+---------+
| compute1 | (total) | 0 | 193149 | 893 |
| compute1 | (used_now) | 72 | 163840 | 460 |
| compute1 | (used_max) | 72 | 147456 | 460 |
| compute1 | some_project_id_was_here | 2 | 4096 | 40 |
| compute1 | another_anonymized_id_here | 70 | 143360 | 420 |
+----------+----------------------------------+-----+-----------+---------+
$ os resource provider inventory list uuid_of_compute1_node
+----------------+------------------+----------+----------+----------+-----------+--------+
| resource_class | allocation_ratio | min_unit | max_unit | reserved | step_size | total |
+----------------+------------------+----------+----------+----------+-----------+--------+
| MEMORY_MB | 1.0 | 1 | 193149 | 16384 | 1 | 193149 |
| DISK_GB | 1.0 | 1 | 893 | 0 | 1 | 893 |
| PCPU | 1.0 | 1 | 72 | 0 | 1 | 72 |
+----------------+------------------+----------+----------+----------+-----------+--------+
Steps to reproduce
==================
1. Reserve a large part of memory for hugepages on the hypervisor.
2. Create VMs using a flavor that uses a lot of memory that isn't backed by hugepages.
3. Start memory intensive operations on the VMs, e.g.:
stress-ng --vm-bytes $(awk '/MemAvailable/{printf "%d", $2 * 0.98;}' < /proc/meminfo)k --vm-keep -m 1
Expected result
===============
Nova should not allow overcommitment and should be able to
differentiate between hugepages and "normal" memory.
Actual result
=============
Overcommitment resulting in OOM kills.
Environment
===========
nova-api-metadata 2:21.2.1-0ubuntu1~cloud0
nova-common 2:21.2.1-0ubuntu1~cloud0
nova-compute 2:21.2.1-0ubuntu1~cloud0
nova-compute-kvm 2:21.2.1-0ubuntu1~cloud0
nova-compute-libvirt 2:21.2.1-0ubuntu1~cloud0
python3-nova 2:21.2.1-0ubuntu1~cloud0
python3-novaclient 2:17.0.0-0ubuntu1~cloud0
OS: Ubuntu 18.04.5 LTS
Hypervisor: libvirt + KVM
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/1950186/+subscriptions
References