yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #74734
[Bug 1792985] [NEW] strict NUMA memory allocation for 4K pages leads to OOM-killer
Public bug reported:
We've seen a case on a resource-constrained compute node where booting
multiple instances passed, but led to the following error messages from
the host kernel:
[ 731.911731] Out of memory: Kill process 133047 (nova-api) score 4 or sacrifice child
[ 731.920377] Killed process 133047 (nova-api) total-vm:374456kB, anon-rss:144708kB, file-rss:1892kB, shmem-rss:0kB
The problem appears to be that currently with libvirt an instance which
does not specify a NUMA topology (which implies "shared" CPUs and the
default memory pagesize) is allowed to float across the whole compute
node. As such, we do not know which host NUMA node its memory is going
to be allocated from, and therefore we don't know how much memory is
remaining on each host NUMA node.
If we have a similar instance which *is* limited to a particular NUMA
node (due to adding a PCI device for example, or in the future by
specifying dedicated CPUs) then that allocation will currently use
"strict" NUMA affinity. This allocation can fail if there isn't enough
memory available on that NUMA node (due to being "stolen" by a floating
instance, for example).
I think this means that we cannot use "strict" affinity for the default
page size even when we do have a numa_topology since we can't have
accurate per-NUMA-node accounting due to the fact that we don't know
which NUMA node floating instances allocated their memory from.
** Affects: nova
Importance: Undecided
Status: New
** Tags: compute
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1792985
Title:
strict NUMA memory allocation for 4K pages leads to OOM-killer
Status in OpenStack Compute (nova):
New
Bug description:
We've seen a case on a resource-constrained compute node where booting
multiple instances passed, but led to the following error messages
from the host kernel:
[ 731.911731] Out of memory: Kill process 133047 (nova-api) score 4 or sacrifice child
[ 731.920377] Killed process 133047 (nova-api) total-vm:374456kB, anon-rss:144708kB, file-rss:1892kB, shmem-rss:0kB
The problem appears to be that currently with libvirt an instance
which does not specify a NUMA topology (which implies "shared" CPUs
and the default memory pagesize) is allowed to float across the whole
compute node. As such, we do not know which host NUMA node its memory
is going to be allocated from, and therefore we don't know how much
memory is remaining on each host NUMA node.
If we have a similar instance which *is* limited to a particular NUMA
node (due to adding a PCI device for example, or in the future by
specifying dedicated CPUs) then that allocation will currently use
"strict" NUMA affinity. This allocation can fail if there isn't
enough memory available on that NUMA node (due to being "stolen" by a
floating instance, for example).
I think this means that we cannot use "strict" affinity for the
default page size even when we do have a numa_topology since we can't
have accurate per-NUMA-node accounting due to the fact that we don't
know which NUMA node floating instances allocated their memory from.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1792985/+subscriptions