yahoo-eng-team team mailing list archive

Thread
Date

[Bug 1863757] Re: Insufficient memory for guest pages when using NUMA

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Stephen Finucane <sfinucan@xxxxxxxxxx>
Date: Thu, 12 Mar 2020 18:56:44 -0000
Reply-to: Bug 1863757 <1863757@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx

*** This bug is a duplicate of bug 1734204 ***
    https://bugs.launchpad.net/bugs/1734204

Yes, this has been resolved since Stein as noted at 1734204.
Unfortunately Queen is in Extended Maintenance and we no longer release
new versions so this is not likely to be fixed there.

** This bug has been marked a duplicate of bug 1734204
    Insufficient free host memory pages available to allocate guest RAM with Open vSwitch DPDK in Newton

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1863757

Title:
  Insufficient memory for guest pages when using NUMA

Status in OpenStack Compute (nova):
  New

Bug description:
  This is a Queens / Bionic openstack deploy.

  Compute nodes are using hugepages for nova instances (reserved at boot
  time):

  root@compute1:~# cat /proc/meminfo | grep -i huge
  AnonHugePages:         0 kB
  ShmemHugePages:        0 kB
  HugePages_Total:     332
  HugePages_Free:      184
  HugePages_Rsvd:        0
  HugePages_Surp:        0
  Hugepagesize:    1048576 kB

  There are two numa nodes, as follows:

  root@compute1:~# lscpu | grep -i numa
  NUMA node(s):        2
  NUMA node0 CPU(s):   0-19,40-59
  NUMA node1 CPU(s):   20-39,60-79

  Compute nodes are using DPDK, and memory for it has been reserved with
  the following directive:

  reserved-huge-pages: "node:0,size:1GB,count:8;node:1,size:1GB,count:8"

  A number of instances have already been created on node "compute1",
  until the point that current memory usage is as follows:

  root@compute1:~# cat /sys/devices/system/node/node*/meminfo  | grep -i huge
  Node 0 AnonHugePages:         0 kB
  Node 0 ShmemHugePages:        0 kB
  Node 0 HugePages_Total:   166
  Node 0 HugePages_Free:     26
  Node 0 HugePages_Surp:      0
  Node 1 AnonHugePages:         0 kB
  Node 1 ShmemHugePages:        0 kB
  Node 1 HugePages_Total:   166
  Node 1 HugePages_Free:    158
  Node 1 HugePages_Surp:      0

  Problem:

  When a new instance is created (8 cores and 32gb ram), nova tries to
  schedule it on numa node 0 and fails with "Insufficient free host
  memory pages available to allocate guest RAM", even though there is
  enough memory available on numa node 1.

  This behavior has been seem by other users also here (although the
  solution on that bug seems to be more a coincidence than a proper
  solution -- then classified as not a bug, which I don't believe is the
  case):

  https://bugzilla.redhat.com/show_bug.cgi?id=1517004

  Flavor being used has nothing special except a property for
  hw:mem_page_size='large'.

  Instance is being forced to be created on "zone1::compute1", otherwise
  no kind of pinning of cpus or other resources. All the forcing of vm
  going to node0 seems to be nova's decision when instantiating it.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1863757/+subscriptions

References

[Bug 1863757] [NEW] Insufficient memory for guest pages when using NUMA
From: Andre Ruiz, 2020-02-18