← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1373159] [NEW] NUMA Topology cell memory in MiB units rather than KiB units

 

Public bug reported:

Currently when specifying NUMA cell memory via flavor extra_specs or
image properties, MiB units are used. According to the libvirt xml
domain format documentation (http://libvirt.org/formatdomain.html) ,
cell memory should be specified in KiB.

In this example, we use the following extra_specs:
"hw:numa_policy": "strict", "hw:numa_mem.1": "2048", "hw:numa_mem.0": "6144", "hw:numa_nodes": "2", "hw:numa_cpus.0": "0,1,2", "hw:numa_cpus.1": "3"

The flavor has 8192 MB of ram and 4 vcpus.

When using qemu 2.1.0, the following will be seen in the n-cpu logs when
booting a machine with NUMA specs.

"libvirtError: internal error: process exited while connecting to
monitor: qemu-system-x86_64: total memory for NUMA nodes (8388608)
should equal RAM size (200000000)"

Please not that the 200000000 is 8388608 KiB in bytes and hex (simply an
issue with the qemu error message). The error shows that 8192 KiB is
being requested rather than 8192 MiB. Because the RAM size does not
equal the total memory size, the machine fails to boot.

When using versions of qemu lower than 2.1.0 the issue is not obvious,
as machines with  NUMA specs boot, but only because of a bug (that has
since been resolved) in qemu. This is because the check to ensure that
RAM size equals the NUMA node total memory does not happen in versions
lower than 2.1.0

In short, we should be using KiB units for NUMA cell memory, or at least
be converting from MiB to KiB before creating the xml. Otherwise, NUMA
placement will not behave as intended.

To be fair, I haven't had the chance to look at the memory placement in
a guest booted using qemu 2.0.0 or lower, though I suspect the memory
placement would be incorrect.. If anyone has the chance to look, it
would be greatly appreciated.

I am currently investigating the appropriate fix for this alongside
Tiago Mello. We made a quick fix in /nova/virt/libvirt/config.py on line
495:

                cell.set("memory", str(self.memory * 1024))

Mutiplying by 1024 allowed the machine to properly boot, but it is
probably a bit too quick and dirty. Just thought it would be worth
mentioning.

** Affects: nova
     Importance: Undecided
     Assignee: Michael Turek (mjturek)
         Status: New


** Tags: libvirt libvirt-driver

** Changed in: nova
     Assignee: (unassigned) => Michael Turek (mjturek)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1373159

Title:
  NUMA Topology cell memory in MiB units rather than KiB units

Status in OpenStack Compute (Nova):
  New

Bug description:
  Currently when specifying NUMA cell memory via flavor extra_specs or
  image properties, MiB units are used. According to the libvirt xml
  domain format documentation (http://libvirt.org/formatdomain.html) ,
  cell memory should be specified in KiB.

  In this example, we use the following extra_specs:
  "hw:numa_policy": "strict", "hw:numa_mem.1": "2048", "hw:numa_mem.0": "6144", "hw:numa_nodes": "2", "hw:numa_cpus.0": "0,1,2", "hw:numa_cpus.1": "3"

  The flavor has 8192 MB of ram and 4 vcpus.

  When using qemu 2.1.0, the following will be seen in the n-cpu logs
  when booting a machine with NUMA specs.

  "libvirtError: internal error: process exited while connecting to
  monitor: qemu-system-x86_64: total memory for NUMA nodes (8388608)
  should equal RAM size (200000000)"

  Please not that the 200000000 is 8388608 KiB in bytes and hex (simply
  an issue with the qemu error message). The error shows that 8192 KiB
  is being requested rather than 8192 MiB. Because the RAM size does not
  equal the total memory size, the machine fails to boot.

  When using versions of qemu lower than 2.1.0 the issue is not obvious,
  as machines with  NUMA specs boot, but only because of a bug (that has
  since been resolved) in qemu. This is because the check to ensure that
  RAM size equals the NUMA node total memory does not happen in versions
  lower than 2.1.0

  In short, we should be using KiB units for NUMA cell memory, or at
  least be converting from MiB to KiB before creating the xml.
  Otherwise, NUMA placement will not behave as intended.

  To be fair, I haven't had the chance to look at the memory placement
  in a guest booted using qemu 2.0.0 or lower, though I suspect the
  memory placement would be incorrect.. If anyone has the chance to
  look, it would be greatly appreciated.

  I am currently investigating the appropriate fix for this alongside
  Tiago Mello. We made a quick fix in /nova/virt/libvirt/config.py on
  line 495:

                  cell.set("memory", str(self.memory * 1024))

  Mutiplying by 1024 allowed the machine to properly boot, but it is
  probably a bit too quick and dirty. Just thought it would be worth
  mentioning.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1373159/+subscriptions


Follow ups

References