yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #70315
[Bug 1594529] Re: VM creation failure due to Nova hugepage assumptions
** Changed in: nova
Status: Expired => In Progress
** Changed in: nova
Assignee: (unassigned) => sahid (sahid-ferdjaoui)
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1594529
Title:
VM creation failure due to Nova hugepage assumptions
Status in OpenStack Compute (nova):
In Progress
Bug description:
Description:
In Liberty and Mitaka, Nova assumes that it has exclusive access to
the huge pages on the compute node. It maintains track of the total
pages per NUMA node on the compute node, and then number of used (by
Nova VMs) pages on each NUMA node. This is done for the three huge
page sizes supported.
However, if other third party processes consume huge pages, there will
be a discrepancy between the actual pages available and what Nova
thinks is available. As a result, it is possible (based on the number
of pages and the VM size) for Nova to think it has enough pages, when
there are not enough pages. The create will fail with QEMU reporting
insufficient memory available, for example.
Steps to reproduce:
1. Compute with 32768 2MB pages available, giving 16384 per NUMA node with two nodes.
2. Third party process that consumes 256 pages per NUMA node.
3. Create 15 small flavor (2GB = 1024 pages) VMs.
4. Create another small flavor VM.
Expected Result:
That the 16th VM would be created, without an error, and using huge
pages on the second NUMA node (and allow more VMs as well).
Actual Result:
After step 3, Nova thinks there are 1024 pages available, but the
compute host shows only 768 pages available. The scheduler thinks
there is space for one more VM, it will pass the filter. The creation
will commence, as Nova thinks there is enough space on NUMA node 0.
QEMU will fail, indicating that there is not enough memory.
In addition, there are 16128 pages available on NUMA node 1, but Nova
will not attempt using them, as it thinks there is still memory
available on NUMA node 0.
In my case, I had multiple compute hosts and ended up with a "No hosts
available" error, as it fails on each host when trying NUMA node 0.
If, at step 4, one creates a medium flavor VM, it will succeed, as
Nova will not see enough pages on NUMA node 0, and will try NUMA node
1, which has ample space.
Commentary: Nova checks total huge pages, but not available huge
pages.
Note: A feature was added to master (for Newton) that has a config
based mechanism to reserve huge pages for third party applications
under bug 1543149. However, the Nova team indicated that this change
cannot be back ported to Liberty.
Environment:
Liberty release (12.0.3), with LB, neutron networking, libvirt 1.2.17,
API QEMU 1.2.17, QEMU 2.3.0.
Config:
nova flavor-key m1.small set hw:numa_nodes=1
nova flavor-key m1.small set hw:mem_page_size=2048
network, subnet, and standard VM create commands.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1594529/+subscriptions
References