yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #86948
[Bug 1940668] [NEW] Nova-compute fit instance to numa nodes not optimal resulting instance creation failure
Public bug reported:
Description
===========
Reproduced in ussuri, master has the same code.
When nova compute start to fit instance's NUMA topology on host's NUMA
topology it uses host cells list. This list contains cell objects from
cell 0 up to cell N always sorted from cell id 0 up to cell id N (N
number depends on host numa node number). The only case when sort order
of this list is changed is the case with instance without pci device
requirement. If instance doesn't need pci specific to NUMA node, host
cells list is reordered to place cells with PCI capabilities to the end
of list. If all NUMA cells have PCI capabilities, list order won't
changed.
This behaviour leads to attempt to place instance's first NUMA node to
host NUMA node id 0 at the beginning.
If we will use huge pages and place several instances with number of
NUMA nodes less when Host NUMA node number, we exhaust completely NUMA
node id 0. Which will lead to instances with larger number of NUMA nodes
failed to fit on this host (for example instance with NUMA nodes number
equal to host NUMA node number).
To mitigate this issue, it will be better to take into account NUMA node
memory usage.
May be related also to https://bugs.launchpad.net/nova/+bug/1738501
Steps to reproduce
==================
1. Configure OpenStack to use 2MB huge pages, allocate huge pages on compute host(let say compute 1) during boot
For ussuri it is described here: https://docs.openstack.org/nova/ussuri/admin/huge-pages.html
2. Prepare two flavors to test issue: one flavor with hw:mem_page_size='2MB', hw:numa_nodes='1',
second flavor with hw:mem_page_size='2MB', hw:numa_nodes='N', where N - number of NUMA nodes on compute we will use for testing. Compute should have NUMA node number more than 1.
Flavor's RAM should be large enough to exhaust compute NUMA node 0 RAM with some small number of instances. Lets say with 6 instances of flavor 1 we exhaust compute NUMA node 0 RAM. Flavor 2 RAM should be equal Flavor 1 RAM multiplied by N (number of numa nodes on compute 1.)
3. Start 6 instances with first flavor (with 1 NUMA node defined) on compute 1 (with availability zone hint pointed to compute 1). RAM of NUMA node 0 on host compute 1 will be exhausted
4. Try to start instance with second flavor. Instance will not been started with error "...was re-scheduled: Insufficient compute resources: Requested instance NUMA topology cannot fit the given host NUMA topology"
How it should work:
===================
We should take into account memory usage of numa nodes to reduce number
of this kind of error.
** Affects: nova
Importance: Undecided
Status: New
** Description changed:
Description
===========
Reproduced in ussuri, master has the same code.
When nova compute start to fit instance's NUMA topology on host's NUMA
topology it uses host cells list. This list contains cell objects from
cell 0 up to cell N always sorted from cell id 0 up to cell id N (N
number depends on host numa node number). The only case when sort order
of this list is changed is the case with instance without pci device
requirement. If instance doesn't need pci specific to NUMA node, host
cells list is reordered to place cells with PCI capabilities to the end
of list. If all NUMA cells have PCI capabilities, list order won't
changed.
This behaviour leads to attempt to place instance's first NUMA node to
host NUMA node id 0 at the beginning.
If we will use huge pages and place several instances with number of
NUMA nodes less when Host NUMA node number, we exhaust completely NUMA
node id 0. Which will lead to instances with larger number of NUMA nodes
failed to fit on this host (for example instance with NUMA nodes number
equal to host NUMA node number).
To mitigate this issue, it will be better to take into account NUMA node
memory usage.
May be related also to https://bugs.launchpad.net/nova/+bug/1738501
Steps to reproduce
==================
1. Configure OpenStack to use 2MB huge pages, allocate huge pages on compute host(let say compute 1) during boot
For ussuri it is described here: https://docs.openstack.org/nova/ussuri/admin/huge-pages.html
- 2. Prepare to flavors to test issue: one flavor with hw:mem_page_size='2MB', hw:numa_nodes='1',
+ 2. Prepare two flavors to test issue: one flavor with hw:mem_page_size='2MB', hw:numa_nodes='1',
second flavor with hw:mem_page_size='2MB', hw:numa_nodes='N', where N - number of NUMA nodes on compute we will use for testing. Compute should have node number more than 1.
Flavor's RAM should be large enough to exhaust compute NUMA node 0 RAM with some small number of instances. Lets say with 6 instances of flavor 1 we exhaust compute NUMA node 0 RAM. Flavor 2 RAM should be equal Flavor 1 RAM multiplied by N (number of numa nodes on compute 1.)
3. Start 6 instances with first flavor (with 1 NUMA node defined) on compute 1 (with availability zone hint pointed to compute 1). RAM of NUMA node 0 on host compute 1 will be exhausted
4. Try to start instance with second flavor. Instance will not been started with error "...was re-scheduled: Insufficient compute resources: Requested instance NUMA topology cannot fit the given host NUMA topology"
How it should work:
===================
We should take into account memory usage of numa nodes to reduce number
of this kind of error.
** Description changed:
Description
===========
Reproduced in ussuri, master has the same code.
When nova compute start to fit instance's NUMA topology on host's NUMA
topology it uses host cells list. This list contains cell objects from
cell 0 up to cell N always sorted from cell id 0 up to cell id N (N
number depends on host numa node number). The only case when sort order
of this list is changed is the case with instance without pci device
requirement. If instance doesn't need pci specific to NUMA node, host
cells list is reordered to place cells with PCI capabilities to the end
of list. If all NUMA cells have PCI capabilities, list order won't
changed.
This behaviour leads to attempt to place instance's first NUMA node to
host NUMA node id 0 at the beginning.
If we will use huge pages and place several instances with number of
NUMA nodes less when Host NUMA node number, we exhaust completely NUMA
node id 0. Which will lead to instances with larger number of NUMA nodes
failed to fit on this host (for example instance with NUMA nodes number
equal to host NUMA node number).
To mitigate this issue, it will be better to take into account NUMA node
memory usage.
May be related also to https://bugs.launchpad.net/nova/+bug/1738501
Steps to reproduce
==================
1. Configure OpenStack to use 2MB huge pages, allocate huge pages on compute host(let say compute 1) during boot
For ussuri it is described here: https://docs.openstack.org/nova/ussuri/admin/huge-pages.html
2. Prepare two flavors to test issue: one flavor with hw:mem_page_size='2MB', hw:numa_nodes='1',
- second flavor with hw:mem_page_size='2MB', hw:numa_nodes='N', where N - number of NUMA nodes on compute we will use for testing. Compute should have node number more than 1.
+ second flavor with hw:mem_page_size='2MB', hw:numa_nodes='N', where N - number of NUMA nodes on compute we will use for testing. Compute should have NUMA node number more than 1.
Flavor's RAM should be large enough to exhaust compute NUMA node 0 RAM with some small number of instances. Lets say with 6 instances of flavor 1 we exhaust compute NUMA node 0 RAM. Flavor 2 RAM should be equal Flavor 1 RAM multiplied by N (number of numa nodes on compute 1.)
3. Start 6 instances with first flavor (with 1 NUMA node defined) on compute 1 (with availability zone hint pointed to compute 1). RAM of NUMA node 0 on host compute 1 will be exhausted
4. Try to start instance with second flavor. Instance will not been started with error "...was re-scheduled: Insufficient compute resources: Requested instance NUMA topology cannot fit the given host NUMA topology"
How it should work:
===================
We should take into account memory usage of numa nodes to reduce number
of this kind of error.
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1940668
Title:
Nova-compute fit instance to numa nodes not optimal resulting instance
creation failure
Status in OpenStack Compute (nova):
New
Bug description:
Description
===========
Reproduced in ussuri, master has the same code.
When nova compute start to fit instance's NUMA topology on host's NUMA
topology it uses host cells list. This list contains cell objects from
cell 0 up to cell N always sorted from cell id 0 up to cell id N (N
number depends on host numa node number). The only case when sort
order of this list is changed is the case with instance without pci
device requirement. If instance doesn't need pci specific to NUMA
node, host cells list is reordered to place cells with PCI
capabilities to the end of list. If all NUMA cells have PCI
capabilities, list order won't changed.
This behaviour leads to attempt to place instance's first NUMA node to
host NUMA node id 0 at the beginning.
If we will use huge pages and place several instances with number of
NUMA nodes less when Host NUMA node number, we exhaust completely NUMA
node id 0. Which will lead to instances with larger number of NUMA
nodes failed to fit on this host (for example instance with NUMA nodes
number equal to host NUMA node number).
To mitigate this issue, it will be better to take into account NUMA
node memory usage.
May be related also to https://bugs.launchpad.net/nova/+bug/1738501
Steps to reproduce
==================
1. Configure OpenStack to use 2MB huge pages, allocate huge pages on compute host(let say compute 1) during boot
For ussuri it is described here: https://docs.openstack.org/nova/ussuri/admin/huge-pages.html
2. Prepare two flavors to test issue: one flavor with hw:mem_page_size='2MB', hw:numa_nodes='1',
second flavor with hw:mem_page_size='2MB', hw:numa_nodes='N', where N - number of NUMA nodes on compute we will use for testing. Compute should have NUMA node number more than 1.
Flavor's RAM should be large enough to exhaust compute NUMA node 0 RAM with some small number of instances. Lets say with 6 instances of flavor 1 we exhaust compute NUMA node 0 RAM. Flavor 2 RAM should be equal Flavor 1 RAM multiplied by N (number of numa nodes on compute 1.)
3. Start 6 instances with first flavor (with 1 NUMA node defined) on compute 1 (with availability zone hint pointed to compute 1). RAM of NUMA node 0 on host compute 1 will be exhausted
4. Try to start instance with second flavor. Instance will not been started with error "...was re-scheduled: Insufficient compute resources: Requested instance NUMA topology cannot fit the given host NUMA topology"
How it should work:
===================
We should take into account memory usage of numa nodes to reduce
number of this kind of error.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1940668/+subscriptions