yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #47526
[Bug 1464286] Re: NumaTopololgyFilter Not behaving as expected (returns 0 hosts)
So I have a system with 40 cores (2 sockets, 10 cores, hypethreading enabled).
The NUMA topology is as follows:
$ numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 20 21 22 23 24 25 26 27 28 29
node 0 size: 32083 MB
node 0 free: 16652 MB
node 1 cpus: 10 11 12 13 14 15 16 17 18 19 30 31 32 33 34 35 36 37 38 39
node 1 size: 32237 MB
node 1 free: 25386 MB
node distances:
node 0 1
0: 10 21
1: 21 10
I'm using OpenStack provisioned by DevStack on a Fedora 23 host:
$ cat /etc/*-release*
Fedora release 23 (Twenty Three)
...
$ uname -r
4.3.5-300.fc23.x86_64
$ cd /opt/stack/nova
$ git show --oneline
8bafc99 Merge "remove the unnecessary parem of set_vm_state_and_notify"
I defined a flavor similar to yours, but without the unnecessary swap and
disk space and with a smaller RAM allocation (KISS?).
$ openstack flavor create bug.1464286 --id 100 --ram 8192 --disk 0 \
--vcpus 12
$ openstack flavor set bug.1464286 \
--property "hw:cpu_policy=dedicated" \
--property "hw:numa_nodes=1"
$ openstack flavor show bug.1464286
+----------------------------+----------------------------------------------+
| Field | Value |
+----------------------------+----------------------------------------------+
| OS-FLV-DISABLED:disabled | False |
| OS-FLV-EXT-DATA:ephemeral | 0 |
| disk | 0 |
| id | 100 |
| name | bug.1464286 |
| os-flavor-access:is_public | True |
| properties | hw:cpu_policy='dedicated', hw:numa_nodes='1' |
| ram | 8192 |
| rxtx_factor | 1.0 |
| swap | |
| vcpus | 12 |
+----------------------------+----------------------------------------------+
I also modified the default quotas to allow allocation of more than 20
cores:
$ openstack quota set --cores 40 demo
I boot one instance...
$ openstack server create --flavor=bug.1464286 \
--image=cirros-0.3.4-x86_64-uec --wait test1
$ sudo virsh list
Id Name State
----------------------------------------------------
20 instance-00000010 running
$ sudo virsh dumpxml 20
<domain type='kvm' id='20'>
<name>instance-00000010</name>
...
<vcpu placement='static'>12</vcpu>
<cputune>
<shares>12288</shares>
<vcpupin vcpu='0' cpuset='1'/>
<vcpupin vcpu='1' cpuset='21'/>
<vcpupin vcpu='2' cpuset='0'/>
<vcpupin vcpu='3' cpuset='20'/>
<vcpupin vcpu='4' cpuset='25'/>
<vcpupin vcpu='5' cpuset='5'/>
<vcpupin vcpu='6' cpuset='8'/>
<vcpupin vcpu='7' cpuset='28'/>
<vcpupin vcpu='8' cpuset='9'/>
<vcpupin vcpu='9' cpuset='29'/>
<vcpupin vcpu='10' cpuset='24'/>
<vcpupin vcpu='11' cpuset='4'/>
<emulatorpin cpuset='0-1,4-5,8-9,20-21,24-25,28-29'/>
</cputune>
<numatune>
<memory mode='strict' nodeset='0'/>
<memnode cellid='0' mode='strict' nodeset='0'/>
</numatune>
...
<cpu>
<topology sockets='6' cores='1' threads='2'/>
<numa>
<cell id='0' cpus='0-11' memory='8388608' unit='KiB'/>
</numa>
</cpu>
...
</domain>
Then I boot another...
$ openstack server create --flavor=bug.1464286 \
--image=cirros-0.3.4-x86_64-uec --wait test2
$ sudo virsh list
Id Name State
----------------------------------------------------
20 instance-00000010 running
21 instance-00000011 running
$ sudo virsh dumpxml 20
<domain type='kvm' id='20'>
<name>instance-00000010</name>
...
<vcpu placement='static'>12</vcpu>
<cputune>
<shares>12288</shares>
<vcpupin vcpu='0' cpuset='35'/>
<vcpupin vcpu='1' cpuset='15'/>
<vcpupin vcpu='2' cpuset='10'/>
<vcpupin vcpu='3' cpuset='30'/>
<vcpupin vcpu='4' cpuset='16'/>
<vcpupin vcpu='5' cpuset='36'/>
<vcpupin vcpu='6' cpuset='11'/>
<vcpupin vcpu='7' cpuset='31'/>
<vcpupin vcpu='8' cpuset='32'/>
<vcpupin vcpu='9' cpuset='12'/>
<vcpupin vcpu='10' cpuset='17'/>
<vcpupin vcpu='11' cpuset='37'/>
<emulatorpin cpuset='10-12,15-17,30-32,35-37'/>
</cputune>
<numatune>
<memory mode='strict' nodeset='1'/>
<memnode cellid='0' mode='strict' nodeset='1'/>
</numatune>
...
<cpu>
<topology sockets='6' cores='1' threads='2'/>
<numa>
<cell id='0' cpus='0-11' memory='8388608' unit='KiB'/>
</numa>
</cpu>
...
</domain>
Just for the laughs (figuratively speaking), I tried to boot another to ensure
bug #1438253 was still in effect. It is:
$ openstack server create --flavor=bug.1464286 --image=cirros-0.3.4-x86_64-uec --wait test3
Error creating server: test3
Error creating server
$ openstack server delete test3
So, based on the above, it seems this bug has been resolved in Mitaka and no
longer applies. I have a rough idea which patches might have fixed this and
they were backported to Liberty (and Kilo) so there's a good chance that things
are fixed there also. For now though, I'm going to close this as "fixed". We can
trace down the exact fix if necessary later.
** Changed in: nova
Status: Incomplete => Invalid
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1464286
Title:
NumaTopololgyFilter Not behaving as expected (returns 0 hosts)
Status in OpenStack Compute (nova):
Invalid
Bug description:
I have a system with 32 cores (2 sockets, 8 cores, hyperthreading enabled).
The NUMA topology as follows:
numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23
node 0 size: 65501 MB
node 0 free: 38562 MB
node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31
node 1 size: 65535 MB
node 1 free: 63846 MB
node distances:
node 0 1
0: 10 20
1: 20 10
I have defined an flavor in Openstack with 12 vcpus as follows:
nova flavor-show c4.3xlarge
+----------------------------+------------------------------------------------------+
| Property | Value |
+----------------------------+------------------------------------------------------+
| OS-FLV-DISABLED:disabled | False |
| OS-FLV-EXT-DATA:ephemeral | 0 |
| disk | 40 |
| extra_specs | {"hw:cpu_policy": "dedicated", "hw:numa_nodes": "1"} |
| id | 1d76a225-90c1-4f6f-a59b-000795c33e63 |
| name | c4.3xlarge |
| os-flavor-access:is_public | True |
| ram | 24576 |
| rxtx_factor | 1.0 |
| swap | 8192 |
| vcpus | 12 |
+----------------------------+------------------------------------------------------+
I expect to be able to launch two instances of this flavor on the 32
core host, one contained within each NUMA node.
When I launch two instances, the first succeeds, but the second fails.
The instance xml is attached, along with the system capabilities.
If I change hw:numa_nodes = 2, then I can launch two copies of the
instance.
N.B for the purposes of testing I have disabled all vcpu_pin and
isolcpu settings.
This was tested on RDO Kilo running on CentOS 7.
I had to upgrade the hypervisor with packages from the ovirt master branch in order to support NUMA pinning.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1464286/+subscriptions
References