yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #94629
[Bug 2081853] [NEW] Booting two VMs with anti-affinity in parallel to the same host results in both failing
Public bug reported:
The compute manager late anti-affinity policy check rejects both
parallel VM boot requests even though one of them could be accepted to
the host.
To reproduce:
* create server group with anti-affinity policy
* select a single compute and disable the rest of your computes
* boot two VMs in parallel
Expected:
One of the two VMs succeeds to boot the other VM fails with NoValidHost.
Actual:
If you are (un)lucky then both VMs will fail with nova.exception.GroupAffinityViolation
```
❯ journalctl -D sosreport-compute-1-2024-09-17-tzgxrpu/var/log/journal/730eba01f47f493698df59515d1c213a -u edpm_nova_compute | grep 9d115f6b-bb02-4390-a161-15fb8f83c0cc | grep nova.exception.GroupAffinityViolation:
Sep 17 02:05:36 compute-1 nova_compute[84038]: 2024-09-17 00:05:36.406 2 ERROR nova.compute.manager [None req-a5316266-aca0-4d11-90f9-631e26d058ab 188fff18565b4e46b0c04391ec532b3e b698d1d3bfeb4a75bf32b7a80d19dd46 - - default default] [instance: 9d115f6b-bb02-4390-a161-15fb8f83c0cc] Failed to build and run instance: nova.exception.GroupAffinityViolation: Anti-affinity instance group policy was violated
Sep 17 02:05:36 compute-1 nova_compute[84038]: 2024-09-17 00:05:36.406 2 ERROR nova.compute.manager [instance: 9d115f6b-bb02-4390-a161-15fb8f83c0cc] nova.exception.GroupAffinityViolation: Anti-affinity instance group policy was violated
❯ journalctl -D sosreport-compute-1-2024-09-17-tzgxrpu/var/log/journal/730eba01f47f493698df59515d1c213a -u edpm_nova_compute | grep ea192e6a-4685-45ae-839b-315dfd36697d | grep nova.exception.GroupAffinityViolation
Sep 17 02:05:36 compute-1 nova_compute[84038]: 2024-09-17 00:05:36.132 2 ERROR nova.compute.manager [None req-b37d5098-75bf-4a3c-a85d-6f2ccdf0104f 188fff18565b4e46b0c04391ec532b3e b698d1d3bfeb4a75bf32b7a80d19dd46 - - default default] [instance: ea192e6a-4685-45ae-839b-315dfd36697d] Failed to build and run instance: nova.exception.GroupAffinityViolation: Anti-affinity instance group policy was violated
Sep 17 02:05:36 compute-1 nova_compute[84038]: 2024-09-17 00:05:36.132 2 ERROR nova.compute.manager [instance: ea192e6a-4685-45ae-839b-315dfd36697d] nova.exception.GroupAffinityViolation: Anti-affinity instance group policy was violated
```
There is a functional reproduce pushed in
https://review.opendev.org/c/openstack/nova/+/930326
** Affects: nova
Importance: Undecided
Status: New
** Tags: compute scheduler
** Tags added: compute scheduler
** Description changed:
The compute manager late anti-affinity policy check rejects both
parallel VM boot requests even though one of them could be accepted to
the host.
To reproduce:
* create server group with anti-affinity policy
* select a single compute and disable the rest of your computes
* boot two VMs in parallel
Expected:
One of the two VMs succeeds to boot the other VM fails with NoValidHost.
Actual:
- If you are (un)lucky then both VM will fail with nova.exception.GroupAffinityViolation
+ If you are (un)lucky then both VMs will fail with nova.exception.GroupAffinityViolation
```
❯ journalctl -D sosreport-compute-1-2024-09-17-tzgxrpu/var/log/journal/730eba01f47f493698df59515d1c213a -u edpm_nova_compute | grep 9d115f6b-bb02-4390-a161-15fb8f83c0cc | grep nova.exception.GroupAffinityViolation:
Sep 17 02:05:36 compute-1 nova_compute[84038]: 2024-09-17 00:05:36.406 2 ERROR nova.compute.manager [None req-a5316266-aca0-4d11-90f9-631e26d058ab 188fff18565b4e46b0c04391ec532b3e b698d1d3bfeb4a75bf32b7a80d19dd46 - - default default] [instance: 9d115f6b-bb02-4390-a161-15fb8f83c0cc] Failed to build and run instance: nova.exception.GroupAffinityViolation: Anti-affinity instance group policy was violated
Sep 17 02:05:36 compute-1 nova_compute[84038]: 2024-09-17 00:05:36.406 2 ERROR nova.compute.manager [instance: 9d115f6b-bb02-4390-a161-15fb8f83c0cc] nova.exception.GroupAffinityViolation: Anti-affinity instance group policy was violated
❯ journalctl -D sosreport-compute-1-2024-09-17-tzgxrpu/var/log/journal/730eba01f47f493698df59515d1c213a -u edpm_nova_compute | grep ea192e6a-4685-45ae-839b-315dfd36697d | grep nova.exception.GroupAffinityViolation
Sep 17 02:05:36 compute-1 nova_compute[84038]: 2024-09-17 00:05:36.132 2 ERROR nova.compute.manager [None req-b37d5098-75bf-4a3c-a85d-6f2ccdf0104f 188fff18565b4e46b0c04391ec532b3e b698d1d3bfeb4a75bf32b7a80d19dd46 - - default default] [instance: ea192e6a-4685-45ae-839b-315dfd36697d] Failed to build and run instance: nova.exception.GroupAffinityViolation: Anti-affinity instance group policy was violated
Sep 17 02:05:36 compute-1 nova_compute[84038]: 2024-09-17 00:05:36.132 2 ERROR nova.compute.manager [instance: ea192e6a-4685-45ae-839b-315dfd36697d] nova.exception.GroupAffinityViolation: Anti-affinity instance group policy was violated
```
There is a functional reproduce pushed in
https://review.opendev.org/c/openstack/nova/+/930326
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2081853
Title:
Booting two VMs with anti-affinity in parallel to the same host
results in both failing
Status in OpenStack Compute (nova):
New
Bug description:
The compute manager late anti-affinity policy check rejects both
parallel VM boot requests even though one of them could be accepted to
the host.
To reproduce:
* create server group with anti-affinity policy
* select a single compute and disable the rest of your computes
* boot two VMs in parallel
Expected:
One of the two VMs succeeds to boot the other VM fails with NoValidHost.
Actual:
If you are (un)lucky then both VMs will fail with nova.exception.GroupAffinityViolation
```
❯ journalctl -D sosreport-compute-1-2024-09-17-tzgxrpu/var/log/journal/730eba01f47f493698df59515d1c213a -u edpm_nova_compute | grep 9d115f6b-bb02-4390-a161-15fb8f83c0cc | grep nova.exception.GroupAffinityViolation:
Sep 17 02:05:36 compute-1 nova_compute[84038]: 2024-09-17 00:05:36.406 2 ERROR nova.compute.manager [None req-a5316266-aca0-4d11-90f9-631e26d058ab 188fff18565b4e46b0c04391ec532b3e b698d1d3bfeb4a75bf32b7a80d19dd46 - - default default] [instance: 9d115f6b-bb02-4390-a161-15fb8f83c0cc] Failed to build and run instance: nova.exception.GroupAffinityViolation: Anti-affinity instance group policy was violated
Sep 17 02:05:36 compute-1 nova_compute[84038]: 2024-09-17 00:05:36.406 2 ERROR nova.compute.manager [instance: 9d115f6b-bb02-4390-a161-15fb8f83c0cc] nova.exception.GroupAffinityViolation: Anti-affinity instance group policy was violated
❯ journalctl -D sosreport-compute-1-2024-09-17-tzgxrpu/var/log/journal/730eba01f47f493698df59515d1c213a -u edpm_nova_compute | grep ea192e6a-4685-45ae-839b-315dfd36697d | grep nova.exception.GroupAffinityViolation
Sep 17 02:05:36 compute-1 nova_compute[84038]: 2024-09-17 00:05:36.132 2 ERROR nova.compute.manager [None req-b37d5098-75bf-4a3c-a85d-6f2ccdf0104f 188fff18565b4e46b0c04391ec532b3e b698d1d3bfeb4a75bf32b7a80d19dd46 - - default default] [instance: ea192e6a-4685-45ae-839b-315dfd36697d] Failed to build and run instance: nova.exception.GroupAffinityViolation: Anti-affinity instance group policy was violated
Sep 17 02:05:36 compute-1 nova_compute[84038]: 2024-09-17 00:05:36.132 2 ERROR nova.compute.manager [instance: ea192e6a-4685-45ae-839b-315dfd36697d] nova.exception.GroupAffinityViolation: Anti-affinity instance group policy was violated
```
There is a functional reproduce pushed in
https://review.opendev.org/c/openstack/nova/+/930326
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2081853/+subscriptions