yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #70946
[Bug 1735407] Re: [Nova] Evacuation doesn't respect anti-affinity rules
Reviewed: https://review.openstack.org/525242
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=edeeaf9102eccb78f1a2555c7e18c3d706f07639
Submitter: Zuul
Branch: master
commit edeeaf9102eccb78f1a2555c7e18c3d706f07639
Author: Balazs Gibizer <balazs.gibizer@xxxxxxxxxxxx>
Date: Mon Dec 4 16:18:30 2017 +0100
Add late server group policy check to rebuild
The affinity and anti-affinity server group policy is enforced by the
scheduler but two parallel scheduling could cause that such policy is
violated. During instance boot a late policy check was performed in
the compute manager to prevent this. This check was missing in case
of rebuild. Therefore two parallel evacuate command could cause that
the server group policy is violated. This patch introduces the late
policy check to rebuild to prevent such situation. When the violation
is detected during boot a re-scheduling happens. However the rebuild
action does not have the re-scheduling implementation so in this case
the rebuild will fail and the evacuation needs to be retried by the
user. Still this is better than allowing a parallel evacuation to
break the server group affinity policy.
To make the late policy check possible in the compute/manager the
rebuild_instance compute RPC call was extended with a request_spec
parameter.
Co-Authored-By: Richard Zsarnoczai <richard.zsarnoczai@xxxxxxxxxxxx>
Change-Id: I752617066bb2167b49239ab9d17b0c89754a3e12
Closes-Bug: #1735407
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1735407
Title:
[Nova] Evacuation doesn't respect anti-affinity rules
Status in Mirantis OpenStack:
Won't Fix
Status in Mirantis OpenStack 9.x series:
Won't Fix
Status in OpenStack Compute (nova):
Fix Released
Bug description:
--- Environment ---
MOS: 9.2
Nova: 13.1.1-7~u14.04+mos20
3 compute nodes
--- Steps to reproduce ---
1. Create a new server group:
nova server-group-create anti anti-affinity
2. Launch 2 VMs in this server group:
nova boot --image TestVM --flavor m1.tiny --nic net-id=889e4e01-9b38-4007-829d-b69d53269874 --hint group=def58398-4a00-4066-a2aa-13f1b6e7e327 vm-1
nova boot --image TestVM --flavor m1.tiny --nic net-id=889e4e01-9b38-4007-829d-b69d53269874 --hint group=def58398-4a00-4066-a2aa-13f1b6e7e327 vm-2
3. Stop nova-compute on the nodes where these 2 VMs are running:
nova show vm-1 | grep "hypervisor"
OS-EXT-SRV-ATTR:hypervisor_hostname | node-12.domain.tld
nova show vm-2 | grep "hypervisor"
OS-EXT-SRV-ATTR:hypervisor_hostname | node-13.domain.tld
[root@node-12 ~]$ service nova-compute stop
nova-compute stop/waiting
[root@node-13 ~]$ service nova-compute stop
nova-compute stop/waiting
4. Evacuate both VMs almost at once:
nova evacuate vm-1
nova evacuate vm-2
5. Check where these 2 VMs are running:
nova show vm-1 | grep "hypervisor"
nova show vm-2 | grep "hypervisor"
--- Actual behavior ---
Both VMs have been evacuated on the same node:
[root@node-11 ~]$ virsh list
Id Name State
----------------------------------------------------
2 instance-00000001 running
3 instance-00000002 running
--- Expected behavior ---
According to the anti-affinity rule, only 1 VM is evacuated.
Another one failed to evacuate with the appropriate message.
To manage notifications about this bug go to:
https://bugs.launchpad.net/mos/+bug/1735407/+subscriptions