yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #80204
[Bug 1845146] Re: NUMA aware live migration failed when vCPU pin set
Reviewed: https://review.opendev.org/684409
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=6ec686c26b2c8b18bcff522633bfe9715e0feec3
Submitter: Zuul
Branch: master
commit 6ec686c26b2c8b18bcff522633bfe9715e0feec3
Author: Artom Lifshitz <alifshit@xxxxxxxxxx>
Date: Tue Sep 24 13:22:23 2019 -0400
Stop filtering out 'accepted' for in-progress migrations
Live migrations are created with an 'accepted' status. Resource claims
on the destination are done with the migration in 'accepted' status.
The status is set to 'preparing' a bit later, right before running
pre_live_migration(). Migrations with status 'accepted' are filtered
out by the database layer when getting in-progress migrations. Thus,
there's a time window after resource claims but before 'preparing'
during which resources have been claimed but the migration is not
considered in-progress by the database layer. During that window, the
instance's host is the source - that's only updated once the live
migration finishes. If the update available resources periodic task
runs during that window, it'll free the instance's resource from the
destination because neither the instance nor any of its in-progress
migrations are associated with the destination. This means that other
incoming instances are able to consume resources that should not be
available. This patch stops filtering out the 'accepted' status in the
database layer when retrieving in-progress migrations.
Change-Id: I4c56925ed35bc3275ca1ac6c30d7fd641ad84260
Closes-bug: 1845146
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1845146
Title:
NUMA aware live migration failed when vCPU pin set
Status in OpenStack Compute (nova):
Fix Released
Status in OpenStack Compute (nova) train series:
In Progress
Bug description:
Description
===========
When vCPU pin policy is dedicated, the NUMA aware live migration may
go failed.
Steps to reproduce
==================
1. Create two flavor: 2c2g.numa; 4c.4g.numa
(venv) [root@t1 ~]# openstack flavor show 2c2g.numa
+----------------------------+----------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+----------------------------+----------------------------------------------------------------------------------------------------------------------------------+
| OS-FLV-DISABLED:disabled | False |
| OS-FLV-EXT-DATA:ephemeral | 0 |
| access_project_ids | None |
| disk | 1 |
| id | b4a2df98-82c5-4a53-8ba5-4372f20a98bd |
| name | 2c2g.numa |
| os-flavor-access:is_public | True |
| properties | hw:cpu_policy='dedicated', hw:numa_cpus.0='0', hw:numa_cpus.1='1', hw:numa_mem.0='1024', hw:numa_mem.1='1024', hw:numa_nodes='2' |
| ram | 2048 |
| rxtx_factor | 1.0 |
| swap | |
| vcpus | 2 |
+----------------------------+----------------------------------------------------------------------------------------------------------------------------------+
(venv) [root@t1 ~]# openstack flavor show 4c.4g.numa
+----------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+----------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| OS-FLV-DISABLED:disabled | False |
| OS-FLV-EXT-DATA:ephemeral | 0 |
| access_project_ids | None |
| disk | 1 |
| id | cf53f5ea-c036-4a79-8183-6a2389212d02 |
| name | 4c.4g.numa |
| os-flavor-access:is_public | True |
| properties | hw:cpu_policy='dedicated', hw:numa_cpus.0='0', hw:numa_cpus.1='1,2,3', hw:numa_mem.0='3072', hw:numa_mem.1='1024', hw:numa_nodes='2' |
| ram | 4096 |
| rxtx_factor | 1.0 |
| swap | |
| vcpus | 4 |
+----------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
2. Create four instance (2c2g.numa * 2, 4c.4g.numa * 2)
3. Live migrate the instances one by one
4. After the four instances live migrate done, check the vCPU pin is
correct (use 'virsh vcpupin [vm_id]')
5. If vCPU pin correct, continue to step 3.
Expected result
===============
The vCPU pin is correct
Actual result
=============
The vCPU pin not correct on compute node: t1.
(nova-libvirt)[root@t1 /]# virsh list
Id Name State
----------------------------------------------------
138 instance-00000012 running
139 instance-00000011 running
(nova-libvirt)[root@t1 /]# virsh vcpupin 138
VCPU: CPU Affinity
----------------------------------
0: 0
1: 15
(nova-libvirt)[root@t1 /]# virsh vcpupin 139
VCPU: CPU Affinity
----------------------------------
0: 0
1: 15
Environment
===========
Code version: master, 23 Sep
Three compute nodes:
t1: 16C, 24GB (2 NUMA nodes)
t2: 12C, 16GB (2 NUMA nodes)
t3: 8C, 12GB (2 NUMA nodes)
The image has no property.
Hypervisor: Libvirt + KVM
Storage: ceph
Networking_type: Neutron + OVS
Logs & Configs
==============
Please check the attachment to get log file.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1845146/+subscriptions
References