← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1845146] Re: NUMA aware live migration failed when vCPU pin set

 

Reviewed:  https://review.opendev.org/684409
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=6ec686c26b2c8b18bcff522633bfe9715e0feec3
Submitter: Zuul
Branch:    master

commit 6ec686c26b2c8b18bcff522633bfe9715e0feec3
Author: Artom Lifshitz <alifshit@xxxxxxxxxx>
Date:   Tue Sep 24 13:22:23 2019 -0400

    Stop filtering out 'accepted' for in-progress migrations
    
    Live migrations are created with an 'accepted' status. Resource claims
    on the destination are done with the migration in 'accepted' status.
    The status is set to 'preparing' a bit later, right before running
    pre_live_migration(). Migrations with status 'accepted' are filtered
    out by the database layer when getting in-progress migrations. Thus,
    there's a time window after resource claims but before 'preparing'
    during which resources have been claimed but the migration is not
    considered in-progress by the database layer. During that window, the
    instance's host is the source - that's only updated once the live
    migration finishes. If the update available resources periodic task
    runs during that window, it'll free the instance's resource from the
    destination because neither the instance nor any of its in-progress
    migrations are associated with the destination. This means that other
    incoming instances are able to consume resources that should not be
    available. This patch stops filtering out the 'accepted' status in the
    database layer when retrieving in-progress migrations.
    
    Change-Id: I4c56925ed35bc3275ca1ac6c30d7fd641ad84260
    Closes-bug: 1845146


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1845146

Title:
  NUMA aware live migration failed when vCPU pin set

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) train series:
  In Progress

Bug description:
  Description
  ===========

  When vCPU pin policy is dedicated, the NUMA aware live migration may
  go failed.

  
  Steps to reproduce
  ==================

  1. Create two flavor: 2c2g.numa; 4c.4g.numa
     (venv) [root@t1 ~]# openstack flavor show 2c2g.numa
  +----------------------------+----------------------------------------------------------------------------------------------------------------------------------+
  | Field                      | Value                                                                                                                            |
  +----------------------------+----------------------------------------------------------------------------------------------------------------------------------+
  | OS-FLV-DISABLED:disabled   | False                                                                                                                            |
  | OS-FLV-EXT-DATA:ephemeral  | 0                                                                                                                                |
  | access_project_ids         | None                                                                                                                             |
  | disk                       | 1                                                                                                                                |
  | id                         | b4a2df98-82c5-4a53-8ba5-4372f20a98bd                                                                                             |
  | name                       | 2c2g.numa                                                                                                                        |
  | os-flavor-access:is_public | True                                                                                                                             |
  | properties                 | hw:cpu_policy='dedicated', hw:numa_cpus.0='0', hw:numa_cpus.1='1', hw:numa_mem.0='1024', hw:numa_mem.1='1024', hw:numa_nodes='2' |
  | ram                        | 2048                                                                                                                             |
  | rxtx_factor                | 1.0                                                                                                                              |
  | swap                       |                                                                                                                                  |
  | vcpus                      | 2                                                                                                                                |
  +----------------------------+----------------------------------------------------------------------------------------------------------------------------------+
     (venv) [root@t1 ~]# openstack flavor show 4c.4g.numa
  +----------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
  | Field                      | Value                                                                                                                                |
  +----------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
  | OS-FLV-DISABLED:disabled   | False                                                                                                                                |
  | OS-FLV-EXT-DATA:ephemeral  | 0                                                                                                                                    |
  | access_project_ids         | None                                                                                                                                 |
  | disk                       | 1                                                                                                                                    |
  | id                         | cf53f5ea-c036-4a79-8183-6a2389212d02                                                                                                 |
  | name                       | 4c.4g.numa                                                                                                                           |
  | os-flavor-access:is_public | True                                                                                                                                 |
  | properties                 | hw:cpu_policy='dedicated', hw:numa_cpus.0='0', hw:numa_cpus.1='1,2,3', hw:numa_mem.0='3072', hw:numa_mem.1='1024', hw:numa_nodes='2' |
  | ram                        | 4096                                                                                                                                 |
  | rxtx_factor                | 1.0                                                                                                                                  |
  | swap                       |                                                                                                                                      |
  | vcpus                      | 4                                                                                                                                    |
  +----------------------------+--------------------------------------------------------------------------------------------------------------------------------------+

  2. Create four instance (2c2g.numa * 2, 4c.4g.numa * 2)

  3. Live migrate the instances one by one

  4. After the four instances live migrate done, check the vCPU pin is
  correct (use 'virsh vcpupin [vm_id]')

  5. If vCPU pin correct, continue to step 3.

  
  Expected result
  ===============

  The vCPU pin is correct

  
  Actual result
  =============

  The vCPU pin not correct on compute node: t1.

  (nova-libvirt)[root@t1 /]# virsh list
   Id    Name                           State
  ----------------------------------------------------
   138   instance-00000012              running
   139   instance-00000011              running

  (nova-libvirt)[root@t1 /]# virsh vcpupin 138
  VCPU: CPU Affinity
  ----------------------------------
     0: 0
     1: 15

  (nova-libvirt)[root@t1 /]# virsh vcpupin 139
  VCPU: CPU Affinity
  ----------------------------------
     0: 0
     1: 15

  
  Environment
  ===========

  Code version: master, 23 Sep

  Three compute nodes:
      t1: 16C, 24GB (2 NUMA nodes)
      t2: 12C, 16GB (2 NUMA nodes)
      t3:  8C, 12GB (2 NUMA nodes)

  The image has no property.

  Hypervisor: Libvirt + KVM

  Storage: ceph

  Networking_type: Neutron + OVS

  
  Logs & Configs
  ==============

  Please check the attachment to get log file.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1845146/+subscriptions


References