yahoo-eng-team team mailing list archive

Thread
Date
[Bug 1996995] Re: VM's inaccessible after live migration on certain Arista VXLAN Flood and Learn fabrics

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: OpenStack Infra <1996995@xxxxxxxxxxxxxxxxxx>
Date: Tue, 31 Jan 2023 23:07:51 -0000
Reply-to: Bug 1996995 <1996995@xxxxxxxxxxxxxxxxxx>
Sender: noreply@xxxxxxxxxxxxx
Reviewed:  https://review.opendev.org/c/openstack/nova/+/867324
Committed: https://opendev.org/openstack/nova/commit/fba851bf3a34562db9cdb783ae539556b8b7a329
Submitter: "Zuul (22348)"
Branch:    master

commit fba851bf3a34562db9cdb783ae539556b8b7a329
Author: as0 <as3310@xxxxxxxxxxxxxx>
Date:   Tue Dec 13 09:43:38 2022 +0000

    Add further workaround features for qemu_monitor_announce_self
    
    In some cases on Arista VXLAN fabrics, VMs are inaccessible via network
    after live migration, despite garps being observed on the fabric itself.
    
    This patch builds on the feature
    ``[workarounds]/enable_qemu_monitor_announce_self`` feature as reported
    in `bug 1815989 <https://bugs.launchpad.net/nova/+bug/1815989>`
    
    This patch adds the ability to config the number of times the QEMU
    announce_self monitor command is called, and add a new configuration option to
    specify a delay between calling the announce_self command multiple times,
    as in some cases, multiple announce_self monitor commands are required for
    the fabric to honor the garp packets and the VM to become accessible via
    the network after live migration.
    
    Closes-Bug: #1996995
    Change-Id: I2f5bf7c9de621bb1dc7fae5b3374629a4fcc1f46


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1996995

Title:
  VM's inaccessible after live migration on certain Arista VXLAN Flood
  and Learn fabrics

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Description
  ===========
  This is not a Nova bug per se, but rather an issue with Arista and potentially other network fabrics.

  I have observed a case where VMs are inaccessible by network traffic
  after live migrating on certain fabrics, in this case, Arista VXlan,
  despite the hypervisor sending out a number of garp packets following
  a live migration.

  This was observed on an Arista VXlan fabric - live migrating a VM
  between hypervisors on two different switches. A live migration
  between two hypervisors on the same switch is not affected.

  In both cases, I can see garps on the wire triggered by a VM being
  live migrated, these packets have been observed from other hypervisors
  and even other VMs in the same VLAN on different hypervisors.

  The VM is accessible after a period of time, at the point the switch
  arp aging timer resets and the MAC is re-learnt on the correct switch.

  This occurs on any VM - even a simple c1.m1 with no active workload,
  backed by Ceph storage.

  Steps to Reproduce
  ===========

  To try and prevent this from happening, I have tested the libvirt: Add
  announce-self post live-migration workaround patch[0] - despite this,
  the issue was still observed.

  Create VM: c1.m1 or similar, Centos7 or Centos8 - Ceph storage, no
  active or significant load on VM

  Run:
  `ping VM_IP | while read ping; do echo "$(date): $pong"; done`

  Then:
  `openstack server migrate --live TARGET_HOST VM_INSTANCE`

  Expected result
  ===============
  VM live migrates and is accessible in a reasonable <10 timeframe

  Actual result
  =============
  VM live migrates successfully, ping fails until switch arp timer resets (in our environment, 60-180 seconds)

  Despite efforts from us and our network team, we are unable to
  determine why the VM is inaccessible, what has been noticed is that
  sending a further number of announce_self commands to the qemu
  monitor, triggering more garps, gets the VM into an accessible state
  in an acceptable time of <5 seconds.

  Environment
  =============
  Arista EOS4.26M VXLan fabric
  OpenStack Nova Train, Ussuri, Victoria (with and without patch
  Ceph Nautlius

  OpenStack provider networking, using VLANs

  Patch/Workaround
  =============
  I have a follow-up workaround patch which builds on the announce-self patch prepared which we have been running in our production deployment.

  This patch adds two configurable options and the associated code:

  `enable_qemu_monitor_announce_max_retries` - this will call
  announce_self a futher n number of times, triggering more garp packets
  to be sent.

  `enable_qemu_monitor_announce_retry_interval` - this is the delay
  which will be used between triggering the additional announce_self
  calls, as configured in the option above.

  My tests of nearly 5000 live migrations show that the optimal settings
  in our environment are 3 additional calls to qemu_announce_self with 1
  second delay - this gets out VMs accessible in 2 or 3 seconds in the
  vast majority of cases, and 99% within 5 seconds after they stop
  responding to ping (the point at which we determine they are
  inaccessible).

  I shall be submitting this patch for review by the Nova community in
  the next few days.

  0:
  https://opendev.org/openstack/nova/commit/9609ae0bab30675e184d1fc63aec849c1de020d0

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1996995/+subscriptions
References

[Bug 1996995] [NEW] VM's inaccessible after live migration on certain Arista VXLAN Flood and Learn fabrics
From: Aaron S, 2022-11-18