yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #90413
[Bug 1996995] [NEW] VM's inaccessible after live migration on certain Arista VXLAN Flood and Learn fabrics
Public bug reported:
Description
===========
This is not a Nova bug per se, but rather an issue with Arista and potentially other network fabrics.
I have observed a case where VMs are inaccessible by network traffic
after live migrating on certain fabrics, in this case, Arista VXlan,
despite the hypervisor sending out a number of garp packets following a
live migration.
This was observed on an Arista VXlan fabric - live migrating a VM
between hypervisors on two different switches. A live migration between
two hypervisors on the same switch is not affected.
In both cases, I can see garps on the wire triggered by a VM being live
migrated, these packets have been observed from other hypervisors and
even other VMs in the same VLAN on different hypervisors.
The VM is accessible after a period of time, at the point the switch arp
aging timer resets and the MAC is re-learnt on the correct switch.
This occurs on any VM - even a simple c1.m1 with no active workload,
backed by Ceph storage.
Steps to Reproduce
===========
To try and prevent this from happening, I have tested the libvirt: Add
announce-self post live-migration workaround patch[0] - despite this,
the issue was still observed.
Create VM: c1.m1 or similar, Centos7 or Centos8 - Ceph storage, no
active or significant load on VM
Run:
`ping VM_IP | while read ping; do echo "$(date): $pong"; done`
Then:
`openstack server migrate --live TARGET_HOST VM_INSTANCE`
Expected result
===============
VM live migrates and is accessible in a reasonable <10 timeframe
Actual result
=============
VM live migrates successfully, ping fails until switch arp timer resets (in our environment, 60-180 seconds)
Despite efforts from us and our network team, we are unable to determine
why the VM is inaccessible, what has been noticed is that sending a
further number of announce_self commands to the qemu monitor, triggering
more garps, gets the VM into an accessible state in an acceptable time
of <5 seconds.
Environment
=============
Arista EOS4.26M VXLan fabric
OpenStack Nova Train, Ussuri, Victoria (with and without patch
Ceph Nautlius
OpenStack provider networking, using VLANs
Patch/Workaround
=============
I have a follow-up workaround patch which builds on the announce-self patch prepared which we have been running in our production deployment.
This patch adds two configurable options and the associated code:
`enable_qemu_monitor_announce_max_retries` - this will call
announce_self a futher n number of times, triggering more garp packets
to be sent.
`enable_qemu_monitor_announce_retry_interval` - this is the delay which
will be used between triggering the additional announce_self calls, as
configured in the option above.
My tests of nearly 5000 live migrations show that the optimal settings
in our environment are 3 additional calls to qemu_announce_self with 1
second delay - this gets out VMs accessible in 2 or 3 seconds in the
vast majority of cases, and 99% within 5 seconds after they stop
responding to ping (the point at which we determine they are
inaccessible).
I shall be submitting this patch for review by the Nova community
0:
https://opendev.org/openstack/nova/commit/9609ae0bab30675e184d1fc63aec849c1de020d0
** Affects: nova
Importance: Undecided
Status: New
** Tags: live-migration
** Description changed:
Description
===========
This is not a Nova bug per se, but rather an issue with Arista and potentially other network fabrics.
-
- I have observed a case where VMs are inaccessible by network traffic after live migrating on certain fabrics, in this case, Arista VXlan, despite the hypervisor sending out a number of garp packets following a live migration.
+ I have observed a case where VMs are inaccessible by network traffic
+ after live migrating on certain fabrics, in this case, Arista VXlan,
+ despite the hypervisor sending out a number of garp packets following a
+ live migration.
This was observed on an Arista VXlan fabric - live migrating a VM
between hypervisors on two different switches. A live migration between
two hypervisors on the same switch is not affected.
In both cases, I can see garps on the wire triggered by a VM being live
migrated, these packets have been observed from other hypervisors and
even other VMs in the same VLAN on different hypervisors.
The VM is accessible after a period of time, at the point the switch arp
aging timer resets and the MAC is re-learnt on the correct switch.
This occurs on any VM - even a simple c1.m1 with no active workload,
backed by Ceph storage.
-
Steps to Reproduce
===========
To try and prevent this from happening, I have tested the libvirt: Add
announce-self post live-migration workaround patch[0] - despite this,
the issue was still observed.
Create VM: c1.m1 or similar, Centos7 or Centos8 - Ceph storage, no
active or significant load on VM
Run:
`ping VM_IP | while read ping; do echo "$(date): $pong"; done`
Then:
`openstack server migrate --live TARGET_HOST VM_INSTANCE`
Expected result
===============
VM live migrates and is accessible in a reasonable <10 timeframe
-
Actual result
=============
VM live migrates successfully, ping fails until switch arp timer resets (in our environment, 60-180 seconds)
Despite efforts from us and our network team, we are unable to determine
why the VM is inaccessible, what has been noticed is that sending a
further number of announce_self commands to the qemu monitor, triggering
more garps, gets the VM into an accessible state in an acceptable time
of <5 seconds.
Environment
=============
Arista EOS4.26M VXLan fabric
- OpenStack Nova Train, Ussuri, Victoria (with and without patch
+ OpenStack Nova Train, Ussuri, Victoria (with and without patch
Ceph Nautlius
OpenStack provider networking, using VLANs
-
Patch/Workaround
=============
I have a follow-up workaround patch which builds on the announce-self patch prepared which we have been running in our production deployment.
This patch adds two configurable options and the associated code:
`enable_qemu_monitor_announce_max_retries` - this will call
announce_self a futher n number of times, triggering more garp packets
to be sent.
`enable_qemu_monitor_announce_retry_interval` - this is the delay which
will be used between triggering the additional announce_self calls, as
configured in the option above.
-
- My tests of nearly 5000 live migrations show that the optimal settings in our environment are 3 additional calls to qemu_announce_self with 1 second delay - this gets out VMs accessible in 2 or 3 seconds in the vast majority of cases, and 99% within 5 seconds after they stop responding to ping (the point at which we determine they are inaccessible).
+ My tests of nearly 5000 live migrations show that the optimal settings
+ in our environment are 3 additional calls to qemu_announce_self with 1
+ second delay - this gets out VMs accessible in 2 or 3 seconds in the
+ vast majority of cases, and 99% within 5 seconds after they stop
+ responding to ping (the point at which we determine they are
+ inaccessible).
- 0: https://opendev.org/openstack/nova/commit/9609ae0bab30675e184d1fc63aec849c1de020d0
+ I shall be submitting this patch for review by the Nova community
+
+ 0:
+ https://opendev.org/openstack/nova/commit/9609ae0bab30675e184d1fc63aec849c1de020d0
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1996995
Title:
VM's inaccessible after live migration on certain Arista VXLAN Flood
and Learn fabrics
Status in OpenStack Compute (nova):
New
Bug description:
Description
===========
This is not a Nova bug per se, but rather an issue with Arista and potentially other network fabrics.
I have observed a case where VMs are inaccessible by network traffic
after live migrating on certain fabrics, in this case, Arista VXlan,
despite the hypervisor sending out a number of garp packets following
a live migration.
This was observed on an Arista VXlan fabric - live migrating a VM
between hypervisors on two different switches. A live migration
between two hypervisors on the same switch is not affected.
In both cases, I can see garps on the wire triggered by a VM being
live migrated, these packets have been observed from other hypervisors
and even other VMs in the same VLAN on different hypervisors.
The VM is accessible after a period of time, at the point the switch
arp aging timer resets and the MAC is re-learnt on the correct switch.
This occurs on any VM - even a simple c1.m1 with no active workload,
backed by Ceph storage.
Steps to Reproduce
===========
To try and prevent this from happening, I have tested the libvirt: Add
announce-self post live-migration workaround patch[0] - despite this,
the issue was still observed.
Create VM: c1.m1 or similar, Centos7 or Centos8 - Ceph storage, no
active or significant load on VM
Run:
`ping VM_IP | while read ping; do echo "$(date): $pong"; done`
Then:
`openstack server migrate --live TARGET_HOST VM_INSTANCE`
Expected result
===============
VM live migrates and is accessible in a reasonable <10 timeframe
Actual result
=============
VM live migrates successfully, ping fails until switch arp timer resets (in our environment, 60-180 seconds)
Despite efforts from us and our network team, we are unable to
determine why the VM is inaccessible, what has been noticed is that
sending a further number of announce_self commands to the qemu
monitor, triggering more garps, gets the VM into an accessible state
in an acceptable time of <5 seconds.
Environment
=============
Arista EOS4.26M VXLan fabric
OpenStack Nova Train, Ussuri, Victoria (with and without patch
Ceph Nautlius
OpenStack provider networking, using VLANs
Patch/Workaround
=============
I have a follow-up workaround patch which builds on the announce-self patch prepared which we have been running in our production deployment.
This patch adds two configurable options and the associated code:
`enable_qemu_monitor_announce_max_retries` - this will call
announce_self a futher n number of times, triggering more garp packets
to be sent.
`enable_qemu_monitor_announce_retry_interval` - this is the delay
which will be used between triggering the additional announce_self
calls, as configured in the option above.
My tests of nearly 5000 live migrations show that the optimal settings
in our environment are 3 additional calls to qemu_announce_self with 1
second delay - this gets out VMs accessible in 2 or 3 seconds in the
vast majority of cases, and 99% within 5 seconds after they stop
responding to ping (the point at which we determine they are
inaccessible).
I shall be submitting this patch for review by the Nova community
0:
https://opendev.org/openstack/nova/commit/9609ae0bab30675e184d1fc63aec849c1de020d0
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1996995/+subscriptions
Follow ups