yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #94135
[Bug 2069718] [NEW] [ovn] No connection to VM during live-migration
Public bug reported:
Problem: In environments with many hypervisors and VMs, a live-migration
leads to VMs being not reachable for some seconds (4-20s).
Description:
We run a big environment with many hypervisors and VMs, so northd reconcile cycles take some time.
At live-migration, even nova has live_migration_wait_for_vif_plug=true configured, the vif-plugged event from neutron is send before northd has processed the change to have the VMs port added to the destination hypervisor and multi-chassis-feature is enabled.
Nova starts the live migration at libvirt and it is done, before southbound and ovn-controller of destination have the change.
So the VM is started at destination hypervisor but the port setup is not done yet.
>From what I saw, the vif-plugged event is generated by neutron, when the
transaction to northbound ovsdb is finished [1].
Is there a way to wait till the change is propagated to southbound
ovsdb?
Version:
neutron-server 21.2.1 zed / unmaintained/zed
ml2 plugin: ovn
at neutron: ovsdb-client (Open vSwitch) 3.3.0
Nova zed / unmaintained/zed
nova.conf: live_migration_wait_for_vif_plug=true ([2])
Hypervisor OS: Ubuntu 22.04 with newer kernel (but that shouldn't be relevant here)
Steps to Reproduce:
1. Run neutron with ovn setup and create a VM that you can ping (via FIP or other VM in same private network)
2. Stop northd
3. Start live-migration
4. Wait till live-migration is done - VM is not reachable anymore
[1] https://opendev.org/openstack/neutron/src/branch/unmaintained/zed/neutron/plugins/ml2/drivers/ovn/mech_driver/mech_driver.py#L836
[2] https://docs.openstack.org/nova/latest/configuration/config.html#compute.live_migration_wait_for_vif_plug
** Affects: neutron
Importance: Undecided
Status: New
** Description changed:
Problem: In environments with many hypervisors and VMs, a live-migration
leads to VMs being not reachable for some seconds (4-20s).
Description:
We run a big environment with many hypervisors and VMs, so northd reconcile cycles take some time.
At live-migration, even nova has live_migration_wait_for_vif_plug=true configured, the vif-plugged event from neutron is send before northd has processed the change to have the VMs port added to the destination hypervisor and multi-chassis-feature is enabled.
Nova starts the live migration at libvirt and it is done, before southbound and ovn-controller of destination have the change.
So the VM is started at destination hypervisor but the port setup is not done yet.
From what I saw, the vif-plugged event is generated by neutron, when the
transaction to northbound ovsdb is finished [1].
Is there a way to wait till the change is propagated to southbound
ovsdb?
Version:
neutron-server 21.2.1 zed / unmaintained/zed
ml2 plugin: ovn
at neutron: ovsdb-client (Open vSwitch) 3.3.0
Nova zed / unmaintained/zed
- nova.conf: live_migration_wait_for_vif_plug=true (https://docs.openstack.org/nova/latest/configuration/config.html#compute.live_migration_wait_for_vif_plug)
+ nova.conf: live_migration_wait_for_vif_plug=true ([2])
Hypervisor OS: Ubuntu 22.04 with newer kernel (but that shouldn't be relevant here)
-
Steps to Reproduce:
1. Run neutron with ovn setup and create a VM that you can ping (via FIP or other VM in same private network)
2. Stop northd
3. Start live-migration
4. Wait till live-migration is done - VM is not reachable anymore
-
[1] https://opendev.org/openstack/neutron/src/branch/unmaintained/zed/neutron/plugins/ml2/drivers/ovn/mech_driver/mech_driver.py#L836
+ [2] https://docs.openstack.org/nova/latest/configuration/config.html#compute.live_migration_wait_for_vif_plug
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2069718
Title:
[ovn] No connection to VM during live-migration
Status in neutron:
New
Bug description:
Problem: In environments with many hypervisors and VMs, a live-
migration leads to VMs being not reachable for some seconds (4-20s).
Description:
We run a big environment with many hypervisors and VMs, so northd reconcile cycles take some time.
At live-migration, even nova has live_migration_wait_for_vif_plug=true configured, the vif-plugged event from neutron is send before northd has processed the change to have the VMs port added to the destination hypervisor and multi-chassis-feature is enabled.
Nova starts the live migration at libvirt and it is done, before southbound and ovn-controller of destination have the change.
So the VM is started at destination hypervisor but the port setup is not done yet.
From what I saw, the vif-plugged event is generated by neutron, when
the transaction to northbound ovsdb is finished [1].
Is there a way to wait till the change is propagated to southbound
ovsdb?
Version:
neutron-server 21.2.1 zed / unmaintained/zed
ml2 plugin: ovn
at neutron: ovsdb-client (Open vSwitch) 3.3.0
Nova zed / unmaintained/zed
nova.conf: live_migration_wait_for_vif_plug=true ([2])
Hypervisor OS: Ubuntu 22.04 with newer kernel (but that shouldn't be relevant here)
Steps to Reproduce:
1. Run neutron with ovn setup and create a VM that you can ping (via FIP or other VM in same private network)
2. Stop northd
3. Start live-migration
4. Wait till live-migration is done - VM is not reachable anymore
[1] https://opendev.org/openstack/neutron/src/branch/unmaintained/zed/neutron/plugins/ml2/drivers/ovn/mech_driver/mech_driver.py#L836
[2] https://docs.openstack.org/nova/latest/configuration/config.html#compute.live_migration_wait_for_vif_plug
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2069718/+subscriptions