yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #87292
[Bug 1944619] Re: Instances with SRIOV ports loose access after failed live migrations
Hello:
The problem reported relates to Nova migration process. When an error
like this one occurs, Nova should rebind again the port in the source
host. If this is not happening, as you described, then Nova process
should be reviewed.
Please, provide the Neutron logs (SRIOV agent, Neutron server) in DEBUG
mode if possible.
Regards.
** Changed in: neutron
Status: New => Incomplete
** Also affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1944619
Title:
Instances with SRIOV ports loose access after failed live migrations
Status in neutron:
Incomplete
Status in OpenStack Compute (nova):
New
Bug description:
If for some reason a live migration fails for an instance with an
SRIOV port during the '_pre_live_migration' hook. The instance will
lose access to the network and leave behind duplicated port bindings
on the database.
The instance re-gains connectivity on the source host after a reboot
(don't know if there's another way to restore connectivity). As a side
effect of this behavior, the pre-live migration cleanup hook also
fails with:
PCI device 0000:3b:10.0 is in use by driver QEMU
[How to reproduce]
- Create an environment with SRIOV, (our case uses switchdev[1])
- Create 1 VM
- Provoke a failure in the _pre_live_migration process (for example creating a directory /var/lib/nova/instances/<instance id>)
- Check the VM's connectivity
- Check the logs for: libvirt.libvirtError: Requested operation is not valid: PCI device 0000:03:04.1 is in use by driver QEMU, domain instance-00000001
Full-stack trace[2]
[Expected]
VM connectivity is restored even if it gets a brief disconnection
[Observed]
VM loses connectivity which is only is restored after the VM status is set to ERROR and the VM is power recycled
[Environment]
Focal Ussuri with Melanox Connect5 cards
[1] https://paste.ubuntu.com/p/PzBM7y6Dbr/
[2] https://paste.ubuntu.com/p/ThQmDYtdSS/
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1944619/+subscriptions
References