← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1967144] [NEW] [OVN] Live migration can fail due to wrong revision id during setting requested chassis in ovn

 

Public bug reported:

During the live-migration of vm, when Nova calls /binding/activate API to activate port binding on the destination node, Neutron calls mechanism drivers' port_update_postcommit() method and in the ovn mechanism driver at that point there should be updated "requested chassis" field for the LSP.
Unfortunately we saw recently in our d/s ci race condition when one worker was processing such port binding activate request and other worker was processing OVN event related to the same port.
Finally there was mismatch of the revision numbers in ovn db and neutron and requested chassis wasn't updated for the LSP. Due to that port wasn't claimed by OVN on the destination node thus connectivity to the vm was broken.

Some more details can be found in our d/s bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=2068065

** Affects: neutron
     Importance: Medium
     Assignee: Slawek Kaplonski (slaweq)
         Status: Confirmed


** Tags: ovn

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1967144

Title:
  [OVN] Live migration can fail due to wrong revision id during setting
  requested chassis in ovn

Status in neutron:
  Confirmed

Bug description:
  During the live-migration of vm, when Nova calls /binding/activate API to activate port binding on the destination node, Neutron calls mechanism drivers' port_update_postcommit() method and in the ovn mechanism driver at that point there should be updated "requested chassis" field for the LSP.
  Unfortunately we saw recently in our d/s ci race condition when one worker was processing such port binding activate request and other worker was processing OVN event related to the same port.
  Finally there was mismatch of the revision numbers in ovn db and neutron and requested chassis wasn't updated for the LSP. Due to that port wasn't claimed by OVN on the destination node thus connectivity to the vm was broken.

  Some more details can be found in our d/s bugzilla
  https://bugzilla.redhat.com/show_bug.cgi?id=2068065

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1967144/+subscriptions



Follow ups