← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1888395] Re: live migration of a vm using the single port binding work flow is broken in train as a result of the introduction of sriov live migration

 

** Description changed:

- it was working in queens but fails in train. nova compute at the target
- aborts with the exception:
+ [Impact]
+ 
+ Live migration of instances in an environment that uses neutron backends
+ that do not support multiple port bindings will fail with error
+ 'NotImplemented', effectively rendering live-migration inoperable in
+ these environments.
+ 
+ This is fixed by first checking to ensure the backend supports the
+ multiple port bindings before providing the port bindings.
+ 
+ [Test Plan]
+ 
+ 1. deploy a Train/Ussuri OpenStack cloud w/ at least 2 compute nodes
+ using an SDN that does not support multiple port bindings (e.g.
+ opencontrail).
+ 
+ 2. Attempt to perform a live migration of an instance.
+ 
+ 3. Observe that the live migration will fail without this fix due to the
+ trace below (NotImplementedError: Cannot load 'vif_type' in the base
+ class), and should succeed with this fix.
+ 
+ 
+ [Where problems could occur]
+ 
+ This affects the live migration code, so likely problems would arise in
+ this area. Specifically, the check introduced is guarding information
+ provided for instances using SR-IOV indirect migration.
+ 
+ Regressions would likely occur in the form of live migration errors
+ around features that rely on the multiple port bindings (e.g. the SR-
+ IOV) and not the more generic/common use case. Errors may be seen in
+ standard network providers that are included with distro packaging, but
+ may also be seen in scenarios where proprietary SDNs are used.
+ 
+ 
+ [Original Description]
+ it was working in queens but fails in train. nova compute at the target aborts with the exception:
  
  Traceback (most recent call last):
-   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming
-     res = self.dispatcher.dispatch(message)
-   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 274, in dispatch
-     return self._do_dispatch(endpoint, method, ctxt, args)    
-   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch
-     result = func(ctxt, **new_args)
-   File "/usr/lib/python2.7/site-packages/nova/exception_wrapper.py", line 79, in wrapped
-     function_name, call_dict, binary, tb)
-   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
-     self.force_reraise()
-   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
-     six.reraise(self.type_, self.value, self.tb)
-   File "/usr/lib/python2.7/site-packages/nova/exception_wrapper.py", line 69, in wrapped
-     return f(self, context, *args, **kw)
-   File "/usr/lib/python2.7/site-packages/nova/compute/utils.py", line 1372, in decorated_function
-     return function(self, context, *args, **kwargs)
-   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 219, in decorated_function
-     kwargs['instance'], e, sys.exc_info())
-   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__    self.force_reraise()
-   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
-     six.reraise(self.type_, self.value, self.tb)  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 207, in decorated_function
-     return function(self, context, *args, **kwargs)
-   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 7007, in pre_live_migration
-     bdm.save()
-   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
-     self.force_reraise()
-   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
-     six.reraise(self.type_, self.value, self.tb)
-   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6972, in pre_live_migration
-     migrate_data)
-   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 9190, in pre_live_migration
-     instance, network_info, migrate_data)
-   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 9071, in _pre_live_migration_plug_vifs
-     vif_plug_nw_info.append(migrate_vif.get_dest_vif())
-   File "/usr/lib/python2.7/site-packages/nova/objects/migrate_data.py", line 90, in get_dest_vif
-     vif['type'] = self.vif_type
-   File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 67, in getter
-     self.obj_load_attr(name)
-   File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 603, in obj_load_attr
-     _("Cannot load '%s' in the base class") % attrname)
+   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming
+     res = self.dispatcher.dispatch(message)
+   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 274, in dispatch
+     return self._do_dispatch(endpoint, method, ctxt, args)
+   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch
+     result = func(ctxt, **new_args)
+   File "/usr/lib/python2.7/site-packages/nova/exception_wrapper.py", line 79, in wrapped
+     function_name, call_dict, binary, tb)
+   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
+     self.force_reraise()
+   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
+     six.reraise(self.type_, self.value, self.tb)
+   File "/usr/lib/python2.7/site-packages/nova/exception_wrapper.py", line 69, in wrapped
+     return f(self, context, *args, **kw)
+   File "/usr/lib/python2.7/site-packages/nova/compute/utils.py", line 1372, in decorated_function
+     return function(self, context, *args, **kwargs)
+   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 219, in decorated_function
+     kwargs['instance'], e, sys.exc_info())
+   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__    self.force_reraise()
+   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
+     six.reraise(self.type_, self.value, self.tb)  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 207, in decorated_function
+     return function(self, context, *args, **kwargs)
+   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 7007, in pre_live_migration
+     bdm.save()
+   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
+     self.force_reraise()
+   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
+     six.reraise(self.type_, self.value, self.tb)
+   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6972, in pre_live_migration
+     migrate_data)
+   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 9190, in pre_live_migration
+     instance, network_info, migrate_data)
+   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 9071, in _pre_live_migration_plug_vifs
+     vif_plug_nw_info.append(migrate_vif.get_dest_vif())
+   File "/usr/lib/python2.7/site-packages/nova/objects/migrate_data.py", line 90, in get_dest_vif
+     vif['type'] = self.vif_type
+   File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 67, in getter
+     self.obj_load_attr(name)
+   File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 603, in obj_load_attr
+     _("Cannot load '%s' in the base class") % attrname)
  NotImplementedError: Cannot load 'vif_type' in the base class
- 
  
  steps to reproduce:
  - train centos 7 based deployment: 1 controller, 2 computes, libvirt + qemu-kvm, ceph shared storage, neutron with contrail vrouter virtual network;
  - create and start a vm;
  - live migrate it between computes.
  
  expected result: vm migrates successfully.
  
- 
  rpm -qa | grep nova:
  
  python2-novaclient-15.1.1-1.el7.noarch
  openstack-nova-common-20.3.0-1.el7.noarch
  python2-nova-20.3.0-1.el7.noarch
  openstack-nova-compute-20.3.0-1.el7.noarch

** Also affects: cloud-archive
   Importance: Undecided
       Status: New

** Also affects: cloud-archive/victoria
   Importance: Undecided
       Status: New

** Also affects: cloud-archive/train
   Importance: Undecided
       Status: New

** Also affects: cloud-archive/ussuri
   Importance: Undecided
       Status: New

** Changed in: cloud-archive/victoria
       Status: New => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1888395

Title:
  live migration of a vm using the single port binding work flow is
  broken in train as a result of the introduction of sriov live
  migration

Status in Ubuntu Cloud Archive:
  Fix Released
Status in Ubuntu Cloud Archive train series:
  New
Status in Ubuntu Cloud Archive ussuri series:
  New
Status in Ubuntu Cloud Archive victoria series:
  Fix Released
Status in networking-opencontrail:
  New
Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) train series:
  In Progress
Status in OpenStack Compute (nova) ussuri series:
  Fix Committed
Status in nova package in Ubuntu:
  New
Status in nova source package in Focal:
  Triaged
Status in nova source package in Groovy:
  Fix Released

Bug description:
  [Impact]

  Live migration of instances in an environment that uses neutron
  backends that do not support multiple port bindings will fail with
  error 'NotImplemented', effectively rendering live-migration
  inoperable in these environments.

  This is fixed by first checking to ensure the backend supports the
  multiple port bindings before providing the port bindings.

  [Test Plan]

  1. deploy a Train/Ussuri OpenStack cloud w/ at least 2 compute nodes
  using an SDN that does not support multiple port bindings (e.g.
  opencontrail).

  2. Attempt to perform a live migration of an instance.

  3. Observe that the live migration will fail without this fix due to
  the trace below (NotImplementedError: Cannot load 'vif_type' in the
  base class), and should succeed with this fix.

  
  [Where problems could occur]

  This affects the live migration code, so likely problems would arise
  in this area. Specifically, the check introduced is guarding
  information provided for instances using SR-IOV indirect migration.

  Regressions would likely occur in the form of live migration errors
  around features that rely on the multiple port bindings (e.g. the SR-
  IOV) and not the more generic/common use case. Errors may be seen in
  standard network providers that are included with distro packaging,
  but may also be seen in scenarios where proprietary SDNs are used.

  
  [Original Description]
  it was working in queens but fails in train. nova compute at the target aborts with the exception:

  Traceback (most recent call last):
    File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming
      res = self.dispatcher.dispatch(message)
    File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 274, in dispatch
      return self._do_dispatch(endpoint, method, ctxt, args)
    File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch
      result = func(ctxt, **new_args)
    File "/usr/lib/python2.7/site-packages/nova/exception_wrapper.py", line 79, in wrapped
      function_name, call_dict, binary, tb)
    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()
    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/usr/lib/python2.7/site-packages/nova/exception_wrapper.py", line 69, in wrapped
      return f(self, context, *args, **kw)
    File "/usr/lib/python2.7/site-packages/nova/compute/utils.py", line 1372, in decorated_function
      return function(self, context, *args, **kwargs)
    File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 219, in decorated_function
      kwargs['instance'], e, sys.exc_info())
    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__    self.force_reraise()
    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 207, in decorated_function
      return function(self, context, *args, **kwargs)
    File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 7007, in pre_live_migration
      bdm.save()
    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()
    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6972, in pre_live_migration
      migrate_data)
    File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 9190, in pre_live_migration
      instance, network_info, migrate_data)
    File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 9071, in _pre_live_migration_plug_vifs
      vif_plug_nw_info.append(migrate_vif.get_dest_vif())
    File "/usr/lib/python2.7/site-packages/nova/objects/migrate_data.py", line 90, in get_dest_vif
      vif['type'] = self.vif_type
    File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 67, in getter
      self.obj_load_attr(name)
    File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 603, in obj_load_attr
      _("Cannot load '%s' in the base class") % attrname)
  NotImplementedError: Cannot load 'vif_type' in the base class

  steps to reproduce:
  - train centos 7 based deployment: 1 controller, 2 computes, libvirt + qemu-kvm, ceph shared storage, neutron with contrail vrouter virtual network;
  - create and start a vm;
  - live migrate it between computes.

  expected result: vm migrates successfully.

  rpm -qa | grep nova:

  python2-novaclient-15.1.1-1.el7.noarch
  openstack-nova-common-20.3.0-1.el7.noarch
  python2-nova-20.3.0-1.el7.noarch
  openstack-nova-compute-20.3.0-1.el7.noarch

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1888395/+subscriptions


References