← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1672597] [NEW] [live migration] The instance directory on the destination host is not clean up

 

Public bug reported:

I understand there are code to clean up the instance directory on the
target host if the live migration failed, but the directory is not
cleanup if libvirt's connection is timeout.

I haven't got a change to root cause the issue, but I feel the code
could be optimized a little bit to avoid this issue.

Here is some trace log from my side.

- Libvirt connection timed out
2017-03-07 02:34:37.540 ERROR nova.virt.libvirt.driver [req-35bc9ca8-c77b-481c-ae3e-93bbfed6187f admin admin] [instance: 6714b056-4950-4e63-83d3-fc383e977a53] Live Migration failure: unable to connect to server at 'ceph-dev:49152': Connection timed out
2017-03-07 02:34:37.541 DEBUG nova.virt.libvirt.driver [req-35bc9ca8-c77b-481c-ae3e-93bbfed6187f admin admin] [instance: 6714b056-4950-4e63-83d3-fc383e977a53] Migration operation thread notification from (pid=18073) thread_finished /opt/stack/nova/nova/virt/libvirt/driver.py:6361
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 457, in fire_timers
    timer()
  File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 58, in __call__
    cb(*args, **kw)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 168, in _do_send
    waiter.switch(result)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main
    result = function(*args, **kwargs)
  File "/opt/stack/nova/nova/utils.py", line 1066, in context_wrapper
    return func(*args, **kwargs)
  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 5962, in _live_migration_operation
    instance=instance)
  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
    self.force_reraise()
  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)
  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 5958, in _live_migration_operation
    bandwidth=CONF.libvirt.live_migration_bandwidth)
  File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 605, in migrate
    flags=flags, bandwidth=bandwidth)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, in doit
    result = proxy_call(self._autowrap, f, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, in proxy_call
    rv = execute(f, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, in execute
    six.reraise(c, e, tb)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker
    rv = meth(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/libvirt.py", line 1586, in migrateToURI2
    if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', dom=self)

libvirtError: unable to connect to server at 'ceph-dev:49152':
Connection timed out


- The instance's directory haven't cleanup, and the next migration will fail.

Traceback (most recent call last):

  File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 133, in _process_incoming
    res = self.dispatcher.dispatch(message)

  File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 150, in dispatch
    return self._do_dispatch(endpoint, method, ctxt, args)

  File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 121, in _do_dispatch
    result = func(ctxt, **new_args)

  File "/opt/stack/nova/nova/exception_wrapper.py", line 75, in wrapped
    function_name, call_dict, binary)

  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
    self.force_reraise()

  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)

  File "/opt/stack/nova/nova/exception_wrapper.py", line 66, in wrapped
    return f(self, context, *args, **kw)

  File "/opt/stack/nova/nova/compute/utils.py", line 613, in decorated_function
    return function(self, context, *args, **kwargs)

  File "/opt/stack/nova/nova/compute/manager.py", line 216, in decorated_function
    kwargs['instance'], e, sys.exc_info())

  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
    self.force_reraise()

  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)

  File "/opt/stack/nova/nova/compute/manager.py", line 204, in decorated_function
    return function(self, context, *args, **kwargs)

  File "/opt/stack/nova/nova/compute/manager.py", line 5192, in pre_live_migration
    migrate_data)

  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6474, in pre_live_migration
    raise exception.DestinationDiskExists(path=instance_dir)

DestinationDiskExists: The supplied disk path
(/opt/stack/data/nova/instances/6714b056-4950-4e63-83d3-fc383e977a53)
already exists, it is expected not to exist.

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1672597

Title:
  [live migration] The  instance directory on the destination host is
  not clean up

Status in OpenStack Compute (nova):
  New

Bug description:
  I understand there are code to clean up the instance directory on the
  target host if the live migration failed, but the directory is not
  cleanup if libvirt's connection is timeout.

  I haven't got a change to root cause the issue, but I feel the code
  could be optimized a little bit to avoid this issue.

  Here is some trace log from my side.

  - Libvirt connection timed out
  2017-03-07 02:34:37.540 ERROR nova.virt.libvirt.driver [req-35bc9ca8-c77b-481c-ae3e-93bbfed6187f admin admin] [instance: 6714b056-4950-4e63-83d3-fc383e977a53] Live Migration failure: unable to connect to server at 'ceph-dev:49152': Connection timed out
  2017-03-07 02:34:37.541 DEBUG nova.virt.libvirt.driver [req-35bc9ca8-c77b-481c-ae3e-93bbfed6187f admin admin] [instance: 6714b056-4950-4e63-83d3-fc383e977a53] Migration operation thread notification from (pid=18073) thread_finished /opt/stack/nova/nova/virt/libvirt/driver.py:6361
  Traceback (most recent call last):
    File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 457, in fire_timers
      timer()
    File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 58, in __call__
      cb(*args, **kw)
    File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 168, in _do_send
      waiter.switch(result)
    File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main
      result = function(*args, **kwargs)
    File "/opt/stack/nova/nova/utils.py", line 1066, in context_wrapper
      return func(*args, **kwargs)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 5962, in _live_migration_operation
      instance=instance)
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 5958, in _live_migration_operation
      bandwidth=CONF.libvirt.live_migration_bandwidth)
    File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 605, in migrate
      flags=flags, bandwidth=bandwidth)
    File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, in doit
      result = proxy_call(self._autowrap, f, *args, **kwargs)
    File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, in proxy_call
      rv = execute(f, *args, **kwargs)
    File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, in execute
      six.reraise(c, e, tb)
    File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker
      rv = meth(*args, **kwargs)
    File "/usr/local/lib/python2.7/dist-packages/libvirt.py", line 1586, in migrateToURI2
      if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', dom=self)

  libvirtError: unable to connect to server at 'ceph-dev:49152':
  Connection timed out

  
  - The instance's directory haven't cleanup, and the next migration will fail.

  Traceback (most recent call last):

    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 133, in _process_incoming
      res = self.dispatcher.dispatch(message)

    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 150, in dispatch
      return self._do_dispatch(endpoint, method, ctxt, args)

    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 121, in _do_dispatch
      result = func(ctxt, **new_args)

    File "/opt/stack/nova/nova/exception_wrapper.py", line 75, in wrapped
      function_name, call_dict, binary)

    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()

    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)

    File "/opt/stack/nova/nova/exception_wrapper.py", line 66, in wrapped
      return f(self, context, *args, **kw)

    File "/opt/stack/nova/nova/compute/utils.py", line 613, in decorated_function
      return function(self, context, *args, **kwargs)

    File "/opt/stack/nova/nova/compute/manager.py", line 216, in decorated_function
      kwargs['instance'], e, sys.exc_info())

    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()

    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)

    File "/opt/stack/nova/nova/compute/manager.py", line 204, in decorated_function
      return function(self, context, *args, **kwargs)

    File "/opt/stack/nova/nova/compute/manager.py", line 5192, in pre_live_migration
      migrate_data)

    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6474, in pre_live_migration
      raise exception.DestinationDiskExists(path=instance_dir)

  DestinationDiskExists: The supplied disk path
  (/opt/stack/data/nova/instances/6714b056-4950-4e63-83d3-fc383e977a53)
  already exists, it is expected not to exist.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1672597/+subscriptions


Follow ups