← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1507521] Re: Nova Resize is failing for shared storage between Compute node

 

@palbhan:

> [...]  we checked the share storage is true or not before migrating.
So, did you do a resize which results in a cold-migration (=rebuild) of
the instance to another server?

> I think checking the share storage using " shared_storage = 
> (dest == self.get_host_ip_addr())" is not the right way. 
What makes you think this way? This is a check if the resize/rebuild
of the instance is happening on the very same host.

> When I always return true from _is_storage_shared_with function, 
> the Nova resize works fine.
If your setup will never(!) change, this could work, but I wouldn't 
recommend doing it this way.

I assume that your setup doesn't have a valid SSH setup between the
*compute nodes*, at least the error "'Host key verification failed.\r\n'"
makes me think this way. The "cloud admin guide" states [1]:

    Ensure you can access SSH without a password and without 
    StrictHostKeyChecking between HostB and HostC as nova user 
    (set with the owner of nova-compute service). Direct access 
    from one compute host to another is needed to copy the VM file
    across. It is also needed to detect if the source and target 
    compute nodes share a storage subsystem.

I think as soon as you did that SSH configuration between your compute
nodes the issue will be gone. Because of this I close this bug report
as invalid. If you think I got you wrong and this is a valid bug, please
reopen it by setting the status back to "New".

References:
[1] http://docs.openstack.org/admin-guide-cloud/compute-configuring-migrations.html

** Changed in: nova
       Status: New => Invalid

** Tags added: libvirt

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1507521

Title:
  Nova Resize is failing for shared storage between Compute node

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  Nova Version: 2.22.0

  I have share nfs storage mounting the /var/lib/nova between two
  compute nodes. When I tried to re-sizing the instance using nova
  resize command, it is failing and below is the output of log

  2015-10-19 05:13:15.582 14325 ERROR oslo_messaging.rpc.dispatcher [req-5cb16661-74ec-4faf-93cd-044e597cc9de d4209dcd86b84fc584f8b3b72bee0c64 da6c9fa9be0046dda47e9bd6caf3908a - - -] Exception during message handling: Resize error: not able to execute ssh command: Unexpected error while running command.
  Command: ssh 20.20.20.3 mkdir -p /var/lib/nova/instances/744d6341-023f-49cd-9d93-7bae7eb32653
  Exit code: 255
  Stdout: u''
  Stderr: u'Host key verification failed.\r\n'
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher Traceback (most recent call last):
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher     executor_callback))
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher     executor_callback)
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 130, in _do_dispatch
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher     result = func(ctxt, **new_args)
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6748, in resize_instance
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher     clean_shutdown=clean_shutdown)
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 88, in wrapped
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher     payload)
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 85, in __exit__
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher     six.reraise(self.type_, self.value, self.tb)
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 71, in wrapped
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher     return f(self, context, *args, **kw)
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 327, in decorated_function
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher     LOG.warning(msg, e, instance_uuid=instance_uuid)
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 85, in __exit__
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher     six.reraise(self.type_, self.value, self.tb)
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 298, in decorated_function
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher     return function(self, context, *args, **kwargs)
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 377, in decorated_function
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher     return function(self, context, *args, **kwargs)
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 286, in decorated_function
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher     migration.instance_uuid, exc_info=True)
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 85, in __exit__
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher     six.reraise(self.type_, self.value, self.tb)
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 269, in decorated_function
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher     return function(self, context, *args, **kwargs)
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 355, in decorated_function
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher     kwargs['instance'], e, sys.exc_info())
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 85, in __exit__
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher     six.reraise(self.type_, self.value, self.tb)
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 343, in decorated_function
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher     return function(self, context, *args, **kwargs)
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 4012, in resize_instance
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher     self.instance_events.clear_events_for_instance(instance)
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/contextlib.py", line 35, in __exit__
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher     self.gen.throw(type, value, traceback)
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6382, in _error_out_instance_on_exception
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher     raise error.inner_exception
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher ResizeError: Resize error: not able to execute ssh command: Unexpected error while running command.
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher Command: ssh 20.20.20.3 mkdir -p /var/lib/nova/instances/744d6341-023f-49cd-9d93-7bae7eb32653
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher Exit code: 255
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher Stdout: u''
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher Stderr: u'Host key verification failed.\r\n'
  2015-10-19 05:13:15.582 14325 TRACE oslo_messaging.rpc.dispatcher

  When I checked the code, I found in /usr/lib/python2.7/dist-
  packages/nova/virt/libvirt/driver.py file we checked the share storage
  is true or not before migrating.

   def _is_storage_shared_with(self, dest, inst_base):
          # NOTE (rmk): There are two methods of determining whether we are
          #             on the same filesystem: the source and dest IP are the
          #             same, or we create a file on the dest system via SSH
          #             and check whether the source system can also see it.
          shared_storage = (dest == self.get_host_ip_addr())
          if not shared_storage:
              tmp_file = uuid.uuid4().hex + '.tmp'
              tmp_path = os.path.join(inst_base, tmp_file)
              LOG.debug("Temp path is")
              LOG.debug(tmp_path)
              try:
                  utils.execute('ssh', dest, 'touch', tmp_path)
                  if os.path.exists(tmp_path):
                      shared_storage = True
                      os.unlink(tmp_path)
                  else:
                      utils.execute('ssh', dest, 'rm', tmp_path)
              except Exception:
                  pass
          return shared_storage

  But in case of share storage between compute node  “shared storage =
  (dest == self.get_host_ip_addr())” always return false and also ssh
  mkdir  return false and Nova resize will fail.I think checking the
  share storage using " shared_storage = (dest ==
  self.get_host_ip_addr())" is not the right way. When I always return
  true from _is_storage_shared_with function, the Nova resize works
  fine.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1507521/+subscriptions


References