← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1969971] Re: Live migrations failing due to remote host identification change

 

Ssh known hosts file handling is not in scope for nova. I glad to see
that this is progressing in charms. Closing this for nova.

** Changed in: nova
       Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1969971

Title:
  Live migrations failing due to remote host identification change

Status in OpenStack Nova Cloud Controller Charm:
  In Progress
Status in OpenStack Compute (nova):
  Invalid

Bug description:
  I've encountered a cloud where, for some reason (maybe a redeploy of a
  compute; I'm not sure), I'm hitting this error in nova-compute.log on
  the source node for an instance migration:

  2022-04-22 10:21:17.419 3776 ERROR nova.virt.libvirt.driver [-] [instance: <REDACTED INSTANCE UUID>] Live Migration failure: operation failed: Failed to connect to remote libvirt URI qemu+ssh://<REDACTED IP>/system: Cannot recv data: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
  @    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
  @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
  IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
  Someone could be eavesdropping on you right now (man-in-the-middle attack)!
  It is also possible that a host key has just been changed.
  The fingerprint for the RSA key sent by the remote host is
  SHA256:<REDACTED FINGERPRINT>.
  Please contact your system administrator.
  Add correct host key in /root/.ssh/known_hosts to get rid of this message.
  Offending RSA key in /root/.ssh/known_hosts:97
    remove with:
    ssh-keygen -f "/root/.ssh/known_hosts" -R "<REDACTED IP>"
  RSA host key for <REDACTED IP> has changed and you have requested strict checking.
  Host key verification failed.: Connection reset by peer: libvirt.libvirtError: operation failed: Failed to connect to remote libvirt URI qemu+ssh://<REDACTED IP>/system: Cannot recv data: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

  This interferes with instance migration.

  There is a workaround:
  * Manually ssh to the destination node, both as the root and nova users on the source node.
  * Manually clear the offending known_hosts entries reported by the SSH command.
  * Verify that once cleared, the root and nova users are able to successfully connect via SSH.

  Obviously, this is cumbersome in the case of clouds with high numbers
  of compute nodes.  It'd be better if the charm was able to avoid this
  issue.

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-nova-cloud-controller/+bug/1969971/+subscriptions