← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2000069] [NEW] Live-migration failure during post because of keystone unavailable

 

Public bug reported:

Description of problem:

Client attempted a live migration and it failed during post section
with:

The source of the problem seems to have been related to keystone:
1064:2022-11-17 15:49:15.296 7 INFO nova.compute.resource_tracker [req-1c837c0b-3ccd-4226-8340-af1b12c6fddb - - - - -] [instance: 67857f95-d07d-4568-b92f-b37c760c94ef] Updating resource usage from migration 1976219d-a89d-48f2-893f-39d3e194a4f3
1067:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [req-ac4e513d-a39c-452c-8df7-319d2e764095 - - - - -] [instance: 67857f95-d07d-4568-b92f-b37c760c94ef] Post live migration at destination icmlw-p1-r740-070.itpc.uk.pri.o2.com failed: oslo_messaging.rpc.client.RemoteError: Remote error: ServiceUnavailable The server is currently unavailable. Please try again at a later time.<br /><br />
1073:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef] Traceback (most recent call last):
1074:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]   File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 7579, in _post_live_migration
1075:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]     instance, block_migration, dest)
1076:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]   File "/usr/lib/python3.6/site-packages/nova/compute/rpcapi.py", line 796, in post_live_migration_at_destination
1077:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]     instance=instance, block_migration=block_migration)
1078:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]   File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/client.py", line 181, in call
1079:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]     transport_options=self.transport_options)
1080:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]   File "/usr/lib/python3.6/site-packages/oslo_messaging/transport.py", line 129, in _send
1081:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]     transport_options=transport_options)
1082:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]   File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 674, in send
1083:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]     transport_options=transport_options)
1084:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]   File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 664, in _send
1085:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]     raise result
1086:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef] oslo_messaging.rpc.client.RemoteError: Remote error: ServiceUnavailable The server is currently unavailable. Please try again at a later time.<br /><br />
1087:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef] The Keystone service is temporarily unavailable.

Found in the log, Keystone was unavailable at that moment, the failure
is normal but the issue is regarding the instance:

Instance was stuck in "MIGRATING" state and nova.instances was pointing still to the source compute.
Customer had to modify the database to point to the new compute.

Version-Release number of selected component (if applicable):
Master (A) and probably Z, Y, X, W, V, U, T


How reproducible:
Happened once

Steps to Reproduce:
1. Live-migration
2. Keystone failure
3.

Actual results:
Live-migration completed and instance in MIGRATION state.

Expected results:
Live-migration failed and instance in ERROR allowing it to be recovered.

** Affects: nova
     Importance: Undecided
         Status: Confirmed

** Changed in: nova
       Status: New => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2000069

Title:
  Live-migration failure during post because of keystone unavailable

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  Description of problem:

  Client attempted a live migration and it failed during post section
  with:

  The source of the problem seems to have been related to keystone:
  1064:2022-11-17 15:49:15.296 7 INFO nova.compute.resource_tracker [req-1c837c0b-3ccd-4226-8340-af1b12c6fddb - - - - -] [instance: 67857f95-d07d-4568-b92f-b37c760c94ef] Updating resource usage from migration 1976219d-a89d-48f2-893f-39d3e194a4f3
  1067:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [req-ac4e513d-a39c-452c-8df7-319d2e764095 - - - - -] [instance: 67857f95-d07d-4568-b92f-b37c760c94ef] Post live migration at destination icmlw-p1-r740-070.itpc.uk.pri.o2.com failed: oslo_messaging.rpc.client.RemoteError: Remote error: ServiceUnavailable The server is currently unavailable. Please try again at a later time.<br /><br />
  1073:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef] Traceback (most recent call last):
  1074:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]   File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 7579, in _post_live_migration
  1075:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]     instance, block_migration, dest)
  1076:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]   File "/usr/lib/python3.6/site-packages/nova/compute/rpcapi.py", line 796, in post_live_migration_at_destination
  1077:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]     instance=instance, block_migration=block_migration)
  1078:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]   File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/client.py", line 181, in call
  1079:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]     transport_options=self.transport_options)
  1080:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]   File "/usr/lib/python3.6/site-packages/oslo_messaging/transport.py", line 129, in _send
  1081:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]     transport_options=transport_options)
  1082:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]   File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 674, in send
  1083:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]     transport_options=transport_options)
  1084:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]   File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 664, in _send
  1085:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]     raise result
  1086:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef] oslo_messaging.rpc.client.RemoteError: Remote error: ServiceUnavailable The server is currently unavailable. Please try again at a later time.<br /><br />
  1087:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef] The Keystone service is temporarily unavailable.

  Found in the log, Keystone was unavailable at that moment, the failure
  is normal but the issue is regarding the instance:

  Instance was stuck in "MIGRATING" state and nova.instances was pointing still to the source compute.
  Customer had to modify the database to point to the new compute.

  Version-Release number of selected component (if applicable):
  Master (A) and probably Z, Y, X, W, V, U, T

  
  How reproducible:
  Happened once

  Steps to Reproduce:
  1. Live-migration
  2. Keystone failure
  3.

  Actual results:
  Live-migration completed and instance in MIGRATION state.

  Expected results:
  Live-migration failed and instance in ERROR allowing it to be recovered.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2000069/+subscriptions