← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1929446] [NEW] check_can_live_migrate_source taking > 60 seconds in CI

 

Public bug reported:

I've been seeing lots of failures caused by timeouts in
test_volume_backed_live_migration during the live-migration and
multinode grenade jobs, for example:

https://zuul.opendev.org/t/openstack/build/bb6fd21b5d8c471a89f4f6598aa84e5d/logs

During check_can_live_migrate_source I'm seeing the following gap in the
logs that I can't explain:

12225 May 24 10:23:02.637600 ubuntu-focal-inap-mtl01-0024794054 nova-compute[107012]: DEBUG nova.virt.libvirt.driver [None req-b5288b85-d642-426f-a525-c64724fe4091 tempest-LiveMigrationTest-312230369 tempest-LiveMigrationTest-312230369-project-admin] [instance: 91a0e0ca-e6a8-43ab-8e68-a10a77ad615b] Check if temp file /opt/stack/data/nova/instances/tmp5lcmhuri exists to indicate shared storage is being used for migration. Exists? False {{(pid=107012) _check_shared_storage_test_file /opt/stack/nova/nova/virt/libvirt/driver.py:9367}}
[..]
12282 May 24 10:24:22.385187 ubuntu-focal-inap-mtl01-0024794054 nova-compute[107012]: DEBUG nova.virt.libvirt.driver [None req-b5288b85-d642-426f-a525-c64724fe4091 tempest-LiveMigrationTest-312230369 tempest-LiveMigrationTest-312230369-project-admin] skipping disk /dev/sdb (vda) as it is a volume {{(pid=107012) _get_instance_disk_info_from_config /opt/stack/nova/nova/virt/libvirt/driver.py:10458}}

^ this leads to both the HTTP request to live migrate (that's still a
synchronous call at this point [1]) *and* the RPC call from the dest to
the source both timing out.

[1] https://docs.openstack.org/nova/latest/reference/live-migration.html

** Affects: nova
     Importance: Undecided
         Status: New


** Tags: gate-failure live-migration

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1929446

Title:
  check_can_live_migrate_source taking > 60 seconds in CI

Status in OpenStack Compute (nova):
  New

Bug description:
  I've been seeing lots of failures caused by timeouts in
  test_volume_backed_live_migration during the live-migration and
  multinode grenade jobs, for example:

  https://zuul.opendev.org/t/openstack/build/bb6fd21b5d8c471a89f4f6598aa84e5d/logs

  During check_can_live_migrate_source I'm seeing the following gap in
  the logs that I can't explain:

  12225 May 24 10:23:02.637600 ubuntu-focal-inap-mtl01-0024794054 nova-compute[107012]: DEBUG nova.virt.libvirt.driver [None req-b5288b85-d642-426f-a525-c64724fe4091 tempest-LiveMigrationTest-312230369 tempest-LiveMigrationTest-312230369-project-admin] [instance: 91a0e0ca-e6a8-43ab-8e68-a10a77ad615b] Check if temp file /opt/stack/data/nova/instances/tmp5lcmhuri exists to indicate shared storage is being used for migration. Exists? False {{(pid=107012) _check_shared_storage_test_file /opt/stack/nova/nova/virt/libvirt/driver.py:9367}}
  [..]
  12282 May 24 10:24:22.385187 ubuntu-focal-inap-mtl01-0024794054 nova-compute[107012]: DEBUG nova.virt.libvirt.driver [None req-b5288b85-d642-426f-a525-c64724fe4091 tempest-LiveMigrationTest-312230369 tempest-LiveMigrationTest-312230369-project-admin] skipping disk /dev/sdb (vda) as it is a volume {{(pid=107012) _get_instance_disk_info_from_config /opt/stack/nova/nova/virt/libvirt/driver.py:10458}}

  ^ this leads to both the HTTP request to live migrate (that's still a
  synchronous call at this point [1]) *and* the RPC call from the dest
  to the source both timing out.

  [1] https://docs.openstack.org/nova/latest/reference/live-
  migration.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1929446/+subscriptions


Follow ups