yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #74839
[Bug 1793419] Re: database online data migration fail due to missing request spec marker
OK I think I see, _get_marker_for_migrate_instances returns the marker
because there is still a request_specs table entry with the marker
instance_uuid (because we didn't used to clean up request specs on db
archive/purge - but now we do). So when listing instances we passed a
marker to an instance which wasn't found, and that raised MarkerNotFound
and failed.
** Changed in: nova
Importance: Undecided => Low
** Changed in: nova
Status: New => Triaged
** Also affects: nova/pike
Importance: Undecided
Status: New
** Also affects: nova/queens
Importance: Undecided
Status: New
** Also affects: nova/rocky
Importance: Undecided
Status: New
** Changed in: nova/pike
Status: New => Triaged
** Changed in: nova/queens
Status: New => Triaged
** Changed in: nova/rocky
Status: New => Triaged
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1793419
Title:
database online data migration fail due to missing request spec marker
Status in OpenStack Compute (nova):
Triaged
Status in OpenStack Compute (nova) pike series:
Triaged
Status in OpenStack Compute (nova) queens series:
Triaged
Status in OpenStack Compute (nova) rocky series:
Triaged
Bug description:
Description
===========
During upgrade we run nova online migration that goes through the list of instances and creates a request spec record in the db if one does not exist. As the online migrations are batched, the request spec migration leaves a marker record in the request_specs table to indicate the last instance uuid that was processed. It continues processing starting from that instances on the next batch.
In our upgrade test, we hit a scenario where the marker instance from
the online migration that was run during the Mitaka->Newton upgrade
had been deleted and purged from the db by time we ran the
Newton->Pike upgrade. This caused the online migration to fail as the
marker instance couldn't be found.
Steps to reproduce
==================
- run data online migration on installed Newton load.
nova-manage db online_data_migrations
- delete the instance referenced by the marker (instance_uuid 00000000-0000-0000-0000-000000000000)
- purge db:
nova-manage db purge
- upgrade to Pike.
Expected result
===============
Upgrade successful with no exceptions.
Actual result
=============
Exceptions occur during upgrade with missing marker an upgrade failed.
Error attempting to run <function migrate_instances_add_request_spec at 0x5151050>
14 rows matched query service_uuids_online_data_migration, 14 migrated
13 rows matched query migrate_quota_limits_to_api_db, 13 migrated
Error attempting to run <function migrate_instances_add_request_spec at 0x5151050>
+---------------------------------------------+--------------+-----------+
| Migration | Total Needed | Completed |
+---------------------------------------------+--------------+-----------+
| delete_build_requests_with_no_instance_uuid | 0 | 0 |
| migrate_aggregate_reset_autoincrement | 0 | 0 |
| migrate_aggregates | 0 | 0 |
| migrate_flavor_reset_autoincrement | 0 | 0 |
| migrate_flavors | 0 | 0 |
| migrate_instance_groups_to_api_db | 0 | 0 |
| migrate_instance_keypairs | 0 | 0 |
| migrate_instances_add_request_spec | 0 | 0 |
| migrate_keypairs_to_api_db | 0 | 0 |
| migrate_quota_classes_to_api_db | 0 | 0 |
| migrate_quota_limits_to_api_db | 0 | 0 |
| service_uuids_online_data_migration | 0 | 0 |
+---------------------------------------------+--------------+-----------+
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1793419/+subscriptions
References