yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #88205
[Bug 1952941] Re: Migration fails with "NotImplementedError: Cannot load 'pcpuset' in the base class" when a pre Victoria instance with cpu pinning is migrated in Victoria
Reviewed: https://review.opendev.org/c/openstack/nova/+/820153
Committed: https://opendev.org/openstack/nova/commit/e853bb57181721725a89656b3cb3058636630a6e
Submitter: "Zuul (22348)"
Branch: master
commit e853bb57181721725a89656b3cb3058636630a6e
Author: Balazs Gibizer <balazs.gibizer@xxxxxxxx>
Date: Thu Dec 2 12:52:01 2021 +0100
Migrate RequestSpec.numa_topology to use pcpuset
When the InstanceNUMATopology OVO has changed in
I901fbd7df00e45196395ff4c69e7b8aa3359edf6 to separately track
pcpus from vcpus a data migration was added. This data migration is
triggered when the InstanceNUMATopology object is loaded from the
instance_extra table. However that patch is missed the fact that the
InstanceNUMATopology object can be loaded from the request_spec table as
well. So InstanceNUMATopology object in RequestSpec are not migrated.
This could lead to errors in the scheduler when such RequestSpec object
is used for scheduling (e.g. during a migration of a pre Victoria
instance with cpu pinning)
This patch adds the missing data migration.
Change-Id: I812d720555bdf008c83cae3d81541a37bd99e594
Closes-Bug: #1952941
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1952941
Title:
Migration fails with "NotImplementedError: Cannot load 'pcpuset' in
the base class" when a pre Victoria instance with cpu pinning is
migrated in Victoria
Status in OpenStack Compute (nova):
Fix Released
Status in OpenStack Compute (nova) victoria series:
New
Status in OpenStack Compute (nova) wallaby series:
New
Status in OpenStack Compute (nova) xena series:
New
Bug description:
When the cpuset -> pcpuset data migration was added to
InstanceNUMATopology [1] it was missed that such object is not only
hydrated via InstanceNUMATopology.get_by_instance_uuid() but also
hydrated by RequestSpec.get_by_instance_uuid() indirectly. However the
latter code patch does not call InstanceNUMATopology.obj_from_db_obj()
that triggers the data migration via
InstanceNUMATopology._migrate_legacy_dedicated_instance_cpuset. This
causes that when the new nova code loads an old RequestSpec object
from the DB (e.g. during migration of an instance) the
InstanceNUMATopology in the RequestSpec will not be migrated to the
new object version and it will lead to errors when the pcpuset field
is read during scheduling.
To reproduce:
* Install a pre Victoria cloud
* Create an instance with cpu pinning
* Upgrade to Victoria or newer
* Try to migrate / evacuate the instance
You will see the following stack trace in the nova-scheduler log
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 309, in dispatch
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 229, in _do_dispatch
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 241, in inner
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server return func(*args, **kwargs)
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/scheduler/manager.py", line 215, in select_destinations
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server allocation_request_version, return_alternates)
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/scheduler/filter_scheduler.py", line 96, in select_destinations
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server allocation_request_version, return_alternates)
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/scheduler/filter_scheduler.py", line 210, in _schedule
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server hosts = self._get_sorted_hosts(spec_obj, hosts, num)
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/scheduler/filter_scheduler.py", line 441, in _get_sorted_hosts
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server spec_obj, index)
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/scheduler/host_manager.py", line 606, in get_filtered_hosts
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server hosts, spec_obj, index)
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/filters.py", line 88, in get_filtered_objects
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server list_objs = list(objs)
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/filters.py", line 43, in filter_all
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server if self._filter_one(obj, spec_obj):
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/scheduler/filters/__init__.py", line 44, in _filter_one
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server return self.host_passes(obj, spec)
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/scheduler/filters/numa_topology_filter.py", line 104, in host_passes
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server pci_stats=host_state.pci_stats))
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/virt/hardware.py", line 2294, in numa_fit_instance_to_host
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server host_cell, instance_cell, limits, cpuset_reserved)
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/virt/hardware.py", line 1109, in _numa_fit_instance_cell
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server required_cpus = len(instance_cell.pcpuset) + cpuset_reserved
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_versionedobjects/base.py", line 67, in getter
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server self.obj_load_attr(name)
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_versionedobjects/base.py", line 601, in obj_load_attr
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server _("Cannot load '%s' in the base class") % attrname)
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server NotImplementedError: Cannot load 'pcpuset' in the base class
2021-11-30 17:36:38.963 48 ERROR oslo_messaging.rpc.server
[1] https://review.opendev.org/c/openstack/nova/+/714658
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1952941/+subscriptions
References