yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #90334
[Bug 1864665] Fix included in openstack/nova rocky-eol
This issue was fixed in the openstack/nova rocky-eol release.
** Changed in: nova/rocky
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1864665
Title:
Circular reference error during re-schedule
Status in OpenStack Compute (nova):
Invalid
Status in OpenStack Compute (nova) ocata series:
New
Status in OpenStack Compute (nova) pike series:
Fix Released
Status in OpenStack Compute (nova) queens series:
Fix Released
Status in OpenStack Compute (nova) rocky series:
Fix Released
Bug description:
Description
===========
Server cold migration fails after re-schedule.
Steps to reproduce
==================
* create a devstack with two compute hosts with libvirt driver
* set allow_resize_to_same_host=True on both computes
* set up cellsv2 without cell conductor and rabbit separation to allow re-schedule logic to call back to the super conductor / scheduler
* enable NUMATopologyFilter and make sure both computes has NUMA resources
* create a flavor with hw:cpu_policy='dedicated' extra spec
* boot a server with the flavor. Check which compute the server is placed (let's call it host1)
* boot enough servers on host2 so that the next scheduling request could still be fulfilled by both computes but host1 will be preferred by the weighers
* cold migrate the pinned server
Expected result
===============
* scheduler selects host1 first but that host fails with UnableToMigrateToSelf exception as libvirt does not have the capability
* re-schedule happens
* scheduler selects host2 where the server spawns successfully
Actual result
=============
* during the re-schedule when the conductor sends prep_resize RPC to host2 the json serialization of the request spec fails with Circural reference error.
Environment
===========
* two node devstack with libvirt driver
* stable/pike nova. But expected to be reproduced in newer branches but not since stein. See triage part
Triage
======
The json serialization blows up in the migrate conductor task. [1] After debugging I see that the infinit loop happens when jsonutils.to_primitive tries to serialize a VirtCPUTopology instance.
The problematic piece of code has been removed by
I4244f7dd8fe74565180f73684678027067b4506e in Stein.
[1]
https://github.com/openstack/nova/blob/4224a61b4f3a8b910dcaa498f9663479d61a6060/nova/conductor/tasks/migrate.py#L87
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1864665/+subscriptions
References