yahoo-eng-team team mailing list archive

Thread
Date

[Bug 1498126] [NEW] Inconsistencies with resource tracking in the case of resize operation.

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Nikola Đipanov <ndipanov@xxxxxxxxxx>
Date: Mon, 21 Sep 2015 18:36:53 -0000
Reply-to: Bug 1498126 <1498126@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx

Public bug reported:

All of these are being reported upon code inspection - I have yet to
confirm all of these as they are in fact edge cases and subtle race
conditions:

* We update the instance.host field to the value of the destination_node
in resize_migration which runs on the source host.
(https://github.com/openstack/nova/blob/1df8248b6ad7982174c417abf80070107eac8909/nova/compute/manager.py#L3750)
This means that in between that DB write, and changing the flavor and
applying the migration context (which happens in finish_resize ran on
destination host) all resource tracking runs on the destination host
will be wrong (they will use the instance record and thus use the wrong
.

* There is very similar racy-ness in the revert_resize path as described
in the following comment
(https://github.com/openstack/nova/blob/1df8248b6ad7982174c417abf80070107eac8909/nova/compute/manager.py#L3448)
- we should fix that too.

* drop_move_claim method makes sense only when called on the source
node, so it's name should be reflected to change that. It's really an
optimization where we free the resources sooner than the next RT pass
which will not see the migration as in progress. This should be
documented better

* drop_move_claim looks up the new_flavor to compare it with the flavor
that was used to track the migration, but on the source node it's
certain to be the old_flavor. Thus as it stands now drop_move_claim
(only ran on source nodes) doesn't do anything. Not a big deal, but we
should probably fix it.

** Affects: nova
Importance: Undecided
Status: New

--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1498126

Title:
Inconsistencies with resource tracking in the case of resize
operation.

Status in OpenStack Compute (nova):
New

Bug description:
All of these are being reported upon code inspection - I have yet to
confirm all of these as they are in fact edge cases and subtle race
conditions:

* We update the instance.host field to the value of the
destination_node in resize_migration which runs on the source host.
(https://github.com/openstack/nova/blob/1df8248b6ad7982174c417abf80070107eac8909/nova/compute/manager.py#L3750)
This means that in between that DB write, and changing the flavor and
applying the migration context (which happens in finish_resize ran on
destination host) all resource tracking runs on the destination host
will be wrong (they will use the instance record and thus use the
wrong .

* There is very similar racy-ness in the revert_resize path as
described in the following comment
(https://github.com/openstack/nova/blob/1df8248b6ad7982174c417abf80070107eac8909/nova/compute/manager.py#L3448)
- we should fix that too.

* drop_move_claim looks up the new_flavor to compare it with the
flavor that was used to track the migration, but on the source node
it's certain to be the old_flavor. Thus as it stands now
drop_move_claim (only ran on source nodes) doesn't do anything. Not a
big deal, but we should probably fix it.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1498126/+subscriptions