← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1498126] [NEW] Inconsistencies with resource tracking in the case of resize operation.

 

Public bug reported:

All of these are being reported upon code inspection - I have yet to
confirm all of these as they are in fact edge cases and subtle race
conditions:

* We update the instance.host field to the value of the destination_node
in resize_migration which runs on the source host.
(https://github.com/openstack/nova/blob/1df8248b6ad7982174c417abf80070107eac8909/nova/compute/manager.py#L3750)
This means that in between that DB write,  and changing the flavor and
applying the migration context (which happens in finish_resize ran on
destination host) all resource tracking runs on the destination host
will be wrong (they will use the instance record and thus use the wrong
.

* There is very similar racy-ness in the revert_resize path as described
in the following comment
(https://github.com/openstack/nova/blob/1df8248b6ad7982174c417abf80070107eac8909/nova/compute/manager.py#L3448)
- we should fix that too.

* drop_move_claim method makes sense only when called on the source
node, so it's name should be reflected to change that. It's really an
optimization where we free the resources sooner than the next RT pass
which will not see the migration as in progress. This should be
documented better

* drop_move_claim looks up the new_flavor to compare it with the flavor
that was used to track the migration, but on the source node it's
certain to be the old_flavor. Thus as it stands now drop_move_claim
(only ran on source nodes) doesn't do anything. Not a big deal, but we
should probably fix it.

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1498126

Title:
  Inconsistencies with resource tracking in the case of resize
  operation.

Status in OpenStack Compute (nova):
  New

Bug description:
  All of these are being reported upon code inspection - I have yet to
  confirm all of these as they are in fact edge cases and subtle race
  conditions:

  * We update the instance.host field to the value of the
  destination_node in resize_migration which runs on the source host.
  (https://github.com/openstack/nova/blob/1df8248b6ad7982174c417abf80070107eac8909/nova/compute/manager.py#L3750)
  This means that in between that DB write,  and changing the flavor and
  applying the migration context (which happens in finish_resize ran on
  destination host) all resource tracking runs on the destination host
  will be wrong (they will use the instance record and thus use the
  wrong .

  * There is very similar racy-ness in the revert_resize path as
  described in the following comment
  (https://github.com/openstack/nova/blob/1df8248b6ad7982174c417abf80070107eac8909/nova/compute/manager.py#L3448)
  - we should fix that too.

  * drop_move_claim method makes sense only when called on the source
  node, so it's name should be reflected to change that. It's really an
  optimization where we free the resources sooner than the next RT pass
  which will not see the migration as in progress. This should be
  documented better

  * drop_move_claim looks up the new_flavor to compare it with the
  flavor that was used to track the migration, but on the source node
  it's certain to be the old_flavor. Thus as it stands now
  drop_move_claim  (only ran on source nodes) doesn't do anything. Not a
  big deal, but we should probably fix it.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1498126/+subscriptions