yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #92811
[Bug 2035911] [NEW] Race conditions attaching/detaching volumes
Public bug reported:
For cinder volume attach and detach operations Nova is properly using
os-brick's `guard_connection` as a context manager to protect against
race conditions the the local disconnection of the volume and the call
to cinder to unmap/unexport the volume with other volumes' local
connection and map/export.
This is the code:
def detach(self, context, instance, volume_api, virt_driver,
attachment_id=None, destroy_bdm=False):
volume = self._get_volume(context, volume_api, self.volume_id)
# Let OS-Brick handle high level locking that covers the local os-brick
# detach and the Cinder call to call unmap the volume. Not all volume
# backends or hosts require locking.
with brick_utils.guard_connection(volume):
self._do_detach(context, instance, volume_api, virt_driver,
attachment_id, destroy_bdm)
@update_db
def attach(self, context, instance, volume_api, virt_driver,
do_driver_attach=False, **kwargs):
volume = self._get_volume(context, volume_api, self.volume_id)
volume_api.check_availability_zone(context, volume,
instance=instance)
# Let OS-Brick handle high level locking that covers the call to
# Cinder that exports & maps the volume, and for the local os-brick
# attach. Not all volume backends or hosts require locking.
with brick_utils.guard_connection(volume):
self._do_attach(context, instance, volume, volume_api,
virt_driver, do_driver_attach)
But there are many other places where Nova does attach or detach not
using those 2 methods (`detach` and `attach`).
One example is when we delete an instance that has cinder volumes
attached, another one is when finishing an instance live migration.
Nova needs to always use the `guard_connection` context manager.
There are places where it will be easy to fix such as the
`_remove_volume_connection` method in cinder/volume/manager.py.
But there are others where it looks like it will be harder like
`_shutdown_instance` in the same file, because the volumes are locally
detached through the `self.driver.destroy` call and only after all the
volumes (potentially from different backends) have been locally removed
does it call `self.volume_api.attachment_delete` for each of the
volumes.
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2035911
Title:
Race conditions attaching/detaching volumes
Status in OpenStack Compute (nova):
New
Bug description:
For cinder volume attach and detach operations Nova is properly using
os-brick's `guard_connection` as a context manager to protect against
race conditions the the local disconnection of the volume and the call
to cinder to unmap/unexport the volume with other volumes' local
connection and map/export.
This is the code:
def detach(self, context, instance, volume_api, virt_driver,
attachment_id=None, destroy_bdm=False):
volume = self._get_volume(context, volume_api, self.volume_id)
# Let OS-Brick handle high level locking that covers the local os-brick
# detach and the Cinder call to call unmap the volume. Not all volume
# backends or hosts require locking.
with brick_utils.guard_connection(volume):
self._do_detach(context, instance, volume_api, virt_driver,
attachment_id, destroy_bdm)
@update_db
def attach(self, context, instance, volume_api, virt_driver,
do_driver_attach=False, **kwargs):
volume = self._get_volume(context, volume_api, self.volume_id)
volume_api.check_availability_zone(context, volume,
instance=instance)
# Let OS-Brick handle high level locking that covers the call to
# Cinder that exports & maps the volume, and for the local os-brick
# attach. Not all volume backends or hosts require locking.
with brick_utils.guard_connection(volume):
self._do_attach(context, instance, volume, volume_api,
virt_driver, do_driver_attach)
But there are many other places where Nova does attach or detach not
using those 2 methods (`detach` and `attach`).
One example is when we delete an instance that has cinder volumes
attached, another one is when finishing an instance live migration.
Nova needs to always use the `guard_connection` context manager.
There are places where it will be easy to fix such as the
`_remove_volume_connection` method in cinder/volume/manager.py.
But there are others where it looks like it will be harder like
`_shutdown_instance` in the same file, because the volumes are locally
detached through the `self.driver.destroy` call and only after all the
volumes (potentially from different backends) have been locally
removed does it call `self.volume_api.attachment_delete` for each of
the volumes.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2035911/+subscriptions