← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1367964] Re: Unable to recover from timeout of detaching cinder volume

 

** Changed in: nova
       Status: Fix Committed => Fix Released

** Changed in: nova
    Milestone: None => kilo-1

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1367964

Title:
  Unable to recover from timeout of detaching cinder volume

Status in OpenStack Compute (Nova):
  Fix Released

Bug description:
  When cinder-volume is under heavy load, RPC call for terminate_connection of cinder volumes may take more time than RPC timeout.
  When the timeout occurs, nova gives up the detaching volume and recover the volume state to 'in-use', but doesn't reattach volumes.
  This will make DB inconsistent state:

    (1) libvirt is already detaches the volume from the instance
    (2) cinder volume is disconnected from the host by terminate_connection RPC (but nova doesn't know this because of timeout)
    (3) nova.block_device_mapping still remains because of timeout in (2)

  and the volume becomes impossible to re-attach or to detach completely.
  If volume-detach is issued again, it will fail by the exception exception.DiskNotFound:

  
  2014-07-17 10:58:17.333 2586 AUDIT nova.compute.manager [req-e251f834-9653-47aa-969c-b9524d4a683d f8c2ac613325450fa6403a89d48ac644 4be531199d5240f79733fb071e090e46] [instance: 48c19bff-ec39-44c5-a63b-cac01ee813eb] Detach volume f7d90bc8-eb55-4d46-a2c4-294dc9c6a92a from mountpoint /dev/vdb
  2014-07-17 10:58:17.337 2586 ERROR nova.compute.manager [req-e251f834-9653-47aa-969c-b9524d4a683d f8c2ac613325450fa6403a89d48ac644 4be531199d5240f79733fb071e090e46] [instance: 48c19bff-ec39-44c5-a63b-cac01ee813eb] Failed to detach volume f7d90bc8-eb55-4d46-a2c4-294dc9c6a92a from /dev/vdb
  2014-07-17 10:58:17.337 2586 TRACE nova.compute.manager [instance: 48c19bff-ec39-44c5-a63b-cac01ee813eb] Traceback (most recent call last):
  2014-07-17 10:58:17.337 2586 TRACE nova.compute.manager [instance: 48c19bff-ec39-44c5-a63b-cac01ee813eb]   File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 4169, in _detach_volume
  2014-07-17 10:58:17.337 2586 TRACE nova.compute.manager [instance: 48c19bff-ec39-44c5-a63b-cac01ee813eb]     encryption=encryption)
  2014-07-17 10:58:17.337 2586 TRACE nova.compute.manager [instance: 48c19bff-ec39-44c5-a63b-cac01ee813eb]   File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/driver.py", line 1365, in detach_volume
  2014-07-17 10:58:17.337 2586 TRACE nova.compute.manager [instance: 48c19bff-ec39-44c5-a63b-cac01ee813eb]     raise exception.DiskNotFound(location=disk_dev)
  2014-07-17 10:58:17.337 2586 TRACE nova.compute.manager [instance: 48c19bff-ec39-44c5-a63b-cac01ee813eb] DiskNotFound: No disk at vdb
  2014-07-17 10:58:17.337 2586 TRACE nova.compute.manager [instance: 48c19bff-ec39-44c5-a63b-cac01ee813eb] 

  
  We should have the way to recover from this situation.

  For instance, we need to have something like "volume-detach --force"
  which ignores the DiskNotFound exception and continues to delete
  nova.block_device_mapping entry.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1367964/+subscriptions


References